State Management for LLM Serving Systems

Shuhao Zhang  |  张书豪

张书豪

华中科技大学计算机学院教授|面向大模型推理引擎、推理服务系统与记忆智能体中间件招收硕士/博士/实习生

张书豪,华中科技大学计算机科学与技术学院教授(个人主页)。研究聚焦 复杂硬件与动态负载下的高效状态管理,重点关注并行分布式系统,并进一步延伸至 大模型推理基础设施,尤其是推理服务中的访问调度、执行优化与状态复用问题。加入 HUST 前,曾于新加坡南洋理工大学(NTU)任助理教授,并在德国柏林工业大学(TUB)从事博士后研究。

当前工作主要围绕 大模型推理引擎、推理服务系统与记忆智能体中间件 展开,重点关注共享状态在访问、执行与演化三个层面的组织问题,以及这些机制对吞吐、P99 时延和服务稳定性的影响。


研究板块

当前研究大致分为三个相互衔接的板块:

  • 共享状态访问与运行时调度:观测、建模并优化共享状态访问与运行时调度;在大模型里主要对应请求排队、合批、路由与 prefill / decode 组织
  • 状态感知执行优化:面向 CPU/GPU/异构硬件的执行协同与性能优化;在大模型里主要对应 KV cache、算子执行、通信路径与端到端时延/吞吐
  • 共享状态演化与复用:围绕上下文、KV 与检索记忆的持续写入、演化与跨轮复用;在大模型里主要对应长上下文、RAG 和记忆增强推理

当前系统建设

课题组当前的系统工作主要包括三组相互衔接的系统线:

系统脉络、公开仓库与代表性论文已整理在下方各系统条目中。

  • SAGE:面向国产异构算力的大模型推理服务系统,重点覆盖在线 serving、调度编排、可观测性与端到端性能优化
  • Neuromem(当前开源的是 Neuromem-Benchmark):面向大模型记忆智能体的中间件系列,重点覆盖流式记忆组织、向量检索、长期记忆写入与跨轮复用
  • vLLM-HUST(含 BenchmarkDev Hub 等系列仓库):面向国产算力的华科自研推理引擎底座与插件生态,重点覆盖共享 workload、KV 状态管理、插件化优化与评测工具链

招生与合作

课题组长期招收对 系统研究与工程实现 感兴趣的硕士、博士和长期实习生,方向聚焦 大模型推理引擎、推理服务系统与记忆智能体中间件

比较契合的背景包括以下几类:

  • 对系统研究有兴趣,愿意围绕真实负载做可复现实验与指标设计
  • 喜欢性能工程与原型实现,熟悉 Python/C++、分布式系统或 GPU/异构优化
  • 想做推理引擎核心问题,例如 batching、scheduling、KV cache 与记忆状态管理、observability 与 benchmarking

如果具有数据库、操作系统、分布式系统、编译、体系结构、GPU/异构优化,或开源系统项目经验,通常会与上述方向较为契合。联系时建议附上简历、研究或项目经历(含代码链接),以及希望重点开展的 1–2 个问题。


联系方式

 NEWS

  • May 2026 two papers accepted to ICML 2026, including Neuromem and SAGE
  • Apr 2026 two collaborative papers accepted to ACL 2026, including one Main Conference paper and one Findings paper
  • Mar 2026 recruiting students interested in LLM inference engines and serving systems; prospective students are welcome to contact me by email with a CV and relevant project or code links
  • Jan 2026 two papers accepted to WWW 2026
  • Nov 2025 launch of our LLM inference system: https://ws.sage.org.ai/
  • Nov 2025 One paper accepted to SIGMOD 2026
  • Nov 2025 One paper accepted to TKDE 2026
  • Nov 2025 One paper accepted to AAAI 2026
  • Oct 2025 One paper accepted to ICDE 2026
  • July 2025 One paper accepted to NDBC 2025
  • July 2025 One paper accepted to NDBC 2025
  • July 2025 One paper accepted to VLDB 2025
  • March 2025 One paper accepted to ICDCS 2025
  • March 2025 One paper accepted to TKDE 2025
  • March 2025 One paper accepted to ICDE 2025
  • March 2025 One paper accepted to CVPR 2025

 PUBLICATIONS

论文标记说明:[First Author]、[Corresponding Author]、[CCF-A]。

如果下列论文暂时没有直接链接,说明对应 PDF 尚未上传到当前仓库。

首页导读

近期工作可概括为一条较为连贯的研究主线:围绕复杂系统中的共享状态管理展开,并逐步延伸至大模型推理基础设施。具体而言,相关研究主要关注三类问题:请求组织与运行时调度对吞吐和 P99 时延的影响,KV cache 与执行路径协同对端到端收益的影响,以及长上下文、RAG 和记忆增强推理中信息写入、保留、检索与跨轮复用的系统机制。

代表性论文

以下代表性论文按研究主线组织,用于提供一个相对紧凑的入口;完整列表见后文。

一、共享状态访问、调度与运行时管理

这一方向关注共享状态访问中的冲突、热点与排队如何演化为系统级性能退化,以及相关问题如何从经验性调优转化为可观测、可建模、可管理的系统方法。对应工作沿着路径级冲突观测、共享执行路径优化、状态热点与局部性建模,以及兼顾稳态执行与恢复控制的运行时调度框架逐步展开。

  • [ICML 2026] SAGE: A Dataflow-Native Framework for Modular, Controllable, and Transparent LLM-Augmented Reasoning. Jun Liu, Peilin Liu, Ruicheng Zhang, Senlei Zhang, Yanbo Chen, Ziao Wang, Jinyun Yang, Mingqi Wang, Shuhao Zhang, Xiaofei Liao, Hai Jin. International Conference on Machine Learning (ICML). [Corresponding Author] [CCF-A]
  • [TKDE 2025] Scalable Transactional Stream Processing on Multicore Processors. Jianjun Zhao, Yancan Mao, Zhonghao Yang, Haikun Liu, Shuhao Zhang. IEEE Transactions on Knowledge and Data Engineering (TKDE), 37(7): 4254-4269, 2025. [Corresponding Author] [CCF-A]
  • [SIGMOD 2023] MorphStream: Adaptive Scheduling for Scalable Transactional Stream Processing on Multicores. Yancan Mao, Jianjun Zhao, Shuhao Zhang, Haikun Liu, Volker Markl. Proc. ACM Manag. Data (SIGMOD), 1(1), Article 59, 1-26, 2023. [Corresponding Author] [CCF-A]
  • [VLDBJ 2024] A Survey on Transactional Stream Processing. Shuhao Zhang, Juan Soto, Volker Markl. The VLDB Journal, 33(2): 451-479, 2024. [First Author] [CCF-A]

二、状态感知执行优化与软硬件协同设计

这一方向关注状态相关执行在复杂硬件上的协同优化问题,重点分析吞吐、时延、能耗、精度与质量边界之间的耦合关系,以及局部优化为何难以稳定转化为端到端收益。相应研究将状态组织、数据通路与执行映射纳入统一分析框架,并进一步联合建模效率边界、质量边界与硬件代价。

  • [ICML 2026] Neuromem: A Granular Decomposition of the Streaming Lifecycle in External Memory for LLMs. Ruicheng Zhang, Xinyi Li, Tianyi Xu, Shuhao Zhang, Xiaofei Liao, Hai Jin. International Conference on Machine Learning (ICML). [Corresponding Author] [CCF-A]
  • [NeurIPS 2024] LibAMM: Empirical Insights into Approximate Computing for Accelerating Matrix Multiplication. Xianzhi Zeng, Wenchao Jiang, and Shuhao Zhang. Conference on Neural Information Processing Systems (NeurIPS). [Corresponding Author] [CCF-A]
  • [SIGMOD 2024] PECJ: Stream Window Join on Disorder Data Streams with Proactive Error Compensation. Xianzhi Zeng, Shuhao Zhang, Hongbin Zhong, Hao Zhang, Mian Lu, Zhao Zheng, Yuqiang Chen. Proc. ACM Manag. Data (SIGMOD), 2(1): 1-24, 2024. [Corresponding Author] [CCF-A]
  • [TKDE 2024] CStream: Parallel Data Stream Compression on Multicore Edge Devices. Xianzhi Zeng, Shuhao Zhang. IEEE Transactions on Knowledge and Data Engineering, 36(11): 5889-5904, 2024. [Corresponding Author] [CCF-A]

三、共享状态演化、复用与稳定推理

这一方向关注动态场景下共享状态的持续写入、稳定保留与跨轮复用问题,并进一步考察其对长期稳定推理的支撑作用。相关工作从在线更新与写入感知出发,逐步扩展到写入与保留代价的协同优化,以及面向可更新、可检索、可复用记忆对象的中间层组织。

  • [WWW 2026] FlowRAG: Continual Learning for Dynamic Retriever in Retrieval-Augmented Generation. Senlei Zhang, Tongjun Shi, Dandan Song, Luan Zhang, Shuhao Zhang, Xiaofei Liao, and Hai Jin. The Web Conference (WWW). [Corresponding Author] [CCF-A]
  • [WWW 2026] StreamFP: Fingerprint-guided Data Selection for Efficient Stream Learning. Changwu Li, Tongjun Shi, Shuhao Zhang, Binbin Chen, Bingsheng He, Xiaofei Liao, and Hai Jin. The Web Conference (WWW). [Corresponding Author] [CCF-A]
  • [ICDM 2024] MOStream: A Modular and Self-Optimizing Data Stream Clustering Algorithm. Zhengru Wang, Xin Wang, Shuhao Zhang. International Conference on Data Mining (ICDM). [Corresponding Author]
  • [EMNLP 2023] SentiStream: A Co-Training Framework for Adaptive Online Sentiment Analysis in Evolving Data Streams. Yuhao Wu, Karthick Sharma, Chun Wei Seah, Shuhao Zhang. Empirical Methods in Natural Language Processing (long paper, main track). [Corresponding Author]

完整论文列表

下列列表按研究主题分组,主要覆盖共享状态访问与调度、状态感知执行优化、共享状态演化与复用,以及相关应用与系统研究。

一、共享状态访问、调度与运行时管理

  • [ICML] SAGE: A Dataflow-Native Framework for Modular, Controllable, and Transparent LLM-Augmented Reasoning. Jun Liu, Peilin Liu, Ruicheng Zhang, Senlei Zhang, Yanbo Chen, Ziao Wang, Jinyun Yang, Mingqi Wang, Shuhao Zhang, Xiaofei Liao, Hai Jin. International Conference on Machine Learning (ICML). [Corresponding Author] [CCF-A]
  • [TKDE] Scalable Transactional Stream Processing on Multicore Processors. Jianjun Zhao, Yancan Mao, Zhonghao Yang, Haikun Liu, Shuhao Zhang. IEEE Transactions on Knowledge and Data Engineering (TKDE), 37(7): 4254-4269, 2025. [Corresponding Author] [CCF-A]
  • [ICDCS] Spacker: Unified State Migration for Distributed Streaming. Yancan Mao, Shuhao Zhang, Richard Ma. International Conference on Distributed Computing Systems.
  • [ICDE] Fast Parallel Recovery for Transactional Stream Processing on Multicores. Jianjun Zhao, Haikun Liu, Shuhao Zhang, Zhuohui Duan, Xiaofei Liao, Hai Jin, and Yu Zhang. IEEE 40th International Conference on Data Engineering (ICDE). [CCF-A]
  • [ICDE] MorphStream: Scalable Processing of Transactions over Streams. Siqi Xiang, Zhonghao Yang, Shuhao Zhang, Jianjun Zhao, and Yancan Mao. IEEE 40th International Conference on Data Engineering (ICDE Demo). [Corresponding Author] [CCF-A]
  • [VLDBJ] A Survey on Transactional Stream Processing. Shuhao Zhang, Juan Soto, Volker Markl. The VLDB Journal, 33(2): 451-479, 2024. [First Author] [CCF-A]
  • [ICDE] Scalable Online Interval Join on Modern Multicore Processors in OpenMLDB. Hao Zhang, Xianzhi Zeng, Shuhao Zhang, Xinyi Liu, Mian Lu, and Zhao Zheng. IEEE 39th International Conference on Data Engineering (ICDE). [CCF-A]
  • [SIGMOD] MorphStream: Adaptive Scheduling for Scalable Transactional Stream Processing on Multicores. Yancan Mao, Jianjun Zhao, Shuhao Zhang, Haikun Liu, Volker Markl. Proc. ACM Manag. Data (SIGMOD), 1(1), Article 59, 1-26, 2023. [Corresponding Author] [CCF-A]
  • [SIGMOD] Parallelizing Intra-Window Join on Multicores: An Experimental Study. Shuhao Zhang, Yancan Mao, Jiong He, Philipp M. Grulich, Steffen Zeuch, Bingsheng He, Richard T. B. Ma, Volker Markl. International Conference on Management of Data (SIGMOD). [First Author] [CCF-A]
  • [ICDE] Towards Concurrent Stateful Stream Processing on Multicore Processors. Shuhao Zhang, Yingjun Wu, Feng Zhang, Bingsheng He. IEEE 36th International Conference on Data Engineering. [First Author] [CCF-A]
  • [SIGMOD] BriskStream: Scaling Data Stream Processing on Shared-Memory Multicore Architectures. Shuhao Zhang, Jiong He, Amelie Chi Zhou, Bingsheng He. International Conference on Management of Data (SIGMOD). [First Author] [CCF-A]
  • [ICDE] Multi-Query Optimization for Complex Event Processing in SAP ESP. Shuhao Zhang, H. T. Vo, D. Dahlmeier, B. He. IEEE 33rd International Conference on Data Engineering. [First Author] [CCF-A]
  • [ICDE] Revisiting the Design of Data Stream Processing Systems on Multi-Core Processors. Shuhao Zhang, B. He, D. Dahlmeier, A. C. Zhou, T. Heinze. IEEE 33rd International Conference on Data Engineering. [First Author] [CCF-A]

二、状态感知执行优化与软硬件协同设计

  • [ICML] Neuromem: A Granular Decomposition of the Streaming Lifecycle in External Memory for LLMs. Ruicheng Zhang, Xinyi Li, Tianyi Xu, Shuhao Zhang, Xiaofei Liao, Hai Jin. International Conference on Machine Learning (ICML). [Corresponding Author] [CCF-A]
  • [TKDE] Data-Aware Adaptive Compression for Stream Processing. Yu Zhang, Feng Zhang, Hourun Li, Shuhao Zhang, Xiaoguang Guo, Yuxing Chen, Anqun Pan, and Xiaoyong Du. IEEE Transactions on Knowledge and Data Engineering (TKDE), 36(9): 4531-4549, 2024. [CCF-A]
  • [SIGMOD] Enabling Adaptive Sampling for Intra-Window Join: Simultaneously Optimizing Quantity and Quality. Xilin Tang, Feng Zhang, Shuhao Zhang, Yani Liu, Bingsheng He, Xiaoyong Du. Proc. ACM Manag. Data (SIGMOD), 2(4): 1-31, 2024. [CCF-A]
  • [SIGMOD] MAST: Towards Efficient Analytical Query Processing on Point Cloud Data. Jiangneng Li, Haitao Yuan, Gao Cong, Han Mao Kiah, Shuhao Zhang. Proc. ACM Manag. Data (SIGMOD), 3(1): 1-27, 2025. [CCF-A]
  • [NeurIPS] LibAMM: Empirical Insights into Approximate Computing for Accelerating Matrix Multiplication. Xianzhi Zeng, Wenchao Jiang, and Shuhao Zhang. Conference on Neural Information Processing Systems (NeurIPS). [Corresponding Author] [CCF-A]
  • [SIGMOD] PECJ: Stream Window Join on Disorder Data Streams with Proactive Error Compensation. Xianzhi Zeng, Shuhao Zhang, Hongbin Zhong, Hao Zhang, Mian Lu, Zhao Zheng, Yuqiang Chen. Proc. ACM Manag. Data (SIGMOD), 2(1): 1-24, 2024. [Corresponding Author] [CCF-A]
  • [TKDE] CStream: Parallel Data Stream Compression on Multicore Edge Devices. Xianzhi Zeng, Shuhao Zhang. IEEE Transactions on Knowledge and Data Engineering, 36(11): 5889-5904, 2024. [Corresponding Author] [CCF-A]
  • [SIGMOD] Predictive and Near-Optimal Sampling for View Materialization in Video Databases. Yanchao Xu, Dongxiang Zhang, Shuhao Zhang, Sai Wu, Zexu Feng, Gang Chen. Proc. ACM Manag. Data (SIGMOD), 2(1): 1-27, 2024. [CCF-A]
  • [DEBS] A Hardware-Conscious Stateful Stream Compression Framework for IoT Applications (Vision). Xianzhi Zeng, Shuhao Zhang. International Conference on Distributed and Event-Based Systems (DEBS). [Corresponding Author]
  • [ICDE] Parallelizing Stream Compression for IoT Applications on Asymmetric Multicores. Xianzhi Zeng and Shuhao Zhang. IEEE 39th International Conference on Data Engineering (ICDE). [CCF-A]
  • [ICDE] CompressStreamDB: Fine-Grained Adaptive Stream Processing without Decompression. Yu Zhang, Feng Zhang, Hourun Li, Shuhao Zhang, and Xiaoyong Du. IEEE 39th International Conference on Data Engineering (ICDE). [CCF-A]
  • [ICDE] Scalable Machine Learning for Real-Time Fault Diagnosis in Industrial IoT Cooling Roller Systems (SRTFD). Dandan Zhao, Karthick Sharma, Yuxin Qi, Qixun Liu, and Shuhao Zhang. IEEE 41st International Conference on Data Engineering (ICDE). [Corresponding Author] [CCF-A]
  • [TPDS] Fine-Grained Multi-Query Stream Processing on Integrated Architectures. Feng Zhang, Chenyang Zhang, Lin Yang, Cheng Yang, Shuhao Zhang, Bingsheng He, Wei Lu, Xiaoyong Du. IEEE Transactions on Parallel and Distributed Systems (TPDS), 32(9), 2021. [CCF-A]
  • [USENIX ATC] FineStream: Fine-Grained Window-Based Stream Processing on CPU-GPU Integrated Architectures. Feng Zhang, Lin Yang, Shuhao Zhang, Bingsheng He, Wei Lu, Xiaoyong Du. USENIX Annual Technical Conference (USENIX ATC 20). [CCF-A]
  • [SIGMOD Rec.] Hardware-Conscious Stream Processing: A Survey. Shuhao Zhang, Feng Zhang, Yingjun Wu, Bingsheng He, Paul Johns. SIGMOD Record, 48(4), 2020. [First Author]

三、共享状态演化、复用与稳定推理

  • [EMNLP] SentiStream: A Co-Training Framework for Adaptive Online Sentiment Analysis in Evolving Data Streams. Yuhao Wu, Karthick Sharma, Chun Wei Seah, Shuhao Zhang. Empirical Methods in Natural Language Processing (long paper, main track). [Corresponding Author]
  • [ICDM] MOStream: A Modular and Self-Optimizing Data Stream Clustering Algorithm. Zhengru Wang, Xin Wang, Shuhao Zhang. International Conference on Data Mining (ICDM). [Corresponding Author]
  • [SIGMOD] Data Stream Clustering: An In-depth Empirical Study. Xin Wang, Zhengru Wang, Zhenyu Wu, Shuhao Zhang, Xuanhua Shi, Li Lu. International Conference on Management of Data (SIGMOD). [Corresponding Author] [CCF-A]
  • [CVPR] Ferret: An Efficient Online Continual Learning Framework under Varying Memory Constraints. Yuhao Zhou, Yuxin Tian, Jindi Lv, Mingjia Shi, Yuanxi Li, Qing Ye, Shuhao Zhang, and Jiancheng Lv. Conference on Computer Vision and Pattern Recognition (CVPR). [CCF-A]
  • [WWW] StreamFP: Fingerprint-guided Data Selection for Efficient Stream Learning. Changwu Li, Tongjun Shi, Shuhao Zhang, Binbin Chen, Bingsheng He, Xiaofei Liao, and Hai Jin. The Web Conference (WWW). [Corresponding Author] [CCF-A]
  • [EMNLP] A Framework of Knowledge Graph-Enhanced Large Language Model Based on Question Decomposition and Atomic Retrieval. Yading Li, Dandan Song, Changzhi Zhou, Yuhang Tian, Hao Wang, Ziyi Yang, and Shuhao Zhang. Empirical Methods in Natural Language Processing (Findings).
  • [TKDE] A Framework of Knowledge Graph-Enhanced Large Language Model Based on Global Planning. Yading Li, Dandan Song, Yuhang Tian, Hao Wang, Changzhi Zhou, and Shuhao Zhang. IEEE Transactions on Knowledge and Data Engineering (TKDE), 38(2), 2025. [CCF-A]
  • [WWW] FlowRAG: Continual Learning for Dynamic Retriever in Retrieval-Augmented Generation. Senlei Zhang, Tongjun Shi, Dandan Song, Luan Zhang, Shuhao Zhang, Xiaofei Liao, and Hai Jin. The Web Conference (WWW). [Corresponding Author] [CCF-A]
  • [SIGMOD] CANDOR-Bench: Benchmarking In-Memory Continuous ANNS under Dynamic Open-World Streams [Experiments & Analysis]. Mingqi Wang, Jun Liu, Ruicheng Zhang, Jianjun Zhao, Ruipeng Wan, Xinyan Lei, Shuhao Zhang, Bolong Zheng, Haikun Liu, Xiaofei Liao, and Hai Jin. International Conference on Management of Data (SIGMOD). [Corresponding Author] [CCF-A]

其他论文

  • [ACL Findings] Multi-Hop Knowledge Editing via Critic-Guided Multi-Agent Reasoning. Xudong Li, Yuhang Tian, Dandan Song, Zhijing Wu, Shuhao Zhang, Jun Yang, Yongyu Huo, Changzhi Zhou, Xinyu Zhang, Chenhao Li, Huipeng Ma, Luan Zhang, Yan Xu, Qian Liu. Findings of the Association for Computational Linguistics (ACL 2026 Findings).
  • [ACL] FusionFlow: Enabling Deep Structural Exploration for Automated Agentic Workflow Generation. Xiang Wang, Zongtao Yang, Zhuojian Hong, Shuhao Zhang, Wei Wei. Annual Meeting of the Association for Computational Linguistics (ACL 2026 Main Conference). [CCF-A]
  • [ICDE] GRACE: Alleviating Reconstruction Cost in Dynamic Graph Processing Systems. Hongru Gao, Shuhao Zhang, Xiaofei Liao, and Hai Jin. IEEE 42nd International Conference on Data Engineering (ICDE). [Corresponding Author] [CCF-A]
  • [SIGMOD] Select Edges Wisely: Monotonic Path Aware Graph Layout Optimization for Disk-Based ANN Search. Ziyang Yue, Bolong Zheng, Ling Xu, Kanru Xu, Shuhao Zhang, Yajuan Du, Yunjun Gao, Xiaofang Zhou, and Christian S. Jensen. SIGMOD. [CCF-A]
  • [IJCAI] Detecting Hallucination in Large Language Models through Deep Internal Representation Analysis. Luan Zhang, Dandan Song, Zhijing Wu, Yuhang Tian, Changzhi Zhou, Jing Xu, Ziyi Yang, and Shuhao Zhang. International Joint Conference on Artificial Intelligence (IJCAI). [CCF-A]
  • [NC] MatSwarm: Trusted Swarm Transfer Learning Driven Materials Computation for Secure Big Data Sharing. Cheng Xu, Ran Wang, Shuhao Zhang, Fangwen Ye, Yusen Tang, Sisui Tang, Hangning Zhang, Wendi Du, and Xiaotong Zhang. Nature Communications, 15(1), 2024.
  • [ICPP] PREACT: Predictive Resource Allocation for Bursty Workloads in a Co-located Data Center. Ziyang Xiao, Dongxiang Zhang, Dingyu Yang, Shuhao Zhang, Jian Cao, Gang Chen. International Conference in Parallel Processing (ICPP).
  • [IWQoS] Low-Latency Video Conferencing via Optimized Packet Routing and Reordering. Yao Xiao, Amelie Chi Zhou, Sitian Chen, Shuhao Zhang, Yi Wang, Rui Mao, Xuan Yang. IEEE International Symposium on Quality of Service.
  • [VLDBJ] Payment Behavior Prediction on Shared Parking Lots with TR-GCN. Qingyu Xu, Feng Zhang, Mingde Zhang, Jidong Zhai, Bingsheng He, Cheng Yang, Shuhao Zhang, Jiazao Lin, Haidi Liu, Xiaoyong Du. The VLDB Journal, 31(5), 2022. [CCF-A]
  • [MDPI Algorithms] Revisiting the Design of Parallel Stream Joins on Trusted Execution Environments. Souhail Meftah, Shuhao Zhang, Bharadwaj Veeravalli, Khin Mi Mi Aung. MDPI Algorithms.
  • [TKDE] Periodic Weather-Aware LSTM with Event Mechanism for Parking Behavior Prediction. F. Zhang, Y. Liu, N. Feng, C. Yang, J. Zhai, Shuhao Zhang, B. He, J. Lin, X. Zhang, X. Du. IEEE Transactions on Knowledge and Data Engineering (TKDE), 34(12), 2022. [CCF-A]
  • [OJIOT] NebulaStream: Complex Analytics Beyond the Cloud. Steffen Zeuch, Eleni Tzirita Zacharatou, Shuhao Zhang, Xenofon Chatziliadis, Ankit Chaudhary, Bonaventura Del Monte, Dimitrios Giouroukis, Philipp M. Grulich, Ariane Ziehn, Volker Markl. Open Journal of Internet Of Things (OJIOT).
  • [IJCAI] PewLSTM: Periodic LSTM with Weather-Aware Gating Mechanism for Parking Behavior Prediction. Feng Zhang, Ningxuan Feng, Yani Liu, Cheng Yang, Jidong Zhai, Shuhao Zhang, Bingsheng He, Jiazao Lin, Xiaoyong Du. International Joint Conference on Artificial Intelligence (IJCAI). [CCF-A]
  • [BigMM] TraV: An Interactive Exploration System for Massive Trajectory Data. J. Ang, T. Fu, J. Paul, Shuhao Zhang, B. He, T. S. D. Wenceslao, S. Y. Tan. IEEE Fifth International Conference on Multimedia Big Data (BigMM). [Corresponding Author]
  • [TPDS] Understanding Co-Running Behaviors on Integrated CPU/GPU Architectures. F. Zhang, J. Zhai, B. He, Shuhao Zhang, W. Chen. IEEE Transactions on Parallel and Distributed Systems (TPDS). [CCF-A]
  • [SC] Elastic Multi-resource Fairness: Balancing Fairness and Efficiency in Coupled CPU/GPU Architectures. S. Tang, B. He, Shuhao Zhang, Z. Niu. International Conference for High Performance Computing, Networking, Storage and Analysis (SC). [CCF-A]
  • [TPDS] Melia: A MapReduce Framework on OpenCL-Based FPGAs. Zeke Wang, Shuhao Zhang, Bingsheng He, Wei Zhang. IEEE Transactions on Parallel and Distributed Systems (TPDS), 27(12): 3547-3560, 2016. [CCF-A]
  • [MASCOTS] To Co-run, or Not to Co-run: A Performance Study on Integrated Architectures. Feng Zhang, Jidong Zhai, Wenguang Chen, Bingsheng He, Shuhao Zhang. IEEE 23rd International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS).
  • [VLDB] In-Cache Query Co-Processing on Coupled CPU-GPU Architectures. Jiong He, Shuhao Zhang, Bingsheng He. Proceedings of the VLDB Endowment (PVLDB), 8(4): 329-340, 2014. [CCF-A]
  • [VLDB] OmniDB: Towards Portable and Efficient Query Processing on Parallel CPU/GPU Architectures. Shuhao Zhang, Jiong He, Bingsheng He, Mian Lu. Proceedings of the VLDB Endowment (PVLDB), 6(12): 1374-1377, 2013. [First Author] [CCF-A]

论文下载(PDF)

以下链接提供的是作者保存的作者版本(如 preprint、accepted manuscript 或 author copy),不是期刊、会议论文集或出版社网站上的正式出版版本。

已上传论文的直接下载链接:

 MORE

这里收录公开教学材料与研究综述下载入口。

系统建设

  • 代表性系统建设信息已在主页系统板块中汇总,包括 SAGE、Neuromem 与 vLLM-HUST 等当前工作。

教学材料

目前公开两门课程:

  • 大模型推理基础设施:页面
  • 研究生论文写作:页面

其中,2026 年研究生论文写作课程当前公开第 5 至第 8 讲课件,均为草稿版,仅供课程同学和相关读者参考,后续仍可能继续修订。

公开综述材料

  • 并行与分布式系统中的高效状态管理综述(preprint):PDF
  • 国产算力推理引擎综述(preprint):PDF