About Me
Dr. Kai Zhang is an Associate Professor at Fudan University. He received his Ph.D. degree from University of Science and Technology of China in 2016. Before joining Fudan in Oct. 2017, he was a Research Fellow at National University of Singapore from Oct. 2016 to Oct. 2017. He was a Visiting Scholar at The Ohio State University from Sep. 2013 to Sep. 2015.
His research areas include AI Systems, Parallel and Distributed Computing, Databases.
目前主要研究课题:面向大模型的检索生成增强(RAG)、向量数据库;大规模推荐系统;基于GPU异构硬件(光线追踪核心)的数据处理。
News and Events
- We are the winners on the NeurIPS'23 Big-ANN Competition for both the OOD track and the Sparse track. Congratulations to my students Meng Chen, Yue Chen, and Rui Ma!
- I am looking for Highly Self-Motivated Students with a strong interest in System Research. If you are interested in working with me, please send me an email with your CV and transcript.
- 招收自驱力强且对系统研究充满兴趣的本科生、硕士生/博士生。
Teaching
Seleted Publication
- RoarGraph: A Projected Bipartite Graph for Efficient Cross-Modal Approximate Nearest Neighbor Search. Meng Chen, Kai Zhang, Zhenying He, Yinan Jing and X. Sean Wang. Proceedings of 50th International Conference on Very Large Databases (VLDB), 2024.
Highlight: SOTA of multi-modal embedding retrieval. The approach wins the OOD Track of NeurIPS'23 Big-ANN Competition.
- AdaptChain: Adaptive Data Sharing and Synchronization for NFV Systems on Heterogeneous Architectures. Kai Zhang, Jiahui Hong, Zhenying He, Yinan Jing and X. Sean Wang. IEEE Transactions on Parallel and Distributed Systems(TPDS), Pages: 1281 - 1292, Volume: 35, Issue: 7, 2024.
- RTScan: Efficient Scan with Ray Tracing Cores. Yangming Lv, Kai Zhang, Ziming Wang, Xiaodong Zhang, Rubao Lee, Zhenying He, Yinan Jing and X. Sean Wang. Proceedings of 50th International Conference on Very Large Databases (VLDB), Vol. 17, No. 6, 2024.
Highlight: The first work that demonstrates ray tracing cores can be faster than CUDA cores and CPUs for relational operators.
- MWP: Multi-Window Parallel Evaluation of Regular Path Queries on Streaming Graphs. Siyuan Zhang, Zhenying He, Yinan Jing, Kai Zhang and X. Sean Wang. Proceedings of the 39 th ACM International Conference on Management of Data (SIGMOD), 2024.
- MetaSQL: A Generate-then-Rank Framework for Natural Language to SQL Translation. Yuankai Fan, Zhenying He, Tonghui Ren, Can Huang, Yinan Jing, Kai Zhang, X.Sean Wang. IEEE International Conference on Data Engineering (ICDE), 2024.
- PURPLE: Making a Large Language Model a Better SQL Writer. Tonghui Ren, Yuankai Fan, Zhenying He, Ren Huang, Jiaqi Dai, Can Huang, Yinan Jing, Kai Zhang, Yifan Yang, X. Sean Wang. IEEE International Conference on Data Engineering (ICDE), 2024.
- Learned Optimizer for Online Approximate Query Processing in Data Exploration. Liyuan Liu, Hanbing Zhang, Yinan Jing, Zhenying He, Kai Zhang, and X. Sean Wang. IEEE Transactions on Knowledge and Data Engineering (TKDE), 2024, Accepted.
- Learning-based Sample Tuning for Approximate Query Processing in Interactive Data Exploration. Hanbing Zhang, Yinan Jing, Zhenying He, Kai Zhang and X.Sean Wang. IEEE Transactions on Knowledge and Data Engineering (TKDE), 2024, Accepted.
- CLIC: An Extensible and Efficient Cross-Platform Data Analytics System. Qixiang Chen, Zhijun Chen, Kai Zhang and X. Sean Wang. IEEE Transactions on Parallel and Distributed Systems(TPDS), Pages: 34 - 45, Volume: 35, Issue: 1, 2024.
- GAR: A Generate-and-Rank Approach for Natural Language to SQL Translation. Yuankai Fan, Zhenying He, Tonghui Ren, Dianjun Guo, Lin Chen, Ruisi Zhu, Guanduo Chen, Yinan Jing, Kai Zhang, X.Sean Wang. IEEE International Conference on Data Engineering (ICDE), pages 110-122, 2023.
- BlinkViz: Fast and Scalable Approximate Visualization on Very Large Datasets using Neural-Enhanced Mixed Sum-Product Networks. Yimeng Qiao, Yinan Jing, Hanbing Zhang, Zhenying He, Kai Zhang and X. Sean Wang. The Web Conference(WWW), April 30 - May 4, 2023.
- Gaviss: Boosting the Performance of GPU-Accelerated NFV Systems via Data Sharing. Liangchen Guo, Kai Zhang, X. Sean Wang. IEEE Transactions on Parallel and Distributed Systems(TPDS), Pages: 4472 - 4483, Volume: 33, Issue: 12, 2022.
- Chunk-Level Password Guessing: Towards Modeling Refined Password Composition Representations. Ming Xu, Chuanwang Wang, Jitao Yu, Junjie Zhang, Kai Zhang, Weili Han. Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security(CCS), November 15-19, 2021.
- TransPCFG:Transferring the Grammars From Short Passwords to Guess Long Passwords Effectively. Weili Han, Ming Xu, Junjie Zhang, Chuanwang Wang, Kai Zhang, X. Sean Wang. IEEE Transactions on Information Forensics and Security (TIFS), 16: 451-465, 2021.
- BinDex: A Two-Layered Index for Fast and Robust Scans. Linwei Li, Kai Zhang, Jiading Guo, Wen He, Zhenying He, Yinan Jing, Weili Han, X. Sean Wang. Proceedings of the 39th ACM International Conference on Management of Data (SIGMOD), Portland OR, USA, June 14-19, 2020.
Highlight: State-of-the-art / Fastest scan approach on CPUs.
- An Agile Sample Maintenance Approach for Agile Analytics. Hanbing Zhang, Yazhong Zhang, Zhenying He, Yinan Jing, Kai Zhang, X. Sean Wang. Proceedings of the 36th IEEE International Conference on Data Engineering (ICDE), Dallas TX, USA, April 20-24, 2020.
- Understanding and Optimizing Conjunctive Predicates under Memory-Efficient Storage Layouts. Zeke Wang, Xue Liu, Kai Zhang, Haihang Zhou, Bingsheng He. IEEE Transactions on Knowledge and Data Engineering (TKDE), Volume 33, Issue 6, Pages 2803-2871, 2019.
- G-NET: Effective GPU Sharing in NFV Systems. Kai Zhang, Bingsheng He, Jiayu Hu, Zeke Wang, Bei Hua, Jiayi Meng, Lishan Yang. Proceedings of the 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI), Renton WA, USA, April 9-11, 2018.
Highlight: The first work that efficiently shares a GPU in an NFV system. Following work includes Gaviss and AdaptChain.
- Hebe: An Order-obliviousness and High-performance Execution Scheme for Conjunctive Predicates. Zeke Wang, Kai Zhang, Haihang Zhou, Xue Liu, Bingsheng He. Proceedings of the 34th IEEE International Conference on Data Engineering (ICDE short paper), Paris, France, April 16-20, 2018.
- A Distributed In-Memory Key-Value Store System on Heterogeneous CPU–GPU Cluster. Kai Zhang, Kaibo Wang, Yuan Yuan, Lei Guo, Rubao Li, Xiaodong Zhang, Bingsheng He, Jiayu Hu, Bei Hua. The International Journal on Very Large Data Bases (VLDB Journal), Volume 26 Number 5, Pages 729-750, October 2017.
- DIDO: Dynamic Pipelines for In-Memory Key-Value Stores on Coupled CPU-GPU Architectures. Kai Zhang, Jiayu Hu, Bingsheng He, Bei Hua. Proceedings of the 33rd IEEE International Conference on Data Engineering (ICDE), San Diego, California, USA, April 19-22, 2017.
- Mega-KV: A Case for GPUs to Maximize the Throughput of In-Memory Key-Value Stores. Kai Zhang, Kaibo Wang, Yuan Yuan, Lei Guo, Rubao Lee, and Xiaodong Zhang. Proceedings of the VLDB Endowment, Vol. 8, Issue. 11. July 2015. (presented in the 41th International Conference on Very Large Data Bases (VLDB), Hawaii, USA, August 31 - September 4, 2015)
Highlight: Breaking the performance record of in-memory key-value store on commodity processors.
- Hetero-DB: The Next Generation High-Performance Database System on Heterogeneous Computing and Storage Hardware. Kai Zhang, Feng Chen, Xiaoning Ding, Yin Huai, Rubao Lee, Tian Luo, Kaibo Wang, Yuan Yuan, and Xiaodong Zhang. Journal of Computer Science and Technology (JCST), Volume 30, Issue 4, pages 657-678, July 2015.
- A Holistic Approach to Build Real-time Stream Processing System with GPU. Kai Zhang, Jiayu Hu, Bei Hua. Journal of Parallel and Distributed Systems (JPDC), Volume 83 Issue C, Pages 44-57, September 2015.
- Concurrent analytical query processing with GPUs. Kaibo Wang, Kai Zhang, Yuan Yuan, Siyuan Ma, Rubao Lee, Xiaoning Ding, and Xiaodong Zhang. Proceedings of the VLDB Endowment, Vol. 7, Issue. 11. July 2014. (presented in the 40th International Conference on Very Large Data Bases (VLDB), Hangzhou, China, September 1-5, 2014.)