Dr. Dong Li is an associate professor at University of California, Merced. He is the director of Parallel Architecture, System, and Algorithm Lab (PASA) and a co-director of High Performance Computing Systems and Architecture Group at UC Merced. Previously (2011-2014), he was a research scientist at the Oak Ridge National Laboratory (ORNL). Before that, he earned a PhD in computer science from Virginia Tech. He is an associate editor for IEEE Transactions on Parallel and Distributed Systems (TPDS). Dong's research focuses on high performance computing (HPC), and maintains a strong relevance to computer systems.
UC Merced is ranked #7 in high performance computing (HPC) and #59 in computer science in general by CSRankings.
Recent Research Impacts
- Our work on debugging persistent memory programs won the distinguished artifact award at ASPLOS'21, and has been integrated into Intel PMDK (Intel PMDK is the de facto development kit to program persistent memory).
- Our work on training large machine learning models using heterogeneous memory is integrated into Microsoft DeepSpeed. This is a collaborative work with Microsoft. This work has been widely reported by medias (Link1, Link2, Link3, and Link4 etc), and has been widely used in industry (e.g., NVIDIA, HP and Microsoft).
- Our work on accelerating power grid simulation using machine learning has been highlighted by the U.S. Department of Energy.
- Our work on MPI fault tolerance benchmark suite and understanding natural error resilience in HPC applications are reported by HPCwire (Link1 and Link2). HPCwire is the #1 news and information resource covering HPC.
- ASPLOS Distinguished Artifact Award, 2021
- Berkeley Lab University Faculty Fellowship, 2016
- NSF CAREER Award, 2016
- SC best poster nomination (2.9% of all poster submissions), 2016
- SC best student paper nomination, 2014
- Oak Ridge National Lab (CSMD) Distinguished Contributor Award, 2013
Current Research Topics
- System support for persistent memory-based big memory platforms
- Memory-centric system optimization for machine learning training and inference
- Scientific machine learning
Selected Recent Publications (a complete list of the publication can be found from here)
- [VLDB'21] Jie Liu, Wenqian Dong, Qingqing Zhou, and Dong Li. Fauce: Fast and Accurate Deep Ensembles with Uncertainty for Cardinality Estimation. In 47th International Conference on Very Large Data Bases, 2021
- [ATC'21] Jie Ren, Samyam Rajbhandari, Reza Yazdani Aminabadi, Olatunji Ruwase, Shuangyan Yang, Minjia Zhang, Dong Li and Yuxiong He. ZeRO-Offload: Democratizing Billion-Scale Model Training. In USENIX Annual Technical Conference, 2021
- [EuroSys'21] Zhen Xie, Wenqian Dong, Jiawen Liu, Hang Liu and Dong Li. Tahoe: Tree Structure-Aware High Performance Inference Engine for Decision Tree Ensemble on GPU. In European Conference on Computer Systems, 2021
- [FAST'21] Kai Wu, Jie Ren, Ivy Peng and Dong Li. ArchTM: Architecture-Aware, High Performance Transaction for Persistent Memory. In the 19th USENIX Conference on File and Storage Technologies, 2021
- [ASPLOS'21] Bang Di, Jiawen Liu, Hao Chen and Dong Li. "Fast, Flexible and Comprehensive Bug Detection for Persistent Memory Programs". In 26th Architectural Support for Programming Languages and Operating Systems 2021 (Distinguished Artifact Award)
- [PPoPP'21] Jiawen Liu, Jie Ren, Roberto Gioiosa, Dong Li and Jiajia Li. "Sparta: High-Performance, Element-Wise SparseTensor Contraction on Heterogeneous Memory". In 26th Principles and Practice of Parallel Programming 2021
- [HPCA'21] Jie Ren, Jiaolin Luo, Kai Wu, Minjia Zhang, Hyeran Jeon and Dong Li. "Sentinel: Efficient Tensor Migration and Allocation on Heterogeneous Memory Systems for Deep Learning". In 27th IEEE International Symposium on High-Performance Computer Architecture, 2021
- [ICS'21] Zhen Xie, Wenqian Dong, Jie Liu, Ivy Peng, Yanbao Ma and Dong Li. "MD-HM: Memoization-based Molecular Dynamics Simulations on Big Memory System". In 35th International Conference on Supercomputing, 2021
- [ICS'21] Xin He, Jiawen Liu, Zhen Xie, Hao Chen, Guoyang Chen, Weifeng Zhang and Dong Li. "Enabling Energy-Efficient DNN Training on Hybrid GPU-FPGA Accelerators". In 35th International Conference on Supercomputing, 2021
- [ICS'21] Jie Ren, Jiaolin Luo, Ivy Peng, Kai Wu and Dong Li. "Optimizing Large-Scale Plasma Simulations on Persistent Memory-based Heterogeneous Memory with Effective Data Placement Across Memory Hierarchy". In 35th International Conference on Supercomputing, 2021
- [ICS'21] Jiawen Liu, Dong Li, Roberto Gioiosa and Jiajia Li. "Athena: High-Performance Sparse Tensor Contraction Sequence on Heterogeneous Memory". In 35th International Conference on Supercomputing, 2021
- [NeurIPS'20] Jie Ren, Minjia Zhang and Dong Li. "HM-ANN: Efficient Billion-Point Nearest Neighbor Search on Heterogeneous Memory". In 34th Conference on Neural Information Processing Systems, 2020
- [MICRO'18] Jiawen Liu, Hengyu Zhao, Matheus Ogleari, Dong Li, and Jishen Zhao. "Processing-in-Memory for Energy-efficient Neural Network Training: A Heterogeneous Approach". In 51st IEEE/ACM International Symposium on Microarchitecture, 2018
- [9/2021] A paper, "Flame: A Self-Adaptive Auto-Labeling System for Heterogeneous Mobile Processors" is accepted into SEC'21.
- [9/2021] Thanks ANL and SK Hynix for supporting our research on machine learning systems and big memory!
- [8/2021] Welcome new PhD student, Dong Xu :-)
- [6/2021] Dong was invited to give a talk and be a panelist at 2nd Workshop on Heterogeneous Memory Systems (HMEM)
- [6/2021] A paper "Fauce: Fast and Accurate Deep Ensembles with Uncertainty for Cardinality Estimation" is accepted into VLDB'21.
- [6/2021] Dong was selected to be an associate editor for IEEE Transactions on Parallel and Distributed Systems (TPDS).
- [5/2021] Dong was invited to give a keynote at the Eleventh International Workshop on Accelerators and Hybrid Emerging Systems (AsHES).
- [5/2021] An NSF grant is funded to support our research on big memory for HPC.
- [4/2021] Our work on training billion-scale NLP models on heterogeneous memory is accepted into USENIX ATC'21! This is a collaboration work with Microsoft.
- [4/2021] Our ASPLOS'21 paper won the distinguished artifact award! Only two papers won this award.
- [3/2021] Four papers are accepted into ICS'21!
- [3/2021] Our collaboration work with LLNL on MPI fault tolerance benchmark suite is reported by HPCWire.
- [3/2021] Wenqian got an internship offer! She will go to the HP labs during the summer working on scientific machine learning.
- [1/2021] Our collaboration work with Microsoft on training large NLP models with heterogeneous memory draws some attentions from media (see 1 and 2). :)
- [1/2021] A paper “Tahoe: Tree Structure-Aware High Performance Inference Engine for Decision Tree Ensemble on GPU” is accepted in EuroSys'21.
- [12/2020] Dong was invited to give a talk on heterogeneous memory at IBM Research Almaden.
- PC member: ICS'21, SC'21, ICML'21, HPDC'21 (poster co-chair), IPDPS'21, IPDPS'20 (primary PC), NeurIPS'20, HPDC'18-20 (travel chair), ICPP'19-20, SC'18, ASPLOS'18 (shadow PC), Cluster'18 (poster chair), IPDPS'17, NAS (2016-2017), SC'15, CCGrid (2012-2018), Cluster (2015-2020), ISC (2013-2016), etc.
- External reviewers: PPoPP'14, SC'13, Euro-par'12, HPDC'11, etc.
- Editorial position: TPDS review board member; Co-editor, special Issue of the Journal of High Performance Computing Applications for the fourth International Workshop on Accelerators and Hybrid Exascale Systems
- PC (co-) chair: PASA (2012, 2013, and 2016), HP-PAC (2013 and 2014)
- Steering committee: PASA (2014 and 2015)
- Technical reviewers for major journals: IEEE Transaction on Parallel and Distributed Systems (TPDC), IEEE Transaction on Reliability (TR), IEEE Transaction on Computers (TC), Journal of Parallel and Distributed Computing (JPDC), Journal of Supercomputing, International Journal of High Performance Computing, etc.
- Jiawen Liu (PhD student, since 2017 Fall)
- Wenqian Dong (PhD student, since 2017 Fall)
- Jie Ren (PhD student, since 2017 Fall)
- Jie Liu (PhD student, since 2019 Spring)
- Shuangyan Yang (PhD student, since 2021 Spring)
- Dong Xu (PhD student, since 2021 Fall)
- Zhen Xie (Postdoc, left in May 2021. First employment: Argonne National Lab)
- Kai Wu (PhD student, graduated in May 2021. First employment: ByteDance)
- Luanzheng Guo (PhD student, graduated in Oct 2020. First employment: Pacific Northwest National Lab)
- Neelam Sinha (Master student, graduated in 2020. First employment: National Cancer Institute)
- Hanlin He (Master student, graduated in 2018. First employment: Byton)
- Wei Liu (Master student, graduated in 2017. First employment: ctrip)
- Himanshu Pillai (Master student, graduated in 2016. First employment: Barcelona Supercomputer Center)
- Armando Montanez. (Undergraduate student, graduated in 2018, as the UC Merced outstanding student. Join Google)
- Jing Liang (Undergraduate student, graduated in 2017. First employment: )
- Nigel Tan (Undergraduate student, graduated in 2017. Join Rice as a PhD student)
- Hanlin He (Undergraduate student, graduated in 2016, as the UC Merced outstanding student. Join PASA as a PhD student)
- Zachary Canann (Undergraduate student, graduated in 2016, as the UC Merced outstanding student. First employment: PayPal)
- Kevin Song (Undergraduate student, graduated in 2015, as the UC Merced outstanding student. Join UT Austin as a PhD student)
- National Science Foundation
- Lawrence Livermore National Lab
- Argonne National Lab
- Lawrence Berkeley National Lab
- SK Hynix
- Intel (Equipment donation)
- Xilinx (Equipment donation)
- University of California, Merced