Dr. Dong Li is an associate professor in the Department of Electrical Engineering and Computer Science, University of California, Merced since 2015. He is the director of Parallel Architecture, System, and Algorithm Lab (PASA) and a co-director of High Performance Computing Systems and Architecture Group at UC Merced. Previously (2011-2014), he was a research scientist at the Oak Ridge National Laboratory (ORNL). Before that, he earned his PhD in computer science from Virginia Tech. Dong received a CAREER Award from U.S. National Science Foundation in 2016, a Berkeley Lab University Faculty Fellowship in 2016, and an ORNL/CSMD Distinguished Contributor Award in 2013. His paper in SC'14 was nominated as the best student paper. His paper in ASPLOS'21 won the distinguished artifact award. He was also the lead PI for NVIDIA CUDA Research Center at UC Merced. He is an associate editor for IEEE Transactions on Parallel and Distributed Systems (TPDS).
Dong's research focuses on high performance computing (HPC), and maintains a strong relevance to computer systems. The core theme of his research is to study how to enable scalable and efficient execution of enterprise and scientific applications on increasingly complex large-scale parallel systems. His work creates innovation in runtime, architecture, performance modeling, and programming models; His work often coordinates software and hardware to solve challenges on fault tolerance and execution efficiency of large-scale parallel systems. He also investigates the impact of novel architectures (e.g., non-volatile memory and accelerator with massive parallelism) on the designs of applications and runtime. Dong's approach to research has always been application-driven. He believes measuring and understanding characteristics of realistic large-scale parallel applications is critical to identify problems, provoke thought and prevent fundamental design flaws.
In collaboration with industry, DOE laboratories, and universities, Dong has been involved in various NSF and DOE projects related to HPC. His work was published in a number of premier HPC conferences. He is a program committee member in a number of international conferences and workshops.
UC Merced is ranked #10 in high performance computing (HPC) and #60 in computer science in general by CSRankings.
Current Research Topics
- System support for persistent memory-based big memory platforms
- Memory-centric system optimization for machine learning training and inference
- Scientific machine learning
Selected Recent Publications (a complete list of the publication can be found from here)
- [VLDB'21] Jie Liu, Wenqian Dong, Qingqing Zhou, and Dong Li. Fauce: Fast and Accurate Deep Ensembles with Uncertainty for Cardinality Estimation. In 47th International Conference on Very Large Data Bases, 2021
- [ATC'21] Jie Ren, Samyam Rajbhandari, Reza Yazdani Aminabadi, Olatunji Ruwase, Shuangyan Yang, Minjia Zhang, Dong Li and Yuxiong He. ZeRO-Offload: Democratizing Billion-Scale Model Training. In USENIX Annual Technical Conference, 2021
- [EuroSys'21] Zhen Xie, Wenqian Dong, Jiawen Liu, Hang Liu and Dong Li. Tahoe: Tree Structure-Aware High Performance Inference Engine for Decision Tree Ensemble on GPU. In European Conference on Computer Systems, 2021
- [FAST'21] Kai Wu, Jie Ren, Ivy Peng and Dong Li. ArchTM: Architecture-Aware, High Performance Transaction for Persistent Memory. In the 19th USENIX Conference on File and Storage Technologies, 2021
- [ASPLOS'21] Bang Di, Jiawen Liu, Hao Chen and Dong Li. "Fast, Flexible and Comprehensive Bug Detection for Persistent Memory Programs". In 26th Architectural Support for Programming Languages and Operating Systems 2021 (Distinguished Artifact Award)
- [PPoPP'21] Jiawen Liu, Jie Ren, Roberto Gioiosa, Dong Li and Jiajia Li. "Sparta: High-Performance, Element-Wise SparseTensor Contraction on Heterogeneous Memory". In 26th Principles and Practice of Parallel Programming 2021
- [HPCA'21] Jie Ren, Jiaolin Luo, Kai Wu, Minjia Zhang, Hyeran Jeon and Dong Li. "Sentinel: Efficient Tensor Migration and Allocation on Heterogeneous Memory Systems for Deep Learning". In 27th IEEE International Symposium on High-Performance Computer Architecture, 2021
- [ICS'21] Zhen Xie, Wenqian Dong, Jie Liu, Ivy Peng, Yanbao Ma and Dong Li. "MD-HM: Memoization-based Molecular Dynamics Simulations on Big Memory System". In 35th International Conference on Supercomputing, 2021
- [ICS'21] Xin He, Jiawen Liu, Zhen Xie, Hao Chen, Guoyang Chen, Weifeng Zhang and Dong Li. "Enabling Energy-Efficient DNN Training on Hybrid GPU-FPGA Accelerators". In 35th International Conference on Supercomputing, 2021
- [ICS'21] Jie Ren, Jiaolin Luo, Ivy Peng, Kai Wu and Dong Li. "Optimizing Large-Scale Plasma Simulations on Persistent Memory-based Heterogeneous Memory with Effective Data Placement Across Memory Hierarchy". In 35th International Conference on Supercomputing, 2021
- [ICS'21] Jiawen Liu, Dong Li, Roberto Gioiosa and Jiajia Li. "Athena: High-Performance Sparse Tensor Contraction Sequence on Heterogeneous Memory". In 35th International Conference on Supercomputing, 2021
- [NeurIPS'20] Jie Ren, Minjia Zhang and Dong Li. "HM-ANN: Efficient Billion-Point Nearest Neighbor Search on Heterogeneous Memory". In 34th Conference on Neural Information Processing Systems, 2020
- [MICRO'18] Jiawen Liu, Hengyu Zhao, Matheus Ogleari, Dong Li, and Jishen Zhao. "Processing-in-Memory for Energy-efficient Neural Network Training: A Heterogeneous Approach". In 51st IEEE/ACM International Symposium on Microarchitecture, 2018
- [6/2021] A paper "Fauce: Fast and Accurate Deep Ensembles with Uncertainty for Cardinality Estimation" is accepted into VLDB'21.
- [5/2021] An NSF grant is funded to support our research on big memory for HPC.
- [4/2021] Our work on training billion-scale NLP models on heterogeneous memory is accepted into USENIX ATC'21! This is a collaboration work with Microsoft.
- [4/2021] Our ASPLOS'21 paper won the distinguished artifact award! Only two papers won this award.
- [3/2021] Four papers are accepted into ICS'21!
- [3/2021] Our collaboration work with LLNL on MPI fault tolerance benchmark suite is reported by HPCWire.
- [3/2021] Wenqian got an internship offer! She will go to the HP labs during the summer working on scientific machine learning.
- [1/2021] Our collaboration work with Microsoft on training large NLP models with heterogeneous memory draws some attentions from media (see 1 and 2). :)
- [1/2021] A paper “Tahoe: Tree Structure-Aware High Performance Inference Engine for Decision Tree Ensemble on GPU” is accepted in EuroSys'21.
- [12/2020] Shuangyan Yang joined us as a PhD student. Welcome aboard, Shuangyan.
- [12/2020] A paper “ArchTM: Architecture-Aware, High Performance Transaction for Persistent Memory” is accepted in FAST'21!
- [12/2020] Welcome Yan Li from Western Digital to visit us. She gave a talk, “NAND Flash and its Application”.
- [11/2020] A paper “Fast, Flexible and Comprehensive Bug Detection for Persistent Memory Programs” is accepted in ASPLOS'21!
- [11/2020] A paper “Sparta: High-Performance, Element-Wise SparseTensor Contraction on Heterogeneous Memory” is accepted in PPoPP'21!
- [10/2020] A paper “Sentinel: Efficient Tensor Migration and Allocation on Heterogeneous Memory Systems for Deep Learning” is accepted in HPCA'21!
- [9/2020] Welcome Dr. Zhao Zhang from Texas Advanced Computing Center! He will give us a talk virtually, "Scalable Deep Learning on Supercomputers".
- [9/2020] A paper “HM-ANN: Efficient Billion-Point Nearest Neighbor Search on Heterogeneous Memory” is accepted in NeurIPS'20!
- [9/2020] Congratulate Jiawen for his internship in the Facebook research!
- [8/2020] The lab has a new website! :)
- [8/2020] A paper “MATCH: An MPI Fault Tolerance Benchmark Suite” is accepted in IISWC'20.
- [7/2020] A paper “Exploring Non-Volatility of Non-Volatile Memory for High Performance Computing Under Failures” is accepted in Cluster'20.
- [7/2020] A paper “Ribbon: High Performance Cache Line Flushing for Persistent Memory” is accepted in PACT'20.
- [6/2020] A paper “Smart-PGSim: Using Neural Network to Accelerate AC-OPF Power Grid Simulation” is accepted in SC'20.
- [5/2020] Dong is promoted to associate professor with tenure.
- [3/2020] Congratulations to Jie Ren, Kai, Jie Liu, Jiawen and Wenqian for their summer internships in Microsoft research, ByteDance, Futurewei and PNNL!
- [3/2020] A paper “Real-time Incremental Learning with Approximate Nearest Neighbor on Mobile Devices” is accepted in USENIX OpML'20.
- [2/2020] A paper “Flame: A Self-Adaptive Auto-Labeling System for Heterogeneous Mobile Processors” is accepted in On-Device Intelligence Workshop at MLSys’20
- [1/2020] Dong was invited to join IEEE Transactions on Parallel and Distributed Systems (TPDS) Review Board.
- [1/2020] Dong was invited as participate in the SIAM parallel processing panel.
- PC member: ICS'21, SC'21, ICML'21, HPDC'21 (student program chair), IPDPS'21, IPDPS'20 (primary PC), NeurIPS'20, HPDC'18-20 (travel chair), ICPP'19-20, SC'18, ASPLOS'18 (shadow PC), Cluster'18 (poster chair), IPDPS'17, NAS (2016-2017), SC'15, CCGrid (2012-2018), Cluster (2015-2020), ISC (2013-2016), etc.
- External reviewers: PPoPP'14, SC'13, Euro-par'12, HPDC'11, etc.
- Editorial position: TPDS review board member; Co-editor, special Issue of the Journal of High Performance Computing Applications for the fourth International Workshop on Accelerators and Hybrid Exascale Systems
- PC (co-) chair: PASA (2012, 2013, and 2016), HP-PAC (2013 and 2014)
- Steering committee: PASA (2014 and 2015)
- Technical reviewers for major journals: IEEE Transaction on Parallel and Distributed Systems (TPDC), IEEE Transaction on Reliability (TR), IEEE Transaction on Computers (TC), Journal of Parallel and Distributed Computing (JPDC), Journal of Supercomputing, International Journal of High Performance Computing, etc.
- Kai Wu (PhD student, since 2016 Summer)
- Jiawen Liu (PhD student, since 2017 Fall)
- Wenqian Dong (PhD student, since 2017 Fall)
- Jie Ren (PhD student, since 2017 Fall)
- Jie Liu (PhD student, since 2019 Spring)
- Shuangyan Yang (PhD student, since 2021 Spring)
- Neelam (Master student,since 2018 Fall)
- Luanzheng Guo (PhD student, graduated in Oct 2020. First employment: Pacific Northwest National Lab)
- Hanlin He (Master student, graduated in 2018. First employment: Byton)
- Wei Liu (Master student, graduated in 2017. First employment: ctrip)
- Himanshu Pillai (Master student, graduated in 2016. First employment: Barcelona Supercomputer Center)
- Armando Montanez. (Undergraduate student, graduated in 2018, as the UC Merced outstanding student. Join Google)
- Jing Liang (Undergraduate student, graduated in 2017. First employment: )
- Nigel Tan (Undergraduate student, graduated in 2017. Join Rice as a PhD student)
- Hanlin He (Undergraduate student, graduated in 2016, as the UC Merced outstanding student. Join PASA as a PhD student)
- Zachary Canann (Undergraduate student, graduated in 2016, as the UC Merced outstanding student. First employment: PayPal)
- Kevin Song (Undergraduate student, graduated in 2015, as the UC Merced outstanding student. Join UT Austin as a PhD student)
- National Science Foundation
- Lawrence Livermore National Lab
- Argonne National Lab
- Lawrence Berkeley National Lab
- Intel (Equipment donation)
- Xilinx (Equipment donation)
- University of California, Merced