A postdoctoral scholar is sought in the area of accelerating machine learning on heterogeneous parallel systems. Please see here for details.
One GRA/GTA position (since 2019 Spring or Fall) is available for a PhD student who is interested in optimizing machine learning workloads in distributed and parallel systems.
Dr. Dong Li is an assistant professor in the Department of Electrical Engineering and Computer Science, University of California, Merced since 2015. He is the director of Parallel Architecture, System, and Algorithm Lab (PASA). Previously (2011-2014), he was a research scientist at the Oak Ridge National Laboratory (ORNL). Before that, he earned his PhD in computer science from Virginia Tech. Dong received a CAREER Award from U.S. National Science Foundation in 2016, a Berkeley Lab University Faculty Fellowship in 2016, and an ORNL/CSMD Distinguished Contributor Award in 2013. His paper in SC'14 was nominated as the best student paper. He is also the lead PI for NVIDIA CUDA Research Center at UC Merced.
Dong's research focuses on high performance computing (HPC), and maintains a strong relevance to computer systems. The core theme of his research is to study how to enable scalable and efficient execution of enterprise and scientific applications on increasingly complex large-scale parallel systems. His work creates innovation in runtime, architecture, performance modeling, and programming models; His work often coordinates software and hardware to solve challenges on fault tolerance and execution efficiency of large-scale parallel systems. He also investigates the impact of novel architectures (e.g., non-volatile memory and accelerator with massive parallelism) on the designs of applications and runtime. Dong's approach to research has always been application-driven. He believes measuring and understanding characteristics of realistic large-scale parallel applications is critical to identify problems, provoke thought and prevent fundamental design flaws.
In collaboration with industry, DOE laboratories, and universities, Dong has been involved in various DOE, DOD and NSF projects related to HPC. His work was published in a number of premier HPC conferences. He is a program committee member in a number of international conferences and workshops.
Selected Recent Publications (a complete list of the publication can be found from here)
- [PACT'18] Bang Di, Jianhua Sun, Hao Chen, and Dong Li. "GMOD: A Dynamic GPU Memory Overflow Detector". In 27th International Conference on Parallel Architectures and Compilation Techniques, 2018
- [MICRO'18] Jiawen Liu, Hengyu Zhao, Matheus Ogleari, Dong Li, and Jishen Zhao. "Processing-in-Memory for Energy-efficient Neural Network Training: A Heterogeneous Approach". In 51st IEEE/ACM International Symposium on Microarchitecture, 2018
- [SC'18] Kai Wu, Jie Ren, and Dong Li. "Runtime Data Management on Non-Volatile Memory-Based Heterogeneous Memory for Task Parallel Programs". In 30th ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, 2018
- [SC'18] Luanzheng Guo, Dong Li, Ignacio Laguna, and Martin Schulz. "FlipTracker: Understanding Natural Error Resilience in HPC Applications". In 30th ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, 2018
- [SC'17] Kai Wu, Yingchao Huang, and Dong Li. "Unimem: Runtime Data Management on Non-Volatile Memory-based Heterogeneous Main Memory". In 29th ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, 2017
- [8/2018] One paper for characterizing application inherent error resilience is accepted into SC'18.
- [7/2018] One paper for accelerating machine learning training on a processing-in-memory arch is accepted into MICRO'18.
- [6/2018] One paper for detecting buffer overflow for co-running kernels on single GPUs is accepted into PACT'18.
- [6/2018] One paper for directing data placement in NVM for task parallel program is accepted into SC'18.
- [6/2018] One paper for modeling application resilience to errors in large scales is accepted into ICPP'18.
- [5/2018] Welcome Dr. Guoyang Chen (Alibaba research) to visit the PASA lab!
- [4/2018] Congratulations to Anzheng, Kai, Jie, and Jiawen! They will go to Lawrence Livermore National Lab and Oak Ridge National Lab as summer student interns.
- [4/2018] The PASA Lab gets research fund from Lawrence Livermore National Lab to support its work on HPC resilience.
- [4/2018] Welcome Prof. Xia Ning (Indiana University - Purdue University Indianapolis) to visit the PASA lab!
- [2/2018] Two papers are accepted as posters in Non-Volatile Memories Workshop.
- [11/2017] Congratulations to Wenqian, Jie, and Jiawen. They are selected as student volunteers for SC'17.
- [8/2017] Welcome four new PhD students, Wenqian Dong, Jiawen Liu, Jie Ren, and Letian Kang!
- [6/2017] Two papers are accepted in IEEE Cluster'17. One of the two is about non-volatile memory crash consistency from the algorithm perspective (interesting!), and the other is about GPU performance modeling on heterogeneous memory system.
- [6/2017] Our paper, Unimem: Runtime Data Management on Non-Volatile Memory-based Heterogeneous Main Memory, is accepted in SC'17.
- [6/2017] Our paper, Performance Evaluation and Modeling of HPC I/O on Non-Volatile Memory, is accepted in NAS'17.
- [6/2017] NSF funds our project on processing-in-memory.
- [2/2017] Two students in PASA lab get internships during the summer. They are PhD student Luanzheng Guo (Lawrence Livermore National Lab), and Kai Wu (Los Alamos National Lab). Congratulations, students!
- [11/2016] Our poster is nominated as the best poster (2.9% of all poster submissions) in SC'16.
- [6/2016] We are selected as an NVIDIA Research Center with Dong as the lead PI.
- [6/2016] NSF funds our project on heterogeneous memory.
- PC member: SC'18, ASPLOS'18 (shadow PC), HPDC'18 (travel chair), Cluster'18 (poster chair), IPDPS'17, NAS (2016-2017), SC'15, CCGrid (2012-2017), Cluster (2015-2017), ISC (2013-2016), etc.
- External reviewers: PPoPP'14, SC'13, Euro-par'12, HPDC'11, etc.
- Editorial position: co-editor, special Issue of the Journal of High Performance Computing Applications for the fourth International Workshop on Accelerators and Hybrid Exascale Systems
- PC (co-) chair: PASA (2012, 2013, and 2016), HP-PAC (2013 and 2014)
- Steering committee: PASA (2014 and 2015)
- Technical reviewers for major journals: IEEE Transaction on Parallel and Distributed Systems (TPDC), IEEE Transaction on Reliability (TR), IEEE Transaction on Computers (TC), Journal of Parallel and Distributed Computing (JPDC), Journal of Supercomputing, International Journal of High Performance Computing, etc.
- Luanzheng Guo (PhD student, since 2015 Fall)
- Kai Wu (PhD student, since 2016 Summer)
- Jiawen Liu (PhD student, since 2017 Fall)
- Letian Kang (PhD student, since 2017 Fall)
- Wenqian Dong (PhD student, since 2017 Fall)
- Jie Ren (PhD student, since 2017 Fall)
- Yaorong Fan (Master student, since 2016 Fall)
- Hanlin He (Master student, since 2017 Fall)
- Armando Montanez (REU student, 2017 summer)
- Logan Smith (REU student, 2017 Summer)
- Wei Liu (Master student, graduated in 2017. First employment: ctrip)
- Himanshu Pillai (Master student, graduated in 2016. First employment: Barcelona Supercomputer Center)
- Jing Liang (Undergraduate student, graduated in 2017. First employment: )
- Nigel Tan (Undergraduate student, graduated in 2017. Join Rice as a PhD student)
- Hanlin He (Undergraduate student, graduated in 2016 as the UC Merced outstanding student. Join PASA as a PhD student)
- Zachary Canann (Undergraduate student, graduated in 2016 as the UC Merced outstanding student. First employment: PayPal)
- Kevin Song (Undergraduate student, graduated in 2015 as the UC Merced outstanding student. Join UT Austin as a PhD student)
Research Sponsors: Our research is generously supported by:
- National Science Foundation
- Lawrence Livermore National Lab
- Lawrence Berkeley National Lab
- NVIDIA (Equipment donation)
- Intel (Equipment donation)
- Xilinx (Equipment donation)
- University of California, Merced