profile image

Tong Zhao (赵曈)

Assistant Professor
Leading a stochastic optimization and reinforcement learning team
at Institue of Computing Technology, Chinese Academy of Sciences.


📰News

Oct 30,
2024

I would like to apply for a postdoctoral position in the AI-related field. I have a multidisciplinary background in mathematics, AI, high-performance computing and following advantages:

  1. Curiosity, Adaptability, and Self-Motivation. I am capable of working independently and eager to learn new knowledge, even willing to adopt a doctoral student mindset. I aim to contribute to highly original research. Additionally, I am open to discussing ideas with group members and assisting in guiding PhD students.
  2. Computational Resources. Due to my professional experience in the High Performance Computer Research Center, I have established good collaborative relationships with some leading supercomputing companies, internet companies, as well as distributed training centers in universities and research institutes. If necessary, I can provide the required computational resources for our group.
  3. Relevant Theoretical Foundation. I obtained my Ph.D. in the School of Mathematical Sciences, with strong algorithm analysis skills and the ability to learn new theories. My background covers deep learning optimizers, reinforcement learning, stochastic analysis, control and game theory, and PDEs.

In addition, almost all my personal information (referees, publications, and etc.) related to an postdoctoral application can be found on this website. If you have any interest and question, please feel free to contact me or refer to the FAQ section.


🎓Research

Primary: Optimization in Deep Learning

  • I have studied how momentum affects generalization and its relationship with the sharpness of the landscape.
  • I explored why the training performance of deep neural networks deteriorates with larger batch sizes, and how momentum can be introduced to improve the performance. I designed adaptive momentum methods to enhance training with large batch sizes.
  • Many optimizers perform well on small models (parameters<100M), but their performance lags behind SGD on large models (parameters>100M), especially considering the memory and communication constraints. How can we design efficient and practical optimizers for large-scale models?
  • Kalman filtering is a classical algorithm in control domain, I adapt it to a novel optimizer. This optimizer has shown promising results in molecular potential surface fitting. Moreover, this algorithm has been incorporated into the DeePMD library for molecular dynamics simulations.
  • I have also dabbled in training neural networks for PDEs (Partial Differential Equations).During my PhD in the Mathematics Department at Fudan University, I researched stochastic control and operations, focusing on stochastic partial differential equations and how to solve forward-backward stochastic differential equations and corresponding PDEs by deep learning.
  • Secondary: Reinforcement Learning

  • I designed a co-running scheduler for AI training tasks using reinforcement learning algorithms, fully utilizing the large scheduling window to achieve better performance than traditional (dynamic) schedulers.
  • I developed a distributed I/O scheduler, especially for training tasks, with reinforcement learning algorithms. It is an interference-aware scheduler of multiple reads and writes, using a neural network to predict the real bandwidth under interference.

  • 📑Selected Publications [full list]

    (*) denotes equal contribution, (†) denotes corresponding author

    2025

    1. IPDPS
      Large scale finite-temperature rt-TDDFT simulation with hybrid functional
      Rongrong Liu, Zhuoqiang Guo, Qiuchen Sha, Tong Zhao, Haibo Li, Wei Hu, Lijun Liu, Guangming Tan, and Weile Jia
      In IEEE International Parallel and Distributed Processing Symposium, 2025
    1. JCAM
      Backward error analysis of the Lanczos bidiagonalization with reorthogonalization. Journal of Computational and Applied Mathematics
      Haibo Li, Guangming Tan, and Tong Zhao
      Journal of Computational and Applied Mathematics, 2025

    2024

    1. JCST
      10-million atoms simulation of first-principle package LS3DF
      Yujin Yan, Haibo LiTong Zhao, Lin-Wang Wang, Lin Shi, Tao Liu, Guangming Tan, Weile Jiaand Ninghui Sun
      Journal of Computer Science and Technology, 2024
    1. PPoPP
      Training one deepmd model in minutes: A step towards online learning
      Siyu Hu, Tong Zhao, Qiuchen Sha, Enji Li, Xiangyu Meng, Lijun Liu, Lin-Wang Wang, Guangming Tan, and Weile Jia
      In Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024
    1. EuroPar
      Accelerating large-scale sparse LU factorization for RF circuit simulation
      Guofeng Feng, Hongyu Wang, Zhuoqiang Guo, Mingzhen LiTong Zhao, Zhou Jin, Weile Jia, Guangming Tan, and Ninghui Sun
      In European Conference on Parallel Processing, 2024

    2023

    1. SC
      Enhance the strong scaling of lammps on fugaku
      Jianxiong Li, Tong Zhao, Zuoqiang Guo, Shunchen Shi, Lijun Liu, Guangming Tan, Weile JiaGuojun Yuan, and Zhan Wang
      In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2023
    1. AAAIOral
      RLEKF: An optimizer for deep potential with ab initio accuracy
      Siyu Hu*Wentao Zhang*Qiuchen Sha, Feng Pan, Lin-Wang Wang, Weile Jia, Guangming Tan, and Tong Zhao
      In Proceedings of the AAAI Conference on Artificial Intelligence, 2023

    2022

    1. CAM
      Limits of one-dimensional interacting particle systems with two-scale interaction
      Tong Zhao
      Chinese Annals of Mathematics, Series B, 2022