Mingchen Li


Electrical Engineering & Computer Science, University of Michigan, Ann Arbor


  • Ph.D. Candidate in Electrical and Computer Engineering. Advisor: Samet Oymak

Computer Science and Engineering, UCR, US


  • Ph.D. Candidate in Computer Science. Advisor: Samet Oymak
  • GPA: 3.93/4.0

School of Computer Science and Technology, Fudan University, China


  • Degree: Bachelor of Science in Information Security. Advisor: Yanqiu Chen


My research interests are statistical machine learning and optimization, including robust neural network training, network pruning, semi-supervised learning, and bilevel optimization. I am highly interested in learning on noisy datasets or datasets that are rarely labeled or even unlabeled. My current research project is exploring LLM-based multiagent system in competing environment.


Gradient descent with early stopping is probably robust to label noise for overparameterized neural networks.

Published on AISTATS 2020, 305 citations

  • Demonstrated that large neural networks can overfit to noise, which hurt the accuracy.
  • Proved that a large neural network becomes provably robust to label noise when trained with early stopping.

FEDNEST: Federated Bilevel, Minimax, and Compositional optimization

Published on ICML 2022, oral, 2% acceptance

  • Proposed FEDNEST: A federated gradient-based method to address nested optimization problems.
  • Provided provable convergence rate for FEDNEST in the presence of heterogeneous and introduced variations for bilevel, minimax and compositional optimization. Experiments on hyperparameter and hyper-representation learning demonstrate the benefit of FEDNEST in practice.

AutoBalance: Optimized Loss Functions for Imbalanced Data

Published on NeurIPS 2021

  • Established a bi-level optimization framework that automatically designs a training loss function to optimize a blend of accuracy and fairness-seeking objectives, such as long-tailed data and group imbalanced data.
  • Demonstrated that the designed personalized treatment for class/group imbalanced dataset overperforms state-of-the-art approaches by extensive evaluations.

Generalization Guarantees for Neural Architecture Search with Train-Validation Split.

Published on ICML 2021

  • Demonstrated that validation loss of a near-minimal validation set can be indicative of the true test loss. The theory is established for continuous search spaces which relevant to popular differentiable NAS methods.
  • Established generalization bounds for NAS problems with an emphasis on an activation search problem. Proved that train-validation procedure can produce best architecture even the model overfits to training data.

Generalization, Adaptation and Low-Rank Representation in Neural Networks.

Published on Asilomar Conference, 61 citations

  • Demonstrated that Jacobian of a neural network exhibit low-rank structure with a few large singular values and many small ones leading to low-dimensional information space.
  • Proved that learning on the information space with large singular values is fast and can generalize well but learning on the nuisance space with smaller singular values can impede optimization and generalization.


Student Researcher at Google LLC., RankLab team

2022 Sep-Nov

Software Engineer Intern at Google LLC., RankLab team

2022 Jun-Nov

  • Focused on noisy detection on imbalanced dataset.

Software Engineer Intern at Google LLC., YouTube Shorts team

2021 Jun-Nov

  • Improved user profiling model to enhance recommendation system in YouTube Shorts ranking team.

GPU-Accelerated Deep Learning Framework: Mini-Caffe


  • Designed and implemented a user-friendly GPU accelerated Caffe-like deep learning framework using C++ and CUDA for Convolution, Fully Connected, ReLU, Local Response Normalization, and Batch Normalization layers. Source code available at https://github.com/DavyVan/MiniCaffe

Data Mining: Behavior-Based Software Malware Detection:


  • Designed and implemented a novel feature extraction scheme consisting of behavior counting and PCA-based features for the dataset provided by Qihoo360 DataCon. Trained a Neural Network to identify malware software.

Software Engineering Intern at Shanghai ShanCe Technologies Company Ltd.


  • Develop a visual trading system consisting of a backend server and web interface using C++, HTML, SQL, JavaScript, and Flask framework to deploy strategies, view stocks and futures information on the website.




  • Machine Learning programming using PyTorch, TensorFlow and JAX, GPU programming.

Reviewer Experience

  • Reviewer of NeurIPS, ICML, AISTATS, ICLR, CVPR and KDD.

Teaching Experience

  • Teach assistant of Deep Learning, Artificial Intelligence, Data Science courses in UCR.


Dissertation Year Program (DYP) fellowship, UCR, CS Department 2022

Second class scholarship of Fudan University, twice, 4/32 2015 & 2017

Third class scholarship of Fudan University 2016