While overparameterization in machine learning models offers great benefits in terms of optimization and generalization, it also leads to increased computational requirements as model sizes grow. In this work, we show that leveraging inherent low-dimensional structure within the model parameter updates, we can reap the benefits of overparameterization without the computational burden. In practice, we demonstrate the effectiveness of this approach for deep low-rank matrix completion as well as fine-tuning language models. For theory of deep overparameterized low-rank matrix recovery, we show that the learning dynamics of each weight matrix are confined to an invariant low-dimensional subspace. Consequently, we can construct and train compact, highly compressed factorizations possessing the same benefits as their overparameterized counterparts. For language model fine-tuning, we introduce a method called “Deep LoRA”, which improves the existing low-rank adaptation (LoRA) technique, leading to reduced overfitting and a simplified hyperparameter setup, all while maintaining comparable efficiency. The effectiveness of Deep LoRA is validated through its performance on natural language understanding tasks, particularly when fine-tuning with a limited number of samples.
Efficient Low-Dimensional Compression for Deep Overparameterized Learning
                            
                                                                June 25, 2024                                                            
                        
                                                             
                            
                                                                3:20 pm                                                            
                        
                                                             Laura Balzano
Laura Balzano is an associate professor of Electrical Engineering and Computer Science, and of Statistics by courtesy, at the University of Michigan. She is recipient of the NSF Career Award, ARO Young Investigator Award, AFOSR Young Investigator Award, and faculty fellowships from Intel and 3M. She received an MLK Spirit Award and the Vulcans Education Excellence Award at the University of Michigan. Her expertise is in statistical signal processing, matrix factorization, and optimization. Laura received a BS from Rice University, MS from UCLA, and PhD from the University of Wisconsin Madison in Electrical and Computer Engineering.Seminários
Últimos seminários
- Unlocking Latent Discourse Translation in LLMs Through Quality-Aware DecodingJune 17, 2025- Large language models (LLMs) have emerged as strong contenders in machine translation. Yet, they often fall behind specialized neural machine… 
- Speech as a Biomarker for Disease DetectionMay 20, 2025- Today’s overburdened health systems face numerous challenges, exacerbated by an aging population. Speech emerges as a ubiquitous biomarker with strong… 
- Enhancing Uncertainty Estimation in Neural NetworksMay 6, 2025- Neural networks are often overconfident about their predictions, which undermines their reliability and trustworthiness. In this presentation, I will present… 
- Improving Evaluation Metrics for Vision-and-Language ModelsApril 22, 2025- Evaluating image captions is essential for ensuring both linguistic fluency and accurate semantic alignment with visual content. While reference-free metrics… 
 
								


