Charles H Martin
Calculated Content, Inc
15th January, 2025, 11:00AM - 12:00PM (GST)
Title: | WeightWatcher: Data Free Diagnostics for Deep Learning |
Abstract: | WeightWatcher is an open-source tool, based on the research towards a new theory of learning, that can analyze a pre-trained and/or instruction-fined tuned LLM and provide diagnostic information as to how well each individual layer has 'converged'. WeightWatcher is based on SOTA research into Why Deep Learning Works, using advanced techniques from statistical mechanics, random matrix theory, and strongly correlated systems. The theory formulates the DNN learning problem in terms of how the layers converge, and, in doing so, provides new and unique layer quality metrics that engineers can use to build better LLMs. The tool and the theory, can be used to detect which layers are (potentially) underfit or overfit, to set the learning rates dynamically during training, to help compress LLMs, and fine-tune models more efficiently and effectively. |
Bio: | Dr Martin earned his PhD in theoretical chemistry from the University of Chicago under the mentorship of Professor Karl Freed—a group notable for producing Nobel laureates such as Bawendi (Quntum Dots, 2023) and Jumper (AlphaFold, 2024). He then served as an NSF Postdoctoral Fellowship at UIUC. Dr. Martin continues to collaborate with UC Berkeley on foundational AI research and has developed the theory of Heavy-Tailed Self-Regularization (HTSR), and the newer SemiEmpirical Theory of Learning (SETOL). He is the creator and lead of the open-source WeightWatcher project, which has 150,000 downloads, and uses HTSR / SETOL theories to assess the quality of neural network layers. Throughout his career, Dr. Martin has applied his expertise to develop and implement machine learning systems across various industries, contributing to projects at organizations including Roche, France Telecom, GoDaddy, Aardvark (acquired by Google), eBay, eHow, Walmart, GLG, Barclays/BGI, and BlackRock. Beyond his professional endeavors, Dr. Martin offers scientific consulting to the Page family office at the Anthropocene Institute, providing insights on nuclear and quantum technologies with a focus on addressing climate change. His extensive experience spans over 20 years in commercial data science, software engineering, and machine learning/AI. |