Dr. Soufiane Hayou
Simons institute for theory and computing
Location: Technology Innovation Institute - TII Yas Island
6th March 2024, 11:00am – 12:00pm (GST)
Title: | Principled Scaling of Neural Networks |
Abstract: | Neural networks have achieved impressive performance in many applications such as image and speech recognition and generation. State-of-the-art performance is usually achieved via a series of engineered modifications to existing neural architectures and their training procedures. However, a common feature of these systems is their large-scale nature: modern neural networks usually contain Billions - if not 100's of Billions - of trainable parameters, and empirical evaluations (generally) support the claim that increasing the scale of neural networks (e.g. width and depth) boosts the model performance if done correctly. However, given a neural network model, it is not straightforward to address the crucial question `how do we adjust the training process as we scale the network?'. In this talk, I will show how we can leverage different mathematical results to efficiently scale and train neural networks with applications both in pretraining and fine-tuning. |
Bio: | Soufiane Hayou obtained his PhD in statistics in 2021 from Oxford. He graduated from Ecole Polytechnique in Paris before joining Oxford. During his PhD, he worked mainly on the theory of infinite-width neural networks on topics including the impact of the hyperparameters on how the 'geometric' information propagates through the network. He is currently a Researcher at Simons Institute for the Theory of Computing, on leave from his Peng Tsu Ann Assistant Professorship in mathematics at the National University of Singapore. His current research is focused on the theory and practice of scaling of neural networks. |