Shuai Zheng
- Cofounder
- Boson AI
- 4677 Old Ironsides Dr
- Santa Clara, CA 95054, US
- shuai@boson.ai
- Github
About Me
I am building next generation of foundation models and LLM powered products to make AI more accessible to the world at Boson.ai. The company is still in stealth mode. Stay tuned on what will be revealed soon!
In 2019 I received the Doctoral Degree in computer science from the Hong Kong University of Science and Technology. After that, I worked as a scientist at Amazon Web Services until 2023. I led the distributed system and LLM training efforts across Amazon. These include scalable distributed training and inference infrastructures, more intelligent models with hundreds of billions of parameters, and faster distributed optimization algorithms.
We at Boson.ai are hiring full-time machine learning engineers and researchers for building LLM and its applications. Do drop me a line if you are interested and want to know more.
Research Interests
- Distributed System
- Large-scale Distributed Algorithm
- Deep Learning
- Natural Language Processing
Working Experience
-
Cofounder Boson AI Santa Clara, CA, USA, Mar 2023 - Present
-
Senior Applied Scientist AWS Deep Learning, Amazon AI East Palo Alto, CA, USA, Sep 2019 - Feb 2023
-
Applied Scientist Intern AWS Deep Learning, Amazon AI East Palo Alto, CA, USA, Feb 2018 - Aug 2018
-
Research Intern VIPL Group, Institute of Computing Technology, Chinese Academy of Sciences Beijing, China, August 2012 - April 2013
Open Source Software
- MXNet: A deep learning framework that mixes symbolic and imperative programming to maximize efficiency and productivity.
- Gluon NLP: GluonNLP is a toolkit that enables easy text preprocessing, datasets loading and neural models building to help you speed up your Natural Language Processing (NLP) research.
- Slapo: Slapo is a schedule language for progressive optimization of large deep learning model training.
- MiCS: MiCS is a proprietary distributed system that enables the training of trillion parameter language models on public cloud. We upstreamed its implementation to DeepSpeed.