Shuai Zheng

Cofounder

4677 Old Ironsides Dr

Santa Clara, CA 95054, US

Github

About Me

I am building next generation of foundation models and LLM powered products to make AI more accessible to the world at Boson.ai. The company is still in stealth mode. Stay tuned on what will be revealed soon!

In 2019 I received the Doctoral Degree in computer science from the Hong Kong University of Science and Technology. After that, I worked as a scientist at Amazon Web Services until 2023. I led the distributed system and LLM training efforts across Amazon. These include scalable distributed training and inference infrastructures, more intelligent models with hundreds of billions of parameters, and faster distributed optimization algorithms.

We at Boson.ai are hiring full-time machine learning engineers and researchers for building LLM and its applications. Do drop me a line if you are interested and want to know more.

Research Interests

Distributed System
Large-scale Distributed Algorithm
Deep Learning
Natural Language Processing

Working Experience

Cofounder
Boson AI
Santa Clara, CA, USA, Mar 2023 - Present
Senior Applied Scientist
AWS Deep Learning, Amazon AI
East Palo Alto, CA, USA, Sep 2019 - Feb 2023
Applied Scientist Intern
AWS Deep Learning, Amazon AI
East Palo Alto, CA, USA, Feb 2018 - Aug 2018
Research Intern
VIPL Group, Institute of Computing Technology, Chinese Academy of Sciences
Beijing, China, August 2012 - April 2013

Open Source Software

MXNet: A deep learning framework that mixes symbolic and imperative programming to maximize efficiency and productivity.
Gluon NLP: GluonNLP is a toolkit that enables easy text preprocessing, datasets loading and neural models building to help you speed up your Natural Language Processing (NLP) research.
Slapo: Slapo is a schedule language for progressive optimization of large deep learning model training.
MiCS: MiCS is a proprietary distributed system that enables the training of trillion parameter language models on public cloud. We upstreamed its implementation to DeepSpeed.