

BLOOM is the BigScience 176 billion parameters model, currently being trained on 416 A100 GPUs of the Jean Zay public supercomputer. The model has a decoder-only architecture with 176B parameters, 70 layers, and 112 attention heads per layer. It is a multilingual model trained on 46 languages, with ...
No team members found
Team member data will appear here when available