Popularized through LLMs, GPT-3 paper,
See also: 7.1 of The Little Book of Deep Learning
Though, it should be thought as Intelligence amplification rather than “artificial intelligence” system.
Scaling laws
Initial work from OpenAI
Distributed serving of large models requires cost-efficient methods1
- Petals: a decentralized system that run Llama 2 over internet
large world models
LWM: implementation of RingAttention