profile pic
⌘ '
raccourcis clavier

Popularized through LLMs, GPT-3 paper (Brown et al., 2020)

Though, it should be thought as Intelligence amplification rather than “artificial intelligence” system.

Scaling laws

Initial workarXiv from OpenAI

Distributed serving of large models requires cost-efficient methods1

  • Petals: a decentralized system that run Llama 2 over internet

large world models

LWM: implementation of RingAttention

visions

Bibliographie

  • Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., … Amodei, D. (2020). Language Models are Few-Shot Learners. arXiv preprint arXiv:2005.14165 [arXiv]arXiv

Remarque

  1. Distributed Inference and Fine-tuning of Large Language Models over the InternetarXiv