Retrieval-Augmented Generation paper: arxiv

Since models has finite memory, limited context windows, generations often leads to “hallucinations” and lack of cohesion

The idea of RAG is to combine a pretrained retriever and a seq2seq to do end-to-end fine tuning.

Two core components include embeddings and vector databases.