Retrieval-Augmented Generation
RAG is not Memory, Search or Memory is better than naive RAG. Better retrieval performance ≠ Better performance
Challenges
- retrieval latency
- system complexity
RAG Notion

RAG Usages


Beyond dot product

Stanford CS25: V3 I Retrieval Augmented Language Models
December 5, 2023
Douwe Kiela, Contextual AI
Language models have led to amazing progress, but they also have important shortcomings. One solution for many of these shortcomings is retrieval augmentation. I will introduce the topic, survey recent literature on retrieval augmented language models and finish with some of the main open questions.
More about the course can be found here: https://web.stanford.edu/class/cs25/
View the entire CS25 Transformers United playlist: https://www.youtube.com/playlist?list=PLoROMvodv4rNiJRchCzutFw5ItR_Z27CM
https://www.youtube.com/watch?v=mE7IDf2SmJg&list=PLoROMvodv4rNiJRchCzutFw5ItR_Z27CM&index=25

Intro of Retrieval Augmented Generation (RAG) and application demos
Introduction of Retrieval Augmented Generation, Jupyter Notebook three demos of Basic RAG, Sentence-window retrieval, Auto-merging…
https://medium.com/@henryhengluo/intro-of-retrieval-augmented-generation-rag-and-application-demos-c1d9239ababf

NVIDIA Research: RAG with Long Context LLMs
This blog post dives into NVIDIA’s recent study comparing retrieval-augmentation with and without long-context LLMs.
https://blog.llamaindex.ai/nvidia-research-rag-with-long-context-llms-7d94d40090c4

KBQA, Knowledge Graph
ML Blog - Improve ChatGPT with Knowledge Graphs
Leveraging knowledge graphs for LLMs using LangChain
https://mlabonne.github.io/blog/posts/Article_Improve_ChatGPT_with_Knowledge_Graphs.html
llama-recipes/recipes/use_cases/agents/langchain at main · meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization an...
https://github.com/meta-llama/llama-recipes/tree/main/recipes/use_cases/agents/langchain
Base model is better at retrieval than Instruction Tuning model

Seonglae Cho
