Discussion about this post

User's avatar
Bala Subramanian's avatar

Nice review of various different activities and practices!!

Expand full comment
Meng Li's avatar

RankRAG Instruction Adjusts LLM to Simultaneously Capture Relevance Between Questions and Context, Utilizing Retrieved Context to Generate Answers.

Phase 1: Supervised Fine-Tuning (SFT)

The language model is fine-tuned using a high-quality instruction-following dataset, which includes conversation datasets, chain-of-thought datasets, long-form QA datasets, and LLM-generated instruction datasets (Synthetic Instructions). In a multi-turn conversation format, the conversation history between the user and the assistant is used as context, and loss is calculated at the assistant's final response. A total of 128,000 SFT samples were used.

Phase 2: Unified Instruction Fine-Tuning for Ranking and Generation

The first phase of SFT provides the LLM with basic instruction-following capabilities. However, their performance on RAG tasks is often suboptimal, as the LLM is not optimized for extracting answers from the retrieved context given a specific question. RankRAG adjusts the LLM for retrieval-augmented generation and context ranking instructions, with context ranking capabilities being particularly crucial for obtaining more relevant top-K contexts in cases where the retriever is not perfect.

Expand full comment

No posts