Google's RAG Alternative: Retrieval Interleaved Generation (RIG)

Enhancing Large Language Models with Data Commons

Sep 16, 2024

∙ Paid

Large Language Models (LLMs) have demonstrated remarkable capabilities in tasks such as question answering, text generation, and language translation. However, despite their impressive performance, LLMs are prone to generating factually incorrect information, especially when dealing with numerical and statistical data or other timely facts. This limitation stems from the probabilistic nature of LLMs and the lack of sufficient factual coverage in their training data.

To address this issue, researchers at Google have been exploring ways to integrate LLMs with Data Commons, a vast, open-source repository of public statistics from trusted organizations like the United Nations, Center for Disease Control and Prevention, and global census bureaus. Data Commons serves as a unified knowledge graph, organizing and standardizing data from hundreds of sources, making it universally accessible and useful.

By providing a reliable and comprehensive source of statistical information, Data Commons enables LLMs to ground their responses in factual data, reducing the risk of generating inaccurate or misleading information.

In this article, we will delve into two approaches for interfacing LLMs with Data Commons: Retrieval Interleaved Generation (RIG) and Retrieval Augmented Generation (RAG). We will explore the differences between these methods, highlighting their strengths, limitations, and potential for enhancing the factual accuracy of LLMs.

Understanding Retrieval Interleaved Generation (RIG)

RIG is a tool-inspired (~ function calling) approach that aims to improve the factual accuracy of LLMs by integrating them with Data Commons, a vast, open-source repository of public statistics from trusted organizations. The core idea behind RIG is to fine-tune the LLM to produce natural language Data Commons queries alongside the generated statistical values. In the image below, you can see the LLM making a function call via the “DC(question)” format.

Comparison of baseline and RIG approaches for generating responses with statistical data. Image source.

The RIG pipeline consists of three main components:

Model Fine-tuning: The LLM is fine-tuned on an instruction-response dataset to generate natural language Data Commons queries alongside the original LLM-generated statistical values (Generated-SVs). The natural language queries describe the Generated-SVs and are used to retrieve the corresponding statistical values from Data Commons (Ground-Truth-SVs).

Keep reading with a 7-day free trial

Subscribe to LLM Watch to keep reading this post and get 7 days of free access to the full post archives.