Docs
RAG Settings

RAG Settings

Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG) is a powerful feature of the LLM Engine API that enables you to store and retrieve documents in various formats, including CSV, PDF, and plain text. This storage is designed to be fast, efficient, and scalable, making it ideal for a wide range of applications.

What is RAG?

RAG is an in-memory vector storage that allows you to upload and store documents in various formats. Each document has its own unique identifier, which can be used to retrieve the document and use it as input for your API calls.

Supported Formats

The following formats are supported:

  • CSV
  • PDF
  • Plain text

File Size Limitation

The maximum file size for each document is 1 MB.

Number of Files

The number of files you can store depends on your subscription plan. Please refer to our Pricing for more information.

How to Use RAG

To use RAG, simply upload your documents to the LLM Engine API using the provided interface. Each document will be assigned a unique identifier, which you can then use to retrieve the document and use it as input for your API calls.

Example Usage

Here's an example of how to use RAG with the LLM Engine API:

curl --location --request POST 'https://llmengine.ai/query' \
--header 'X-API-Key: YOUR_API_KEY \
--header 'User-Agent: Apidog/1.0.0 (https://apidog.com)' \
--header 'Content-Type: application/json' \
--data-raw '{
    "query": "When did humanity begin building its first civilizations?",
    "model": "llama31_8b_awq",
    "session_reset": false,
    "user_store_ids": "1", // here we can add document ids with comma separator
    "model_options": {
        "temperature": 1.0,
        "max_tokens": 200
    }
}'

Use Cases

RAG is a versatile feature that can be used in a wide range of applications, including:

  • AI Agents: Store documents with instructions on how to reply to user queries.
  • Coding Assistents: Store codebases and use them as input for your API calls.

We hope this helps you get started with using RAG! If you have any questions or need further assistance, please don't hesitate to contact us.