Step 1 - Create Your RAG System

Docs

Complete guide to creating a RAG system using the RAG Wizard

Step 1: Create Your RAG System

Overview

The RAG Wizard guides you through a structured 5-step process to create a complete Retrieval-Augmented Generation (RAG) system. This wizard transforms the complex setup of document processing pipelines, embedding configurations, and retrieval mechanisms into an intuitive, guided workflow.

Entry Point: Dashboard → RAG → Create New

Expected Outcome: Fully configured RAG system ready for testing

Estimated Time: 15-30 minutes depending on document count

Wizard Steps Quick Reference

Step	Title	What You Do	What Gets Created
1	Project Setup	Define use case, scale	Project configuration
2	Data Sources	Upload documents	Document library
3	Document Processing	Configure chunking	Processed chunks
4	Pipeline Configuration	Select embedding, retrieval	Search pipeline
5	API Setup	Generate API key	Ready-to-use endpoint

Step 1: Project Setup

Project Name

Choose a descriptive name that clearly identifies your RAG system's purpose.

Example: "Customer Support Knowledge Base"

Tips:

Use clear, descriptive names
Avoid special characters or spaces
This name appears in dashboards and API responses

Domain Selection

Select the primary domain that best describes your use case:

Domain	Best For
Customer Support	FAQ systems, help desk automation
Research & Academia	Literature reviews, citation assistance
Content Creation	Writing assistance, content generation
Technical Documentation	API documentation, code explanations
Business Intelligence	Data analysis, report generation
Education	Tutoring systems, course assistance
Legal & Compliance	Document analysis, regulation compliance
Healthcare	Medical knowledge bases, patient information

Your domain selection helps optimize default settings for your use case.

Use Case Specification

Define the specific use case within your chosen domain:

Examples:

Customer Support: "Automated FAQ responses for SaaS product inquiries"
Research: "Academic paper summarization and citation management"
Technical Docs: "Developer API reference with contextual examples"
Business: "Sales report analysis and trend identification"

Expected Scale

Configure your system for the anticipated load:

Scale	Queries/Day	Configuration
Small	< 1,000	Development/testing optimized
Medium	1,000 - 10,000	Production-ready
Large	> 10,000	Enterprise-grade

Query Complexity

Define the typical complexity of user queries:

Complexity	Description	Example
Simple	Direct fact retrieval	"What is the return policy?"
Moderate	Multi-step reasoning	"How does pricing compare for enterprise?"
Complex	Advanced analysis	"Analyze Q3 policy impact on satisfaction"

Response Type

Select the primary type of responses your system will generate:

Response Type	Description
Factual Answers	Direct, concise responses
Explanatory Responses	Detailed explanations with context
Analytical Insights	Data interpretation and analysis
Creative Content	Content generation and writing

Step 2: Data Sources

Supported File Formats

Format	Extensions	Description
Text Documents	`.txt`, `.md`, `.rtf`	Plain text and markdown
PDF Documents	`.pdf`	Extractable text and scanned (OCR)
Office Documents	`.docx`, `.xlsx`, `.pptx`	Microsoft Office formats
Structured Data	`.csv`, `.json`, `.xml`	Tabular and structured data
Web Content	`.html`, `.htm`	HTML documents

File Size Limits

Individual File: Up to 25MB per document
Batch Upload: Up to 100 files simultaneously
Total Project Storage: Varies by subscription tier

Upload Methods

Drag & Drop:

Drag files directly into the upload zone
Visual feedback for file validation
Automatic format detection

File Browser:

Traditional file picker interface
Multi-file selection support
Folder structure preservation options

Document Processing

Upon upload, documents undergo automatic processing:

Content Extraction - Text extraction from various formats
Quality Assessment - Content quality scoring
Metadata Extraction - Automatic metadata capture

Step 3: Document Processing

Chunking Strategy Overview

Chunking divides large documents into smaller, semantically meaningful pieces:

Strategy	Best For	Description
Fixed-Size	Consistent processing	Divides text into predetermined sizes
Semantic	Topic coherence	Divides based on semantic boundaries
Recursive	Complex structures	Hierarchical chunking at multiple levels
Document-Based	Short documents	Treats entire documents as chunks

Configuration Options

Chunk Size:

Small (100-300 tokens): Precise retrieval, specific context
Medium (300-800 tokens): Balance of precision and context
Large (800-2048 tokens): Rich context, fewer total chunks

Overlap Percentage:

Low (5-10%): Minimal redundancy, efficient storage
Medium (10-20%): Recommended for most use cases
High (20-50%): Maximum context preservation

Advanced Processing Options

Sentence Boundary Respect: Prevents mid-sentence breaks
Context Coherence: Preserves conceptual relationships
Section Header Preservation: Maintains document structure
Content Filtering: Skips low-information content

Step 4: Pipeline Configuration

Embedding Model Selection

Model	Dimensions	Context Length	Best For
Stella-EN-1.5B-v5	1024D	512 tokens	Excellent performance (71.19), 1.5B parameters, state-of-the-art
BGE-Large-EN-v1.5	1024D	512 tokens	Solid performance (~64), enterprise-grade, English optimized
E5-Large-v2	1024D	512 tokens	Good performance (~64), multilingual support, versatile
All-MPNet-Base-v2	768D	384 tokens	General-purpose sentence embeddings, semantic search
BGE-M3	1024D	8192 tokens	Multilingual support, 8K context, dense + sparse retrieval
Jina-Embeddings-v2-Base-EN	768D	8192 tokens	8K context support, balanced performance, long documents
All-MiniLM-L6-v2	384D	256 tokens	Optimized for speed, lightweight, quick processing, prototyping

Best Practices for Model Selection:

For Production Systems: Use Stella-EN-1.5B-v5 or BGE-Large-EN-v1.5 for best overall performance
For Long Documents: Choose BGE-M3 or Jina-Embeddings-v2-Base-EN with 8K context support
For Prototyping: Start with All-MiniLM-L6-v2 for fast iteration, then upgrade
For Multilingual Content: E5-Large-v2 or BGE-M3 provide excellent multilingual support
For Better Similarity Search: Higher dimension models (1024D) generally provide better semantic discrimination than lower dimension models (384D)

Similarity Methods

Method	Use Case	Description
Cosine Similarity	Most common	Normalizes for vector magnitude
Euclidean Distance	Absolute magnitude	Intuitive distance measurement
Dot Product	Performance-critical	Fastest computation
Manhattan Distance	Noisy data	Robust to outliers

Retrieval Methods

Method	Description	Best For
Custom Template	Simple template-based	Straightforward Q&A
Contextual Retrieval	LLM-enhanced context	Complex narratives
ML-Optimized	Multi-level summaries	Hierarchical content

BM25 Hybrid Search

Enable when:

Technical documents with proper nouns
Code repositories with function names
Legal documents with specific terms

Benefits:

Combines exact term matching with semantic similarity
Improves results for terminology-heavy content

Step 5: API Setup

API Key Management

Primary API Key:

Generated automatically upon completion
Full access to all endpoints
Regeneration available through dashboard

Key Security:

Automatic key rotation options
IP address whitelisting
Rate limiting per key
Immediate revocation capabilities

Authentication Methods

Bearer Token (Recommended):

curl -X POST "https://api.guidedmind.ai/v1/rag/your-project-id/query" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the return policy?"}'

Header-based:

curl -X POST "https://api.guidedmind.ai/v1/rag/your-project-id/query" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the return policy?"}'

Rate Limiting

Tier	Requests/Day	Requests/Minute
Development	1,000	10
Production	100,000	500
Enterprise	Custom	Custom

For GraphRAG Users

Graph Schema Configuration

If you're setting up GraphRAG, additional configuration is available:

Graph Schema:

Define entity categories (Person, Organization, Location)
Define relationship types (WORKS_FOR, LOCATED_IN)
Define entity properties (name, email, date)

NER Analysis:

Automatically identifies entities in documents
Run on sample documents before full graph generation
Populates schema with discovered entities

Graph Generation Settings

Setting	Description	Recommended
Max Nodes	Control graph size	100-200 for testing
Extraction Scenario	Domain-specific prompts	Business or Technical

Validation and Completion

What Validation Occurs

At each step, the wizard validates:

Step 1: All required fields completed, project name unique
Step 2: At least one document uploaded successfully
Step 3: Chunking strategy selected and configured
Step 4: Embedding model and retrieval method selected
Step 5: API key generated and secured

Completion Criteria

The wizard is complete when:

✅ All 5 steps completed successfully
✅ At least one document uploaded and processed
✅ API key generated
✅ Configuration validation passed

Next Step: Test Embedding Search

After completing the wizard, proceed to Step 2: Test Embedding Search to verify your embedding model produces good similarity scores before deployment.

What to Prepare

Before proceeding to Step 2:

Prepare 5-10 test queries representing typical user questions
Know your quality threshold (similarity score > 0.7 recommended)
Have sample documents ready for verification

Tips for Success

Be Specific in Project Setup: Detailed descriptions help optimize recommendations
Upload Representative Documents: Include documents that represent your full content range
Start with Medium Chunk Size: 512-768 tokens works for most use cases
Test Multiple Embedding Models: Compare results before finalizing
Enable BM25 for Technical Content: Improves results for terminology-heavy domains

API Setup Step 2: Test Embedding