Docs
Step 1 - Create Your RAG System

Step 1 - Create Your RAG System

Complete guide to creating a RAG system using the RAG Wizard

Step 1: Create Your RAG System

Overview

The RAG Wizard guides you through a structured 5-step process to create a complete Retrieval-Augmented Generation (RAG) system. This wizard transforms the complex setup of document processing pipelines, embedding configurations, and retrieval mechanisms into an intuitive, guided workflow.

Entry Point: Dashboard → RAG → Create New

Expected Outcome: Fully configured RAG system ready for testing

Estimated Time: 15-30 minutes depending on document count

Wizard Steps Quick Reference

StepTitleWhat You DoWhat Gets Created
1Project SetupDefine use case, scaleProject configuration
2Data SourcesUpload documentsDocument library
3Document ProcessingConfigure chunkingProcessed chunks
4Pipeline ConfigurationSelect embedding, retrievalSearch pipeline
5API SetupGenerate API keyReady-to-use endpoint

Step 1: Project Setup

Project Name

Choose a descriptive name that clearly identifies your RAG system's purpose.

Example: "Customer Support Knowledge Base"

Tips:

  • Use clear, descriptive names
  • Avoid special characters or spaces
  • This name appears in dashboards and API responses

Domain Selection

Select the primary domain that best describes your use case:

DomainBest For
Customer SupportFAQ systems, help desk automation
Research & AcademiaLiterature reviews, citation assistance
Content CreationWriting assistance, content generation
Technical DocumentationAPI documentation, code explanations
Business IntelligenceData analysis, report generation
EducationTutoring systems, course assistance
Legal & ComplianceDocument analysis, regulation compliance
HealthcareMedical knowledge bases, patient information

Your domain selection helps optimize default settings for your use case.

Use Case Specification

Define the specific use case within your chosen domain:

Examples:

  • Customer Support: "Automated FAQ responses for SaaS product inquiries"
  • Research: "Academic paper summarization and citation management"
  • Technical Docs: "Developer API reference with contextual examples"
  • Business: "Sales report analysis and trend identification"

Expected Scale

Configure your system for the anticipated load:

ScaleQueries/DayConfiguration
Small< 1,000Development/testing optimized
Medium1,000 - 10,000Production-ready
Large> 10,000Enterprise-grade

Query Complexity

Define the typical complexity of user queries:

ComplexityDescriptionExample
SimpleDirect fact retrieval"What is the return policy?"
ModerateMulti-step reasoning"How does pricing compare for enterprise?"
ComplexAdvanced analysis"Analyze Q3 policy impact on satisfaction"

Response Type

Select the primary type of responses your system will generate:

Response TypeDescription
Factual AnswersDirect, concise responses
Explanatory ResponsesDetailed explanations with context
Analytical InsightsData interpretation and analysis
Creative ContentContent generation and writing

Step 2: Data Sources

Supported File Formats

FormatExtensionsDescription
Text Documents.txt, .md, .rtfPlain text and markdown
PDF Documents.pdfExtractable text and scanned (OCR)
Office Documents.docx, .xlsx, .pptxMicrosoft Office formats
Structured Data.csv, .json, .xmlTabular and structured data
Web Content.html, .htmHTML documents

File Size Limits

  • Individual File: Up to 25MB per document
  • Batch Upload: Up to 100 files simultaneously
  • Total Project Storage: Varies by subscription tier

Upload Methods

Drag & Drop:

  • Drag files directly into the upload zone
  • Visual feedback for file validation
  • Automatic format detection

File Browser:

  • Traditional file picker interface
  • Multi-file selection support
  • Folder structure preservation options

Document Processing

Upon upload, documents undergo automatic processing:

  1. Content Extraction - Text extraction from various formats
  2. Quality Assessment - Content quality scoring
  3. Metadata Extraction - Automatic metadata capture

Step 3: Document Processing

Chunking Strategy Overview

Chunking divides large documents into smaller, semantically meaningful pieces:

StrategyBest ForDescription
Fixed-SizeConsistent processingDivides text into predetermined sizes
SemanticTopic coherenceDivides based on semantic boundaries
RecursiveComplex structuresHierarchical chunking at multiple levels
Document-BasedShort documentsTreats entire documents as chunks

Configuration Options

Chunk Size:

  • Small (100-300 tokens): Precise retrieval, specific context
  • Medium (300-800 tokens): Balance of precision and context
  • Large (800-2048 tokens): Rich context, fewer total chunks

Overlap Percentage:

  • Low (5-10%): Minimal redundancy, efficient storage
  • Medium (10-20%): Recommended for most use cases
  • High (20-50%): Maximum context preservation

Advanced Processing Options

  • Sentence Boundary Respect: Prevents mid-sentence breaks
  • Context Coherence: Preserves conceptual relationships
  • Section Header Preservation: Maintains document structure
  • Content Filtering: Skips low-information content

Step 4: Pipeline Configuration

Embedding Model Selection

ModelDimensionsContext LengthBest For
Stella-EN-1.5B-v51024D512 tokensExcellent performance (71.19), 1.5B parameters, state-of-the-art
BGE-Large-EN-v1.51024D512 tokensSolid performance (~64), enterprise-grade, English optimized
E5-Large-v21024D512 tokensGood performance (~64), multilingual support, versatile
All-MPNet-Base-v2768D384 tokensGeneral-purpose sentence embeddings, semantic search
BGE-M31024D8192 tokensMultilingual support, 8K context, dense + sparse retrieval
Jina-Embeddings-v2-Base-EN768D8192 tokens8K context support, balanced performance, long documents
All-MiniLM-L6-v2384D256 tokensOptimized for speed, lightweight, quick processing, prototyping

Best Practices for Model Selection:

  • For Production Systems: Use Stella-EN-1.5B-v5 or BGE-Large-EN-v1.5 for best overall performance
  • For Long Documents: Choose BGE-M3 or Jina-Embeddings-v2-Base-EN with 8K context support
  • For Prototyping: Start with All-MiniLM-L6-v2 for fast iteration, then upgrade
  • For Multilingual Content: E5-Large-v2 or BGE-M3 provide excellent multilingual support
  • For Better Similarity Search: Higher dimension models (1024D) generally provide better semantic discrimination than lower dimension models (384D)

Similarity Methods

MethodUse CaseDescription
Cosine SimilarityMost commonNormalizes for vector magnitude
Euclidean DistanceAbsolute magnitudeIntuitive distance measurement
Dot ProductPerformance-criticalFastest computation
Manhattan DistanceNoisy dataRobust to outliers

Retrieval Methods

MethodDescriptionBest For
Custom TemplateSimple template-basedStraightforward Q&A
Contextual RetrievalLLM-enhanced contextComplex narratives
ML-OptimizedMulti-level summariesHierarchical content

Enable when:

  • Technical documents with proper nouns
  • Code repositories with function names
  • Legal documents with specific terms

Benefits:

  • Combines exact term matching with semantic similarity
  • Improves results for terminology-heavy content

Step 5: API Setup

API Key Management

Primary API Key:

  • Generated automatically upon completion
  • Full access to all endpoints
  • Regeneration available through dashboard

Key Security:

  • Automatic key rotation options
  • IP address whitelisting
  • Rate limiting per key
  • Immediate revocation capabilities

Authentication Methods

Bearer Token (Recommended):

curl -X POST "https://api.guidedmind.ai/v1/rag/your-project-id/query" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the return policy?"}'

Header-based:

curl -X POST "https://api.guidedmind.ai/v1/rag/your-project-id/query" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the return policy?"}'

Rate Limiting

TierRequests/DayRequests/Minute
Development1,00010
Production100,000500
EnterpriseCustomCustom

For GraphRAG Users

Graph Schema Configuration

If you're setting up GraphRAG, additional configuration is available:

Graph Schema:

  • Define entity categories (Person, Organization, Location)
  • Define relationship types (WORKS_FOR, LOCATED_IN)
  • Define entity properties (name, email, date)

NER Analysis:

  • Automatically identifies entities in documents
  • Run on sample documents before full graph generation
  • Populates schema with discovered entities

Graph Generation Settings

SettingDescriptionRecommended
Max NodesControl graph size100-200 for testing
Extraction ScenarioDomain-specific promptsBusiness or Technical

Validation and Completion

What Validation Occurs

At each step, the wizard validates:

  • Step 1: All required fields completed, project name unique
  • Step 2: At least one document uploaded successfully
  • Step 3: Chunking strategy selected and configured
  • Step 4: Embedding model and retrieval method selected
  • Step 5: API key generated and secured

Completion Criteria

The wizard is complete when:

  • ✅ All 5 steps completed successfully
  • ✅ At least one document uploaded and processed
  • ✅ API key generated
  • ✅ Configuration validation passed

After completing the wizard, proceed to Step 2: Test Embedding Search to verify your embedding model produces good similarity scores before deployment.

What to Prepare

Before proceeding to Step 2:

  1. Prepare 5-10 test queries representing typical user questions
  2. Know your quality threshold (similarity score > 0.7 recommended)
  3. Have sample documents ready for verification

Tips for Success

  1. Be Specific in Project Setup: Detailed descriptions help optimize recommendations
  2. Upload Representative Documents: Include documents that represent your full content range
  3. Start with Medium Chunk Size: 512-768 tokens works for most use cases
  4. Test Multiple Embedding Models: Compare results before finalizing
  5. Enable BM25 for Technical Content: Improves results for terminology-heavy domains