Step 1 - Create Your RAG System
Complete guide to creating a RAG system using the RAG Wizard
Step 1: Create Your RAG System
Overview
The RAG Wizard guides you through a structured 5-step process to create a complete Retrieval-Augmented Generation (RAG) system. This wizard transforms the complex setup of document processing pipelines, embedding configurations, and retrieval mechanisms into an intuitive, guided workflow.
Entry Point: Dashboard → RAG → Create New
Expected Outcome: Fully configured RAG system ready for testing
Estimated Time: 15-30 minutes depending on document count
Wizard Steps Quick Reference
| Step | Title | What You Do | What Gets Created |
|---|---|---|---|
| 1 | Project Setup | Define use case, scale | Project configuration |
| 2 | Data Sources | Upload documents | Document library |
| 3 | Document Processing | Configure chunking | Processed chunks |
| 4 | Pipeline Configuration | Select embedding, retrieval | Search pipeline |
| 5 | API Setup | Generate API key | Ready-to-use endpoint |
Step 1: Project Setup
Project Name
Choose a descriptive name that clearly identifies your RAG system's purpose.
Example: "Customer Support Knowledge Base"
Tips:
- Use clear, descriptive names
- Avoid special characters or spaces
- This name appears in dashboards and API responses
Domain Selection
Select the primary domain that best describes your use case:
| Domain | Best For |
|---|---|
| Customer Support | FAQ systems, help desk automation |
| Research & Academia | Literature reviews, citation assistance |
| Content Creation | Writing assistance, content generation |
| Technical Documentation | API documentation, code explanations |
| Business Intelligence | Data analysis, report generation |
| Education | Tutoring systems, course assistance |
| Legal & Compliance | Document analysis, regulation compliance |
| Healthcare | Medical knowledge bases, patient information |
Your domain selection helps optimize default settings for your use case.
Use Case Specification
Define the specific use case within your chosen domain:
Examples:
- Customer Support: "Automated FAQ responses for SaaS product inquiries"
- Research: "Academic paper summarization and citation management"
- Technical Docs: "Developer API reference with contextual examples"
- Business: "Sales report analysis and trend identification"
Expected Scale
Configure your system for the anticipated load:
| Scale | Queries/Day | Configuration |
|---|---|---|
| Small | < 1,000 | Development/testing optimized |
| Medium | 1,000 - 10,000 | Production-ready |
| Large | > 10,000 | Enterprise-grade |
Query Complexity
Define the typical complexity of user queries:
| Complexity | Description | Example |
|---|---|---|
| Simple | Direct fact retrieval | "What is the return policy?" |
| Moderate | Multi-step reasoning | "How does pricing compare for enterprise?" |
| Complex | Advanced analysis | "Analyze Q3 policy impact on satisfaction" |
Response Type
Select the primary type of responses your system will generate:
| Response Type | Description |
|---|---|
| Factual Answers | Direct, concise responses |
| Explanatory Responses | Detailed explanations with context |
| Analytical Insights | Data interpretation and analysis |
| Creative Content | Content generation and writing |
Step 2: Data Sources
Supported File Formats
| Format | Extensions | Description |
|---|---|---|
| Text Documents | .txt, .md, .rtf | Plain text and markdown |
| PDF Documents | .pdf | Extractable text and scanned (OCR) |
| Office Documents | .docx, .xlsx, .pptx | Microsoft Office formats |
| Structured Data | .csv, .json, .xml | Tabular and structured data |
| Web Content | .html, .htm | HTML documents |
File Size Limits
- Individual File: Up to 25MB per document
- Batch Upload: Up to 100 files simultaneously
- Total Project Storage: Varies by subscription tier
Upload Methods
Drag & Drop:
- Drag files directly into the upload zone
- Visual feedback for file validation
- Automatic format detection
File Browser:
- Traditional file picker interface
- Multi-file selection support
- Folder structure preservation options
Document Processing
Upon upload, documents undergo automatic processing:
- Content Extraction - Text extraction from various formats
- Quality Assessment - Content quality scoring
- Metadata Extraction - Automatic metadata capture
Step 3: Document Processing
Chunking Strategy Overview
Chunking divides large documents into smaller, semantically meaningful pieces:
| Strategy | Best For | Description |
|---|---|---|
| Fixed-Size | Consistent processing | Divides text into predetermined sizes |
| Semantic | Topic coherence | Divides based on semantic boundaries |
| Recursive | Complex structures | Hierarchical chunking at multiple levels |
| Document-Based | Short documents | Treats entire documents as chunks |
Configuration Options
Chunk Size:
- Small (100-300 tokens): Precise retrieval, specific context
- Medium (300-800 tokens): Balance of precision and context
- Large (800-2048 tokens): Rich context, fewer total chunks
Overlap Percentage:
- Low (5-10%): Minimal redundancy, efficient storage
- Medium (10-20%): Recommended for most use cases
- High (20-50%): Maximum context preservation
Advanced Processing Options
- Sentence Boundary Respect: Prevents mid-sentence breaks
- Context Coherence: Preserves conceptual relationships
- Section Header Preservation: Maintains document structure
- Content Filtering: Skips low-information content
Step 4: Pipeline Configuration
Embedding Model Selection
| Model | Dimensions | Context Length | Best For |
|---|---|---|---|
| Stella-EN-1.5B-v5 | 1024D | 512 tokens | Excellent performance (71.19), 1.5B parameters, state-of-the-art |
| BGE-Large-EN-v1.5 | 1024D | 512 tokens | Solid performance (~64), enterprise-grade, English optimized |
| E5-Large-v2 | 1024D | 512 tokens | Good performance (~64), multilingual support, versatile |
| All-MPNet-Base-v2 | 768D | 384 tokens | General-purpose sentence embeddings, semantic search |
| BGE-M3 | 1024D | 8192 tokens | Multilingual support, 8K context, dense + sparse retrieval |
| Jina-Embeddings-v2-Base-EN | 768D | 8192 tokens | 8K context support, balanced performance, long documents |
| All-MiniLM-L6-v2 | 384D | 256 tokens | Optimized for speed, lightweight, quick processing, prototyping |
Best Practices for Model Selection:
- For Production Systems: Use
Stella-EN-1.5B-v5orBGE-Large-EN-v1.5for best overall performance - For Long Documents: Choose
BGE-M3orJina-Embeddings-v2-Base-ENwith 8K context support - For Prototyping: Start with
All-MiniLM-L6-v2for fast iteration, then upgrade - For Multilingual Content:
E5-Large-v2orBGE-M3provide excellent multilingual support - For Better Similarity Search: Higher dimension models (1024D) generally provide better semantic discrimination than lower dimension models (384D)
Similarity Methods
| Method | Use Case | Description |
|---|---|---|
| Cosine Similarity | Most common | Normalizes for vector magnitude |
| Euclidean Distance | Absolute magnitude | Intuitive distance measurement |
| Dot Product | Performance-critical | Fastest computation |
| Manhattan Distance | Noisy data | Robust to outliers |
Retrieval Methods
| Method | Description | Best For |
|---|---|---|
| Custom Template | Simple template-based | Straightforward Q&A |
| Contextual Retrieval | LLM-enhanced context | Complex narratives |
| ML-Optimized | Multi-level summaries | Hierarchical content |
BM25 Hybrid Search
Enable when:
- Technical documents with proper nouns
- Code repositories with function names
- Legal documents with specific terms
Benefits:
- Combines exact term matching with semantic similarity
- Improves results for terminology-heavy content
Step 5: API Setup
API Key Management
Primary API Key:
- Generated automatically upon completion
- Full access to all endpoints
- Regeneration available through dashboard
Key Security:
- Automatic key rotation options
- IP address whitelisting
- Rate limiting per key
- Immediate revocation capabilities
Authentication Methods
Bearer Token (Recommended):
curl -X POST "https://api.guidedmind.ai/v1/rag/your-project-id/query" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"query": "What is the return policy?"}'Header-based:
curl -X POST "https://api.guidedmind.ai/v1/rag/your-project-id/query" \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"query": "What is the return policy?"}'Rate Limiting
| Tier | Requests/Day | Requests/Minute |
|---|---|---|
| Development | 1,000 | 10 |
| Production | 100,000 | 500 |
| Enterprise | Custom | Custom |
For GraphRAG Users
Graph Schema Configuration
If you're setting up GraphRAG, additional configuration is available:
Graph Schema:
- Define entity categories (Person, Organization, Location)
- Define relationship types (WORKS_FOR, LOCATED_IN)
- Define entity properties (name, email, date)
NER Analysis:
- Automatically identifies entities in documents
- Run on sample documents before full graph generation
- Populates schema with discovered entities
Graph Generation Settings
| Setting | Description | Recommended |
|---|---|---|
| Max Nodes | Control graph size | 100-200 for testing |
| Extraction Scenario | Domain-specific prompts | Business or Technical |
Validation and Completion
What Validation Occurs
At each step, the wizard validates:
- Step 1: All required fields completed, project name unique
- Step 2: At least one document uploaded successfully
- Step 3: Chunking strategy selected and configured
- Step 4: Embedding model and retrieval method selected
- Step 5: API key generated and secured
Completion Criteria
The wizard is complete when:
- ✅ All 5 steps completed successfully
- ✅ At least one document uploaded and processed
- ✅ API key generated
- ✅ Configuration validation passed
Next Step: Test Embedding Search
After completing the wizard, proceed to Step 2: Test Embedding Search to verify your embedding model produces good similarity scores before deployment.
What to Prepare
Before proceeding to Step 2:
- Prepare 5-10 test queries representing typical user questions
- Know your quality threshold (similarity score > 0.7 recommended)
- Have sample documents ready for verification
Tips for Success
- Be Specific in Project Setup: Detailed descriptions help optimize recommendations
- Upload Representative Documents: Include documents that represent your full content range
- Start with Medium Chunk Size: 512-768 tokens works for most use cases
- Test Multiple Embedding Models: Compare results before finalizing
- Enable BM25 for Technical Content: Improves results for terminology-heavy domains
