Project Setup
Configure your RAG project's basic settings and requirements
Project Setup
The first step in creating a RAG system is defining your project's fundamental characteristics. This step establishes the foundation for all subsequent configuration decisions and helps optimize the system for your specific use case.
Basic Information
Project Name
Choose a descriptive name that clearly identifies your RAG system's purpose. This name will appear in dashboards, API endpoints, and documentation.
Best Practices:
- Use clear, descriptive names (e.g., "Customer Support KB", "Research Assistant")
- Avoid special characters or spaces (use hyphens or underscores)
- Keep it concise but meaningful
Project Description
Provide a detailed description of your RAG system's purpose, target audience, and expected functionality. This helps with:
- Team collaboration and knowledge sharing
- Future maintenance and updates
- Configuration optimization recommendations
Domain Configuration
Domain Selection
Choose the primary domain that best describes your use case:
- Customer Support: FAQ systems, help desk automation, support ticket routing
- Research & Academia: Literature reviews, citation assistance, knowledge synthesis
- Content Creation: Writing assistance, content generation, editing support
- Technical Documentation: API documentation, code explanations, troubleshooting guides
- Business Intelligence: Data analysis, report generation, insight extraction
- Education: Tutoring systems, course assistance, learning materials
- Legal & Compliance: Document analysis, regulation compliance, contract review
- Healthcare: Medical knowledge bases, patient information, clinical decision support
Use Case Specification
Define the specific use case within your chosen domain:
Examples by Domain:
- Customer Support: "Automated FAQ responses for SaaS product inquiries"
- Research: "Academic paper summarization and citation management"
- Technical Docs: "Developer API reference with contextual examples"
- Business: "Sales report analysis and trend identification"
Performance Requirements
Expected Scale
Configure your system for the anticipated load:
Small Scale (< 1,000 queries/day)
- Optimized for development and testing
- Single-instance deployment
- Basic caching strategies
Medium Scale (1,000 - 10,000 queries/day)
- Production-ready configuration
- Load balancing considerations
- Enhanced caching and optimization
Large Scale (> 10,000 queries/day)
- Enterprise-grade setup
- Distributed architecture support
- Advanced performance monitoring
Query Complexity
Define the typical complexity of user queries:
Simple Queries
- Direct fact retrieval
- Single-topic questions
- Exact match searches
- Example: "What is the return policy?"
Moderate Queries
- Multi-step reasoning required
- Cross-document synthesis
- Conceptual understanding needed
- Example: "How does our pricing compare to competitors for enterprise customers?"
Complex Queries
- Advanced reasoning and analysis
- Multiple document correlation
- Context-dependent responses
- Example: "Analyze the impact of our Q3 policy changes on customer satisfaction trends"
Response Configuration
Response Type
Select the primary type of responses your system will generate:
Factual Answers
- Direct, concise responses
- Fact-based information retrieval
- Minimal interpretation or analysis
Explanatory Responses
- Detailed explanations with context
- Educational or instructional content
- Step-by-step guidance
Analytical Insights
- Data interpretation and analysis
- Trend identification and predictions
- Comparative assessments
Creative Content
- Content generation and writing assistance
- Creative interpretation of information
- Stylistic adaptations
Response Length Preferences
- Concise (< 100 words): Quick answers, key facts
- Balanced (100-300 words): Standard explanations with context
- Comprehensive (> 300 words): Detailed analysis and explanations
Integration Settings
LLM Integration Preferences
Choose your preferred approach for language model integration:
Direct Integration
- Built-in LLM models
- Simplified configuration
- Managed scaling and updates
Custom API Integration
- Use your own LLM API endpoints
- Custom model configurations
- Advanced control over model behavior
Hybrid Approach
- Combine multiple LLM sources
- Failover and load distribution
- Model-specific optimization
Validation and Dependencies
The Project Setup step validates:
- ✅ All required fields are completed
- ✅ Project name uniqueness
- ✅ Configuration consistency
- ✅ Resource requirement feasibility
Prerequisites for Next Step:
- Complete project name and description
- Select domain and use case
- Configure scale and complexity settings
- Choose response and integration preferences
Once validated, you can proceed to Data Sources configuration where you'll upload and organize your documents for processing.
Tips for Success
- Be Specific: Detailed descriptions help optimize recommendations
- Consider Growth: Plan for future scaling requirements
- Align Settings: Ensure complexity settings match your actual use case
- Document Intent: Clear descriptions help with future modifications
The configuration choices made in this step influence all subsequent wizard recommendations, so take time to accurately describe your requirements.