API Setup & Endpoints
Configure API endpoints, authentication, and integration for your RAG system
API Setup & Endpoints
The final step of the RAG Wizard configures your completed RAG system for external access through secure API endpoints. This step handles authentication, rate limiting, monitoring, and provides integration documentation.
API Endpoint Configuration
Base Configuration
Your RAG system will be available through RESTful API endpoints with the following structure:
Base URL Format:
https://api.guidedmind.ai/v1/rag/{project-id}/
Available Endpoints:
POST /query
- Submit queries to your RAG systemGET /documents
- List and manage documentsPOST /documents
- Upload additional documentsGET /status
- Check system health and statisticsGET /config
- Retrieve current configuration (admin only)
Query Endpoint Details
Endpoint: POST /query
Request Format:
{
"query": "Your question or search query",
"options": {
"max_results": 5,
"temperature": 0.7,
"max_tokens": 500,
"include_sources": true,
"stream": false
},
"filters": {
"document_types": ["pdf", "docx"],
"date_range": {
"start": "2024-01-01",
"end": "2024-12-31"
},
"metadata": {
"department": "engineering",
"priority": "high"
}
}
}
Response Format:
{
"query": "Original user query",
"response": "Generated response based on retrieved context",
"sources": [
{
"document_id": "doc123",
"title": "Document Title",
"chunk_text": "Relevant chunk content",
"similarity_score": 0.89,
"metadata": {
"author": "John Doe",
"date": "2024-01-15",
"section": "Chapter 3"
}
}
],
"processing_time": 1.23,
"token_usage": {
"query_tokens": 12,
"response_tokens": 145,
"total_tokens": 157
}
}
Authentication & Security
API Key Management
Primary API Key:
- Generated automatically upon completion
- Full access to all endpoints
- Regeneration available through dashboard
- Secure storage and transmission required
Secondary Keys:
- Read-only access keys for monitoring
- Limited scope keys for specific endpoints
- Time-limited keys for temporary access
- Key rotation and management tools
Key Security Features:
- Automatic key rotation options
- IP address whitelisting
- Rate limiting per key
- Usage monitoring and alerting
- Immediate revocation capabilities
Authentication Methods
Bearer Token Authentication (Recommended):
curl -X POST "https://api.guidedmind.ai/v1/rag/your-project-id/query" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"query": "What is the return policy?"}'
Header-based Authentication:
curl -X POST "https://api.guidedmind.ai/v1/rag/your-project-id/query" \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"query": "What is the return policy?"}'
OAuth 2.0 Integration (Enterprise):
- Client credentials flow
- Authorization code flow for user-specific access
- JWT tokens with configurable expiration
- Refresh token management
Rate Limiting & Quotas
Rate Limit Configuration
Request Rate Limits:
- Requests per minute (RPM)
- Requests per hour (RPH)
- Requests per day (RPD)
- Burst allowance for temporary spikes
Token-based Limits:
- Input tokens per request
- Output tokens per request
- Total tokens per time period
- Cumulative monthly usage caps
Concurrent Request Limits:
- Maximum simultaneous requests
- Queue management for excess requests
- Priority handling for different key types
- Automatic scaling based on subscription tier
Usage Tiers
Development Tier:
- 1,000 requests per day
- 10 requests per minute
- 100,000 tokens per month
- Basic support and monitoring
Production Tier:
- 100,000 requests per day
- 500 requests per minute
- 10M tokens per month
- Advanced monitoring and analytics
Enterprise Tier:
- Custom limits based on requirements
- Dedicated infrastructure options
- Premium support and SLA guarantees
- Custom integration assistance
Rate Limit Headers
API responses include rate limiting information:
HTTP/1.1 200 OK
X-RateLimit-Limit: 500
X-RateLimit-Remaining: 342
X-RateLimit-Reset: 1640995200
X-RateLimit-Retry-After: 60
Monitoring & Analytics
Usage Monitoring
Real-time Metrics:
- Request volume and patterns
- Response times and latency
- Error rates and types
- Token consumption tracking
Performance Analytics:
- Query complexity analysis
- Retrieval accuracy metrics
- User satisfaction indicators
- System resource utilization
Cost Tracking:
- Token usage and associated costs
- Processing time charges
- Storage utilization fees
- Bandwidth consumption
Dashboard Features
Overview Dashboard:
- Key performance indicators (KPIs)
- Usage trends and patterns
- Health status indicators
- Recent activity summaries
Detailed Analytics:
- Query analysis and optimization suggestions
- Document performance metrics
- User behavior patterns
- System performance deep-dives
Alert System:
- Usage threshold alerts
- Error rate notifications
- Performance degradation warnings
- Security incident alerts
Integration Documentation
Code Examples
Python SDK:
from guidedmind import RAGClient
client = RAGClient(
api_key="your-api-key",
project_id="your-project-id"
)
response = client.query(
query="What are the main features of the product?",
options={
"max_results": 3,
"include_sources": True,
"temperature": 0.7
}
)
print(f"Response: {response.text}")
for source in response.sources:
print(f"Source: {source.title} ({source.similarity_score:.2f})")
JavaScript SDK:
import { RAGClient } from '@guidedmind/rag-client';
const client = new RAGClient({
apiKey: 'your-api-key',
projectId: 'your-project-id'
});
async function queryRAG() {
try {
const response = await client.query({
query: 'How do I reset my password?',
options: {
maxResults: 5,
includeSources: true,
temperature: 0.5
}
});
console.log('Response:', response.text);
console.log('Sources:', response.sources);
} catch (error) {
console.error('Query failed:', error);
}
}
cURL Examples:
# Basic query
curl -X POST "https://api.guidedmind.ai/v1/rag/your-project-id/query" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "What is the shipping policy?",
"options": {
"max_results": 3,
"include_sources": true
}
}'
# Query with filters
curl -X POST "https://api.guidedmind.ai/v1/rag/your-project-id/query" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "Latest product updates",
"filters": {
"date_range": {
"start": "2024-01-01"
},
"metadata": {
"category": "product"
}
}
}'
Webhook Configuration
Event Types:
query.completed
- Query processing finisheddocument.processed
- New document successfully processedusage.threshold
- Usage limits approachederror.occurred
- System errors and failures
Webhook Format:
{
"event_type": "query.completed",
"timestamp": "2024-01-15T10:30:00Z",
"project_id": "your-project-id",
"data": {
"query_id": "query123",
"query": "User's original query",
"response_time": 1.23,
"tokens_used": 157,
"success": true
}
}
Advanced Configuration
Custom Domains and SSL
Custom Domain Setup:
- Configure your own domain (e.g.,
api.yourcompany.com
) - SSL certificate management
- DNS configuration assistance
- Custom branding options
SSL/TLS Configuration:
- TLS 1.3 support
- Custom certificate installation
- Certificate renewal automation
- Security compliance validation
Advanced Authentication
Single Sign-On (SSO) Integration:
- SAML 2.0 support
- OIDC/OAuth 2.0 integration
- Active Directory integration
- Custom identity provider support
Multi-tenant Architecture:
- Tenant isolation and security
- Per-tenant configuration
- Usage tracking and billing
- Administrative controls
Performance Optimization
Caching Strategy:
- Query result caching
- Embedding cache management
- CDN integration for global performance
- Cache invalidation controls
Geographic Distribution:
- Multi-region deployment options
- Edge computing integration
- Latency optimization
- Data residency compliance
Error Handling & Troubleshooting
Common Error Codes
400 Bad Request:
{
"error": "invalid_query",
"message": "Query cannot be empty or exceed maximum length",
"details": {
"max_query_length": 4000,
"provided_length": 4500
}
}
401 Unauthorized:
{
"error": "invalid_api_key",
"message": "The provided API key is invalid or expired",
"details": {
"suggestion": "Check your API key or generate a new one"
}
}
429 Too Many Requests:
{
"error": "rate_limit_exceeded",
"message": "Request rate limit exceeded",
"details": {
"retry_after": 60,
"limit": 500,
"window": "1 hour"
}
}
500 Internal Server Error:
{
"error": "processing_error",
"message": "An error occurred while processing your request",
"details": {
"request_id": "req_123456",
"support_contact": "support@guidedmind.ai"
}
}
Troubleshooting Guide
Performance Issues:
- Check current rate limits and usage
- Optimize query complexity and filters
- Implement client-side caching
- Consider request batching for multiple queries
Quality Issues:
- Review and refine query templates
- Analyze retrieved sources for relevance
- Adjust similarity thresholds
- Consider document processing optimization
Authentication Problems:
- Verify API key validity and permissions
- Check IP whitelisting configuration
- Confirm proper header formatting
- Review SSL/TLS certificate issues
Testing & Deployment
Testing Framework
Unit Testing:
- Individual endpoint testing
- Authentication validation
- Rate limiting verification
- Error handling confirmation
Integration Testing:
- End-to-end workflow testing
- SDK compatibility verification
- Webhook delivery confirmation
- Performance benchmark validation
Load Testing:
- Concurrent request handling
- Rate limit behavior under load
- System stability under stress
- Performance degradation thresholds
Deployment Checklist
Pre-deployment:
- ✅ All configuration steps completed successfully
- ✅ API keys generated and secured
- ✅ Rate limits configured appropriately
- ✅ Monitoring and alerting set up
- ✅ Integration documentation reviewed
Post-deployment:
- ✅ Endpoint accessibility verified
- ✅ Authentication working correctly
- ✅ Sample queries returning expected results
- ✅ Monitoring dashboards operational
- ✅ Team access and permissions configured
Production Readiness:
- ✅ Load testing completed successfully
- ✅ Error handling and recovery tested
- ✅ Security review and penetration testing
- ✅ Documentation and training materials ready
- ✅ Support procedures established
Your RAG system is now fully configured and ready for production use. The API endpoints provide secure, scalable access to your knowledge base with comprehensive monitoring and management capabilities.