Docs
API Setup & Endpoints

API Setup & Endpoints

Configure API endpoints, authentication, and integration for your RAG system

API Setup & Endpoints

The final step of the RAG Wizard configures your completed RAG system for external access through secure API endpoints. This step handles authentication, rate limiting, monitoring, and provides integration documentation.

API Endpoint Configuration

Base Configuration

Your RAG system will be available through RESTful API endpoints with the following structure:

Base URL Format:

https://api.guidedmind.ai/v1/rag/{project-id}/

Available Endpoints:

  • POST /query - Submit queries to your RAG system
  • GET /documents - List and manage documents
  • POST /documents - Upload additional documents
  • GET /status - Check system health and statistics
  • GET /config - Retrieve current configuration (admin only)

Query Endpoint Details

Endpoint: POST /query

Request Format:

{
  "query": "Your question or search query",
  "options": {
    "max_results": 5,
    "temperature": 0.7,
    "max_tokens": 500,
    "include_sources": true,
    "stream": false
  },
  "filters": {
    "document_types": ["pdf", "docx"],
    "date_range": {
      "start": "2024-01-01",
      "end": "2024-12-31"
    },
    "metadata": {
      "department": "engineering",
      "priority": "high"
    }
  }
}

Response Format:

{
  "query": "Original user query",
  "response": "Generated response based on retrieved context",
  "sources": [
    {
      "document_id": "doc123",
      "title": "Document Title",
      "chunk_text": "Relevant chunk content",
      "similarity_score": 0.89,
      "metadata": {
        "author": "John Doe",
        "date": "2024-01-15",
        "section": "Chapter 3"
      }
    }
  ],
  "processing_time": 1.23,
  "token_usage": {
    "query_tokens": 12,
    "response_tokens": 145,
    "total_tokens": 157
  }
}

Authentication & Security

API Key Management

Primary API Key:

  • Generated automatically upon completion
  • Full access to all endpoints
  • Regeneration available through dashboard
  • Secure storage and transmission required

Secondary Keys:

  • Read-only access keys for monitoring
  • Limited scope keys for specific endpoints
  • Time-limited keys for temporary access
  • Key rotation and management tools

Key Security Features:

  • Automatic key rotation options
  • IP address whitelisting
  • Rate limiting per key
  • Usage monitoring and alerting
  • Immediate revocation capabilities

Authentication Methods

Bearer Token Authentication (Recommended):

curl -X POST "https://api.guidedmind.ai/v1/rag/your-project-id/query" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the return policy?"}'

Header-based Authentication:

curl -X POST "https://api.guidedmind.ai/v1/rag/your-project-id/query" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the return policy?"}'

OAuth 2.0 Integration (Enterprise):

  • Client credentials flow
  • Authorization code flow for user-specific access
  • JWT tokens with configurable expiration
  • Refresh token management

Rate Limiting & Quotas

Rate Limit Configuration

Request Rate Limits:

  • Requests per minute (RPM)
  • Requests per hour (RPH)
  • Requests per day (RPD)
  • Burst allowance for temporary spikes

Token-based Limits:

  • Input tokens per request
  • Output tokens per request
  • Total tokens per time period
  • Cumulative monthly usage caps

Concurrent Request Limits:

  • Maximum simultaneous requests
  • Queue management for excess requests
  • Priority handling for different key types
  • Automatic scaling based on subscription tier

Usage Tiers

Development Tier:

  • 1,000 requests per day
  • 10 requests per minute
  • 100,000 tokens per month
  • Basic support and monitoring

Production Tier:

  • 100,000 requests per day
  • 500 requests per minute
  • 10M tokens per month
  • Advanced monitoring and analytics

Enterprise Tier:

  • Custom limits based on requirements
  • Dedicated infrastructure options
  • Premium support and SLA guarantees
  • Custom integration assistance

Rate Limit Headers

API responses include rate limiting information:

HTTP/1.1 200 OK
X-RateLimit-Limit: 500
X-RateLimit-Remaining: 342
X-RateLimit-Reset: 1640995200
X-RateLimit-Retry-After: 60

Monitoring & Analytics

Usage Monitoring

Real-time Metrics:

  • Request volume and patterns
  • Response times and latency
  • Error rates and types
  • Token consumption tracking

Performance Analytics:

  • Query complexity analysis
  • Retrieval accuracy metrics
  • User satisfaction indicators
  • System resource utilization

Cost Tracking:

  • Token usage and associated costs
  • Processing time charges
  • Storage utilization fees
  • Bandwidth consumption

Dashboard Features

Overview Dashboard:

  • Key performance indicators (KPIs)
  • Usage trends and patterns
  • Health status indicators
  • Recent activity summaries

Detailed Analytics:

  • Query analysis and optimization suggestions
  • Document performance metrics
  • User behavior patterns
  • System performance deep-dives

Alert System:

  • Usage threshold alerts
  • Error rate notifications
  • Performance degradation warnings
  • Security incident alerts

Integration Documentation

Code Examples

Python SDK:

from guidedmind import RAGClient
 
client = RAGClient(
    api_key="your-api-key",
    project_id="your-project-id"
)
 
response = client.query(
    query="What are the main features of the product?",
    options={
        "max_results": 3,
        "include_sources": True,
        "temperature": 0.7
    }
)
 
print(f"Response: {response.text}")
for source in response.sources:
    print(f"Source: {source.title} ({source.similarity_score:.2f})")

JavaScript SDK:

import { RAGClient } from '@guidedmind/rag-client';
 
const client = new RAGClient({
  apiKey: 'your-api-key',
  projectId: 'your-project-id'
});
 
async function queryRAG() {
  try {
    const response = await client.query({
      query: 'How do I reset my password?',
      options: {
        maxResults: 5,
        includeSources: true,
        temperature: 0.5
      }
    });
    
    console.log('Response:', response.text);
    console.log('Sources:', response.sources);
  } catch (error) {
    console.error('Query failed:', error);
  }
}

cURL Examples:

# Basic query
curl -X POST "https://api.guidedmind.ai/v1/rag/your-project-id/query" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the shipping policy?",
    "options": {
      "max_results": 3,
      "include_sources": true
    }
  }'
 
# Query with filters
curl -X POST "https://api.guidedmind.ai/v1/rag/your-project-id/query" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Latest product updates",
    "filters": {
      "date_range": {
        "start": "2024-01-01"
      },
      "metadata": {
        "category": "product"
      }
    }
  }'

Webhook Configuration

Event Types:

  • query.completed - Query processing finished
  • document.processed - New document successfully processed
  • usage.threshold - Usage limits approached
  • error.occurred - System errors and failures

Webhook Format:

{
  "event_type": "query.completed",
  "timestamp": "2024-01-15T10:30:00Z",
  "project_id": "your-project-id",
  "data": {
    "query_id": "query123",
    "query": "User's original query",
    "response_time": 1.23,
    "tokens_used": 157,
    "success": true
  }
}

Advanced Configuration

Custom Domains and SSL

Custom Domain Setup:

  • Configure your own domain (e.g., api.yourcompany.com)
  • SSL certificate management
  • DNS configuration assistance
  • Custom branding options

SSL/TLS Configuration:

  • TLS 1.3 support
  • Custom certificate installation
  • Certificate renewal automation
  • Security compliance validation

Advanced Authentication

Single Sign-On (SSO) Integration:

  • SAML 2.0 support
  • OIDC/OAuth 2.0 integration
  • Active Directory integration
  • Custom identity provider support

Multi-tenant Architecture:

  • Tenant isolation and security
  • Per-tenant configuration
  • Usage tracking and billing
  • Administrative controls

Performance Optimization

Caching Strategy:

  • Query result caching
  • Embedding cache management
  • CDN integration for global performance
  • Cache invalidation controls

Geographic Distribution:

  • Multi-region deployment options
  • Edge computing integration
  • Latency optimization
  • Data residency compliance

Error Handling & Troubleshooting

Common Error Codes

400 Bad Request:

{
  "error": "invalid_query",
  "message": "Query cannot be empty or exceed maximum length",
  "details": {
    "max_query_length": 4000,
    "provided_length": 4500
  }
}

401 Unauthorized:

{
  "error": "invalid_api_key",
  "message": "The provided API key is invalid or expired",
  "details": {
    "suggestion": "Check your API key or generate a new one"
  }
}

429 Too Many Requests:

{
  "error": "rate_limit_exceeded",
  "message": "Request rate limit exceeded",
  "details": {
    "retry_after": 60,
    "limit": 500,
    "window": "1 hour"
  }
}

500 Internal Server Error:

{
  "error": "processing_error",
  "message": "An error occurred while processing your request",
  "details": {
    "request_id": "req_123456",
    "support_contact": "support@guidedmind.ai"
  }
}

Troubleshooting Guide

Performance Issues:

  1. Check current rate limits and usage
  2. Optimize query complexity and filters
  3. Implement client-side caching
  4. Consider request batching for multiple queries

Quality Issues:

  1. Review and refine query templates
  2. Analyze retrieved sources for relevance
  3. Adjust similarity thresholds
  4. Consider document processing optimization

Authentication Problems:

  1. Verify API key validity and permissions
  2. Check IP whitelisting configuration
  3. Confirm proper header formatting
  4. Review SSL/TLS certificate issues

Testing & Deployment

Testing Framework

Unit Testing:

  • Individual endpoint testing
  • Authentication validation
  • Rate limiting verification
  • Error handling confirmation

Integration Testing:

  • End-to-end workflow testing
  • SDK compatibility verification
  • Webhook delivery confirmation
  • Performance benchmark validation

Load Testing:

  • Concurrent request handling
  • Rate limit behavior under load
  • System stability under stress
  • Performance degradation thresholds

Deployment Checklist

Pre-deployment:

  • ✅ All configuration steps completed successfully
  • ✅ API keys generated and secured
  • ✅ Rate limits configured appropriately
  • ✅ Monitoring and alerting set up
  • ✅ Integration documentation reviewed

Post-deployment:

  • ✅ Endpoint accessibility verified
  • ✅ Authentication working correctly
  • ✅ Sample queries returning expected results
  • ✅ Monitoring dashboards operational
  • ✅ Team access and permissions configured

Production Readiness:

  • ✅ Load testing completed successfully
  • ✅ Error handling and recovery tested
  • ✅ Security review and penetration testing
  • ✅ Documentation and training materials ready
  • ✅ Support procedures established

Your RAG system is now fully configured and ready for production use. The API endpoints provide secure, scalable access to your knowledge base with comprehensive monitoring and management capabilities.