Docs

API Setup & Endpoints

Configure API endpoints, authentication, and integration for your RAG system

API Setup & Endpoints

The final step of the RAG Wizard configures your completed RAG system for external access through secure API endpoints. This step handles authentication, rate limiting, monitoring, and provides integration documentation.

API Endpoint Configuration

Base Configuration

Your RAG system will be available through RESTful API endpoints with the following structure:

Base URL Format:

https://api.guidedmind.ai/v1/rag/{project-id}/

Available Endpoints:

POST /query - Submit queries to your RAG system
GET /documents - List and manage documents
POST /documents - Upload additional documents
GET /status - Check system health and statistics
GET /config - Retrieve current configuration (admin only)

Query Endpoint Details

Endpoint: POST /query

Request Format:

{
  "query": "Your question or search query",
  "options": {
    "max_results": 5,
    "temperature": 0.7,
    "max_tokens": 500,
    "include_sources": true,
    "stream": false
  },
  "filters": {
    "document_types": ["pdf", "docx"],
    "date_range": {
      "start": "2024-01-01",
      "end": "2024-12-31"
    },
    "metadata": {
      "department": "engineering",
      "priority": "high"
    }
  }
}

Response Format:

{
  "query": "Original user query",
  "response": "Generated response based on retrieved context",
  "sources": [
    {
      "document_id": "doc123",
      "title": "Document Title",
      "chunk_text": "Relevant chunk content",
      "similarity_score": 0.89,
      "metadata": {
        "author": "John Doe",
        "date": "2024-01-15",
        "section": "Chapter 3"
      }
    }
  ],
  "processing_time": 1.23,
  "token_usage": {
    "query_tokens": 12,
    "response_tokens": 145,
    "total_tokens": 157
  }
}

Authentication & Security

API Key Management

Primary API Key:

Generated automatically upon completion
Full access to all endpoints
Regeneration available through dashboard
Secure storage and transmission required

Secondary Keys:

Read-only access keys for monitoring
Limited scope keys for specific endpoints
Time-limited keys for temporary access
Key rotation and management tools

Key Security Features:

Automatic key rotation options
IP address whitelisting
Rate limiting per key
Usage monitoring and alerting
Immediate revocation capabilities

Authentication Methods

Bearer Token Authentication (Recommended):

curl -X POST "https://api.guidedmind.ai/v1/rag/your-project-id/query" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the return policy?"}'

Header-based Authentication:

curl -X POST "https://api.guidedmind.ai/v1/rag/your-project-id/query" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the return policy?"}'

OAuth 2.0 Integration (Enterprise):

Client credentials flow
Authorization code flow for user-specific access
JWT tokens with configurable expiration
Refresh token management

Rate Limiting & Quotas

Rate Limit Configuration

Request Rate Limits:

Requests per minute (RPM)
Requests per hour (RPH)
Requests per day (RPD)
Burst allowance for temporary spikes

Token-based Limits:

Input tokens per request
Output tokens per request
Total tokens per time period
Cumulative monthly usage caps

Concurrent Request Limits:

Maximum simultaneous requests
Queue management for excess requests
Priority handling for different key types
Automatic scaling based on subscription tier

Usage Tiers

Development Tier:

1,000 requests per day
10 requests per minute
100,000 tokens per month
Basic support and monitoring

Production Tier:

100,000 requests per day
500 requests per minute
10M tokens per month
Advanced monitoring and analytics

Enterprise Tier:

Custom limits based on requirements
Dedicated infrastructure options
Premium support and SLA guarantees
Custom integration assistance

Rate Limit Headers

API responses include rate limiting information:

HTTP/1.1 200 OK
X-RateLimit-Limit: 500
X-RateLimit-Remaining: 342
X-RateLimit-Reset: 1640995200
X-RateLimit-Retry-After: 60

Monitoring & Analytics

Usage Monitoring

Real-time Metrics:

Request volume and patterns
Response times and latency
Error rates and types
Token consumption tracking

Performance Analytics:

Query complexity analysis
Retrieval accuracy metrics
User satisfaction indicators
System resource utilization

Cost Tracking:

Token usage and associated costs
Processing time charges
Storage utilization fees
Bandwidth consumption

Dashboard Features

Overview Dashboard:

Key performance indicators (KPIs)
Usage trends and patterns
Health status indicators
Recent activity summaries

Detailed Analytics:

Query analysis and optimization suggestions
Document performance metrics
User behavior patterns
System performance deep-dives

Alert System:

Usage threshold alerts
Error rate notifications
Performance degradation warnings
Security incident alerts

Integration Documentation

Code Examples

Python SDK:

from guidedmind import RAGClient
 
client = RAGClient(
    api_key="your-api-key",
    project_id="your-project-id"
)
 
response = client.query(
    query="What are the main features of the product?",
    options={
        "max_results": 3,
        "include_sources": True,
        "temperature": 0.7
    }
)
 
print(f"Response: {response.text}")
for source in response.sources:
    print(f"Source: {source.title} ({source.similarity_score:.2f})")

JavaScript SDK:

import { RAGClient } from '@guidedmind/rag-client';
 
const client = new RAGClient({
  apiKey: 'your-api-key',
  projectId: 'your-project-id'
});
 
async function queryRAG() {
  try {
    const response = await client.query({
      query: 'How do I reset my password?',
      options: {
        maxResults: 5,
        includeSources: true,
        temperature: 0.5
      }
    });
    
    console.log('Response:', response.text);
    console.log('Sources:', response.sources);
  } catch (error) {
    console.error('Query failed:', error);
  }
}

cURL Examples:

# Basic query
curl -X POST "https://api.guidedmind.ai/v1/rag/your-project-id/query" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the shipping policy?",
    "options": {
      "max_results": 3,
      "include_sources": true
    }
  }'
 
# Query with filters
curl -X POST "https://api.guidedmind.ai/v1/rag/your-project-id/query" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Latest product updates",
    "filters": {
      "date_range": {
        "start": "2024-01-01"
      },
      "metadata": {
        "category": "product"
      }
    }
  }'

Webhook Configuration

Event Types:

query.completed - Query processing finished
document.processed - New document successfully processed
usage.threshold - Usage limits approached
error.occurred - System errors and failures

Webhook Format:

{
  "event_type": "query.completed",
  "timestamp": "2024-01-15T10:30:00Z",
  "project_id": "your-project-id",
  "data": {
    "query_id": "query123",
    "query": "User's original query",
    "response_time": 1.23,
    "tokens_used": 157,
    "success": true
  }
}

Advanced Configuration

Custom Domains and SSL

Custom Domain Setup:

Configure your own domain (e.g., api.yourcompany.com)
SSL certificate management
DNS configuration assistance
Custom branding options

SSL/TLS Configuration:

TLS 1.3 support
Custom certificate installation
Certificate renewal automation
Security compliance validation

Advanced Authentication

Single Sign-On (SSO) Integration:

SAML 2.0 support
OIDC/OAuth 2.0 integration
Active Directory integration
Custom identity provider support

Multi-tenant Architecture:

Tenant isolation and security
Per-tenant configuration
Usage tracking and billing
Administrative controls

Performance Optimization

Caching Strategy:

Query result caching
Embedding cache management
CDN integration for global performance
Cache invalidation controls

Geographic Distribution:

Multi-region deployment options
Edge computing integration
Latency optimization
Data residency compliance

Error Handling & Troubleshooting

Common Error Codes

400 Bad Request:

{
  "error": "invalid_query",
  "message": "Query cannot be empty or exceed maximum length",
  "details": {
    "max_query_length": 4000,
    "provided_length": 4500
  }
}

401 Unauthorized:

{
  "error": "invalid_api_key",
  "message": "The provided API key is invalid or expired",
  "details": {
    "suggestion": "Check your API key or generate a new one"
  }
}

429 Too Many Requests:

{
  "error": "rate_limit_exceeded",
  "message": "Request rate limit exceeded",
  "details": {
    "retry_after": 60,
    "limit": 500,
    "window": "1 hour"
  }
}

500 Internal Server Error:

{
  "error": "processing_error",
  "message": "An error occurred while processing your request",
  "details": {
    "request_id": "req_123456",
    "support_contact": "support@guidedmind.ai"
  }
}

Troubleshooting Guide

Performance Issues:

Check current rate limits and usage
Optimize query complexity and filters
Implement client-side caching
Consider request batching for multiple queries

Quality Issues:

Review and refine query templates
Analyze retrieved sources for relevance
Adjust similarity thresholds
Consider document processing optimization

Authentication Problems:

Verify API key validity and permissions
Check IP whitelisting configuration
Confirm proper header formatting
Review SSL/TLS certificate issues

Testing & Deployment

Testing Framework

Unit Testing:

Individual endpoint testing
Authentication validation
Rate limiting verification
Error handling confirmation

Integration Testing:

End-to-end workflow testing
SDK compatibility verification
Webhook delivery confirmation
Performance benchmark validation

Load Testing:

Concurrent request handling
Rate limit behavior under load
System stability under stress
Performance degradation thresholds

Deployment Checklist

Pre-deployment:

✅ All configuration steps completed successfully
✅ API keys generated and secured
✅ Rate limits configured appropriately
✅ Monitoring and alerting set up
✅ Integration documentation reviewed

Post-deployment:

✅ Endpoint accessibility verified
✅ Authentication working correctly
✅ Sample queries returning expected results
✅ Monitoring dashboards operational
✅ Team access and permissions configured

Production Readiness:

✅ Load testing completed successfully
✅ Error handling and recovery tested
✅ Security review and penetration testing
✅ Documentation and training materials ready
✅ Support procedures established

Your RAG system is now fully configured and ready for production use. The API endpoints provide secure, scalable access to your knowledge base with comprehensive monitoring and management capabilities.

Pipeline Configuration Analytics Overview