Step 4 - Deploy API Endpoint

Docs

Configure and test your production API endpoint with proper authentication

Step 4: Deploy API Endpoint

Purpose

Configure API security settings and test your RAG endpoint with proper authentication. This step transforms your tested configuration into a production-ready API.

Entry Point: API Setup tab → "Test Endpoint" button

Prerequisites: Pipeline configuration optimized (Step 3 complete)

Expected Outcome: Verified API endpoint working with authentication

API Configuration Checklist

API Key Management

Item	Status	Notes
API key generated	✅ Automatic	Created on project creation
API key securely stored	⚠️ Manual	Copy immediately after creation
API key masked display	✅ Automatic	For verification only

Important: The raw API key is only shown once during initial generation. Store it securely immediately.

Security Settings

Setting	Recommended Value	Purpose
Rate Limit	1000 req/hour	Prevent abuse
Timeout	30 seconds	Resource management
IP Whitelist	Your application IPs	Additional security

Input Configuration

Verify your API accepts:

Query text - The user's question or search
Optional filters - Document IDs, metadata filters
Model options - Temperature, max_tokens, stream

Testing Methods

Method 1: Built-in Test Console

Location: API Setup tab → "Test Endpoint" button

Steps:

Enter test query in the console
API key is auto-filled from your project
Click "Send" to execute
Review response format and timing

What to Verify:

Response contains expected fields
Similarity scores match Step 2 testing
Processing time is acceptable
Sources are correctly attributed

Method 2: cURL Command

Basic Query:

curl -X POST "https://api.guidedmind.ai/v1/rag/YOUR_PROJECT_ID/query" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the return policy?",
    "options": {
      "max_results": 5,
      "temperature": 0.7
    }
  }'

Query with Filters:

curl -X POST "https://api.guidedmind.ai/v1/rag/YOUR_PROJECT_ID/query" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the return policy?",
    "filters": {
      "document_ids": ["doc_123", "doc_456"]
    },
    "options": {
      "max_results": 5
    }
  }'

Query with GraphRAG:

curl -X POST "https://api.guidedmind.ai/v1/rag/YOUR_PROJECT_ID/query" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Who works at Acme Corp?",
    "options": {
      "include_graph": true
    }
  }'

Method 3: MCP (AI Agent Integration)

For AI agents using MCP, the tool is automatically discovered:

User: "What is the return policy?"
Agent: [Automatically calls MCP tool `rag_query` with query]
Agent: "Based on the documents, the return policy states..."

MCP Tool Discovery:

Agents automatically discover available tools
No manual API key management needed
Natural language tool invocation

Expected Response Format

Standard Response

{
  "query": "What is the return policy?",
  "response": "Returns are accepted within 30 days of purchase. Refunds are processed within 5-7 business days.",
  "sources": [
    {
      "document_id": "doc_123",
      "title": "Return Policy",
      "content": "Returns accepted within 30 days...",
      "similarity_score": 0.89,
      "metadata": {
        "page": 3,
        "section": "Returns"
      }
    }
  ],
  "processing_time": 1.23,
  "token_usage": {
    "query_tokens": 8,
    "response_tokens": 45,
    "total_tokens": 53
  }
}

Response Fields Explained

Field	Description
`query`	Original user query
`response`	Generated answer
`sources`	Retrieved chunks with scores
`processing_time`	Total latency in seconds
`token_usage`	Token consumption breakdown

GraphRAG Response

{
  "query": "Who works at Acme Corp?",
  "response": "John Smith and Jane Doe work at Acme Corp.",
  "sources": [...],
  "graph_results": {
    "entities": [
      {
        "id": "node_1",
        "name": "John Smith",
        "category": "Person"
      },
      {
        "id": "node_2",
        "name": "Acme Corp",
        "category": "Organization"
      }
    ],
    "relationships": [
      {
        "source": "node_1",
        "target": "node_2",
        "type": "WORKS_FOR"
      }
    ]
  }
}

API vs MCP Comparison

Aspect	REST API v1	MCP Server
Best For	Traditional apps, web backends	AI agents (Claude, Cursor)
Authentication	API key in header	MCP protocol handles auth
Query Style	HTTP POST with JSON	Natural language tool calls
Response Format	Structured JSON	Structured JSON (MCP standard)
Streaming	SSE support	MCP progress tokens
Tool Discovery	Manual (read docs)	Automatic (MCP manifest)

When to Use REST API

Building a web application backend
Integrating with existing systems
Need fine-grained control over HTTP headers
Working with non-LLM clients

When to Use MCP

Building AI agent integrations
Want natural language tool discovery
Using Claude Desktop, Cursor, or similar
Prefer automatic tool documentation

Troubleshooting

401 Unauthorized

Cause: Invalid or missing API key

Solution:

# Verify API key is correct
curl -X POST "https://api.guidedmind.ai/v1/rag/YOUR_PROJECT_ID/query" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  ...

429 Rate Limit Exceeded

Cause: Too many requests

Solution:

Reduce request frequency
Contact support for higher limits
Implement client-side rate limiting

500 Internal Error

Cause: Server-side issue

Solution:

Check query format
Verify project has processed documents
Contact support if persistent

Timeout

Cause: Query taking too long

Solution:

Reduce Top-K value
Simplify query
Check server resources

Next Steps

For Standard RAG Users

Proceed to Step 6: Benchmarking & Iteration to set up quality tracking.

For GraphRAG Users

Proceed to Step 5: AI Graph Editor to test graph search and editing capabilities.

What to Bring to Next Step

Working API endpoint with verified authentication
API key stored securely
Baseline response format for comparison
Processing time metrics for monitoring

Tips for Success

Test All Methods: Verify both built-in console and cURL work
Store API Key Securely: Use environment variables or secret management
Monitor Rate Limits: Track usage to avoid hitting limits
Document Response Format: Share with your team for integration
Set Up Monitoring: Track latency and error rates

Step 3: Configure Pipeline Step 5: Graph Editing