Docs
Step 4 - Deploy API Endpoint

Step 4 - Deploy API Endpoint

Configure and test your production API endpoint with proper authentication

Step 4: Deploy API Endpoint

Purpose

Configure API security settings and test your RAG endpoint with proper authentication. This step transforms your tested configuration into a production-ready API.

Entry Point: API Setup tab → "Test Endpoint" button

Prerequisites: Pipeline configuration optimized (Step 3 complete)

Expected Outcome: Verified API endpoint working with authentication

API Configuration Checklist

API Key Management

ItemStatusNotes
API key generated✅ AutomaticCreated on project creation
API key securely stored⚠️ ManualCopy immediately after creation
API key masked display✅ AutomaticFor verification only

Important: The raw API key is only shown once during initial generation. Store it securely immediately.

Security Settings

SettingRecommended ValuePurpose
Rate Limit1000 req/hourPrevent abuse
Timeout30 secondsResource management
IP WhitelistYour application IPsAdditional security

Input Configuration

Verify your API accepts:

  • Query text - The user's question or search
  • Optional filters - Document IDs, metadata filters
  • Model options - Temperature, max_tokens, stream

Testing Methods

Method 1: Built-in Test Console

Location: API Setup tab → "Test Endpoint" button

Steps:

  1. Enter test query in the console
  2. API key is auto-filled from your project
  3. Click "Send" to execute
  4. Review response format and timing

What to Verify:

  • Response contains expected fields
  • Similarity scores match Step 2 testing
  • Processing time is acceptable
  • Sources are correctly attributed

Method 2: cURL Command

Basic Query:

curl -X POST "https://api.guidedmind.ai/v1/rag/YOUR_PROJECT_ID/query" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the return policy?",
    "options": {
      "max_results": 5,
      "temperature": 0.7
    }
  }'

Query with Filters:

curl -X POST "https://api.guidedmind.ai/v1/rag/YOUR_PROJECT_ID/query" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the return policy?",
    "filters": {
      "document_ids": ["doc_123", "doc_456"]
    },
    "options": {
      "max_results": 5
    }
  }'

Query with GraphRAG:

curl -X POST "https://api.guidedmind.ai/v1/rag/YOUR_PROJECT_ID/query" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Who works at Acme Corp?",
    "options": {
      "include_graph": true
    }
  }'

Method 3: MCP (AI Agent Integration)

For AI agents using MCP, the tool is automatically discovered:

User: "What is the return policy?"
Agent: [Automatically calls MCP tool `rag_query` with query]
Agent: "Based on the documents, the return policy states..."

MCP Tool Discovery:

  • Agents automatically discover available tools
  • No manual API key management needed
  • Natural language tool invocation

Expected Response Format

Standard Response

{
  "query": "What is the return policy?",
  "response": "Returns are accepted within 30 days of purchase. Refunds are processed within 5-7 business days.",
  "sources": [
    {
      "document_id": "doc_123",
      "title": "Return Policy",
      "content": "Returns accepted within 30 days...",
      "similarity_score": 0.89,
      "metadata": {
        "page": 3,
        "section": "Returns"
      }
    }
  ],
  "processing_time": 1.23,
  "token_usage": {
    "query_tokens": 8,
    "response_tokens": 45,
    "total_tokens": 53
  }
}

Response Fields Explained

FieldDescription
queryOriginal user query
responseGenerated answer
sourcesRetrieved chunks with scores
processing_timeTotal latency in seconds
token_usageToken consumption breakdown

GraphRAG Response

{
  "query": "Who works at Acme Corp?",
  "response": "John Smith and Jane Doe work at Acme Corp.",
  "sources": [...],
  "graph_results": {
    "entities": [
      {
        "id": "node_1",
        "name": "John Smith",
        "category": "Person"
      },
      {
        "id": "node_2",
        "name": "Acme Corp",
        "category": "Organization"
      }
    ],
    "relationships": [
      {
        "source": "node_1",
        "target": "node_2",
        "type": "WORKS_FOR"
      }
    ]
  }
}

API vs MCP Comparison

AspectREST API v1MCP Server
Best ForTraditional apps, web backendsAI agents (Claude, Cursor)
AuthenticationAPI key in headerMCP protocol handles auth
Query StyleHTTP POST with JSONNatural language tool calls
Response FormatStructured JSONStructured JSON (MCP standard)
StreamingSSE supportMCP progress tokens
Tool DiscoveryManual (read docs)Automatic (MCP manifest)

When to Use REST API

  • Building a web application backend
  • Integrating with existing systems
  • Need fine-grained control over HTTP headers
  • Working with non-LLM clients

When to Use MCP

  • Building AI agent integrations
  • Want natural language tool discovery
  • Using Claude Desktop, Cursor, or similar
  • Prefer automatic tool documentation

Troubleshooting

401 Unauthorized

Cause: Invalid or missing API key

Solution:

# Verify API key is correct
curl -X POST "https://api.guidedmind.ai/v1/rag/YOUR_PROJECT_ID/query" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  ...

429 Rate Limit Exceeded

Cause: Too many requests

Solution:

  • Reduce request frequency
  • Contact support for higher limits
  • Implement client-side rate limiting

500 Internal Error

Cause: Server-side issue

Solution:

  • Check query format
  • Verify project has processed documents
  • Contact support if persistent

Timeout

Cause: Query taking too long

Solution:

  • Reduce Top-K value
  • Simplify query
  • Check server resources

Next Steps

For Standard RAG Users

Proceed to Step 6: Benchmarking & Iteration to set up quality tracking.

For GraphRAG Users

Proceed to Step 5: AI Graph Editor to test graph search and editing capabilities.

What to Bring to Next Step

  1. Working API endpoint with verified authentication
  2. API key stored securely
  3. Baseline response format for comparison
  4. Processing time metrics for monitoring

Tips for Success

  1. Test All Methods: Verify both built-in console and cURL work
  2. Store API Key Securely: Use environment variables or secret management
  3. Monitor Rate Limits: Track usage to avoid hitting limits
  4. Document Response Format: Share with your team for integration
  5. Set Up Monitoring: Track latency and error rates