Step 4 - Deploy API Endpoint
Configure and test your production API endpoint with proper authentication
Step 4: Deploy API Endpoint
Purpose
Configure API security settings and test your RAG endpoint with proper authentication. This step transforms your tested configuration into a production-ready API.
Entry Point: API Setup tab → "Test Endpoint" button
Prerequisites: Pipeline configuration optimized (Step 3 complete)
Expected Outcome: Verified API endpoint working with authentication
API Configuration Checklist
API Key Management
| Item | Status | Notes |
|---|---|---|
| API key generated | ✅ Automatic | Created on project creation |
| API key securely stored | ⚠️ Manual | Copy immediately after creation |
| API key masked display | ✅ Automatic | For verification only |
Important: The raw API key is only shown once during initial generation. Store it securely immediately.
Security Settings
| Setting | Recommended Value | Purpose |
|---|---|---|
| Rate Limit | 1000 req/hour | Prevent abuse |
| Timeout | 30 seconds | Resource management |
| IP Whitelist | Your application IPs | Additional security |
Input Configuration
Verify your API accepts:
- Query text - The user's question or search
- Optional filters - Document IDs, metadata filters
- Model options - Temperature, max_tokens, stream
Testing Methods
Method 1: Built-in Test Console
Location: API Setup tab → "Test Endpoint" button
Steps:
- Enter test query in the console
- API key is auto-filled from your project
- Click "Send" to execute
- Review response format and timing
What to Verify:
- Response contains expected fields
- Similarity scores match Step 2 testing
- Processing time is acceptable
- Sources are correctly attributed
Method 2: cURL Command
Basic Query:
curl -X POST "https://api.guidedmind.ai/v1/rag/YOUR_PROJECT_ID/query" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "What is the return policy?",
"options": {
"max_results": 5,
"temperature": 0.7
}
}'Query with Filters:
curl -X POST "https://api.guidedmind.ai/v1/rag/YOUR_PROJECT_ID/query" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "What is the return policy?",
"filters": {
"document_ids": ["doc_123", "doc_456"]
},
"options": {
"max_results": 5
}
}'Query with GraphRAG:
curl -X POST "https://api.guidedmind.ai/v1/rag/YOUR_PROJECT_ID/query" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "Who works at Acme Corp?",
"options": {
"include_graph": true
}
}'Method 3: MCP (AI Agent Integration)
For AI agents using MCP, the tool is automatically discovered:
User: "What is the return policy?"
Agent: [Automatically calls MCP tool `rag_query` with query]
Agent: "Based on the documents, the return policy states..."
MCP Tool Discovery:
- Agents automatically discover available tools
- No manual API key management needed
- Natural language tool invocation
Expected Response Format
Standard Response
{
"query": "What is the return policy?",
"response": "Returns are accepted within 30 days of purchase. Refunds are processed within 5-7 business days.",
"sources": [
{
"document_id": "doc_123",
"title": "Return Policy",
"content": "Returns accepted within 30 days...",
"similarity_score": 0.89,
"metadata": {
"page": 3,
"section": "Returns"
}
}
],
"processing_time": 1.23,
"token_usage": {
"query_tokens": 8,
"response_tokens": 45,
"total_tokens": 53
}
}Response Fields Explained
| Field | Description |
|---|---|
query | Original user query |
response | Generated answer |
sources | Retrieved chunks with scores |
processing_time | Total latency in seconds |
token_usage | Token consumption breakdown |
GraphRAG Response
{
"query": "Who works at Acme Corp?",
"response": "John Smith and Jane Doe work at Acme Corp.",
"sources": [...],
"graph_results": {
"entities": [
{
"id": "node_1",
"name": "John Smith",
"category": "Person"
},
{
"id": "node_2",
"name": "Acme Corp",
"category": "Organization"
}
],
"relationships": [
{
"source": "node_1",
"target": "node_2",
"type": "WORKS_FOR"
}
]
}
}API vs MCP Comparison
| Aspect | REST API v1 | MCP Server |
|---|---|---|
| Best For | Traditional apps, web backends | AI agents (Claude, Cursor) |
| Authentication | API key in header | MCP protocol handles auth |
| Query Style | HTTP POST with JSON | Natural language tool calls |
| Response Format | Structured JSON | Structured JSON (MCP standard) |
| Streaming | SSE support | MCP progress tokens |
| Tool Discovery | Manual (read docs) | Automatic (MCP manifest) |
When to Use REST API
- Building a web application backend
- Integrating with existing systems
- Need fine-grained control over HTTP headers
- Working with non-LLM clients
When to Use MCP
- Building AI agent integrations
- Want natural language tool discovery
- Using Claude Desktop, Cursor, or similar
- Prefer automatic tool documentation
Troubleshooting
401 Unauthorized
Cause: Invalid or missing API key
Solution:
# Verify API key is correct
curl -X POST "https://api.guidedmind.ai/v1/rag/YOUR_PROJECT_ID/query" \
-H "Authorization: Bearer YOUR_API_KEY" \
...429 Rate Limit Exceeded
Cause: Too many requests
Solution:
- Reduce request frequency
- Contact support for higher limits
- Implement client-side rate limiting
500 Internal Error
Cause: Server-side issue
Solution:
- Check query format
- Verify project has processed documents
- Contact support if persistent
Timeout
Cause: Query taking too long
Solution:
- Reduce Top-K value
- Simplify query
- Check server resources
Next Steps
For Standard RAG Users
Proceed to Step 6: Benchmarking & Iteration to set up quality tracking.
For GraphRAG Users
Proceed to Step 5: AI Graph Editor to test graph search and editing capabilities.
What to Bring to Next Step
- Working API endpoint with verified authentication
- API key stored securely
- Baseline response format for comparison
- Processing time metrics for monitoring
Tips for Success
- Test All Methods: Verify both built-in console and cURL work
- Store API Key Securely: Use environment variables or secret management
- Monitor Rate Limits: Track usage to avoid hitting limits
- Document Response Format: Share with your team for integration
- Set Up Monitoring: Track latency and error rates
