IOA Cloud API Integration Guide
Complete integration tutorial for third-party applications connecting to IOA Cloud with OpenAI-compatible APIs.
Overview
IOA Cloud provides OpenAI-compatible REST APIs with automatic governance, evidence generation, and compliance features. Third-party applications can integrate seamlessly by changing their API endpoint and adding authentication.
Key Benefits
- Zero breaking changes - OpenAI-compatible API
- Automatic governance - Policy enforcement and evidence generation
- Governance profiles - domain and regulation-specific mappings layered above the runtime kernel
- Flexible billing - House key or bring-your-own-key
- Audit trails - verifiable evidence bundles and runtime records
Integration Options
| Method | Use Case | Complexity |
|---|---|---|
| OpenAI-Compatible API | Drop-in replacement for OpenAI calls | Low |
| SDK Integration | Enhanced governance features | Medium |
| Evidence-Only | Audit compliance without governance | Advanced |
Prerequisites
1. API Key
Get your API key from console.orchintel.com:
- Sign up for an IOA Cloud account
- Navigate to Settings → API Keys
- Generate a new API key (format:
ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX)
2. Plan Selection
Choose the appropriate plan:
| Plan | Monthly Cost | Governance Modes | Rate Limits | Use Case |
|---|---|---|---|---|
| Launch | Free | Shadow only | 2 RPS, 1K requests | Development/testing |
| Scale | $299 | Shadow + Enforce | 5 RPS, 25K requests | Production apps |
| Trust | Custom | All modes + Enterprise | Custom limits | Enterprise deployments |
3. Development Environment
# Install required packages
pip install httpx openai requests
# Or for async applications
pip install aiohttp httpx Authentication
IOA Cloud uses Bearer token authentication compatible with OpenAI's format.
Authentication Headers
headers = {
"Authorization": "Bearer ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX",
"Content-Type": "application/json"
} API Key Validation
import httpx
async def validate_api_key(api_key: str) -> bool:
"""Validate API key with IOA Cloud."""
try:
async with httpx.AsyncClient() as client:
response = await client.get(
"https://api.orchintel.com/api/models",
headers={"Authorization": f"Bearer {api_key}"}
)
return response.status_code == 200
except:
return False
# Usage
is_valid = await validate_api_key("ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX")
print(f"API key valid: {is_valid}") Security Best Practices
- Never expose API keys in client-side code
- Rotate keys regularly via console.orchintel.com
- Use environment variables for key storage
- Log key usage for audit purposes
API Endpoints
IOA Cloud provides OpenAI-compatible endpoints with additional governance features.
Base URL
https://api.orchintel.com Core Endpoints
1. Chat Completions (Primary Integration Point)
Endpoint: POST /v1/chat/completions
Purpose: Generate chat completions with governance enforcement.
Request
import httpx
import json
async def chat_completion_example():
payload = {
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in simple terms."}
],
"temperature": 0.7,
"max_tokens": 500,
"stream": False
}
headers = {
"Authorization": "Bearer ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX",
"Content-Type": "application/json",
"ioa-mode": "enforce" # Governance mode
}
async with httpx.AsyncClient() as client:
response = await client.post(
"https://api.orchintel.com/v1/chat/completions",
headers=headers,
json=payload
)
if response.status_code == 200:
data = response.json()
content = data["choices"][0]["message"]["content"]
usage = data["usage"]
print(f"Response: {content}")
print(f"Tokens used: {usage['total_tokens']}")
else:
print(f"Error: {response.status_code} - {response.text}")
# Run the example
import asyncio
asyncio.run(chat_completion_example()) Response (OpenAI-compatible)
{
"id": "chatcmpl-1234567890",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-4o",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Quantum computing uses quantum mechanics principles..."
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 150,
"total_tokens": 175
}
} 2. Models List
Endpoint: GET /api/models
Purpose: Get available models and their capabilities.
async def get_available_models():
headers = {"Authorization": "Bearer ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX"}
async with httpx.AsyncClient() as client:
response = await client.get(
"https://api.orchintel.com/api/models",
headers=headers
)
if response.status_code == 200:
data = response.json()
models = data.get("models", [])
provider_versions = data.get("provider_versions", {})
print(f"Available models: {len(models)}")
for model in models:
print(f"- {model['provider']}/{model['model']}: {model.get('supported_modes', [])}")
print(f"Provider versions: {provider_versions}")
else:
print(f"Error: {response.status_code}")
asyncio.run(get_available_models()) 3. Health Check
Endpoint: GET /v1/healthz
Purpose: Verify API availability.
async def health_check():
async with httpx.AsyncClient() as client:
response = await client.get("https://api.orchintel.com/v1/healthz")
if response.status_code == 200:
data = response.json()
print(f"Status: {data['status']} - Service: {data['service']}")
else:
print(f"Health check failed: {response.status_code}")
asyncio.run(health_check()) Governance Modes
IOA Cloud provides three governance modes via the ioa-mode header.
1. Shadow Mode (ioa-mode: shadow)
Behavior: Log policy evaluations but allow all requests.
Use Cases:
- Development and testing
- Learning governance behavior
- Gradual rollout
headers = {
"Authorization": "Bearer ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX",
"ioa-mode": "shadow"
}
# All requests pass through, evidence generated for analysis 2. Enforce Mode (ioa-mode: enforce)
Behavior: Block requests that violate policies.
Use Cases:
- Production applications
- Compliance-critical scenarios
- Risk mitigation
headers = {
"Authorization": "Bearer ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX",
"ioa-mode": "enforce"
}
# Requests violating policies return HTTP 403 Policy Violation Response
{
"error": {
"message": "Request blocked by policy",
"type": "policy_violation",
"code": "policy_violation",
"violations": [
{
"rule": "content_safety.high_risk_content",
"severity": "high",
"description": "Content contains prohibited material"
}
]
}
} 3. Consensus Mode (ioa-mode: consensus)
Behavior: Multi-LLM voting for high-confidence decisions.
Requirements: Scale or Trust plan with consensus add-on.
Use Cases:
- Critical business decisions
- High-stakes content
- Regulatory compliance
headers = {
"Authorization": "Bearer ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX",
"ioa-mode": "consensus"
}
# Multiple LLMs vote on the response For more details, see Consensus Mode documentation.
Integration Examples
Example 1: Quick Start (Python)
The simplest way to get started with IOA Cloud:
import requests
# Your IOA Cloud API key
IOA_API_KEY = "ioa_usr_sk_live_YOUR_KEY_HERE"
# Make a chat completion request
response = requests.post(
"https://api.orchintel.com/v1/chat/completions",
headers={
"Authorization": f"Bearer {IOA_API_KEY}",
"Content-Type": "application/json"
},
json={
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "What is the capital of France?"}
],
"max_tokens": 100
}
)
# Get the response
data = response.json()
print(data["choices"][0]["message"]["content"])
# Output: The capital of France is Paris. Example 2: Using Governance Modes
Control governance behavior with the ioa-mode header:
import requests
IOA_API_KEY = "ioa_usr_sk_live_YOUR_KEY_HERE"
def chat_with_governance(message, mode="enforce"):
"""Send a message with specified governance mode."""
response = requests.post(
"https://api.orchintel.com/v1/chat/completions",
headers={
"Authorization": f"Bearer {IOA_API_KEY}",
"Content-Type": "application/json",
"ioa-mode": mode # "shadow", "enforce", or "consensus"
},
json={
"model": "gpt-4o",
"messages": [{"role": "user", "content": message}],
"max_tokens": 200
}
)
if response.status_code == 200:
return response.json()["choices"][0]["message"]["content"]
elif response.status_code == 403:
# Request blocked by policy
return f"Blocked: {response.json()}"
else:
return f"Error: {response.status_code}"
# Shadow mode: logs policies but doesn't block
result = chat_with_governance("Hello!", mode="shadow")
print(f"Shadow: {result}")
# Enforce mode: actively blocks policy violations
result = chat_with_governance("What is 2+2?", mode="enforce")
print(f"Enforce: {result}") Example 3: cURL Command
Test the API directly from your terminal:
curl -X POST https://api.orchintel.com/v1/chat/completions \
-H "Authorization: Bearer ioa_usr_sk_live_YOUR_KEY_HERE" \
-H "Content-Type: application/json" \
-H "ioa-mode: enforce" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello!"}],
"max_tokens": 50
}' Example 4: Async Python (httpx)
For async applications using httpx:
import httpx
import asyncio
async def chat_async(message):
"""Async chat completion with IOA Cloud."""
async with httpx.AsyncClient() as client:
response = await client.post(
"https://api.orchintel.com/v1/chat/completions",
headers={
"Authorization": "Bearer ioa_usr_sk_live_YOUR_KEY_HERE",
"Content-Type": "application/json",
"ioa-mode": "enforce"
},
json={
"model": "gpt-4o",
"messages": [{"role": "user", "content": message}],
"max_tokens": 200
},
timeout=30.0
)
if response.status_code == 200:
return response.json()["choices"][0]["message"]["content"]
else:
raise Exception(f"API Error: {response.status_code}")
# Run async function
result = asyncio.run(chat_async("Explain quantum computing briefly."))
print(result) Example 5: Chatbot with Conversation History
import requests
class IOAChatbot:
def __init__(self, api_key):
self.api_key = api_key
self.history = []
def chat(self, message):
"""Send message and maintain conversation history."""
self.history.append({"role": "user", "content": message})
response = requests.post(
"https://api.orchintel.com/v1/chat/completions",
headers={
"Authorization": f"Bearer {self.api_key}",
"Content-Type": "application/json",
"ioa-mode": "enforce"
},
json={
"model": "gpt-4o",
"messages": self.history,
"max_tokens": 300
}
)
if response.status_code == 200:
reply = response.json()["choices"][0]["message"]["content"]
self.history.append({"role": "assistant", "content": reply})
return reply
else:
return f"Error: {response.status_code}"
# Usage
bot = IOAChatbot("ioa_usr_sk_live_YOUR_KEY_HERE")
print(bot.chat("What is Python?"))
print(bot.chat("What are its main uses?")) # Remembers context Error Handling
HTTP Status Codes
| Status Code | Meaning | Action |
|---|---|---|
200 | Success | Process response normally |
401 | Invalid API key | Check API key format and validity |
403 | Policy violation | Request blocked by governance |
429 | Rate limit exceeded | Implement exponential backoff |
500 | Server error | Retry with backoff |
502 | Upstream provider error | Retry or fallback |
Policy Violation Handling
class PolicyViolationError(Exception):
def __init__(self, violations):
self.violations = violations
super().__init__(f"Request blocked by {len(violations)} policy violations")
async def handle_ioa_response(response):
"""Handle IOA Cloud API response with proper error handling."""
if response.status_code == 200:
return response.json()
elif response.status_code == 403:
error_data = response.json()
violations = error_data.get("error", {}).get("violations", [])
raise PolicyViolationError(violations)
elif response.status_code == 429:
# Rate limited - implement backoff
retry_after = response.headers.get("Retry-After", "60")
raise RateLimitError(f"Rate limited. Retry after {retry_after} seconds")
elif response.status_code >= 500:
raise ServerError(f"Server error: {response.status_code}")
else:
raise APIError(f"API error: {response.status_code} - {response.text}")
# Usage in your application
try:
result = await handle_ioa_response(response)
print("Success:", result["choices"][0]["message"]["content"])
except PolicyViolationError as e:
print(f"Policy violation: {len(e.violations)} rules triggered")
# Handle blocked content (show user-friendly message)
except RateLimitError as e:
print(f"Rate limited: {e}")
# Implement backoff/retry logic
except ServerError as e:
print(f"Server error: {e}")
# Retry or show error to user Billing & Usage
Cost Models
IOA Cloud offers flexible billing options:
1. House Key Billing (IOA Pays for LLM)
- Launch: 1,000 requests/month included (Free)
- Scale: 25,000 requests/month included ($299/month)
- Trust: Custom request volumes (starting at $25,000/year)
2. Bring Your Own Key (BYOK) Billing
- Use your own OpenAI/Anthropic API keys
- IOA charges for governance only; LLM provider bills separately
- Full control over LLM costs and model selection
Rate Limits
| Plan | Requests/Second | Requests/Month |
|---|---|---|
| Launch | 2 | 1,000 |
| Scale | 5 | 25,000 |
| Trust | Custom | Custom |
Migration Guide
From OpenAI Direct
Minimal Changes Required:
# BEFORE (OpenAI direct)
import openai
client = openai.OpenAI(api_key="sk-...")
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello"}]
)
# AFTER (IOA Cloud)
import httpx
headers = {
"Authorization": "Bearer ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX",
"Content-Type": "application/json"
}
payload = {
"model": "gpt-4o", # Note: gpt-4o instead of gpt-4
"messages": [{"role": "user", "content": "Hello"}]
}
async with httpx.AsyncClient() as client:
response = await client.post(
"https://api.orchintel.com/v1/chat/completions",
headers=headers,
json=payload
)
data = response.json()
content = data["choices"][0]["message"]["content"] Support & Resources
Documentation
Support Channels
- Email: support@orchintel.com
- Chat: Available in console dashboard
- GitHub Issues: github.com/OrchIntel/ioa-core/issues
Conclusion
IOA Cloud integration enables third-party applications to add hosted AI governance workflows with minimal code changes. The OpenAI-compatible API eases migration, while evidence generation provides auditable runtime records.
Key Takeaways
- Simple integration - Change endpoint, add Bearer token
- Flexible governance - Shadow/Enforce/Consensus modes
- Automatic compliance - Evidence bundles for audits
- Enterprise features - Multi-LLM consensus, custom policies
- Cost optimization - House key or bring-your-own-key billing
Next Steps
- Get API key from console.orchintel.com
- Test with shadow mode in development
- Gradually roll out with enforce mode in production
- Monitor usage and evidence in console dashboard