IOA Cloud API Integration Guide

Complete integration tutorial for third-party applications connecting to IOA Cloud with OpenAI-compatible APIs.

Overview

IOA Cloud provides OpenAI-compatible REST APIs with automatic governance, evidence generation, and compliance features. Third-party applications can integrate seamlessly by changing their API endpoint and adding authentication.

Key Benefits

Zero breaking changes - OpenAI-compatible API
Automatic governance - Policy enforcement and evidence generation
Governance profiles - domain and regulation-specific mappings layered above the runtime kernel
Flexible billing - House key or bring-your-own-key
Audit trails - verifiable evidence bundles and runtime records

Integration Options

Method	Use Case	Complexity
OpenAI-Compatible API	Drop-in replacement for OpenAI calls	Low
SDK Integration	Enhanced governance features	Medium
Evidence-Only	Audit compliance without governance	Advanced

Prerequisites

1. API Key

Get your API key from console.orchintel.com:

Sign up for an IOA Cloud account
Navigate to Settings → API Keys
Generate a new API key (format: ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX)

2. Plan Selection

Choose the appropriate plan:

Plan	Monthly Cost	Governance Modes	Rate Limits	Use Case
Launch	Free	Shadow only	2 RPS, 1K requests	Development/testing
Scale	$299	Shadow + Enforce	5 RPS, 25K requests	Production apps
Trust	Custom	All modes + Enterprise	Custom limits	Enterprise deployments

3. Development Environment

# Install required packages
pip install httpx openai requests

# Or for async applications
pip install aiohttp httpx

Authentication

IOA Cloud uses Bearer token authentication compatible with OpenAI's format.

Authentication Headers

headers = {
    "Authorization": "Bearer ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX",
    "Content-Type": "application/json"
}

API Key Validation

import httpx

async def validate_api_key(api_key: str) -> bool:
    """Validate API key with IOA Cloud."""
    try:
        async with httpx.AsyncClient() as client:
            response = await client.get(
                "https://api.orchintel.com/api/models",
                headers={"Authorization": f"Bearer {api_key}"}
            )
            return response.status_code == 200
    except:
        return False

# Usage
is_valid = await validate_api_key("ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX")
print(f"API key valid: {is_valid}")

Security Best Practices

Never expose API keys in client-side code
Rotate keys regularly via console.orchintel.com
Use environment variables for key storage
Log key usage for audit purposes

API Endpoints

IOA Cloud provides OpenAI-compatible endpoints with additional governance features.

Base URL

https://api.orchintel.com

Core Endpoints

1. Chat Completions (Primary Integration Point)

Endpoint: POST /v1/chat/completions

Purpose: Generate chat completions with governance enforcement.

Request

import httpx
import json

async def chat_completion_example():
    payload = {
        "model": "gpt-4o",
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Explain quantum computing in simple terms."}
        ],
        "temperature": 0.7,
        "max_tokens": 500,
        "stream": False
    }

    headers = {
        "Authorization": "Bearer ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX",
        "Content-Type": "application/json",
        "ioa-mode": "enforce"  # Governance mode
    }

    async with httpx.AsyncClient() as client:
        response = await client.post(
            "https://api.orchintel.com/v1/chat/completions",
            headers=headers,
            json=payload
        )

        if response.status_code == 200:
            data = response.json()
            content = data["choices"][0]["message"]["content"]
            usage = data["usage"]
            print(f"Response: {content}")
            print(f"Tokens used: {usage['total_tokens']}")
        else:
            print(f"Error: {response.status_code} - {response.text}")

# Run the example
import asyncio
asyncio.run(chat_completion_example())

Response (OpenAI-compatible)

{
  "id": "chatcmpl-1234567890",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "gpt-4o",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Quantum computing uses quantum mechanics principles..."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 150,
    "total_tokens": 175
  }
}

2. Models List

Endpoint: GET /api/models

Purpose: Get available models and their capabilities.

async def get_available_models():
    headers = {"Authorization": "Bearer ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX"}

    async with httpx.AsyncClient() as client:
        response = await client.get(
            "https://api.orchintel.com/api/models",
            headers=headers
        )

        if response.status_code == 200:
            data = response.json()
            models = data.get("models", [])
            provider_versions = data.get("provider_versions", {})

            print(f"Available models: {len(models)}")
            for model in models:
                print(f"- {model['provider']}/{model['model']}: {model.get('supported_modes', [])}")

            print(f"Provider versions: {provider_versions}")
        else:
            print(f"Error: {response.status_code}")

asyncio.run(get_available_models())

3. Health Check

Endpoint: GET /v1/healthz

Purpose: Verify API availability.

async def health_check():
    async with httpx.AsyncClient() as client:
        response = await client.get("https://api.orchintel.com/v1/healthz")
        if response.status_code == 200:
            data = response.json()
            print(f"Status: {data['status']} - Service: {data['service']}")
        else:
            print(f"Health check failed: {response.status_code}")

asyncio.run(health_check())

Governance Modes

IOA Cloud provides three governance modes via the ioa-mode header.

1. Shadow Mode (`ioa-mode: shadow`)

Behavior: Log policy evaluations but allow all requests.

Use Cases:

Development and testing
Learning governance behavior
Gradual rollout

headers = {
    "Authorization": "Bearer ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX",
    "ioa-mode": "shadow"
}
# All requests pass through, evidence generated for analysis

2. Enforce Mode (`ioa-mode: enforce`)

Behavior: Block requests that violate policies.

Use Cases:

Production applications
Compliance-critical scenarios
Risk mitigation

headers = {
    "Authorization": "Bearer ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX",
    "ioa-mode": "enforce"
}
# Requests violating policies return HTTP 403

Policy Violation Response

{
  "error": {
    "message": "Request blocked by policy",
    "type": "policy_violation",
    "code": "policy_violation",
    "violations": [
      {
        "rule": "content_safety.high_risk_content",
        "severity": "high",
        "description": "Content contains prohibited material"
      }
    ]
  }
}

3. Consensus Mode (`ioa-mode: consensus`)

Behavior: Multi-LLM voting for high-confidence decisions.

Requirements: Scale or Trust plan with consensus add-on.

Use Cases:

Critical business decisions
High-stakes content
Regulatory compliance

headers = {
    "Authorization": "Bearer ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX",
    "ioa-mode": "consensus"
}
# Multiple LLMs vote on the response

For more details, see Consensus Mode documentation.

See It In Action: View real-world examples of IOA blocking discriminatory content and enforcing the 7 System Laws in our Enforcement Demo.

Integration Examples

Example 1: Quick Start (Python)

The simplest way to get started with IOA Cloud:

import requests

# Your IOA Cloud API key
IOA_API_KEY = "ioa_usr_sk_live_YOUR_KEY_HERE"

# Make a chat completion request
response = requests.post(
    "https://api.orchintel.com/v1/chat/completions",
    headers={
        "Authorization": f"Bearer {IOA_API_KEY}",
        "Content-Type": "application/json"
    },
    json={
        "model": "gpt-4o",
        "messages": [
            {"role": "user", "content": "What is the capital of France?"}
        ],
        "max_tokens": 100
    }
)

# Get the response
data = response.json()
print(data["choices"][0]["message"]["content"])
# Output: The capital of France is Paris.

Example 2: Using Governance Modes

Control governance behavior with the ioa-mode header:

import requests

IOA_API_KEY = "ioa_usr_sk_live_YOUR_KEY_HERE"

def chat_with_governance(message, mode="enforce"):
    """Send a message with specified governance mode."""
    response = requests.post(
        "https://api.orchintel.com/v1/chat/completions",
        headers={
            "Authorization": f"Bearer {IOA_API_KEY}",
            "Content-Type": "application/json",
            "ioa-mode": mode  # "shadow", "enforce", or "consensus"
        },
        json={
            "model": "gpt-4o",
            "messages": [{"role": "user", "content": message}],
            "max_tokens": 200
        }
    )

    if response.status_code == 200:
        return response.json()["choices"][0]["message"]["content"]
    elif response.status_code == 403:
        # Request blocked by policy
        return f"Blocked: {response.json()}"
    else:
        return f"Error: {response.status_code}"

# Shadow mode: logs policies but doesn't block
result = chat_with_governance("Hello!", mode="shadow")
print(f"Shadow: {result}")

# Enforce mode: actively blocks policy violations
result = chat_with_governance("What is 2+2?", mode="enforce")
print(f"Enforce: {result}")

Example 3: cURL Command

Test the API directly from your terminal:

curl -X POST https://api.orchintel.com/v1/chat/completions \
  -H "Authorization: Bearer ioa_usr_sk_live_YOUR_KEY_HERE" \
  -H "Content-Type: application/json" \
  -H "ioa-mode: enforce" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello!"}],
    "max_tokens": 50
  }'

Example 4: Async Python (httpx)

For async applications using httpx:

import httpx
import asyncio

async def chat_async(message):
    """Async chat completion with IOA Cloud."""
    async with httpx.AsyncClient() as client:
        response = await client.post(
            "https://api.orchintel.com/v1/chat/completions",
            headers={
                "Authorization": "Bearer ioa_usr_sk_live_YOUR_KEY_HERE",
                "Content-Type": "application/json",
                "ioa-mode": "enforce"
            },
            json={
                "model": "gpt-4o",
                "messages": [{"role": "user", "content": message}],
                "max_tokens": 200
            },
            timeout=30.0
        )

        if response.status_code == 200:
            return response.json()["choices"][0]["message"]["content"]
        else:
            raise Exception(f"API Error: {response.status_code}")

# Run async function
result = asyncio.run(chat_async("Explain quantum computing briefly."))
print(result)

Example 5: Chatbot with Conversation History

import requests

class IOAChatbot:
    def __init__(self, api_key):
        self.api_key = api_key
        self.history = []

    def chat(self, message):
        """Send message and maintain conversation history."""
        self.history.append({"role": "user", "content": message})

        response = requests.post(
            "https://api.orchintel.com/v1/chat/completions",
            headers={
                "Authorization": f"Bearer {self.api_key}",
                "Content-Type": "application/json",
                "ioa-mode": "enforce"
            },
            json={
                "model": "gpt-4o",
                "messages": self.history,
                "max_tokens": 300
            }
        )

        if response.status_code == 200:
            reply = response.json()["choices"][0]["message"]["content"]
            self.history.append({"role": "assistant", "content": reply})
            return reply
        else:
            return f"Error: {response.status_code}"

# Usage
bot = IOAChatbot("ioa_usr_sk_live_YOUR_KEY_HERE")
print(bot.chat("What is Python?"))
print(bot.chat("What are its main uses?"))  # Remembers context

Error Handling

HTTP Status Codes

Status Code	Meaning	Action
`200`	Success	Process response normally
`401`	Invalid API key	Check API key format and validity
`403`	Policy violation	Request blocked by governance
`429`	Rate limit exceeded	Implement exponential backoff
`500`	Server error	Retry with backoff
`502`	Upstream provider error	Retry or fallback

Policy Violation Handling

class PolicyViolationError(Exception):
    def __init__(self, violations):
        self.violations = violations
        super().__init__(f"Request blocked by {len(violations)} policy violations")

async def handle_ioa_response(response):
    """Handle IOA Cloud API response with proper error handling."""
    if response.status_code == 200:
        return response.json()
    elif response.status_code == 403:
        error_data = response.json()
        violations = error_data.get("error", {}).get("violations", [])
        raise PolicyViolationError(violations)
    elif response.status_code == 429:
        # Rate limited - implement backoff
        retry_after = response.headers.get("Retry-After", "60")
        raise RateLimitError(f"Rate limited. Retry after {retry_after} seconds")
    elif response.status_code >= 500:
        raise ServerError(f"Server error: {response.status_code}")
    else:
        raise APIError(f"API error: {response.status_code} - {response.text}")

# Usage in your application
try:
    result = await handle_ioa_response(response)
    print("Success:", result["choices"][0]["message"]["content"])
except PolicyViolationError as e:
    print(f"Policy violation: {len(e.violations)} rules triggered")
    # Handle blocked content (show user-friendly message)
except RateLimitError as e:
    print(f"Rate limited: {e}")
    # Implement backoff/retry logic
except ServerError as e:
    print(f"Server error: {e}")
    # Retry or show error to user

Billing & Usage

Cost Models

IOA Cloud offers flexible billing options:

1. House Key Billing (IOA Pays for LLM)

Launch: 1,000 requests/month included (Free)
Scale: 25,000 requests/month included ($299/month)
Trust: Custom request volumes (starting at $25,000/year)

2. Bring Your Own Key (BYOK) Billing

Use your own OpenAI/Anthropic API keys
IOA charges for governance only; LLM provider bills separately
Full control over LLM costs and model selection

Rate Limits

Plan	Requests/Second	Requests/Month
Launch	2	1,000
Scale	5	25,000
Trust	Custom	Custom

Migration Guide

From OpenAI Direct

Minimal Changes Required:

# BEFORE (OpenAI direct)
import openai

client = openai.OpenAI(api_key="sk-...")
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello"}]
)

# AFTER (IOA Cloud)
import httpx

headers = {
    "Authorization": "Bearer ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX",
    "Content-Type": "application/json"
}

payload = {
    "model": "gpt-4o",  # Note: gpt-4o instead of gpt-4
    "messages": [{"role": "user", "content": "Hello"}]
}

async with httpx.AsyncClient() as client:
    response = await client.post(
        "https://api.orchintel.com/v1/chat/completions",
        headers=headers,
        json=payload
    )
    data = response.json()
    content = data["choices"][0]["message"]["content"]

Support & Resources

Documentation

Support Channels

Email: support@orchintel.com
Chat: Available in console dashboard
GitHub Issues: github.com/OrchIntel/ioa-core/issues

Conclusion

IOA Cloud integration enables third-party applications to add hosted AI governance workflows with minimal code changes. The OpenAI-compatible API eases migration, while evidence generation provides auditable runtime records.

Key Takeaways

Simple integration - Change endpoint, add Bearer token
Flexible governance - Shadow/Enforce/Consensus modes
Automatic compliance - Evidence bundles for audits
Enterprise features - Multi-LLM consensus, custom policies
Cost optimization - House key or bring-your-own-key billing

Next Steps

Get API key from console.orchintel.com
Test with shadow mode in development
Gradually roll out with enforce mode in production
Monitor usage and evidence in console dashboard

IOA Cloud API Integration Guide

Overview

Key Benefits

Integration Options

Prerequisites

1. API Key

2. Plan Selection

3. Development Environment

Authentication

Authentication Headers

API Key Validation

Security Best Practices

API Endpoints

Base URL

Core Endpoints

1. Chat Completions (Primary Integration Point)

Request

Response (OpenAI-compatible)

2. Models List

3. Health Check

Governance Modes

1. Shadow Mode (ioa-mode: shadow)

2. Enforce Mode (ioa-mode: enforce)

Policy Violation Response

3. Consensus Mode (ioa-mode: consensus)

Integration Examples

Example 1: Quick Start (Python)

Example 2: Using Governance Modes

Example 3: cURL Command

Example 4: Async Python (httpx)

Example 5: Chatbot with Conversation History

Error Handling

HTTP Status Codes

Policy Violation Handling

Billing & Usage

Cost Models

1. House Key Billing (IOA Pays for LLM)

2. Bring Your Own Key (BYOK) Billing

Rate Limits

Migration Guide

From OpenAI Direct

Support & Resources

Documentation

Support Channels

Conclusion

Key Takeaways

Next Steps

1. Shadow Mode (`ioa-mode: shadow`)

2. Enforce Mode (`ioa-mode: enforce`)

3. Consensus Mode (`ioa-mode: consensus`)