IOA Cloud API Integration Guide

Complete integration tutorial for third-party applications connecting to IOA Cloud with OpenAI-compatible APIs.

Overview

IOA Cloud provides OpenAI-compatible REST APIs with automatic governance, evidence generation, and compliance features. Third-party applications can integrate seamlessly by changing their API endpoint and adding authentication.

Key Benefits

  • Zero breaking changes - OpenAI-compatible API
  • Automatic governance - Policy enforcement and evidence generation
  • Governance profiles - domain and regulation-specific mappings layered above the runtime kernel
  • Flexible billing - House key or bring-your-own-key
  • Audit trails - verifiable evidence bundles and runtime records

Integration Options

Method Use Case Complexity
OpenAI-Compatible API Drop-in replacement for OpenAI calls Low
SDK Integration Enhanced governance features Medium
Evidence-Only Audit compliance without governance Advanced

Prerequisites

1. API Key

Get your API key from console.orchintel.com:

  1. Sign up for an IOA Cloud account
  2. Navigate to Settings → API Keys
  3. Generate a new API key (format: ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX)

2. Plan Selection

Choose the appropriate plan:

Plan Monthly Cost Governance Modes Rate Limits Use Case
Launch Free Shadow only 2 RPS, 1K requests Development/testing
Scale $299 Shadow + Enforce 5 RPS, 25K requests Production apps
Trust Custom All modes + Enterprise Custom limits Enterprise deployments

3. Development Environment

# Install required packages
pip install httpx openai requests

# Or for async applications
pip install aiohttp httpx

Authentication

IOA Cloud uses Bearer token authentication compatible with OpenAI's format.

Authentication Headers

headers = {
    "Authorization": "Bearer ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX",
    "Content-Type": "application/json"
}

API Key Validation

import httpx

async def validate_api_key(api_key: str) -> bool:
    """Validate API key with IOA Cloud."""
    try:
        async with httpx.AsyncClient() as client:
            response = await client.get(
                "https://api.orchintel.com/api/models",
                headers={"Authorization": f"Bearer {api_key}"}
            )
            return response.status_code == 200
    except:
        return False

# Usage
is_valid = await validate_api_key("ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX")
print(f"API key valid: {is_valid}")

Security Best Practices

  • Never expose API keys in client-side code
  • Rotate keys regularly via console.orchintel.com
  • Use environment variables for key storage
  • Log key usage for audit purposes

API Endpoints

IOA Cloud provides OpenAI-compatible endpoints with additional governance features.

Base URL

https://api.orchintel.com

Core Endpoints

1. Chat Completions (Primary Integration Point)

Endpoint: POST /v1/chat/completions

Purpose: Generate chat completions with governance enforcement.

Request
import httpx
import json

async def chat_completion_example():
    payload = {
        "model": "gpt-4o",
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Explain quantum computing in simple terms."}
        ],
        "temperature": 0.7,
        "max_tokens": 500,
        "stream": False
    }

    headers = {
        "Authorization": "Bearer ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX",
        "Content-Type": "application/json",
        "ioa-mode": "enforce"  # Governance mode
    }

    async with httpx.AsyncClient() as client:
        response = await client.post(
            "https://api.orchintel.com/v1/chat/completions",
            headers=headers,
            json=payload
        )

        if response.status_code == 200:
            data = response.json()
            content = data["choices"][0]["message"]["content"]
            usage = data["usage"]
            print(f"Response: {content}")
            print(f"Tokens used: {usage['total_tokens']}")
        else:
            print(f"Error: {response.status_code} - {response.text}")

# Run the example
import asyncio
asyncio.run(chat_completion_example())
Response (OpenAI-compatible)
{
  "id": "chatcmpl-1234567890",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "gpt-4o",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Quantum computing uses quantum mechanics principles..."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 150,
    "total_tokens": 175
  }
}

2. Models List

Endpoint: GET /api/models

Purpose: Get available models and their capabilities.

async def get_available_models():
    headers = {"Authorization": "Bearer ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX"}

    async with httpx.AsyncClient() as client:
        response = await client.get(
            "https://api.orchintel.com/api/models",
            headers=headers
        )

        if response.status_code == 200:
            data = response.json()
            models = data.get("models", [])
            provider_versions = data.get("provider_versions", {})

            print(f"Available models: {len(models)}")
            for model in models:
                print(f"- {model['provider']}/{model['model']}: {model.get('supported_modes', [])}")

            print(f"Provider versions: {provider_versions}")
        else:
            print(f"Error: {response.status_code}")

asyncio.run(get_available_models())

3. Health Check

Endpoint: GET /v1/healthz

Purpose: Verify API availability.

async def health_check():
    async with httpx.AsyncClient() as client:
        response = await client.get("https://api.orchintel.com/v1/healthz")
        if response.status_code == 200:
            data = response.json()
            print(f"Status: {data['status']} - Service: {data['service']}")
        else:
            print(f"Health check failed: {response.status_code}")

asyncio.run(health_check())

Governance Modes

IOA Cloud provides three governance modes via the ioa-mode header.

1. Shadow Mode (ioa-mode: shadow)

Behavior: Log policy evaluations but allow all requests.

Use Cases:

  • Development and testing
  • Learning governance behavior
  • Gradual rollout
headers = {
    "Authorization": "Bearer ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX",
    "ioa-mode": "shadow"
}
# All requests pass through, evidence generated for analysis

2. Enforce Mode (ioa-mode: enforce)

Behavior: Block requests that violate policies.

Use Cases:

  • Production applications
  • Compliance-critical scenarios
  • Risk mitigation
headers = {
    "Authorization": "Bearer ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX",
    "ioa-mode": "enforce"
}
# Requests violating policies return HTTP 403

Policy Violation Response

{
  "error": {
    "message": "Request blocked by policy",
    "type": "policy_violation",
    "code": "policy_violation",
    "violations": [
      {
        "rule": "content_safety.high_risk_content",
        "severity": "high",
        "description": "Content contains prohibited material"
      }
    ]
  }
}

3. Consensus Mode (ioa-mode: consensus)

Behavior: Multi-LLM voting for high-confidence decisions.

Requirements: Scale or Trust plan with consensus add-on.

Use Cases:

  • Critical business decisions
  • High-stakes content
  • Regulatory compliance
headers = {
    "Authorization": "Bearer ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX",
    "ioa-mode": "consensus"
}
# Multiple LLMs vote on the response

For more details, see Consensus Mode documentation.

See It In Action: View real-world examples of IOA blocking discriminatory content and enforcing the 7 System Laws in our Enforcement Demo.

Integration Examples

Example 1: Quick Start (Python)

The simplest way to get started with IOA Cloud:

import requests

# Your IOA Cloud API key
IOA_API_KEY = "ioa_usr_sk_live_YOUR_KEY_HERE"

# Make a chat completion request
response = requests.post(
    "https://api.orchintel.com/v1/chat/completions",
    headers={
        "Authorization": f"Bearer {IOA_API_KEY}",
        "Content-Type": "application/json"
    },
    json={
        "model": "gpt-4o",
        "messages": [
            {"role": "user", "content": "What is the capital of France?"}
        ],
        "max_tokens": 100
    }
)

# Get the response
data = response.json()
print(data["choices"][0]["message"]["content"])
# Output: The capital of France is Paris.

Example 2: Using Governance Modes

Control governance behavior with the ioa-mode header:

import requests

IOA_API_KEY = "ioa_usr_sk_live_YOUR_KEY_HERE"

def chat_with_governance(message, mode="enforce"):
    """Send a message with specified governance mode."""
    response = requests.post(
        "https://api.orchintel.com/v1/chat/completions",
        headers={
            "Authorization": f"Bearer {IOA_API_KEY}",
            "Content-Type": "application/json",
            "ioa-mode": mode  # "shadow", "enforce", or "consensus"
        },
        json={
            "model": "gpt-4o",
            "messages": [{"role": "user", "content": message}],
            "max_tokens": 200
        }
    )

    if response.status_code == 200:
        return response.json()["choices"][0]["message"]["content"]
    elif response.status_code == 403:
        # Request blocked by policy
        return f"Blocked: {response.json()}"
    else:
        return f"Error: {response.status_code}"

# Shadow mode: logs policies but doesn't block
result = chat_with_governance("Hello!", mode="shadow")
print(f"Shadow: {result}")

# Enforce mode: actively blocks policy violations
result = chat_with_governance("What is 2+2?", mode="enforce")
print(f"Enforce: {result}")

Example 3: cURL Command

Test the API directly from your terminal:

curl -X POST https://api.orchintel.com/v1/chat/completions \
  -H "Authorization: Bearer ioa_usr_sk_live_YOUR_KEY_HERE" \
  -H "Content-Type: application/json" \
  -H "ioa-mode: enforce" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello!"}],
    "max_tokens": 50
  }'

Example 4: Async Python (httpx)

For async applications using httpx:

import httpx
import asyncio

async def chat_async(message):
    """Async chat completion with IOA Cloud."""
    async with httpx.AsyncClient() as client:
        response = await client.post(
            "https://api.orchintel.com/v1/chat/completions",
            headers={
                "Authorization": "Bearer ioa_usr_sk_live_YOUR_KEY_HERE",
                "Content-Type": "application/json",
                "ioa-mode": "enforce"
            },
            json={
                "model": "gpt-4o",
                "messages": [{"role": "user", "content": message}],
                "max_tokens": 200
            },
            timeout=30.0
        )

        if response.status_code == 200:
            return response.json()["choices"][0]["message"]["content"]
        else:
            raise Exception(f"API Error: {response.status_code}")

# Run async function
result = asyncio.run(chat_async("Explain quantum computing briefly."))
print(result)

Example 5: Chatbot with Conversation History

import requests

class IOAChatbot:
    def __init__(self, api_key):
        self.api_key = api_key
        self.history = []

    def chat(self, message):
        """Send message and maintain conversation history."""
        self.history.append({"role": "user", "content": message})

        response = requests.post(
            "https://api.orchintel.com/v1/chat/completions",
            headers={
                "Authorization": f"Bearer {self.api_key}",
                "Content-Type": "application/json",
                "ioa-mode": "enforce"
            },
            json={
                "model": "gpt-4o",
                "messages": self.history,
                "max_tokens": 300
            }
        )

        if response.status_code == 200:
            reply = response.json()["choices"][0]["message"]["content"]
            self.history.append({"role": "assistant", "content": reply})
            return reply
        else:
            return f"Error: {response.status_code}"

# Usage
bot = IOAChatbot("ioa_usr_sk_live_YOUR_KEY_HERE")
print(bot.chat("What is Python?"))
print(bot.chat("What are its main uses?"))  # Remembers context

Error Handling

HTTP Status Codes

Status Code Meaning Action
200 Success Process response normally
401 Invalid API key Check API key format and validity
403 Policy violation Request blocked by governance
429 Rate limit exceeded Implement exponential backoff
500 Server error Retry with backoff
502 Upstream provider error Retry or fallback

Policy Violation Handling

class PolicyViolationError(Exception):
    def __init__(self, violations):
        self.violations = violations
        super().__init__(f"Request blocked by {len(violations)} policy violations")

async def handle_ioa_response(response):
    """Handle IOA Cloud API response with proper error handling."""
    if response.status_code == 200:
        return response.json()
    elif response.status_code == 403:
        error_data = response.json()
        violations = error_data.get("error", {}).get("violations", [])
        raise PolicyViolationError(violations)
    elif response.status_code == 429:
        # Rate limited - implement backoff
        retry_after = response.headers.get("Retry-After", "60")
        raise RateLimitError(f"Rate limited. Retry after {retry_after} seconds")
    elif response.status_code >= 500:
        raise ServerError(f"Server error: {response.status_code}")
    else:
        raise APIError(f"API error: {response.status_code} - {response.text}")

# Usage in your application
try:
    result = await handle_ioa_response(response)
    print("Success:", result["choices"][0]["message"]["content"])
except PolicyViolationError as e:
    print(f"Policy violation: {len(e.violations)} rules triggered")
    # Handle blocked content (show user-friendly message)
except RateLimitError as e:
    print(f"Rate limited: {e}")
    # Implement backoff/retry logic
except ServerError as e:
    print(f"Server error: {e}")
    # Retry or show error to user

Billing & Usage

Cost Models

IOA Cloud offers flexible billing options:

1. House Key Billing (IOA Pays for LLM)

  • Launch: 1,000 requests/month included (Free)
  • Scale: 25,000 requests/month included ($299/month)
  • Trust: Custom request volumes (starting at $25,000/year)

2. Bring Your Own Key (BYOK) Billing

  • Use your own OpenAI/Anthropic API keys
  • IOA charges for governance only; LLM provider bills separately
  • Full control over LLM costs and model selection

Rate Limits

Plan Requests/Second Requests/Month
Launch 2 1,000
Scale 5 25,000
Trust Custom Custom

Migration Guide

From OpenAI Direct

Minimal Changes Required:

# BEFORE (OpenAI direct)
import openai

client = openai.OpenAI(api_key="sk-...")
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello"}]
)

# AFTER (IOA Cloud)
import httpx

headers = {
    "Authorization": "Bearer ioa_usr_sk_live_XXXXXXXXXXXXXXXXXXXX",
    "Content-Type": "application/json"
}

payload = {
    "model": "gpt-4o",  # Note: gpt-4o instead of gpt-4
    "messages": [{"role": "user", "content": "Hello"}]
}

async with httpx.AsyncClient() as client:
    response = await client.post(
        "https://api.orchintel.com/v1/chat/completions",
        headers=headers,
        json=payload
    )
    data = response.json()
    content = data["choices"][0]["message"]["content"]

Support & Resources

Documentation

Support Channels

Conclusion

IOA Cloud integration enables third-party applications to add hosted AI governance workflows with minimal code changes. The OpenAI-compatible API eases migration, while evidence generation provides auditable runtime records.

Key Takeaways

  1. Simple integration - Change endpoint, add Bearer token
  2. Flexible governance - Shadow/Enforce/Consensus modes
  3. Automatic compliance - Evidence bundles for audits
  4. Enterprise features - Multi-LLM consensus, custom policies
  5. Cost optimization - House key or bring-your-own-key billing

Next Steps

  1. Get API key from console.orchintel.com
  2. Test with shadow mode in development
  3. Gradually roll out with enforce mode in production
  4. Monitor usage and evidence in console dashboard