What counts as a request?

Understanding how IOA Cloud meters requests and what counts toward your monthly quota.

Request definition

A "request" in IOA Cloud is defined as a single governance evaluation of an AI operation. Each time your application invokes an LLM through IOA's governance layer, we evaluate policies, generate evidence, and potentially enforce decisions. this complete cycle counts as one request.

What counts

LLM invocations: Any call to an LLM provider (OpenAI, Anthropic, Google, Cohere, etc.) routed through IOA
Policy evaluations: Pre-invoke and post-invoke policy checks
Consensus rounds: For Gold/Gold+ plans, multi-LLM consensus counts as one request (not per LLM)
Retries: If a policy blocks a request and your app retries, each retry counts separately

What doesn't count

Evidence downloads: Retrieving stored evidence bundles
Dashboard views: Viewing metrics, logs, or console pages
Webhook deliveries: Policy event notifications
API health checks: Status and diagnostic endpoints

Examples

Scenario 1: Simple chat completion

Action: Your app sends a prompt to OpenAI GPT-4 via IOA
Requests counted: 1

Scenario 2: Consensus with 3 LLMs

Action: Gold plan uses consensus mode with OpenAI, Anthropic, and Google
Requests counted: 1 (not 3)

Scenario 3: Policy blocks request, app retries

Action: Initial request blocked by PII policy, app modifies and retries
Requests counted: 2 (original + retry)

Request limits & overages

Each plan includes a monthly request quota. If you exceed your quota:

Rate limiting: Requests are rate-limited (not blocked)
Notification: You'll receive email alerts at 80% and 100% usage
Overages: Purchase overage packs ($10 per 10k requests) or upgrade your plan

Monitoring usage

Track your request usage in real-time via:

Console dashboard: Current month usage and trending
Usage API: Programmatic access to usage metrics
Webhooks: Receive alerts at custom thresholds

← Back to Pricing