What counts as a request?

Understanding how IOA Cloud meters requests and what counts toward your monthly quota.

Request definition

A "request" in IOA Cloud is defined as a single governance evaluation of an AI operation. Each time your application invokes an LLM through IOA's governance layer, we evaluate policies, generate evidence, and potentially enforce decisions. this complete cycle counts as one request.

What counts

  • LLM invocations: Any call to an LLM provider (OpenAI, Anthropic, Google, Cohere, etc.) routed through IOA
  • Policy evaluations: Pre-invoke and post-invoke policy checks
  • Consensus rounds: For Gold/Gold+ plans, multi-LLM consensus counts as one request (not per LLM)
  • Retries: If a policy blocks a request and your app retries, each retry counts separately

What doesn't count

  • Evidence downloads: Retrieving stored evidence bundles
  • Dashboard views: Viewing metrics, logs, or console pages
  • Webhook deliveries: Policy event notifications
  • API health checks: Status and diagnostic endpoints

Examples

Scenario 1: Simple chat completion

Action: Your app sends a prompt to OpenAI GPT-4 via IOA
Requests counted: 1

Scenario 2: Consensus with 3 LLMs

Action: Gold plan uses consensus mode with OpenAI, Anthropic, and Google
Requests counted: 1 (not 3)

Scenario 3: Policy blocks request, app retries

Action: Initial request blocked by PII policy, app modifies and retries
Requests counted: 2 (original + retry)

Request limits & overages

Each plan includes a monthly request quota. If you exceed your quota:

  • Rate limiting: Requests are rate-limited (not blocked)
  • Notification: You'll receive email alerts at 80% and 100% usage
  • Overages: Purchase overage packs ($10 per 10k requests) or upgrade your plan

Monitoring usage

Track your request usage in real-time via:

  • Console dashboard: Current month usage and trending
  • Usage API: Programmatic access to usage metrics
  • Webhooks: Receive alerts at custom thresholds