AI Usage Tracking
Loquent logs every AI call — text generation, transcription, and realtime voice sessions — to the ai_usage_log table. Each record captures the organization, feature, model, and token counts, enabling per-org cost tracking and usage analytics.
How It Works
Section titled “How It Works”After every AI call completes, a fire-and-forget logger spawns a background task to insert a usage record. Logging never blocks the response or propagates errors — failures produce a warning log and nothing more.
use crate::mods::ai::{spawn_log_ai_usage, AiUsageEntry, AiUsageFeature, AiModels};
let response = builder.build().generate_text().await?;
// Fire-and-forget: logs asynchronously, never fails the callerspawn_log_ai_usage(AiUsageEntry::from_text_generation( organization_id, Some(call_id), // None if not call-related AiUsageFeature::SummarizeCall, AiModels::SUMMARIZE_CALL, &response.usage(),));
let result = response.into_schema()?;For transcriptions, use from_transcription with the usage data from the OpenAI response:
// Extract token counts from the transcription responselet (input_tokens, output_tokens, input_audio_tokens) = response .usage .map(|u| { let input_audio = u.input_token_details.and_then(|d| d.audio_tokens); let input_text = u.input_token_details.and_then(|d| d.text_tokens); (input_text, u.output_tokens, input_audio) }) .unwrap_or((None, None, None));
spawn_log_ai_usage(AiUsageEntry::from_transcription( organization_id, Some(call_id), AiUsageFeature::Transcription, "gpt-4o-transcribe", input_tokens, output_tokens, input_audio_tokens, Some(duration_secs),));Realtime Voice Session Tracking
Section titled “Realtime Voice Session Tracking”During voice calls, each model response turn emits usage data. Loquent captures these per-turn token counts from both OpenAI and Gemini realtime sessions using a provider-agnostic RealtimeUsageTick struct.
RealtimeUsageTick
Section titled “RealtimeUsageTick”#[derive(Debug, Clone)]pub struct RealtimeUsageTick { pub provider: &'static str, // "openai" or "gemini" pub model: String, // e.g., "gpt-4o-realtime-preview" pub input_tokens: Option<i32>, pub output_tokens: Option<i32>, pub input_audio_tokens: Option<i32>, pub output_audio_tokens: Option<i32>, pub cached_tokens: Option<i32>,}How Providers Report Usage
Section titled “How Providers Report Usage”OpenAI sends a response.done event after each model response turn. The event includes usage with input_token_details and output_token_details that break down text vs. audio tokens.
Gemini includes usage_metadata on server messages with prompt_token_count and candidates_token_count. Gemini doesn’t provide separate audio token counts — those fields are None.
Gemini model names arrive as full Vertex AI resource paths. Loquent strips the path prefix, converting projects/.../models/gemini-live-2.5-flash-native-audio to gemini-live-2.5-flash-native-audio.
Logging Realtime Usage
Section titled “Logging Realtime Usage”In the Twilio stream route’s realtime event loop, UsageTick events trigger a fire-and-forget log:
RealtimeInEvent::UsageTick(tick) => { spawn_log_ai_usage(AiUsageEntry::from_realtime_turn( organization_id, call_id, tick, ));}from_realtime_turn maps the tick to an AiUsageEntry with usage_type: "realtime" and feature: "realtime_turn". Every turn in a voice call produces one row in ai_usage_log.
Database Schema
Section titled “Database Schema”The ai_usage_log table stores one row per AI call:
| Column | Type | Description |
|---|---|---|
id | UUID | Primary key |
organization_id | UUID | FK to organization (CASCADE) |
call_id | UUID? | FK to call (SET NULL) — only for call-related features |
feature | TEXT | Snake_case feature label from AiUsageFeature |
provider | TEXT | Extracted from model slug (e.g., "google", "openai", "anthropic", "deepseek") |
model | TEXT | Model identifier (e.g., "deepseek/deepseek-v3.2") |
usage_type | TEXT | "text_generation", "transcription", or "realtime" |
input_tokens | INT? | Input tokens consumed |
output_tokens | INT? | Output tokens generated |
cached_tokens | INT? | Cached/prompt-cached tokens |
reasoning_tokens | INT? | Reasoning tokens (extended thinking models) |
input_audio_tokens | INT? | Input audio tokens (realtime voice sessions) |
output_audio_tokens | INT? | Output audio tokens (realtime voice sessions) |
audio_duration_secs | DOUBLE? | Audio duration in seconds (transcription only) |
created_at | TIMESTAMPTZ | Auto-set to CURRENT_TIMESTAMP |
Indexes: A composite index on (organization_id, created_at) optimizes per-org date-range queries. A partial index on call_id WHERE call_id IS NOT NULL speeds up call-specific lookups.
AiUsageFeature Enum
Section titled “AiUsageFeature Enum”AiUsageFeature in src/mods/ai/types/ai_usage_type.rs identifies which capability generated the usage:
| Variant | Feature | Provider |
|---|---|---|
EnrichContact | Contact enrichment from transcription | OpenRouter |
EnrichContactFromMessages | Contact enrichment from SMS history | OpenRouter |
SummarizeCall | Call summary generation | OpenRouter |
UpdateContactMemory | System note updates | OpenRouter |
AnalyzeCall | Call analysis (user-defined analyzers) | OpenRouter |
IdentifySpeakers | Speaker identification | OpenRouter |
AutoTagContact | Automatic contact tagging | OpenRouter |
ExtractTodos | Todo extraction from calls | OpenRouter |
ExecuteTodo | Todo action execution | OpenRouter |
QueryKnowledge | Knowledge base RAG queries | OpenRouter |
GenerateInstructions | AI builder: generate agent instructions | OpenRouter |
EditInstructions | AI builder: edit agent instructions | OpenRouter |
CustomEditInstructions | AI builder: custom edit instructions | OpenRouter |
Transcription | Standard audio transcription | OpenAI |
DiarizedTranscription | Diarized transcription (speaker labels) | OpenAI |
TextAgentSuggestions | Text agent response suggestions | OpenRouter |
RealtimeTurn | Per-turn usage from realtime voice sessions | OpenAI / Gemini |
Adding Usage Tracking to New Features
Section titled “Adding Usage Tracking to New Features”When you add a new AI call site, follow these steps:
- Add an
AiUsageFeaturevariant and itsas_str()mapping inai_usage_type.rs. - Add an
AiModelsconstant inai_models_type.rsif you’re using a new model. - Ensure
organization_idis available — add it as a function parameter if needed. - Call
spawn_log_ai_usageimmediately after receiving the AI response, before consuming it.
Cost Calculation
Section titled “Cost Calculation”Loquent calculates costs server-side using hardcoded per-model pricing in src/mods/ai/types/ai_pricing_type.rs. The get_model_pricing(provider, model) function returns a ModelPricing struct with rates per 1M tokens for input, output, cached, and audio tokens.
The Admin AI Costs dashboard uses this pricing engine to aggregate costs across models, providers, features, and organizations with period-over-period comparisons.
Querying Usage Data
Section titled “Querying Usage Data”The admin AI Costs dashboard (GET /api/admin/ai-costs) provides aggregated cost analytics. See the Admin module docs for details.
For raw data, query the database directly:
SELECT feature, model, SUM(COALESCE(input_tokens, 0)) AS total_input, SUM(COALESCE(output_tokens, 0)) AS total_output, COUNT(*) AS call_countFROM ai_usage_logWHERE organization_id = 'your-org-id' AND created_at >= '2026-03-01' AND created_at < '2026-04-01' AND usage_type = 'text_generation'GROUP BY feature, modelORDER BY total_output DESC;Key Files
Section titled “Key Files”| File | Purpose |
|---|---|
src/mods/ai/types/ai_usage_type.rs | AiUsageFeature enum and AiUsageEntry struct |
src/mods/ai/services/log_ai_usage_service.rs | spawn_log_ai_usage() fire-and-forget logger |
src/mods/ai/types/ai_pricing_type.rs | ModelPricing struct and get_model_pricing() lookup |
src/mods/ai/types/ai_models_type.rs | Centralized model registry |
src/mods/agent/types/realtime_usage_tick_type.rs | RealtimeUsageTick provider-agnostic struct |
src/mods/openai/types/openai_realtime_in_event_response_done_type.rs | OpenAI response.done usage parsing |
src/mods/gemini/types/gemini_realtime_in_event_type.rs | Gemini usage_metadata parsing |
src/mods/twilio/routes/twilio_stream_route.rs | Realtime event loop with usage logging |
migration/src/m20260308_000000_create_ai_usage_log_table.rs | Table migration |