Agent
The agent module defines AI voice agents that handle incoming phone calls. Each agent has an identity, a system prompt, a realtime provider configuration, and optional knowledge bases.
Data Model
Section titled “Data Model”pub struct Agent { pub id: Uuid, pub name: String, pub prompt: String, pub realtime_config: AgentRealtimeConfig, pub tool_config: AgentToolConfig, pub knowledge_base_ids: Vec<Uuid>,}AgentData (Create/Update Payload)
Section titled “AgentData (Create/Update Payload)”pub struct AgentData { pub name: String, pub prompt: String, pub realtime_config: AgentRealtimeConfig, pub tool_config: AgentToolConfig, pub knowledge_base_ids: Vec<Uuid>,}API Endpoints
Section titled “API Endpoints”| Method | Path | Description |
|---|---|---|
GET | /api/agents | List all agents for the organization |
GET | /api/agents/:id | Get agent with knowledge base IDs |
POST | /api/agents | Create agent + link knowledge bases |
PUT | /api/agents/:id | Update agent (replaces all knowledge base links) |
DELETE | /api/agents/:id | Delete agent |
All endpoints require an authenticated Session. Agents are scoped to the session’s organization.
Realtime Configuration
Section titled “Realtime Configuration”AgentRealtimeConfig is a tagged enum — the provider field determines which AI backend handles voice sessions.
OpenAI Provider
Section titled “OpenAI Provider”{ "provider": "openai", "model": "gpt-realtime-1.5", "voice": "marin", "speed": 1.0, "noise_reduction": "near_field", "vad": { "type": "semantic_vad", "eagerness": "medium", "create_response": true, "interrupt_response": true }}| Setting | Options | Default |
|---|---|---|
model | gpt-realtime-1.5, gpt-realtime, gpt-realtime-mini | gpt-realtime-1.5 |
voice | marin, cedar, alloy, ash, ballad, coral, echo, sage, shimmer, verse | marin |
speed | f32 | 1.0 |
noise_reduction | near_field, far_field, none | near_field |
vad.type | semantic_vad, server_vad | semantic_vad |
vad.eagerness | low, medium, high, auto | medium |
Server VAD exposes additional fields: threshold, prefix_padding_ms, silence_duration_ms, idle_timeout_ms.
Gemini Provider
Section titled “Gemini Provider”{ "provider": "gemini", "model": "gemini-live-2.5-flash-preview-native-audio-09-2025", "voice": "Aoede", "start_of_speech_sensitivity": "START_SENSITIVITY_HIGH", "end_of_speech_sensitivity": "END_SENSITIVITY_HIGH", "activity_handling": "START_OF_ACTIVITY_INTERRUPTS"}| Setting | Options | Default |
|---|---|---|
voice | Aoede, Charon, Fenrir, Kore, Puck | Aoede |
start_of_speech_sensitivity | START_SENSITIVITY_HIGH, START_SENSITIVITY_LOW | HIGH |
end_of_speech_sensitivity | END_SENSITIVITY_HIGH, END_SENSITIVITY_LOW | HIGH |
silence_duration_ms | Option<u32> | None |
activity_handling | START_OF_ACTIVITY_INTERRUPTS, NO_INTERRUPTION | INTERRUPTS |
Tool Configuration
Section titled “Tool Configuration”Each agent has a tool_config field controlling which tools are available and their per-tool settings:
pub struct AgentToolConfig { pub disabled_tools: Vec<String>, // Tools explicitly disabled for this agent pub tool_settings: HashMap<String, serde_json::Value>, // Per-tool config}An empty config (the default) enables all tools with no custom settings — preserving backward compatibility for existing agents.
Available Tools
Section titled “Available Tools”| Tool | Enabled When | Description |
|---|---|---|
end_call | Always (cannot be disabled) | Ends the call programmatically via Twilio REST API |
transfer_call | Not in disabled_tools | Transfers the active call to a specified phone number |
query_knowledge | Agent has linked knowledge bases | Queries knowledge bases using LLM-powered search |
lookup_caller | organization.client_lookup_url is configured | Looks up caller info via external API |
Transfer Call Tool
Section titled “Transfer Call Tool”The transfer_call tool supports optional per-agent transfer numbers. If tool_settings["transfer_call"] contains a TransferCallSettings object, the agent can only transfer to the configured numbers. If no settings are provided, the agent may transfer to any number mentioned in its prompt.
pub struct TransferCallSettings { pub numbers: Vec<TransferNumber>,}
pub struct TransferNumber { pub number: String, // E.164 format: "+12125551234" pub label: String, // "Sales", "Support", etc.}The UI renders a ToolConfigForm component with checkboxes for each tool and inline inputs for transfer numbers when transfer_call is enabled.
Realtime Session Traits
Section titled “Realtime Session Traits”The module defines provider-agnostic traits for realtime communication:
#[async_trait]pub trait RealtimeSessionSender: Send + Sync { async fn send_audio_delta(&mut self, audio_delta: String) -> Result<...>; async fn send_tool_response(&mut self, call_id: String, function_name: String, output: String) -> Result<...>;}
#[async_trait]pub trait RealtimeSessionReceiver: Send { async fn receive_event(&mut self) -> Result<RealtimeInEvent, ...>;}Events received from any provider are normalized into RealtimeInEvent:
| Variant | Meaning |
|---|---|
AudioDelta | Audio chunk to relay to the caller |
SpeechStarted | User interrupted — model cancelled its response |
ResponseStarted | Model started a new response |
ToolCall | Model requesting a function call |
Unknown | Any other event (ignored) |
AI Instruction Builder
Section titled “AI Instruction Builder”The instruction builder uses GPT-4.5 to generate and edit agent system prompts from natural language descriptions.
Generate Instructions
Section titled “Generate Instructions”POST /api/instructions/generateCreates structured voice agent instructions from a user description. Automatically includes organization context when available.
Request:
pub struct GenerateInstructionsRequest { pub user_description: String,}Response:
pub struct AiInstructionResponse { pub instructions: String,}The system appends organization profile data (if configured) to the user description before sending to GPT-4.5.
Edit Instructions
Section titled “Edit Instructions”POST /api/instructions/editRefines existing instructions using a predefined action (improve, make concise, add examples, etc.).
Request:
pub struct EditInstructionsRequest { pub action: InstructionAction, pub current_instructions: String,}
pub enum InstructionAction { Improve, MakeConcise, ImproveTone, AddExamples, Fix, StrengthenSafety, AddPersonality, EnhanceWithOrgData, // Injects org profile context}The EnhanceWithOrgData action fetches the organization profile and formats it as markdown context for the LLM.
Custom Edit Instructions
Section titled “Custom Edit Instructions”POST /api/instructions/custom-editApplies a free-text editing instruction (e.g., “make this sound more professional”).
Request:
pub struct CustomEditRequest { pub edit_instruction: String, pub current_instructions: String,}UI Components
Section titled “UI Components”InstructionPreview — displays instructions with three view modes:
| Mode | Purpose |
|---|---|
| Preview | Rendered markdown with syntax highlighting |
| Edit | Raw textarea for direct editing |
| Diff | Tabs to compare current vs original instructions |
The component supports undo by snapshotting the original content on mount.
AiInstructionToolbar — renders quick-action buttons for each InstructionAction. Clicking a button calls the edit API and updates the preview.
ToolConfigForm — per-agent tool toggles (transfer_call, lookup_caller, query_knowledge). When transfer_call is enabled, displays inline inputs for configuring transfer numbers (label + E.164 phone number).
Module Structure
Section titled “Module Structure”src/mods/agent/├── api/│ ├── instruction_builder/ # Generate/edit/custom-edit APIs│ └── ... # CRUD endpoints├── components/│ ├── instruction_builder/ # InstructionPreview, AiInstructionToolbar│ └── ... # List, details, realtime config form├── services/│ ├── build_organization_context_service.rs # Formats org profile for LLM│ └── ... # Session, tool handling├── prompts/ # System prompt templates for AI editing actions├── traits/ # RealtimeSessionSender, RealtimeSessionReceiver├── types/│ ├── instruction_builder/ # Request/response types, InstructionAction enum│ └── ... # Agent, realtime config, tool definitions└── views/ # Page views