Agent

The agent module defines AI voice agents that handle incoming phone calls. Each agent has an identity, a system prompt, a realtime provider configuration, and optional knowledge bases.

Data Model

Agent

pub struct Agent {
    pub id: Uuid,
    pub name: String,
    pub prompt: String,
    pub realtime_config: AgentRealtimeConfig,
    pub tool_config: AgentToolConfig,
    pub knowledge_base_ids: Vec<Uuid>,
}

AgentData (Create/Update Payload)

pub struct AgentData {
    pub name: String,
    pub prompt: String,
    pub realtime_config: AgentRealtimeConfig,
    pub tool_config: AgentToolConfig,
    pub knowledge_base_ids: Vec<Uuid>,
}

API Endpoints

Method	Path	Description
`GET`	`/api/agents`	List all agents for the organization
`GET`	`/api/agents/:id`	Get agent with knowledge base IDs
`POST`	`/api/agents`	Create agent + link knowledge bases
`PUT`	`/api/agents/:id`	Update agent (replaces all knowledge base links)
`DELETE`	`/api/agents/:id`	Delete agent

All endpoints require an authenticated Session. Agents are scoped to the session’s organization.

Realtime Configuration

AgentRealtimeConfig is a tagged enum — the provider field determines which AI backend handles voice sessions.

OpenAI Provider

{
  "provider": "openai",
  "model": "gpt-realtime-1.5",
  "voice": "marin",
  "speed": 1.0,
  "noise_reduction": "near_field",
  "vad": {
    "type": "semantic_vad",
    "eagerness": "medium",
    "create_response": true,
    "interrupt_response": true
  }
}

Setting	Options	Default
`model`	`gpt-realtime-1.5`, `gpt-realtime`, `gpt-realtime-mini`	`gpt-realtime-1.5`
`voice`	`marin`, `cedar`, `alloy`, `ash`, `ballad`, `coral`, `echo`, `sage`, `shimmer`, `verse`	`marin`
`speed`	`f32`	`1.0`
`noise_reduction`	`near_field`, `far_field`, `none`	`near_field`
`vad.type`	`semantic_vad`, `server_vad`	`semantic_vad`
`vad.eagerness`	`low`, `medium`, `high`, `auto`	`medium`

Server VAD exposes additional fields: threshold, prefix_padding_ms, silence_duration_ms, idle_timeout_ms.

Gemini Provider

{
  "provider": "gemini",
  "model": "gemini-live-2.5-flash-preview-native-audio-09-2025",
  "voice": "Aoede",
  "start_of_speech_sensitivity": "START_SENSITIVITY_HIGH",
  "end_of_speech_sensitivity": "END_SENSITIVITY_HIGH",
  "activity_handling": "START_OF_ACTIVITY_INTERRUPTS"
}

Setting	Options	Default
`voice`	`Aoede`, `Charon`, `Fenrir`, `Kore`, `Puck`	`Aoede`
`start_of_speech_sensitivity`	`START_SENSITIVITY_HIGH`, `START_SENSITIVITY_LOW`	`HIGH`
`end_of_speech_sensitivity`	`END_SENSITIVITY_HIGH`, `END_SENSITIVITY_LOW`	`HIGH`
`silence_duration_ms`	`Option<u32>`	`None`
`activity_handling`	`START_OF_ACTIVITY_INTERRUPTS`, `NO_INTERRUPTION`	`INTERRUPTS`

Tool Configuration

Each agent has a tool_config field controlling which tools are available and their per-tool settings:

pub struct AgentToolConfig {
    pub disabled_tools: Vec<String>,  // Tools explicitly disabled for this agent
    pub tool_settings: HashMap<String, serde_json::Value>,  // Per-tool config
}

An empty config (the default) enables all tools with no custom settings — preserving backward compatibility for existing agents.

Available Tools

Tool	Enabled When	Description
`end_call`	Always (cannot be disabled)	Ends the call programmatically via Twilio REST API
`transfer_call`	Not in `disabled_tools`	Transfers the active call to a specified phone number
`query_knowledge`	Agent has linked knowledge bases	Queries knowledge bases using LLM-powered search
`lookup_caller`	`organization.client_lookup_url` is configured	Looks up caller info via external API

Transfer Call Tool

The transfer_call tool supports optional per-agent transfer numbers. If tool_settings["transfer_call"] contains a TransferCallSettings object, the agent can only transfer to the configured numbers. If no settings are provided, the agent may transfer to any number mentioned in its prompt.

pub struct TransferCallSettings {
    pub numbers: Vec<TransferNumber>,
}

pub struct TransferNumber {
    pub number: String,  // E.164 format: "+12125551234"
    pub label: String,   // "Sales", "Support", etc.
}

The UI renders a ToolConfigForm component with checkboxes for each tool and inline inputs for transfer numbers when transfer_call is enabled.

Realtime Session Traits

The module defines provider-agnostic traits for realtime communication:

#[async_trait]
pub trait RealtimeSessionSender: Send + Sync {
    async fn send_audio_delta(&mut self, audio_delta: String) -> Result<...>;
    async fn send_tool_response(&mut self, call_id: String, function_name: String, output: String) -> Result<...>;
}

#[async_trait]
pub trait RealtimeSessionReceiver: Send {
    async fn receive_event(&mut self) -> Result<RealtimeInEvent, ...>;
}

Events received from any provider are normalized into RealtimeInEvent:

Variant	Meaning
`AudioDelta`	Audio chunk to relay to the caller
`SpeechStarted`	User interrupted — model cancelled its response
`ResponseStarted`	Model started a new response
`ToolCall`	Model requesting a function call
`Unknown`	Any other event (ignored)

AI Instruction Builder

The instruction builder uses GPT-4.5 to generate and edit agent system prompts from natural language descriptions.

Generate Instructions

POST /api/instructions/generate

Creates structured voice agent instructions from a user description. Automatically includes organization context when available.

Request:

pub struct GenerateInstructionsRequest {
    pub user_description: String,
}

Response:

pub struct AiInstructionResponse {
    pub instructions: String,
}

The system appends organization profile data (if configured) to the user description before sending to GPT-4.5.

Edit Instructions

POST /api/instructions/edit

Refines existing instructions using a predefined action (improve, make concise, add examples, etc.).

Request:

pub struct EditInstructionsRequest {
    pub action: InstructionAction,
    pub current_instructions: String,
}

pub enum InstructionAction {
    Improve,
    MakeConcise,
    ImproveTone,
    AddExamples,
    Fix,
    StrengthenSafety,
    AddPersonality,
    EnhanceWithOrgData,  // Injects org profile context
}

The EnhanceWithOrgData action fetches the organization profile and formats it as markdown context for the LLM.

Custom Edit Instructions

POST /api/instructions/custom-edit

Applies a free-text editing instruction (e.g., “make this sound more professional”).

Request:

pub struct CustomEditRequest {
    pub edit_instruction: String,
    pub current_instructions: String,
}

UI Components

InstructionPreview — displays instructions with three view modes:

Mode	Purpose
Preview	Rendered markdown with syntax highlighting
Edit	Raw textarea for direct editing
Diff	Tabs to compare current vs original instructions

The component supports undo by snapshotting the original content on mount.

AiInstructionToolbar — renders quick-action buttons for each InstructionAction. Clicking a button calls the edit API and updates the preview.

ToolConfigForm — per-agent tool toggles (transfer_call, lookup_caller, query_knowledge). When transfer_call is enabled, displays inline inputs for configuring transfer numbers (label + E.164 phone number).

Module Structure

src/mods/agent/
├── api/
│   ├── instruction_builder/   # Generate/edit/custom-edit APIs
│   └── ...                     # CRUD endpoints
├── components/
│   ├── instruction_builder/   # InstructionPreview, AiInstructionToolbar
│   └── ...                     # List, details, realtime config form
├── services/
│   ├── build_organization_context_service.rs   # Formats org profile for LLM
│   └── ...                                     # Session, tool handling
├── prompts/                    # System prompt templates for AI editing actions
├── traits/                     # RealtimeSessionSender, RealtimeSessionReceiver
├── types/
│   ├── instruction_builder/   # Request/response types, InstructionAction enum
│   └── ...                     # Agent, realtime config, tool definitions
└── views/                      # Page views

Settings — organization profile powers instruction generation
Twilio — delivers incoming calls to agents
OpenAI — implements sender/receiver for OpenAI Realtime
Gemini — implements sender/receiver for Gemini Live
Knowledge — knowledge bases queried by query_knowledge tool

Agent

Data Model

Agent

AgentData (Create/Update Payload)

API Endpoints

Realtime Configuration

OpenAI Provider

Gemini Provider

Tool Configuration

Available Tools

Transfer Call Tool

Realtime Session Traits

AI Instruction Builder

Generate Instructions

Edit Instructions

Custom Edit Instructions

UI Components

Module Structure

Related Modules