Skip to content

Twilio

The twilio module handles all telephony through Twilio’s APIs. It receives incoming calls via webhooks, streams audio bidirectionally over WebSocket, processes recordings, and manages phone numbers.

Incoming call → Twilio
→ POST /twilio/voice
→ TwiML: start recording (mono) + open WebSocket stream
→ WSS /twilio/stream
→ start event → resolve agent, create call record, open AI session
→ media events → forward audio to AI provider
→ AI audio delta → send back to Twilio stream
→ speech started → clear Twilio audio buffer (interruption)
→ Call ends → stop event → close sessions
→ POST /twilio/recording (async)
→ download recording → transcribe → analyze → report → todos → contacts
PathMethodHandlerAuth
/twilio/voicePOSTtwilio_voiceNone (Twilio webhook)
/twilio/streamWSStwilio_streamNone (Twilio WebSocket)
/twilio/recordingPOSTtwilio_recordingNone (Twilio webhook)
/api/twilio/buy-phone-numberPOSTbuy_phone_number_apiSession required

Receives the initial call from Twilio. Returns TwiML that:

  1. Starts a mono recording with a status callback to /twilio/recording
  2. Opens a WebSocket media stream to /twilio/stream with from and to as custom parameters

The core real-time handler. On upgrade:

  1. Wait for start event — reads messages until a valid Start arrives (discards Connected and others)
  2. Resolve agent — calls handle_twilio_stream_in_event_start:
    • Looks up phone number → finds assigned agent
    • Loads agent’s realtime_config to determine provider
    • Resolves or creates a contact for the caller
    • Creates a call record in the database
  3. Create AI session — opens WebSocket to OpenAI or Gemini based on config
  4. Run two concurrent loops (tokio::select!):
    • Twilio listener — forwards audio from caller to AI via send_audio_delta
    • Realtime listener — dispatches AI events:
      • AudioDelta → send audio to Twilio (skipped if interrupted)
      • SpeechStarted → set interrupted flag, send clear to flush Twilio buffer
      • ResponseStarted → reset interrupted flag
      • ToolCall → execute tool, send result back to AI

Fires asynchronously after the call ends. Spawns a background task (tokio::spawn) that:

  1. Downloads the MP3 recording using the org’s Twilio credentials
  2. Saves the audio file ({call_sid}.mp3)
  3. Transcribes via transcribe_audio (GPT-4o Transcribe)
  4. Saves the transcript ({call_sid}.txt)
  5. Updates the call record with recording_url and transcription
  6. Runs post-call pipeline (all fire-and-forget, errors logged but not propagated):
    • enrich_contact — update contact with call context
    • analyze_call — run all org analyzers
    • extract_todos — AI-powered to-do extraction
    • send_call_report — email report
EventTypeKey Fields
connectedTwilioStreamInEventConnectedprotocol, version
startTwilioStreamInEventStartstream_sid, call_sid, from, to, media_format
mediaTwilioStreamInEventMediapayload (base64 µ-law), sequence_number
stopTwilioStreamInEventStopcall_sid, account_sid
EventTypePurpose
mediaTwilioStreamOutEventMediaAudio chunk for playback
clearTwilioStreamOutEventClearFlush buffered audio (interruption)

Audio format: base64-encoded µ-law, 8kHz, mono.

UtilityWhat It Does
buy_twilio_phone_numberPOST to Twilio API to purchase a number by area code
create_twilio_subaccountCreate a per-org Twilio subaccount
set_phone_number_twimlSet the voice webhook URL on a phone number

The buy flow also assigns default analyzers to the new number and configures the webhook URL.

pub struct TwilioCoreConf {
pub twilio_main_sid: String,
pub twilio_main_token: String,
}
FieldPurpose
twilio_account_sidOrg’s Twilio subaccount SID
twilio_auth_tokenOrg’s Twilio auth token
ConstantValue
TWILIO_API_BASEhttps://api.twilio.com/2010-04-01
TWILIO_VOICE_PATH/twilio/voice
TWILIO_STREAM_PATH/twilio/stream
TWILIO_RECORDING_PATH/twilio/recording
src/mods/twilio/
├── api/ # Voice webhook, recording webhook, buy number
├── conf/ # TwilioCoreConf (from core_conf table)
├── constants/ # API base URL, webhook paths
├── routes/ # WebSocket stream handler (the core loop)
├── services/ # Start event handling, media forwarding
├── types/ # Stream events (in/out), call data, phone numbers, recording
└── utils/ # Buy number, create subaccount, process recording, set TwiML
  • Agent — resolved from phone number, provides realtime config and tools
  • OpenAI — receives audio stream for AI processing
  • Call — call records created during stream start
  • Contact — caller resolution and enrichment
  • Analyzer — post-call analysis triggered after recording