Skip to main content

Core Concepts

Call Initiation Responsibility

The SDK supports two approaches for call initiation, differing in who is responsible for creating the call:

Client-Side Initiation

  • Responsibility: The SDK directly creates calls with Hidoba Research
  • Process: SDK uses your API key to create and manage the entire call lifecycle
  • Use Case: Development, prototyping, simple applications

Server-Side Initiation

  • Responsibility: Your backend creates calls, SDK connects using provided credentials
  • Process: Your server creates the call and returns a signed URL for the SDK to use
  • Use Case: Production applications, enhanced security, user management

Call Flow

Understanding the typical flow of an AI voice call helps you implement proper event handling:

  1. Call Initiation: User clicks call button → onCallStart triggered
  2. Call Creation: Either SDK creates call directly (client-side) or your backend creates call (server-side)
  3. Permission Request: Browser requests microphone access automatically
  4. Permission Granted: onCallStarting callback triggered
  5. WebSocket Connection: SDK establishes audio streaming connection
  6. Connected: onConnected callback triggered → conversation can begin
  7. Active Conversation: Real-time audio streaming and optional transcript display
  8. Call End: User hangs up → onHangUp callback triggered

Callback Architecture

The SDK uses an event-driven architecture with callbacks to handle different states:

  • Status Callbacks: Keep your UI updated with current call state
  • Error Handling: Gracefully handle connection issues and permissions
  • RAG Integration: Display external document sources when AI references them
  • User Feedback: Show connection progress and call quality information

Permission Management

Microphone access is handled automatically by the SDK:

  • Automatic Request: Permission requested immediately after backend call creation
  • Early Optimization: Stream acquired during backend polling to reduce perceived latency
  • No Manual Setup: No need to request permissions separately - SDK handles it
  • HTTPS Required: Browser security requires secure context for microphone access

Audio Processing

The SDK includes advanced audio processing capabilities:

  • Real-time Processing: Uses AudioWorklet for low-latency audio handling
  • Automatic Optimization: Built-in echo cancellation and noise reduction
  • Device Flexibility: Switch microphones and speakers during active calls
  • Quality Control: Automatic audio format optimization for best performance