Skip to main content
The OpenAI Realtime WebRTC transport implementation enables real-time audio communication with the OpenAI Realtime service, using a direct WebRTC connection.
Transports of this type connect directly to OpenAI’s API from the client, which exposes your API key. This is designed primarily for development and testing. For production applications, proxy through a server component to keep credentials secure.

Installation

Add the transport dependency to your app/build.gradle.kts:
dependencies {
    implementation("ai.pipecat:openai-realtime-webrtc-transport:1.2.0")
}
Add the microphone permission to AndroidManifest.xml:
<uses-permission android:name="android.permission.RECORD_AUDIO" />

Usage

import ai.pipecat.client.PipecatClientOptions
import ai.pipecat.client.PipecatEventCallbacks
import ai.pipecat.client.openai_realtime_webrtc.OpenAIRealtimeSessionConfig
import ai.pipecat.client.openai_realtime_webrtc.OpenAIServiceOptions
import ai.pipecat.client.openai_realtime_webrtc.PipecatClientOpenAIRealtimeWebRTC
import ai.pipecat.client.openai_realtime_webrtc.OpenAIRealtimeWebRTCTransport
import ai.pipecat.client.types.BotReadyData
import ai.pipecat.client.types.Value

val callbacks = object : PipecatEventCallbacks() {
    override fun onBackendError(message: String) {
        Log.e(TAG, "Backend error: $message")
    }

    override fun onBotReady(data: BotReadyData) {
        Log.d(TAG, "Bot is ready")
    }
}

val options = PipecatClientOptions(callbacks = callbacks, enableMic = true)
val client = PipecatClientOpenAIRealtimeWebRTC(OpenAIRealtimeWebRTCTransport(context), options)

client.connect(
    OpenAIServiceOptions(
        apiKey = "your-openai-api-key",
        model = "gpt-4o-realtime-preview",
        sessionConfig = OpenAIRealtimeSessionConfig(
            voice = "alloy",
            instructions = "You are a helpful assistant.",
            turnDetection = Value.Object("type" to Value.Str("semantic_vad")),
            inputAudioTranscription = Value.Object("model" to Value.Str("gpt-4o-transcribe"))
        )
    )
).withCallback { result ->
    result.errorOrNull?.let { Log.e(TAG, "Connection failed: $it") }
}

Configuration

OpenAIServiceOptions

ParameterTypeDescription
apiKeyStringYour OpenAI API key
sessionConfigOpenAIRealtimeSessionConfigSession configuration
modelString?Model name (default: "gpt-realtime")
initialMessagesList<LLMContextMessage>Messages to inject at session start

OpenAIRealtimeSessionConfig

ParameterTypeDescription
modalitiesList<String>?Output modalities (e.g. ["audio", "text"])
instructionsString?System instructions for the model
voiceString?Voice name (e.g. "alloy", "ballad")
turnDetectionValue?Turn detection config
inputAudioNoiseReductionValue?Noise reduction config
inputAudioTranscriptionValue?Transcription model config
toolsValue?Tool/function definitions
toolChoiceString?Tool choice strategy
temperatureFloat?Sampling temperature

Audio devices

The transport exposes static constants for audio routing:
// Route audio to speakerphone (default) or earpiece
client.updateMic(OpenAIRealtimeWebRTCTransport.AudioDevices.Speakerphone.id)
client.updateMic(OpenAIRealtimeWebRTCTransport.AudioDevices.Earpiece.id)

Resources

Demo

Simple Chatbot Demo

Source

Client Transports

Pipecat Android Client Reference

Complete API documentation for the Pipecat Android client.