The Realtime API allows developers to build low-latency, Multimodal experiences, including speech-to-speech functionality. Text and Audio processed by the Realtime API are priced separately. This model supports a maximum context length of 128,000 tokens.