Skip to main content

How to Get Started with Corti’s API: From Authentication to Real-Time Clinical Insights

Updated over a week ago

Whether you’re prototyping a digital scribe or enhancing your EHR platform with real-time insights, Corti’s API is designed to be easy to integrate, fast to deploy, and optimized for the complex language and workflows of healthcare.

This guide walks you through the full flow - from secure authentication to real-time conversational transcription and clinical fact extraction - using modular Python examples. If you’re building ambient documentation tools or speech-enabled assistants, this is your starting point.

Before You Begin

In order to access and authentication to the Corti API, you’ll need:

  • A client ID and secret

  • Your tenant name and environment

You can request these credentials via the Corti Developer Portal, or reach out to your Corti representative if you already have an existing integration.

Step 1: Authenticate Securely with OAuth2

Corti’s APIs are protected with OAuth2 (Client Credentials Grant), ensuring every integration is secure and tenant-aware.

Use your client_id and client_secret along with your tenant and environment identifiers in order to get an access token.

Security best practice: Store credentials in a .env file and load them using python-dotenv.

python
CopyEdit
from dotenv
import load_dotenv load_dotenv()

Here’s how to authenticate:

python
CopyEdit
def get_access_token():
url = f"<https://auth {ENVIRONMENT}.corti.ai/realms/{TENANT}/protocol/openid-connect/token"
payload = {
'grant_type':'client_credentials',
'client_id':CLIENT_ID,
'client_secret': CLIENT_SECRET,
'scope': 'openid'
}
response = requests.post(url, data=payload) return response.json().get("access_token")

This access token is passed as a Bearer token in all subsequent requests.

Detailed authentication documentation is available here.

Step 2: Create an Interaction

An interaction represents a single clinical session - a consultation, a triage call, or any patient encounter you want to power with Corti AI.

Here’s how to programmatically create an interaction:

python
CopyEdit
def create_interaction(access_token):
headers = {
'Authorization': f'Bearer {access_token}',
'X-Tenant': TENANT,
'Content-Type': 'application/json'
}

payload = {
'assignedUserId': str(uuid.uuid4()),
'encounter': { 'identifier': str(uuid.uuid4()),
'type': 'first-consultation',
'status': 'planned',
'period': {
'start': datetime.utcnow().isoformat() + 'Z',
'end': datetime.utcnow().isoformat() + 'Z'
}
}
}

response = requests.post(INTERACTIONS_URL, headers=headers, json=payload) return response.json()

The response includes:

  • interaction_id to track the session

  • websocket_url for streaming audio

Detailed documentation is about interactions is available here.

Step 3: Connect via WebSocket and set the configuration

With the websocket_url in hand, establish a connection and send a configuration message.

This config tells Corti’s engine how to process the audio. In this case:

  • One speaker, single channel

  • English language

  • Fact extraction mode enabled

Here’s an example config:

python
CopyEdit
config = {
"transcriptionOptions": {
"language": "en",
"isMultiChannel": False,
"enableSpeakerDiarization": False
},
"processing": {
"mode": "facts",
"output": {
"locales": ["en"]
}
}
}

And how to send it on connect:

python
CopyEdit
def on_open(ws):
ws.send(json.dumps({
"messageType": "config",
"config": config
}))

Once the server responds with configAccepted, you’re ready to stream audio.

Corti’s streaming engine is built for the intricacies of clinical dialogue - handling interruptions, non-linear language, and context-rich phrasing with higher precision than general-purpose transcription tools. Detailed documentation on establishing a connection is available here and on streaming configuration here.

Step 4: Stream Audio in Real Time

You can stream audio from a file, a microphone, or a call feed. In this example, we simulate real-time streaming from a file using 32KB chunks.

python
CopyEdit
def stream_audio(ws, audio_path):
with open(audio_path, 'rb') as audio_file:
while chunk := audio_file.read(32000):
ws.send(chunk, opcode=websocket.ABNF.OPCODE_BINARY)
time.sleep(0.5) # Simulate natural pacing
ws.send(json.dumps({"messageType": "end"}))

This approach mimics live playback while preventing the stream from overwhelming the server.

Details on how to send audio data are available here.

Step 5: Receive Real-Time Transcripts and Clinical Facts

As the session progresses, you’ll receive two core message types over the WebSocket:

  • Transcript messages: real-time transcription of spoken content

  • Fact messages: structured clinical insights, such as symptoms, medications, or diagnoses

Here’s a handler for both:

python
CopyEdit
def on_message(ws, message):
data = json.loads(message)
if data["messageType"] == "transcript":
print("Transcript:", data["transcript"])
elif data["messageType"] == "fact": print("Fact:", data["fact"])

When the stream ends, you’ll receive a sessionEnded message, and all transcription and fact extraction will be complete.

Details on the response messages are available here.

Bringing It All Together

This end-to-end pipeline gives you fine-grained control over how clinical conversations are streamed, transcribed, and analyzed.

To recap:

  1. Authenticate securely using OAuth2

  2. Create an interaction session with contextual metadata

  3. Connect to the WebSocket and set the session configuration

  4. Stream audio in real time

  5. Receive live transcripts and structured clinical facts

Corti’s API is designed to be:

  • Developer-friendly and easy to integrate

  • Purpose-built for healthcare use cases

  • Capable of extracting more accurate, concise, and relevant clinical signals from live conversations

Where to Go From Here

If you're ready to start building:

Whether you’re enabling ambient scribing, building tools for live clinical intake, or supporting clinical decision-making - Corti gives you the infrastructure to turn conversations into clinical intelligence.

Did this answer your question?