Skip to main content
All CollectionsCorti APIUnderstanding Corti's API
The Ultimate Guide to Corti's API
The Ultimate Guide to Corti's API
Updated yesterday

What You’ll Learn

  • How Corti’s API Works: Real-time and post-call AI-driven transcription and analysis.

  • Key API Endpoints: Overview of interactions, recordings, transcripts, documents, and more.

  • Supported Languages & ASR Tiers: Understanding Corti’s AI models and language support.

  • Why Choose Corti API?: Seamless workflow integration, real-time processing, and healthcare compliance.

📢 Get started fast: Visit Corti API Documentation

Introduction

Corti's API is the foundation for integrating advanced AI-powered speech and text processing into your applications. This guide provides a comprehensive overview of how the API works, the key components, and how you can leverage its capabilities for real-time and post-call transcription, analysis, and workflow automation.

Corti Foundation

Corti offers three high-performing Foundation Models, each purpose-built to meet the rigorous demands of the healthcare sector:

  • Solo: A fast model optimized for audio reasoning, enabling dictation, transcription, and conversation diarization.

  • Ensemble: A robust model designed for automating documentation, reducing administrative burden.

  • Symphony: A premium model that combines power and speed to deliver unparalleled performance.

📖 To learn more about the wide range of medical specialties that Corti supports, read this article on Medical Specialties Supported by Corti AI Foundation Models.

Why Choose the Corti API?

  • Purpose-Built for Healthcare: Designed to meet the specific needs and compliance standards of the medical industry.

  • Real-Time Processing: Supports live data streaming and highly accurate fact generation.

  • Seamless Workflow Integration: Works across multiple interaction points in clinical and operational workflows.

  • Customizable & Scalable: Adaptable to fit your organization's needs

Below is an overview of Corti's available endpoints, along with a brief description of their functionality.

Available API Endpoints

Interactions

An Interaction is the fundamental unit within the Corti API that encapsulates the entire conversation or session between a medical professional and a patient. It ties together all related data and operations, enabling a cohesive workflow from the start of the interaction to the generation of final documentation. The Interaction endpoint allows you to discover all interactions available for your organization, create a new interaction, and update existing ones. When creating an interaction you’ll receive an InteractionId as well as a Web Socket URL that can be used for real-time, streaming workflows.

Recordings

The recordings endpoint in the Corti API allows clients to manage audio recordings associated with interactions. It is part of a larger workflow that includes initializing interactions, uploading recordings, generating transcripts, and creating documents based on the interaction data.

As an example, after initializing an interaction, clients can upload an audio file by sending a POST request to /interactions/:interactionId/recording/. The API responds with a 200 status code and returns a recordingId, confirming that the audio file has been successfully uploaded and linked to the specific interaction. Use the ‘recordingId’ in your next step, such as generating transcripts or documents.

Transcripts

The transcripts endpoint in the Corti API is part of the workflow for processing recorded interactions. After uploading an audio recording, clients can initiate the transcription process by sending a POST request to /interactions/:interactionId/transcripts/. The API then processes the audio and returns a 200 status code along with the generated transcript. This transcript contains the text version of the recorded interaction, extracted and formatted for review. The transcripts endpoint plays a crucial role in converting speech to text for clinical conversations and dictations, enabling further processing and documentation creation based on the interaction data.

Transcribe

The transcribe endpoint enables stateless speech-to-text that can be used to power dictation workflows. Spoken or automated punctuation is supported, and commands can be defined in configuration requests.

  • See the languages page for more information on how and which languages are supported.

  • See the dictation SDK page for a packaged SDK ready to integrate into your app within minutes.

Streams

The streams endpoint enables real-time conversational transcript and fact generation. Designed for scenarios requiring immediate processing and feedback during live interactions, the streams endpoint operates over a WebSocket connection after initializing an interaction. Once connected, the client streams audio packets to the API, which responds with live transcripts and fact updates in real-time.

This continuous data flow allows for instantaneous capture and processing of information, making it ideal for situations where immediate action is crucial. The stream can be updated with new facts during the interaction, and it concludes when the client sends an “end” message, formally closing the live session.

Facts

Within a real-time streaming interaction, the Corti API extracts Facts: atomic pieces of information that are critical for documenting an interaction accurately. These facts can include details such as the patient’s name, height, blood pressure, symptoms, and other clinically relevant data. Facts are designed to help clinicians quickly identify the salient points of an interaction, making it easier to draft complete and truthful clinical documents.

Key features of Facts:

  • Relevance and Precision: Facts are distilled from the conversation in real-time, ensuring that every crucial piece of information is captured as the interaction unfolds.

  • Customizability: Clinicians can easily add, remove, or modify facts to ensure that the documentation reflects the true nature of the interaction.

  • Integration with Documentation: These facts form the building blocks of clinical documents, ensuring that every relevant detail is included without overwhelming the clinician with unnecessary information.

Documents

The documents endpoint in the Corti API is an essential part of the workflow for processing recorded interactions. After initializing an interaction, uploading a recording, and generating a transcript, clients can create the necessary documentation by sending a POST request to /interactions/:interactionId/documents/. This request utilizes generated transcripts and a templateKey defined in the request to specify the desired format of the output document. Upon processing the request, the API returns a 200 status code along with the final document.

This endpoint allows for the generation of documentation tailored to specific needs, such as clinical notes or referral letters, based on the interaction data and predefined templates. The endpoint can be called multiple times for a given interaction if multiple document outputs are desired. See more details in the Templates & Documents page.

Templates

Templates in the Corti API allow users to specify the output of medical documents generated from various inputs, such as transcripts, facts, or other medical documents. They enable the definition of specific sections, structure, language, and writing style for documentation, ensuring consistency and clarity in medical reporting.

For more information on how to find and use existing templates, how to build templates dynamically in the document generation request, or how to request a custom template, please see more details in the Templates & Documents page.

Codes

The codes endpoint in the Corti API provides access to code prediction models, which includes support for ICD-10 diagnosis and procedure codes as well as CPT code with modifiers. The API provides the ability to generate codes for a text that is defined in the request or documents associated with the interaction.

Currently, the coding endpoint is available for use on a per-customer basis: customer-specific tuning is required so that coding output is tuned for client note styles and requirements. Check back soon for a general use coding model to be available.

📖 To read more about Corti's endpoints and how they work, read this article here.

Languages Supported in Corti APIs

To help users navigate the available options, we have introduced a tier system that categorizes functionality and performance. Selecting the right language model tier allows you to balance capabilities and cost while ensuring the best fit for your needs.

Corti ASR Tiers

Corti's Automatic Speech Recognition (ASR) models are available in three tiers:

Tier

Description

Primary Use Case

Base

Enables speech-to-text with a general medicine model suitable for a variety of medical specialties.

Broad support for speech-enabled clinical documentation workflows.

Enhanced

Optimized for medical terminology recognition across different specialties, balancing speed, functionality, and quality.

Powers ambient documentation workflows.

Premier

Provides the highest accuracy, widest language model, and most advanced functionality, including customized commands, user-defined lexicon, and specialist terminology.

Supports robust dictation or ambient documentation workflows.

Available Languages

The table below outlines the languages currently supported by Corti ASR models, including their availability by tier:

Language

Language Code (BCP47)

Tier Availability

English

en

Base, Enhanced, Premier

English (US)

en-us

Base, Enhanced, Premier

English (UK)

en-gb

Base, Enhanced, Premier

Danish

da

Base, Enhanced, Premier (beta)

German

de

Base, Enhanced, Premier (beta)

French

fr

Base, Enhanced

Swiss-German

de-CH

Base, Enhanced (beta)

Swedish

sv

Base, Enhanced (beta)

Spanish

es

Base, Enhanced (coming soon)

Dutch

nl

Base

Norwegian

no

Base

Italian

it

Base

Portuguese

pt

Base

Arabic

ar

Base (coming soon)

Glossary

This section provides definitions for key terms and concepts related to Corti’s platform and AI capabilities. Whether you're exploring our APIs, using our AI-powered tools, or integrating Corti into your workflows, this glossary will help you better understand the terminology we use.

Term

Definition

Front-end Dictation

A real-time speech-to-text process where spoken words are immediately transcribed and displayed to the speaker (e.g., a clinician dictating notes directly into an EHR and seeing the transcript appear live).

Back-end Transcription

A process where recorded audio (either conversational or dictated) is transcribed after the fact. Unlike front-end dictation, back-end transcription does not require real-time streaming interaction from the speaker. Instead, a complete audio file is processed with a text output being returned.

Conversational Diarized Transcript

A transcript that includes both the spoken content and speaker attribution (i.e., who said what). ‘Diarized’ means that the transcript is segmented and labeled by the different speakers.

Websocket

A protocol that allows for real-time communication between a user via a client-side application and a server. Opening a websocket connection initiates a persistent, two-way communication channel required for real-time workflows. When the real-time interaction is complete, the websocket connection is closed.

Live

Workflows that require data streaming to support real-time updates while the audio is being recorded and processed. Examples include front-end dictation using Corti Solo and ambient documentation using Corti Symphony.

Asynchronous (or ‘Batch’)

Workflows that do not happen live. In asynchronous (batch) transactions, data files are delivered to an API endpoint, processed, and a resulting output is returned. No websocket is required. Example: uploading an audio recording and receiving a transcript via the Transcript endpoint.

Feature Flag

A technique that allows Corti to enable or disable features (API endpoints or components) within a customer’s tenant/infrastructure environment.

Natural Language Dictation (NLD)

The next generation of front-end speech recognition, allowing clinicians to dictate naturally and have AI transform their dictation into a structured summary document. Example: A dictated radiology report is converted into a formatted note output.

ASR Commands

Pre-defined phrases that execute actions instead of transcribing spoken words. Used for punctuation, line breaks, inserting text macros, etc. Supported in the ASR language models in the Premier tier. Example: Saying “Plan colon new line insert diabetes controlled plan” results in structured text output.

Speech vs. Voice

Speech recognition converts spoken words into text (e.g., for dictation, transcription, virtual assistants). Voice recognition identifies a specific person based on unique vocal characteristics (e.g., for security access, personalized services, speaker verification).

Foundation Models, Language Models, and Corti

Foundation models (FM) are large AI models trained on vast data to support multiple use cases. Language models (like ASR or coding models) are specialized FMs for natural language tasks. Corti builds healthcare-specific FMs to ensure accurate, cost-effective AI models for speech recognition, documentation, and more.

Corti Solo

A foundation model pre-trained on extensive healthcare data for fast, accurate speech recognition. Includes ASR models optimized for different languages.

Corti Ensemble

Builds on Corti Solo, adding AI models for automatic document generation from conversational or dictated audio (e.g., AI Scribe applications).

Corti Symphony

The most advanced Corti foundation model, including Solo and Ensemble, plus: real-time fact generation, contextual search agents, knowledge retrieval, chart summarization, and alignment models.

SDK (Software Development Kit)

A collection of tools, libraries, documentation, and code samples for developers to build software applications. Example: Corti’s Dictation SDK provides a guide and components for building a dictation solution using the API.

Related Articles

📢 Need help? If you have questions or require additional support, feel free to contact us at help@corti.ai

Did this answer your question?