Welcome to Corti! This three setup guide will help you get familiar with different areas of Corti's API. We recommend having our documentation open in another tab as you walkthrough these steps. You can also use the AI assistant on the bottom left of this page. It is trained on our documentation, but you can also use it to ask to speak to a person. We are happy to help if you have any questions.
Prerequisites
Prerequisites
Sign up for a Corti Console account and create a Project.
Create your API Client Credentials within your Console Project
Access to API Platform/Client Tools including:
Postman
Python 3.9+ with
websockets
installed (pip install websockets
)
Step 1: Console Studio
Step 1: Console Studio
AI Studio is your in-browser playground for testing speech recognition and text generation before writing any code. It can also be used to generate code snippets to help with your development.
A few things to note before you get started:
We recommend using the studio in conjunction with your bowser's developer tools to allow you to see the network traffic of the API requests and responses.
Studio use consumes credits at the same rate as API requests. You can use studio to estimate your total usage.
Speech Recognition
Speech Recognition
Speech Recognition allows you to explore real-time, bidirectional communication with the Corti system for stateless speech-to-text. The available settings give users control over their text without any interruption.
Click "View Code Sample" to generate JSON , JavaScript, and HTML for your current config settings
Configure speech recognition
These configuration options should be set before starting the recording. Explore the different options to see how each affects the produced transcription.
Primary Language
Choose the language the model should expect (e.g., en – English). Pick the language that matches your audio for best accuracy.
Interim Results
Toggle to see partial (real-time) hypotheses as speech is processed, simulating a live “typing” effect. The interim results will appear light gray while the final results will be black text.
Spoken Punctuation
When enabled, spoken words like “comma” or “period” are interpreted as punctuation rather than literal words.
Automatic Punctuation
Allow the model to insert punctuation automatically based on prosody and context. Use with Spoken Punctuation disabled if you want fully hands-free punctuation.
Commands
Define voice commands that trigger structured actions.
You will see the commands appear below the transcript, but they will not effect the text in the studio demo.
To Create a Command:
Enter the required information:
Command ID: The name of your command (e.g.,
next_section
)Phrases that will be spoken by the user (e.g., “next section”, “go to next section”)
Optional Variables: place holders that allow a user to use specific values when utilizing voice commands (e.g., {template-name} used in phrase "insert my {template-name} template")
Click 'Add Command' to complete the creation process
Text Generation
Text Generation
The text generation allows you to generate standardized documents from clinical facts.
You can find some code from our SDK for the document generation request and links to relevant technical documentation in the bottom left section.
Template Selection
Use this section to select a document template you would like to use to generate a new document. We have 12 different standard templates you can choose from. To see more information about a template and its potential use cases, you can select a template and view the preview on the right.
You can also choose the language of the document. Please note these options may change between US and EU hosted API clients.
Clinical Context
Use this section to select and edit the information used to generate a new document. We have several predefined contexts you can choose from. To see how the document changes based on the input, you can edit the context and regenerate the document.
If you would like to use your own context, you can use a custom clinical context.
Copy your results from Speech Recognition into a custom context to simulate the end to end process of speech to standardized documention.
Generated Document
After you generate a document, you can toggle between the a UI friendly rendered view and the raw JSON output.
Step 2: Postman Collection
Step 2: Postman Collection
Download the Postman collection at the bottom of of this page. This will allow you to explore our REST API using pre-recorded, asynchronous, audio. We have included an example file below for your use.
Authentication Setup Instructions
Open the Collection Variables
In Postman, click on your collection name.
Go to the Variables tab.
Set up the following Variables
token
client-id
environment
Tenant-Name
client-secret
baseURL
Fill in your variable values
Locate the variables for authentication (e.g.,
baseURL
,client-id
,client-secret
,environment
,Tenant-Name
).Fill in your credentials from your Console API Key.
Your baseURL will be
https://api.{{environment}}.corti.app/v2
Get Your Access Token
Save the Token
Click use token on the pop-up modal. This will save the token for future requests. It will be refreshed in the background when a new request is made.
You will find your environment and tenant in the 'API Client' created within your Console.
Explore our REST API
Our recommended starter course is:
Creating an Interaction (Be sure to save the interaction ID in an associated variable)
Upload a recording to the interaction (Copy the recording ID)
Create a transcript
Generate a document from the string of the transcript
Once you are comfortable with this process, explore the other parts of the collection and our API.
Step 3: Open a WebSocket in Python to Stream Audio
Step 3: Open a WebSocket in Python to Stream Audio
You can download the Python example below and view the video guide to get started.