Introduction:

Corti Foundation Model

Corti’s Foundation Model is split up into three components purpose-built to meet the rigorous demands of the healthcare sector:

Solo: A fast model optimized for audio reasoning, enabling dictation, transcription, and conversation diarization.
Ensemble: A robust model designed for automating documentation, reducing administrative burden.
Symphony: A premium model that combines power and speed to deliver unparalleled performance.

Through the API, Solo, Ensemble, and Symphony are activated based on the task (or endpoint) that you request.

Why Corti's Models

Purpose-Built for Healthcare: Designed to meet the specific needs and compliance standards of the medical industry.
Real-Time Processing: Supports live data streaming and highly accurate fact generation.
Seamless Workflow Integration: Works across multiple interaction points in clinical and operational workflows.
Customizable & Scalable: Adaptable to fit your organization's needs

Model Details

Below is an overview of Corti's available endpoints, along with a brief description of their functionality.

Category	Details
Model information
Basic Information	Corti's Foundation Models (Solo, Ensemble, and Symphony) power AI-driven automation in healthcare. Each model provides varying levels of complexity and performance for different use cases. Read more at help.corti.ai and docs.corti.ai.
Developer	Corti ApS / Corti America Inc.
Model Date	2025-03-04
Model Versions	v1.0 (Release Notes)
Model Type	Foundation Models for audio and text, capable of: Speech recognition Text generation Retrieval-augmented generation Instruct-based text generation Classification Ranking
Training Algorithms & Parameters	The Symphony Foundation Model has 100B+ parameters, while Solo and Ensemble are optimized for efficiency and robustness. All models are trained on healthcare-specific datasets, designed to minimize bias through rigorous methodologies. Read more: Bias Mitigation, Tuning AI Models.
Resources for More Information	Customer introductions, whitepapers, and peer-reviewed publications available upon request. Research links: help.corti.ai, Corti Research, docs.corti.ai
Citation Details	Corti ApS (2025). Corti Foundation Models (Solo, Ensemble, Symphony). Release Notes.
License	Subscription License.
Intended Use
Use Cases	Speech recognition Text generation Retrieval-augmented generation Instruct-based text generation Classification Ranking
Primary Users	Healthcare professionals: doctors, nurses, secretaries, administrative staff, call center agents, paramedics.
Out-of-Scope Uses	Any non-healthcare applications.
Factors
Factors Affecting Model Performance	The model has been trained on extensive healthcare data spanning broad populations over an extended time period. No data was selectively curated to mitigate specific biases, ensuring the model reflects general population trends. Certain outliers may therefore have lower accuracies if their representation is lower in the general population. Corti constantly monitors such outliers and ensures that they are represented in datasets for finetuning the model.
Relevant Factors	Dialects, accents, topics outside of the normal distribution.
Evaluation Factors	k-fold cross validation is used for all evaluation with multiple validation and test sets representing the general distribution as well as certain subgroups for relevant factors. These subgroups include dialects, accents, and specific conversational topics.
Variation Approaches	Subgroup analysis, data distribution variation, sensitivity analysis, temporal variation, and cross-validation across model variants.
Metrics
Metrics with real-world impact	1. Speech recognition: 1.1. Levenshtein distance: Word-error-rate and character-error-rate, medical keyword word-error-rate. 1.2. Sub-measures: Rouge scores. Read more: General transcription guidelines. How to Evaluate the Accuracy of a Speech Recognition Model. 2. Generative AI: 2.1. Completeness, conciseness and groundedness Read more: Benchmark of LLM documentation. 3. Recommendations: 3.1. CSAT scores. 3.2. Normalized Review Acceptance Rate.
Model performance measures	Throughout any evaluation we expect: Word-error-rate = <5%. Completeness = >90%. Conciseness = >95%. Groundedness = >95%. f1-score = >60%. CSAT = >70% Normalized Review Acceptance Rate = +90%
Evaluation Data
Data Details	1000+ hours of transcribed speech from in-room and telephone/virtual conversations. 100M+ curated summaries of recorded conversations. 700K+ patient summaries and clinical codes, including denial data.
Datasets Used	Proprietary datasets from Corti's customers Public datasets are also used in evaluation and research, including: Librispeech, Common Voice, Switchboard, Fisher Corpus, LJSpeech, MIMIC-III, MIMIC-IV, PriMock57.
Training Data
Training Data	Vast publicly available data Synthetic datasets capturing specific language anomalies Silver-labeled transcribed speech Human-annotated summaries of recorded conversations Human-annotated patient summaries and clinical codes
Quantitative Analyses
Unitary results	Accent & Dialect Performance: Word-error rate stays below 10% even in varied accents. Long Context Summaries: Conciseness and completeness may drop by ~5%, groundedness remains stable. Clinical Coding: Performance declines as function of code rarity.
Intersectional results	Distributional anomalies evaluation as a part of our cross-validation. (see above)
Ethical Considerations
Ethical Considerations	Data collected with informed consent. Compliant with HIPAA and relevant regulations. Security: End-to-end encryption for data at rest and in transit. Bias mitigation: Diverse datasets and continuous evaluation. Real-time monitoring & risk assessment (EU AI Act, Responsible AI). Regulatory compliance: Verified via Drata (Drata Trust).
Contact	help.corti.ai