Speechmatics ASR REST API

The Speechmatics Automatic Speech Recognition REST API is used to submit ASR jobs and receive the results. The only supported job type currently is transcription of audio files.

Version: 2.0.0

Contact information:
support@speechmatics.com

/jobs


POST

Summary: Create a new job.

Parameters

Name Located in Description Required Schema
Authorization header Customer API token Yes string
config formData JSON containing a JobConfig model indicating the type and parameters for the recognition job. Yes string
data_file formData The data file to be processed. Alternatively the data file can be fetched from a url specified in JobConfig. No file

Responses

Code Description Schema
201 OK CreateJobResponse
400 Bad request ErrorResponse
401 Unauthorized ErrorResponse
403 Forbidden ErrorResponse
429 Rate Limited See Troubleshooting
500 Internal Server Error ErrorResponse
GET

Summary: List all jobs, limited to most recent 100 jobs in the last 2 days.

Parameters

Name Located in Description Required Schema
Authorization header Customer API token Yes string

Responses

Code Description Schema
200 OK RetrieveJobsResponse
401 Unauthorized ErrorResponse
429 Rate Limited See Troubleshooting
500 Internal Server Error ErrorResponse

/jobs/{jobid}


GET

Summary: Get job details, including progress and any error reports.

Parameters

Name Located in Description Required Schema
Authorization header Customer API token Yes string
jobid path ID of the job. Yes string

Responses

Code Description Schema
200 OK RetrieveJobResponse
401 Unauthorized ErrorResponse
404 Not found ErrorResponse
429 Rate Limited See Troubleshooting
500 Internal Server Error ErrorResponse
DELETE

Summary: Delete a job and remove all associated resources.

Parameters

Name Located in Description Required Schema
Authorization header Customer API token Yes string
jobid path ID of the job to delete. Yes string

Responses

Code Description Schema
200 The job that was deleted. DeleteJobResponse
401 Unauthorized ErrorResponse
404 Not found ErrorResponse
429 Rate Limited See Troubleshooting
500 Internal Server Error ErrorResponse

/jobs/{jobid}/data


GET

Summary: Get the data file used as input to a job.

Parameters

Name Located in Description Required Schema
Authorization header Customer API token Yes string
jobid path ID of the job. Yes string

Responses

Code Description Schema
200 OK file
401 Unauthorized ErrorResponse
404 Not found ErrorResponse
429 Rate Limited See Troubleshooting
410 Gone ErrorResponse
500 Internal Server Error ErrorResponse

/jobs/{jobid}/transcript


GET

Summary: Get the transcript for a transcription job.

Parameters

Name Located in Description Required Schema
Authorization header Customer API token Yes string
jobid path ID of the job. Yes string
format query The transcripton format, either txt or json-v2 (the default). No string

Responses

Code Description Schema
200 OK RetrieveTranscriptResponse
401 Unauthorized ErrorResponse
404 Not found ErrorResponse
429 Rate Limited See Troubleshooting
410 Gone ErrorResponse
500 Internal Server Error ErrorResponse

ErrorResponse

Name Type Description Required
code integer The HTTP status code. Yes
error string The error message. Yes
detail string The details of the error. No

TrackingData

Name Type Description Required
title string The title of the job. No
reference string External system reference. No
tags [ string ] A set of keywords No
details object Customer-defined JSON structure. No

DataFetchConfig

Name Type Description Required
url string Yes
auth_headers [ string ] A list of additional headers to be added to the input fetch request when using http or https. This is intended to support authentication or authorization, for example by supplying an OAuth2 bearer token. No

TranscriptionConfig

Name Type Description Required
language string Language model to process the audio input, normally specified as an ISO language code Yes
additional_vocab [ object ] List of custom words or phrases that should be recognized. Alternative pronunciations can be specified to aid recognition. No
diarization string Specify whether speaker or channel labels are added to the transcript. The default is none.
- none: no speaker or channel labels are added.
- speaker: speaker attribution is performed based on acoustic matching;all input channels are mixed into a single stream for processing.
- channel: multiple input channels are processed individually and collated into a single transcript.
No
channel_diarization_labels [ string ] Transcript labels to use when using collating separate input channels. No
punctuation_overrides [ object ] Overrides to the default settings for advanced punctuation.
Ignored if the language model does not support advanced punctuation.
No

NotificationConfig

Name Type Description Required
url string The url to which a notification message will be sent upon completion of the job. The job id and status are added as query parameters, and any combination of the job inputs and outputs can be included by listing them in contents.
If contents is empty, the body of the request will be empty.
If only one item is listed, it will be sent as the body of the request with Content-Type set to an appropriate value such as application/octet-stream or application/json.
If multiple items are listed they will be sent as named file attachments using the multipart content type.
If contents is not specified, the transcript item will be sent as a file attachment named data_file, for backwards compatibility.
If the job was rejected or failed during processing, that will be indicated by the status, and any output items that are not available as a result will be omitted. The body formatting rules will still be followed as if all items were available.
The user-agent header is set to Speechmatics API V2 in all cases.
In case of errors, status can be one of the following: fetch_error indicates a problem with fetching the file from DataFetchConfig.url; trim_error indicates a problem with trimming the audio file length to meet subscription limits; error indicates a general error processing the job.
Yes
contents [ string ] Specifies a list of items to be attached to the notification message. When multiple items are requested, they are included as named file attachments. Possible options:
  • `jobinfo`: A summary of the job. See JobInfo.
  • `transcript`: The transcript in the default format.
  • `transcript.json-v2`: The transcript in `json-v2` format.
  • `transcript.txt`: The transcript in `txt` format.
  • `data`: The audio file submitted for the job.
No
method string The method to be used with http and https urls. The default is POST. No
auth_headers [ string ] A list of additional headers to be added to the notification request when using http or https. This is intended to support authentication or authorization, for example by supplying an OAuth2 bearer token. No

JobConfig

JSON object that contains various groups of job configuration parameters. Based on the value of type, a type-specific object such as transcription_config is required to be present to specify all configuration settings or parameters needed to process the job inputs as expected.

If the results of the job are to be forwarded on completion, notification_config can be provided with a list of callbacks to be made; no assumptions should be made about the order in which they will occur.

Customer specific job details or metadata can be supplied in tracking, and this information will be available where possible in the job results and in callbacks.

Name Type Description Required
type string Yes
fetch_data DataFetchConfig No
transcription_config TranscriptionConfig No
notification_config [ NotificationConfig ] No
tracking TrackingData No

CreateJobResponse

Name Type Description Required
id string The unique ID assigned to the job. Keep a record of this for later retrieval of your completed job. Yes

JobDetails

Name Type Description Required
created_at dateTime The UTC date time the job was created. Yes
data_name string Name of the data file submitted for job. Yes
duration integer The file duration (in seconds). Yes
id string The unique id assigned to the job. Yes
status string The status of the job.
queued - The job is waiting to run.
running - The job is actively running.
done - The job completed successfully.
rejected - The job was accepted at first, but later could not be processed by the transcriber.
deleted - The user deleted the job.
expired - The system deleted the job. Usually because the job was in the done state for a very long time.
Yes
config JobConfig Yes

RetrieveJobsResponse

Name Type Description Required
jobs [ JobDetails ] Yes

RetrieveJobResponse

Name Type Description Required
job JobDetails Yes

DeleteJobResponse

Name Type Description Required
job JobDetails Yes

JobInfo

Summary information about an ASR job, to support identification and tracking.

Name Type Description Required
created_at dateTime The UTC date time the job was created. Yes
data_name string Name of data file submitted for job. Yes
duration integer The data file audio duration (in seconds). Yes
id string The unique id assigned to the job. Yes
tracking TrackingData No

RecognitionMetadata

Summary information about the output from an ASR job, comprising the job type and configuration parameters used when generating the output.

Name Type Description Required
created_at dateTime The UTC date time the transcription output was created. Yes
type string Yes
transcription_config TranscriptionConfig No

RecognitionDisplay

Name Type Description Required
direction string Yes

RecognitionAlternative

List of possible job output item values, ordered by likelihood.

Name Type Description Required
content string Yes
confidence float Yes
language string Yes
display RecognitionDisplay No
speaker string No

RecognitionResult

An ASR job output item. The primary item types are word and punctuation. Other item types may be present, for example to provide semantic information of different forms.

Name Type Description Required
channel string No
start_time float Yes
end_time float Yes
type string New types of items may appear without being requested; unrecognised item types can be ignored. Yes
alternatives [ RecognitionAlternative ] Yes

RecognitionOutput

The results element contains a list of ASR job output items, ordered by start time and sub-ordered by end time and type. The primary item types are word and punctuation; the most likely value for each of these items can be combined in the order of occurence to form a linear transcript. Items may overlap in time if there are multiple channels. Other item types may be present which overlap with the primary items and each other, for example to provide different forms of semantic information as time-based annotations.

Name Type Description Required
metadata RecognitionMetadata Yes
results [ RecognitionResult ] Yes

RetrieveTranscriptResponse

Name Type Description Required
format string Speechmatics JSON transcript format version number. Yes
job JobInfo Yes
output RecognitionOutput Yes

results matching ""

    No results matching ""