Overview

What's New

The following features are now fully supported using the V2 API

  • Custom Dictionary
  • Channel Diarization

What's Changed

New hostname

Rather than using the api.speechmatics.com hostname, you should use the hostname asr.api.speechmatics.com.

Authorization headers

Access to the API requires use of an authorization token ('auth token'). In the V1 API your auth token is passed as a query string parameter on the URI. In the V2 API it is done using an Authorization header, which is the recommended OAuth2 approach.

[info] Auth tokens

Currently there is no way to generate an auth token; if you require a new auth token please contact support@speechmatics.com. In the future we will provide the ability to generate new tokens for the V2 API.

User endpoint

There is no longer a /user endpoint, instead jobs are referenced using a simpler /jobs endpoint, with user access being controlled using the new authentication header. V2 API calls use a /v2 path in the URL. Requests to submit or refer to jobs now look like this:

https://asr.api.speechmatics.com/v2/jobs/

A separate authentication service will be added in future that will provide the equivalent capabilities of /user.

Status endpoint

The /status endpoint has been removed.

Speaker diarization

Speaker diarization is now off by default.

Form fields for configuration

A JSON configuration object ('config JSON') replaces the form fields that were previously used for configuration of a job. The audio file name is still specified as a form field, but all other configuration is passed in the config JSON.

Job IDs are strings

We now use a random string job ID value to refer to jobs, rather than an incrementing integer. The Job IDs that you see will look like this: yjbmf9kqub.

Legacy JSON output format dropped

A richer JSON format (json-v2) is now used which provides support for new features. Plain text output (txt) is still available. In the JSON transcript output you will see the following:

"format": "2.4"

ISO 8601 timestamps

Timestamps are now represented in ISO 8601 format, for example: 2018-10-02T13:10:25Z. Coordinated Universal Time (UTC) is used (indicated by the Z suffix).

Metadata

The meta form parameter is replaced with the tracking element in config JSON. This supports a title, list of tags and a customer-defined JSON object. You can use this information to track jobs through your workflow.

Egress IP adresses (for whitelisting)

You may want to whitelist Speechmatics SaaS for the notification callback service to prevent misuse of your endpoints. Currently, callbacks can be made from following addresses:

40.74.41.91
52.236.157.154
40.74.37.0
52.142.116.223
52.155.88.26
52.142.90.149

The following addresses were previously in use, but are in the process of being deprecated:

52.236.176.166
40.85.99.235

Language Support

The Speechmatics V2 API now supports the following languages:

  • English (en)
  • German (de)
  • Spanish (es)
  • French (fr)
  • Italian (it)
  • Dutch (nl)
  • Portuguese (pt)
  • Japanese (ja)
  • Korean (ko)
  • Danish (da)
  • Polish (pl)
  • Catalan (ca)
  • Hindi (hi)
  • Russian (ru)
  • Swedish (sv)
  • Bulgarian (bg)
  • Slovenian (sl)
  • Czech (cs)
  • Greek (el)
  • Finnish (fi)
  • Hungarian (hu)
  • Croatian (hr)
  • Lithuanian (lt)
  • Latvian (lv)
  • Romanian (ro)
  • Slovakian (sk)
  • Mandarin (cmn)
  • Norwegian (no)
  • Arabic (ar)
  • Turkish (tr)
  • Malay (ms)

Note: it is not our intention to update the legacy V1 SaaS with new language models; in order to ensure you get access to the newest features and most accurate transcription results we recommend that you use the V2 API.

Current Limitations

Email Notifications

There is currently no support for email notification of job completion using the V2 API. However there is full support for notifications using webhooks.

Alignment

Alignment jobs are not supported by the V2 API. If you want to submit alignment jobs then you should continue to do so using the V1 API.

Rate Limiting

Unless agreed otherwise with Speechmatics, the following behaviour will be considered acceptable use of the Cloud Services ASR.

Speechmatics reserve the right to change the rate limits at any time in order to ensure continuity of service for all customers of the Cloud.

  • The Customer shall limit the rate of submission of files to a maximum of 2 jobs per second with a maximum of 100 jobs in progress at any one time.
  • The Customer shall limit the rate of polling for the status of submitted jobs to a maximum of 20 queries per second (across all jobs). If for your use case you believe you need increased limits please contact support@speechmatics.com

results matching ""

    No results matching ""