Transcription and translation of audio responses

Last updated: 14 Jun 2026

KoboToolbox’s natural language processing tools help you collect, manage, and analyze qualitative data more effectively. These tools include automatic speech-to-text transcription and machine translation, which can prepare audio responses for automated qualitative analysis.

This article covers how to transcribe audio responses and translate transcripts, including supported languages and usage limits for automatic options.

Note: Automatic transcription and translation may not be available for all languages. For these languages, only manual transcription and translation are possible.

To use KoboToolbox’s transcription and translation features, start by collecting audio responses in your form using the Audio question type or background audio recordings. After transcribing and translating audio responses, the original transcript for your audio files and all translated text are added as new data columns in the data table and can be downloaded alongside your survey data.

Adding audio transcripts

Adding automatic transcriptions example

To start transcribing your audio responses:

  1. Open your project and navigate to DATA > Table.

  2. Click the Open button next to the audio response you would like to transcribe.

  3. In the TRANSCRIPT tab, click begin.

    • Select the original language of the audio file.

    • If available, select the automatic option. The manual option will allow you to manually transcribe the audio recording in any language.

    • Click create transcript to begin the automatic transcription.

  4. Once the transcript is complete, you can edit it manually. You can play the audio recording in the top right corner to help check the accuracy of the transcript.

    • After editing the transcript, click the Save button to ensure your work is safely stored.

  5. When complete, either click DONE to exit, navigate to the next submission by clicking the arrows next to the DONE button, or proceed to the TRANSLATIONS tab.

    • If you click DONE, you will be taken back to the data table view, where a new column containing the transcript will have been added.

Note: Automatically generated transcripts and translations must be saved to prevent data loss. Navigating away from the page without saving will result in losing the data.

Adding translations

Adding automatic translations example

Once you have a completed transcript for your audio response, you can add translations into multiple languages:

  1. Proceed to the TRANSLATIONS tab.

    • The translation option is only available once a transcript has been completed.

  2. Click begin and choose the language of the translation.

    • If available, select automatic for machine translation. The manual option will allow you to manually translate the transcript in any language.

    • Click create translation to begin the automatic translation

  3. Once the translation is complete, you can edit it manually. The original transcript appears on the right of the screen, and the original audio appears underneath.

    • After editing the translation, click the Save button to ensure your work is safely stored.

  4. When the translation is complete, you can add another translation by clicking new translation, move to the next submission by clicking the arrows next to the item number in the top right corner, or click DONE to navigate back to the data table.

Note: Audio files can only contain a single transcript, but each transcript may have multiple translations.

Language list

KoboToolbox’s natural language processing features integrate automated speech recognition (ASR) and machine translation (MT) capabilities provided by Google Cloud Compute, which currently offers automatic transcription in 80 languages (with 145 regional variants) and automatic translation in 129 languages.

For manual transcription or translation, you can select from approximately 7,000 languages based on the ISO 639-3 comprehensive list, maintained by SIL International (filtered for “living languages”). If a language supports ASR or MT, you can choose between manual and automatic methods. For other languages, only the manual method is available.

For a full list of supported languages, see Languages supported for automatic transcription and translation.

If you cannot find a language in the list, consider alternative spellings or names. All language names are currently listed using their English names and spelling (e.g., Spanish instead of Español). For languages with fewer speakers, there might be alternative names. For example, the Bura language in Northern Nigeria is listed as Bura-Pabir but is also known as Bourrah or Babir.

Note: When manually transcribing audio responses, it is important to select the correct language. If the manually generated transcript does not accurately match the chosen language or region, subsequent automatic translations using that transcript may be incorrect and produce inaccuracies.

Usage limits for automatic transcription and translation

Community Plan users can use up to 10 minutes of automatic speech-to-text transcription per month and up to 6,000 characters of automatic transcript translation per month.

If you need more transcription or translation capacity, you can upgrade to a plan with a higher quota or purchase a Natural Language Processing (NLP) Package add-on. ​​Add-ons are available based on your data processing needs, starting at $9.95 for 100 additional transcription minutes and 60,000 additional translation characters. You can always continue transcribing and translating audio responses manually with no usage limit.

Troubleshooting

Translation not loading Sometimes, the second translation may get stuck with a loading icon. If this happens, refresh the page, and the translation should appear. This is an issue we are working to fix.