Search the knowledge base, browse our resources, and visit our forum for more detailed information
Last updated: 28 Aug 2025
KoboToolbox’s natural language processing tools help you collect, manage, and analyze qualitative data more effectively. These tools include automatic speech-to-text transcription and machine translation, with automated qualitative analysis coming soon. The original transcript for your audio files and all translated text are added as new data columns in the data table and can be downloaded alongside your survey data.
To use these features, first collect audio responses in your form using the Audio question type. These features currently work only on audio responses, not on background audio recordings.
Note: Automatic transcription and translation may not be available for all languages. For these languages, only manual transcription and translation are possible.
To start transcribing your audio responses:
Open your project and navigate to DATA > Table.
Click the Open button next to the audio response you would like to transcribe.
In the TRANSCRIPT tab, click begin.
Select the original language of the audio file and the automatic option (the manual option will allow you to manually transcribe the audio recording).
Click create transcript to begin the automatic transcription.
Once the transcript is complete, you can edit it manually. You can play the audio recording in the top right corner to help check the accuracy of the transcript.
After editing the transcript, click the Save button to ensure your work is safely stored.
When complete, either click DONE to exit, navigate to the next submission by clicking the arrows next to the DONE button, or proceed to the TRANSLATIONS tab.
If you click DONE, you will be taken back to the data table view, where a new column containing the transcript will have been added.
Note: Automatically generated transcripts and translations must be saved to prevent data loss. Navigating away from the page without saving will result in losing the data.
Once you have a completed transcript for your audio response, you can add translations into multiple languages:
Proceed to the TRANSLATIONS tab.
The translation option is only available once a transcript has been completed.
Click begin and choose the language of the translation.
Click automatic for machine translation (the manual option will allow you to manually translate the transcript)
Click create translation to begin the automatic translation
Once the translation is complete, you can edit it manually. The original transcript appears on the right of the screen, and the original audio appears underneath.
After editing the translation, click the Save button to ensure your work is safely stored.
When the translation is complete, you can add another translation by clicking new translation, move to the next submission by clicking the arrows next to the item number in the top right corner, or click DONE to navigate back to the data table.
Note: Audio files can only contain a single transcript, but each transcript may have multiple translations.
These natural language processing features integrate automated speech recognition (ASR) and machine translation (MT) capabilities provided by Google Cloud Compute, which currently offers automatic transcription in 72 languages (with 138 regional variants) and automatic translation in 106 languages. For manual transcription or translation, you can select from approximately 7,000 languages based on the ISO 639-3 comprehensive list, maintained by SIL International (filtered for “living languages”). If a language supports ASR or MT, you can choose between manual and automatic methods. For other languages, only the manual method is available.
If you cannot find a language in the list, consider alternative spellings or names. All language names are currently listed using their English names and spelling (e.g., Spanish instead of Español). For languages with fewer speakers, there might be alternative names. For example, the Bura language in Northern Nigeria is listed as Bura-Pabir but is also known as Bourrah or Babir.
Note: When manually transcribing audio responses, it is important to select the correct language. If the manually generated transcript does not accurately match the chosen language or region, subsequent automatic translations using that transcript may be incorrect and produce inaccuracies.
Did you find what you were looking for? Was the information clear? Was anything missing?
Share your feedback to help us improve this article!
KoboToolbox is maintained by Kobo Inc.