Search the knowledge base, browse our resources, and visit our Community Forum for more detailed information
Last updated: 15 Jun 2026
Many data quality problems do not begin during analysis, but during data collection. Decisions made when building a form, such as how questions are structured, how option choices are named, and how missing data is handled, can affect how much cleaning and preparation is required later.
KoboToolbox includes several tools that support high quality data collection and help prepare your data for analysis in the long run.
This article covers recommendations for designing forms that produce cleaner, more consistent data, from using form logic and calculations to planning question and choice names, and downloading your XLSForm as a data dictionary.
One of the most effective ways to improve your analysis is to prevent errors during data collection. KoboToolbox includes form logic functionalities that can help you collect more accurate and consistent responses.
Validation criteria help ensure that respondents provide valid answers. For example, you can use validation criteria to:
Restrict age to realistic values
Prevent dates of birth from being entered in the future
Require phone numbers to follow a specific format
Validation criteria are especially useful for questions where answers must fall within a specific range or follow a predictable format.
Skip logic helps ensure that respondents only see questions that are relevant to them. For example, a respondent who reports never being pregnant should not be asked questions about previous pregnancies.
Asking the right questions to the right people improves data quality and reduces the burden on respondents. It also reduces the amount of cleaning or data removal required later.
To learn more about form logic, see Introduction to form logic in the Formbuilder.
The way you define question names and choice names affects how easy your data is to work with after export. Question names become column names in your exported dataset, while choice names represent response values for select questions.
For best results, follow the recommendations below.
Use question and choice names that are short, informative, unique, and free of spaces and special characters (e.g., age, sex, or district).
Clear names make your exported data easier to read, process, and analyze in external tools.
Keep question and choice names consistent across forms whenever possible. If multiple surveys collect the same information, using the same variable names makes it easier to combine and compare datasets.
For example, if one form uses district and another form uses location_district for the same type of information, you may need to rename variables before combining the datasets.
Similarly, using standard choice names makes analysis easier and reduces the need for recoding data later.
For example, you can use 0 for “No” and 1 for “Yes” across all relevant questions in your form.
Once data collection has started, avoid modifying question names or choice names. Changing them can create inconsistencies between existing and new submissions.
If you need to update labels shown to respondents, edit the question label but keep the question name the same whenever possible.
For more information, see Best practices for deploying and redeploying forms.
When forms contain related variables, consistent naming conventions can help organize your data. Consider using a prefix or suffix for questions or choices related to the same topic or format.
For example:
Use household_ for household-related questions, such as household_size or household_income
Use _other for “Other, specify” questions, such as income_source_other
Note: Question and choice names remain the same across form languages. This makes multilingual data easier to analyze after export.
When designing a form, think about the analyses you may want to perform later. Planning ahead can reduce the amount of processing required after export.
Calculations can create variables that would otherwise require additional processing after export. For example, you can use calculations to create:
Respondent age based on date of birth
Household size totals
Body mass index (BMI)
Scores or indicators based on several responses
Creating these variables during data collection can save time and improve consistency across analyses.
To learn more about adding calculations, see Adding calculations in the Formbuilder.
When analyzing data, it is important to know why information is missing. A response may be missing because the question was skipped, unavailable, not remembered, not applicable, or deliberately withheld.
A common best practice is to make questions required while providing explicit response options, such as:
Prefer not to say
Don’t know
Don’t remember
Not applicable
This helps reduce unexplained missing data and makes results easier to interpret during analysis.
Open text responses are often difficult to analyze. When possible, use structured question types instead:
Use multiple choice questions when responses can be standardized into predefined options.
Use date questions when collecting calendar dates.
Use GPS questions or cascading select questions when collecting location information.
Reserve text questions for information that cannot reasonably be standardized.
The XLSForm behind your KoboToolbox form can serve as a data dictionary. It documents your form structure, question names, choice names, labels, translations, question types, and form logic.
To download your form’s XLSForm:
Open your project.
Go to FORM.
Click More actions.
Click Download XLS.
Each row in the survey tab represents a question in your form, and each column provides information about that question, such as the question name, label, type, translations, and relevant form logic. For select questions, option choices are listed in the choices tab.
Keeping your XLSForm with your exported dataset can make it easier to interpret variables, understand response values, and document your analysis workflow.
To learn more about XLSForm, see Getting started with XLSForm and Using XLSForm with KoboToolbox.
Did you find what you were looking for? Was the information clear? Was anything missing?
Share your feedback to help us improve this article!
KoboToolbox is maintained by Kobo Inc.