Using KoBoSync

Last Updated: Oct 09, 2014 08:14PM EDT

KoBoSync is a small application used to aggregate individual survey records into a simple data file. KoBoSync is not needed when using KoBoToolbox either offline or online as this aggregation is built in and uses an advanced database system. KoBoSync can useful if data is collected in an environment without Internet access and if KoBoToolbox offline hasn't been downloaded to and installed on a local computer yet.

The file created by KoBoSync is in the CSV (Comma Separated Value) format, which can be imported into any kind of spreadsheet (such as Excel) or analytical software packages (such as R, SPSS, or SAS).

Installing and Running KoBoSync

KoBoSync can be downloaded here. To run it, you need to first install Java, a common software platform. You probably already have Java installed - check here if that's the case.

After downloading KoBoSync, just open the .jar file by double-clicking it.

Potential problems with opening KoBoSync: Depending on what other software you have installed, the .jar file type may not be correctly associated with Java yet. For example, it might open as a zipped file archive in another program. To fix this you need to associate the .jar file type with Java. Help can be found here for Windows users and for Mac users.

Using KoBoSync

Mounting devices as external drives no longer possible with new Android devices

Since version 4.0 Android has dropped the ability to mount Android devices as if they were removable storage, such as USB keys. While you can still connect and access your device by USB cable, it won't create a drive letter, and as a result it won't appear in the list of drives and directories KoBoSync can access. This means that the first 'Browse' button and the 'Aggregate' button are no longer neededThe following instructions assume that you are using Android 4.0 or newer.

Step 1: Copy Survey Data to Computer
  1. Attach your device to your computer by USB cable. Make sure the device is powered on and you've entered your code or pattern on the lockscreen (if any).
  2. Go to Computer (Windows) and click to open your Android device that's in this list. Mac Users should follow these instructions from Android on how to copy files.
  3. On your device find the folder named odk and open it.
  4. Copy-paste the instances folder to a new folder on your computer, for example a folder called Survey data 2014.
  5. (Optional) If you have multiple devices you're planning to synchronize, rename the instances folder to a unique number or name that represents the device you copied it from, for example Tablet_01.

Repeat these steps for all of your devices until all of your survey data has been copied to your local computer. You should now have a folder that contains multiple sub-folders, one for each of your devices.

Step 2: Convert your survey data to CSV

If you haven't already done so, open KoBoSync (see above)

  1. In the line Aggregate To: click the Browse button. Choose the folder that contains all your survey data (in the above example, this would be 'Survey data 2014')
  2. In the line Save CSV To: click the Browse button. Choose a folder where your CSV file should be saved to.
  3. Click on the Convert to CSV button.

All your survey instances will now have been transcribed into a CSV file. If your Android devices were used to collect data with more than one form, there will be one CSV file corresponding to each form. 

You can repeat this process as you copy additional data from more devices to your computer - the records will be appended to the CSV file without interfering with the existing records.

How KoBoSync Works Behind the Scenes

This app syncs all completed forms from whatever directory you point it to into a directory of your choosing to store files from multiple phones into a single location on the computer on which the app is run. It is recursive so it doesn't worry that ODK puts all the forms into separate folders, you just point it at the top level folder. The XML storage directory will be populated with surveys taken from the XML source directory. The individual surveys are renamed based on the survey instance name, DeviceID of the phone used to collect the data, and the time at which the survey was started to the millisecond. This combination of data is used as a unique key throughout the process of backing up and transcribing the surveys into CSV and allows surveys from multiple phones to be collected into one location without having to worry about losing or overwriting existing data.

From the storage directory, it then aggregates all the records into a single CSV file, and places that file in a directory of your choosing. The application uses the instance name, DeviceID, & Start time data combination to create a unique key for each record within the CSV file. It is smart enough to handle changes in the schema over time, so that if you add a question to your survey, it won't confuse the sync. There will be a new column in the CSV for the new datafield, records collected before that question was added will have a null in the column for the new datafield.

The app will look in the XML storage directory and will then read each completed form (which are stored as XML files). It then read the XML schema and write headers to the first line of the CSV, then it will write the first record in the second line. Next it will open each additional XML form and give each one a row in the CSV. The CSV will be stored in the directory you chose for storage. XML files which had been previously written to CSV will not be rewritten to the CSV and so there is no need to sanitize your XML storage directory between runs of the transcriber. If the Transcriber comes across an XML form whose schema is different than the others, it is able to handle that by modifying the schema to include new fields and inserting blank fields for that column in all the previous records. While these headers are updated to accommodate changes to the schema the existing data will not be altered or deleted by the transcription process.

Note that if nobody ever answers a question with a given selection, then that column will never be created. In the above example if valid answers included 1, 2, 3, 4, 5, 6, 7, 8. or 9, but we'd only received the two records previously described, then there would never be created columns for 3, 5, 6, or 8

Advanced: Random Record Generator

We needed a way to test our system with lots and lots of records. It works by pointing the application (over command line) it at an completed survey XML file and tell it to make random records, and how many and where to put them. In a second it can create make hundreds of fake records. 100 is the default, but you can do more. The data is random strings, longs, ints, & date stamps which are generated by inspecting the completed survey file to determine the data type for each question.

On the command line on your computer, type in the following:

java -cp KoboSync_0.93.1.jar org.oyrm.kobo.postproc.test.KoboXMLGen <XMLFile> <DestinationDirectory>

<XMLFile> should refer to a completed survey file as is stored in the odk/instances/ directory after a survey is completed. <DestinationDirectory> should refer to the output folder.


java -cp KoboSync_0.93.1.jar org.oyrm.kobo.postproc.test.KoboXMLGen ../instance/CAR_2009-09-27_17-41-53.xml ../test_instances/



Questions about certain features, general issues, or stuck somewhere?
Post your questions here and help other users.

Please only post your question in one of the three channels to avoid duplication.
seconds ago
a minute ago
minutes ago
an hour ago
hours ago
a day ago
days ago
Invalid characters found