Our recent work launching an AMR Surveillance System in Nepal highlighted the recurring issue that exists in standardising spreadsheet data from multiple locations. The human and animal data gathered for transmission into the Nepal One Health system was coming from over 20 different laboratories, with multiple spreadsheet formats used by the locations. Prior to this innovation, data had to be manually cleansed and reformatted by individuals within Nepal’s National Public Health Laboratory (NPHL) and Central Veterinary Laboratory (CVL). This process was extremely time consuming, so the requirement for a more automated cleansing mechanism was identified early on in the project. To address this issue, and simplify the data cleansing process, SfHF has developed a desktop application, Open Data XLS Transformer (or ODX), which can be mapped to automatically cleanse and standardise spreadsheets ahead of loading them into an analytics platform, or health information system. Spreadsheets are passed through a ‘fix cycle’, where the automatic “correction” mappings and “error” call out features are used to produce a clean, standardised spreadsheet output.
Using ODX to reformat, fix and cleanse spreadsheet data
The initial development of ODX allows for the transmission of information from multiple public health laboratory spreadsheets to DHIS2, specifically for AMR AST/DST testing. The application aims to solve the manual process required to sort data and can make both “corrections” and identify “errors” in spreadsheets uploaded to a data analytics system. It can be modified to map for spreadsheets from different origins and can be programmed to be sent to the desired analytics platform (not exclusively DHIS2). The current version of ODX has been programmed to identify relevant headings and information for data analytics.
The main features of the application are the ODX mappings, which can be set up to ‘correct’ and standardise column headings, common spelling mistakes or name variations for pathogens and antibiotics, and reformat the spreadsheet so that all columns are in the same order and header information is in the same place. ODX is also able to identify any ‘errors’ in a spreadsheet that would prevent data rows from being sent into an analytics platform, such as missing information or information entered in an unrecognised format. This cleansing process is known as a ‘fix cycle’, as the user may need to pass a spreadsheet through ODX more than once to ensure that all errors have been addressed. Additional features can be added to meet country-specific needs, such as the more unique requirement of the Nepal project for mappings to correct for variation in date format used, as some Nepal laboratories use both the Gregorian and Bikram Sambat date formats.
The following walkthrough video demonstrates how the ODX application runs, and how it has been used in Nepal: