The Australian Reference Genome Atlas: An online platform to improve data discoverability

Keeva Connolly1,5, Dr Nigel Ward1,5, Dr Jeff Christiansen1,5, Dr Kelly Scarlett2, Dr Nick dos Remedios3, Dr Kathryn Hall4, Mr Hamish Holewa3

1Australian BioCommons, St Lucia, Australia, 2Bioplatforms Australia, Macquarie Park, Australia, 3Atlas of Living Australia, CSIRO, Black Mountain, Australia, 4Atlas of Living, CSIRO, Dutton Park, Australia, 5Queensland Cyber Infrastructure Foundation, St Lucia, Australia

 

High-throughput sequencing has facilitated exponential growth in the volume of DNA data being generated. Existing infrastructure has struggled to adapt to this evolving landscape, and currently data sources are fragmented and disconnected. Our discussions with life sciences researchers revealed their frustrations in finding, accessing, comparing, and making use of existing genomic data.

To address this need, the Atlas of Living Australia (ALA), together with the Australian BioCommons and Bioplatforms Australia, has initiated the Australian Reference Genome Atlas (ARGA) – an NCRIS-enabled platform – to improve the discoverability of genomic data for Australian species and other taxa relevant to Australia. ARGA provides an online portal, which indexes public repositories and databases to aggregate genomic data with relevant metadata, occurrence data and phenotypic trait data, and make them available for other cloud-based analysis platforms, such as Galaxy Australia. The ARGA portal enables researchers to quickly understand what datasets are available for a species, including reference genomes, mitogenomes, SNP data and eDNA data. This content is intersected with metadata for samples curated using Darwin Core Archives, and, leveraging the power of the ALA, species occurrence information.  Data discoverability is improved by comprehensive searching and filtering capacities, enabling the curation and selection of data according to geospatial locations, genomic data types, and inclusion on species lists, such as threatened species or biosecurity pest lists. ARGA is governed by FAIR principles and delivers a novel platform, facilitating the sharing and reuse of genomic data in research across academic, industry and policy sectors.


Biography:

Keeva Connolly is a scientific business analyst for the Australian Reference Genome Atlas (ARGA) project, a joint initiative between Australian BioCommons, Bioplatforms Australia and the Atlas of Living Australia at CSIRO.