Ann McCartney – Utilising Oxford nanopore data for the genome assembly of endemic New Zealand species
Wednesday 12 February – 1:50 pm
As part of Genomics Aotearoa, a high-quality genomes project has been established to generate pipelines for the assembly of species across New Zealand. These pipelines are specifically targeted at key species that are on the verge of extinction, treasured by Māori, key players in the primary production industry, a significant threat to biosecurity within New Zealand or have complex genomes,i.e. are abnormally large, have higher ploidy levels, are highly repetitive or heterozygous. These species have been sequenced using a variety of NGS platforms, namely Illumina, Oxford Nanopore (ONT), PacBio, Chromium 10X and Hi-C.
Here, genome assembly strategies under development on NeSI will be outlined specifically using ONT and Illumina datatypes in order to highlight the impact of read depth and coverage on genome assembly quality. This study deals with the optimisation of genome assembly construction for projects with a limited budget or those that are confined to certain locations/sequencing platforms. It also addresses optimal assembly strategies for species with unique genome architectures. In order to investigate this five species with unique genome characteristics were selected, namely; Hericium novae-zealandiae a small diploid fungi, Clitarchus hookeri a species containing a large, repetitive and highly heterozygous genome, Knightia excelsa a plant species with a medium sized, non-repetitive genome with low heterozygosity, and kiwifruit another plant species with a smaller and more repetitive genome structure. A focus will all be placed on the importance of appropriate data management, transfer and sharing when working with toanga species
Wallace Chase – Why so slow? Molasses biased data transfers
Wednesday 12 February – 2:50 pm
Why does is my data moving so slowly? Come hear the top reasons why your data
transfer is so very slow and how you can speed it up…
Laura Armstrong – Identifying, connecting and citing research with persistent identifiers
Wednesday 12 February – 3:30 pm
Increasingly, the research community, including funders and publishers, is recognising the power of ‘connected up’ research to facilitate reuse, reproducibility and transparency of research. Persistent identifiers (PIDs) are critical enablers for identifying and linking related research objects including datasets, people, grants, concepts, places, projects and publications. PID systems:
- Provide social and technical infrastructure to identify and cite a research output over time
- Enable machine readability and exchange
- Collect and make available metadata that can provide further context and connections
- Facilitate the linkage and discovery of research outputs, objects, related people and things
- Provide key tools for tracking the impact of research and researchers
Join this BoF to learn about recent developments in PID services and infrastructure with a particular focus on DOI (research data), ORCID (people and organisations), RAID (research activities and projects), IGSN (physical samples and specimens) and ROR (research organisations).
Find out how to maximise the return on your investment in PIDS through participation in national and global initiatives such as the NZ DOI consortium, Scholix and the Project FREYA PID Graph which uses PIDS to offer researchers, and research institutions a richer, more connected experience.
Dinindu Senanayake – HPC for life sciences: handling the challenges posed by a domain that relies on big-data
Thursday 13 February – 2:10 pm
The advancement of sequencing technologies, proteomics, microscopy (High throughput high content), etc. and decreasing cost is responsible in creating an avalanche of data across multiple sub-domains that fall under life sciences. This data deluge demands an interdisciplinary approach to face the associated challenges such as data storage, parallel and high-performance computing solutions for data analysis, scalability, security and data integration. Ability to deliver solutions to these needs will result in converting highly granular, unstructured data into real scientific insights which will accelerate the advances being made assisted precision medical treatment based on an individual’s genetic makeup, developing drugs with minimum side effects, species conservation programmes, etc.
New Zealand eScience Infrastructure (NeSI) is focused on delivering these tools that are required by our researchers who might need a “huge” amount of memory to assemble a large genome, simulate the Newtonian equations of motion in biochemical molecules like proteins, nucleic acids in parallel, facilitate the ever increasing requirement of data storage (from day to day to “Sensitive”) and deploying efficient methods for end-to-end data transfers. Also, NeSI’s partnership with Genomics Aotearoa had been instrumental in introducing training tools such as virtual machines and an extensive number of workshops hosted on these machine which are proving to assist beginners’ level bioinformaticians/computational biologists to acquire advance skills within a short period to be used in their search to understand the rules of life
Steven McEachern – Sharing across the ditch: infrastructure for social science research data in Australia and New Zealand
Thursday 13 February – 2:30 pm
The social science research community has a long tradition of working collaboratively to study comparative political, social and economic problems in an international context. Australian and New Zealand researchers have made regular, long term contributions to international research programs such as the International Social Survey Program, World Values Survey and the Comparative Study of Electoral Systems, and the results of this work is disseminated internationally.
These international collaborations highlight significant opportunities for the establishment of shared resources and infrastructure to support such programs. Social science data archives have been established in many countries to support the efforts associated with programs. Particularly in the European Union, these now represent the major EU-funded social science infrastructures under the ESFRI program, such as the Consortium of European Social Science Data Archives (CESSDA), and associated survey research programs of the European Social Survey (ESS) and the Survey of Health and Retirement in Europe (SHARE).
Jonathan Flutey – Micro-credentials and Research Skills Development
Thursday 13 February – 3:30 pm
The New Zealand Qualifications Authority (NZQA) has recently formalised a new micro-credential policy and piloted a series of small skills based courses that align to the NZQA credit framework.
While this is not yet gaining traction in Universities, Tertiary Education Organisations (TEO’s) with a strong skills based focus are finding new ways of rewarding, and recognising, learning through this new policy and framework.
The policies focus is not only on TEO’s. NZQA have formalised a process for non-TEO’s (professional groups, accreditation boards, communities) to benchmark their skills based learning programmes for micro-credential equivalency –
This birds of a feather session opens up the discussion, and possibilities, of nationally recognised credentials and development pathways for RSE’s and research support staff with a particular focus on skills based and professional practice assessment. Is this something our communities want, lets discuss!
Stephanie Guichard – Data-intensive approaches to finding and predicting research outcomes for New Zealand health research
Friday 14 February – 11:20 am
How can we use data science to measure research outcomes at scale? Can quantitative data be used at all to understand research’s “impact” in its truest sense? In this presentation, we will share how—by asking the right questions, using the right data, and understanding data science’s strengths and limitations—New Zealanders can measure their success towards achieving health research outcomes, and even forecast future success.
Using the New Zealand Health Ministry’s “New Zealand Health Research Strategy 2017-2027” report as a case study, we will first show how thoughtful strategic planning makes it possible for data scientists to answer pressing questions like, “How can we track the implementation of research into health policy?” and “How can we produce the best research that supports the well-being of all New Zealanders?”
Next, we will discuss how linked bibliometric and altmetric data sources can help analysts better understand if and how New Zealand health research has achieved strategic priorities. Using unique data from Dimensions Analytics, a linked research intelligence database, and Altmetric Explorer, which provides data for understanding the broader impacts of research, we will use large scale visualization and statistical approaches to understand the current state of New Zealand health research with regard to desired outcomes; predict future trends based on past funding and publishing activity; and offer suggestions for ways to improve the likelihood of achieving desired research outcomes in the future. Among the outcomes studied will be international and industry collaboration trends, the translation of research into innovation and public policy, and public engagement with health research.
Finally, we will offer a frank discussion on the benefits and limitations of quantitative data in measuring desired outcomes like community collaborations and whether research is improving health outcomes for Maōri and disabled peoples. In some cases, leading engagement indicators like altmetrics can be used as a rough proxy for success, and are complementary to traditional program evaluation approaches. We will explore several instances where altmetric and bibliometric data succeed and where they fail.
Carina Kemp- Building an International FAIR Infrastructure for ‘Uniting’ Research Data
Friday 14 February – 1:30 pm
Over the past ~6 years, a budding community of NREN and discipline operators of synch&share stores has popped up. These operators typically run one of [ownCloud, seafile, NextCloud, PowerFolder]. Judging by site surveys presented at consecutive synch&share-focused CS3 conferences, their services have all become runaway successes – it’s not unusual for these stores to be in the PB range and to serve tens of thousands of real researchers and their real research data. The next wave of open science policy, however, tells us that data shouldn’t be locked inside a single vault – instead it needs to be interlinked, citable, free to move; in short, FAIR. The CS3 community have always been working towards enabling interlinking of the data between stores at the identity and metadata levels. An open protocol was developed to announce, accept and propagate shared volumes from one installation to another. This protocol is called OpenCloudMesh and is by now supported by most synch&share software vendors. So, we have the installed base, the incentive to interlink, and the technology to interlink. We just haven’t taken actual linking beyond proof of concept yet; not in an operational, sustainable way in any case.