A day in the life of a UCL Research Data Steward

  • Dr
    Nick Owen
    • University College London
1 July 2025

A FAIRytale by Nick Owen – a personal story about data stewardship role in modern academia

DOI

The role of a research data steward is an emerging career path, a new role especially for me when I joined the Centre for Advanced Research Computing (ARC) at the University College London (UCL) in 2023. My prior experience as an academic researcher, specialising in rare human disease, followed by several years as a senior bioinformatician, provided me with firsthand insight into data management and analysis that life science researchers face. This foundation motivated me to join ARC’s pioneering data stewardship initiative, where I could directly address these challenges and advance research data practices.

ARC is a unique research centre with unique opportunities. It is committed to enabling researchers by developing and sustaining critical services for research data, training, software and infrastructure. ARC also proactively contributes expert collaborative support to research projects not only within UCL, but across the UK and wider European scientific communities. It uniquely functions as both a service provider and an independent, active research department.

Established in 2022, ARC was principally composed of research software engineers and expanded to include other professions: data scientists, data stewards, research infrastructure developers, education, community, leadership, PRISMs and operational staff. Importantly, the way ARC has evolved allows for incredible levels of communication across all professions, something crucial for maintaining focus, direction and the adoption of new,  improved ways of working. 

The role of the Research Data Steward over the past 12 months

The research data stewards team here has grown significantly, in part due to the recognition of the post as critical to the future of research data management by UCL leadership. Within our team, some members contribute domain-specific expertise, while others are fully committed to the stewardship of specific projects. My focus is on supporting and developing services, training and knowledge exchange across the life science domain at UCL. In addition to my specialist background and interests, I am also keen to be involved with wider scientific collaborations and communities.

Akin to academic research, a typical day’s routine does not revolve around one area and varies as projects progress. Our time can be generally split across our main project (research or service-based), collaborations, side projects, supporting researchers’ requirements, outreach and personal development. To coordinate, we work in an Agile manner, with termly plans, broken into two-week sprints. This enables us to iteratively plan our time in the most effective way, aligning with our goals.  

Supporting researchers and best practices

My current primary project involves developing relationships with researchers across various departments and institutes to champion best practices for research data management and storage. I promote the use of ARC’s centralised research data storage platform (RDP) which offers researchers a reliable, resilient, and secure solution for data storage. This centralized approach provides significant advantages over disparate, unmanaged local storage methods, such as NAS boxes and external drives. As a data steward, my role is to encourage, support and outline better ways of working with large volumes of data, whilst respecting their ultimate decision authority. 

Consider, for instance, an institution relocating to a new building.  In such a scenario, we can advocate the adoption of centralized services, such as the RDP, mitigating reliance on small, localized storage within the new building, which may have physical space restrictions. This approach is also crucial for forecasting future data storage needs and accommodate evolving requirements. We currently are equipped to support over 13 PB of research data.

Developing services and training

Another aspect of my work centres on the development of a comprehensive data catalogue for use within the university. It is a common challenge in many research-intensive universities, to find instances of isolated research groups pursuing similar projects with often overlapping datasets. These may include, for example, population cohort data or smart data, frequently acquired under institutional licenses via research grants. Often, these licenses permit wider internal reuse. This capability represents an opportunity by repurposing existing data across multiple projects which can significantly reduce time and financial investment. To facilitate this, we have been developing a catalogue to host the metadata for such datasets. This will empower researchers to identify UCL-held data and access details on where to obtain further information. Our overarching objective is to substantially reduce both license and data duplication within the university, therefore enhancing research efficiency and fostering greater collaboration.

Overall, this initiative will streamline the processes for data discovery and access. For this project, I am the solutions technical lead, responsible for developing and customising the code based on a critically assessed open-source solution. As a team, we leverage our diverse strengths to accelerate project outcomes; my coding experience has been a significant asset, complementing other team members’ focus on spatial data, or governance, and innovation. 

Another key project involves the development of educational content, including training modules and informational videos. This material is designed to effectively demonstrate the services we offer to UCL researchers, staff, and students. 

In universities with a geographically disperse campus such as UCL, vital information can often get lost among the high volume of communications. We aim to streamline our stewardship outreach to ensure staff, many who may be unaware of our roles or services we can offer, receive clear and relevant information. Our efforts to increase exposure and awareness are gaining momentum. As a team, we regularly meet with departments, present at local seminars, and cultivate a community around research data management. This initiative brings together anyone involved in data stewardship to meet, foster discussions and collaborative problem solving.   

Collaboration is key

We see collaboration as fundamental to the role of a data steward, and the other professions at ARC.  Researchers approach us seeking guidance, specialized expertise, or comprehensive solutions to their projects. These opportunities are openly documented within the department, ensuring that support is strategically assigned across diverse professional specialisms. Team members are empowered to self-assign to incoming projects where our skillset aligns, or to recommend suitable colleagues for involvement. Early engagement with projects is crucial; we actively participate in the grant applications process, often becoming involved with initiatives a year or more before funding is secured.

Expanding Data Stewardship

Another aspect of the ways of working at ARC that I haven’t yet mentioned is the “hub and spoke” model we’ve initiated with UCL departments. While we maintain centralised teams within ARC, we actively encourage departments to secure funding for their own dedicated staff (data stewards, scientists, infrastructure developers etc.). These are directly connected to ARC’s resources and expertise, effectively positioning ARC as the hub and the departmental ARC associates as the spokes. This model increases coherent research connections across UCL. 

My focus: connecting researchers and communities

Another area I want to discuss is particularly important to me:  proactively engaging with researchers to demonstrate possibilities, highlight available and upcoming resources, and introduce emerging solutions that can make their research lives that bit easier. When I transitioned from the wet lab research to a dry lab role, incorporating bioinformatics and project management, I often found myself “reinventing the wheel”. This was frequently necessary to achieve rapid results for publication, a common pressure within the academic ‘publish or perish‘ mentality. While this situation is slowly evolving, it remains a prevalent aspect of academic life.

My bioinformatic background primarily focused on analysing next-generation sequencing data, including whole-genome sequence analysis (Genomics England 100k genomes project and others), transcriptomics (bulk and single-cell), and epigenomics. When I joined ARC, I highlighted the need for support in these areas across UCL.  I observed numerous siloed research groups often conducting similar analyses independently. By centralizing best practices and clearly communicating available support, we can significantly strengthen their capabilities and foster greater collaborative research.

Given the diverse nature of a data steward’s role, I’ve actively embraced the opportunity to engage with these groups, fostering connections, exploring their specific requirements, and cultivating a collaborative community. Whilst this is very much still in progress, I have found encouraging results over the past nine months. This is also where engagement with communities such as ELIXIR UK and EU, as well as the Global Alliance for Genomic Health (GA4GH), and BioFAIR become particularly valuable. 

It’s often the case that researchers lack the time to explore broader community support, develop standards, or establish best practices. Their focus is, understandably, on their immediate research and domain specific advancements. I believe a core aspect of the stewardship role is to bridge this gap, connecting researchers with external resources and innovations that can directly benefit or influence their research – in essence, to make a tangible difference.  As a researcher myself, it has been important for me to develop capabilities beyond my primary focus. I am grateful that I can leverage my skill set in my current role as a research data steward.

I became involved with ELIXIR, GA4GH, and the RDA when I started this role for several key reasons. My aim was to stay current with developments in supporting the research domain, to network with peers to enhance research, and to actively contribute to the development of emerging services and standards. Like many in these communities, I believe in fostering streamlined standards and processes that genuinely benefit the entire research community, rather than creating new ones unnecessarily.

Currently, as the sole data steward at ARC with a focus on life sciences, a critical part of my role is to relay feedback to all our teams regarding new tooling and standard developments. This also involves identifying how ARC can proactively contribute to these evolving communities. For instance, many universities are actively developing secure or Trusted Research Environments for analysing sensitive data, such as genetic information. This raises important questions: How can we adopt existing standards for data ingress and egress, and how can these standards be further expanded for novel use cases? All these considerations stem from the communication and dissemination of information both within and beyond the university community.  

Continuous development

The final area I would like to highlight is continued personal and professional development. While this is an ongoing process for everyone, I am always keen to exceed expectations, staying abreast of emerging trends in data stewardship, as well as my core areas of bioinformatics and rare disease research. Beyond expanding my coding and governance expertise, I also explore collaborative opportunities outside my knowledge base to further understand how stewardship principles can be utilised across diverse scenarios. ARC has a very clear model for career development for all professions, which aims to make it easier for high-performing staff demonstrating the appropriate skills and behaviours of performing at a higher grade to be recognised and accelerate their internal career progression. 

The future of Research Data Stewardship

The role of a data steward is inherently diverse, both within the position itself and across institutions. This career path is evolving, and it’s exciting to anticipate its future. As more stewardship roles gain recognition in research, as communities become more interconnected, and as researchers increasingly understand the value we provide, I believe this field will only grow in strength and utility. I am particularly encouraged by the support for data stewardship at UCL and I truly hope the recognition of the role’s importance continues to expand.  With the immense volume of data being generated today, the demand on researchers to manage every aspect themselves is simply unsustainable without our enhanced support. 

I feel incredibly fortunate in my role, as I can dedicate all my time and efforts to improving the culture around research data. I look forward to meeting other stewardship groups, discussing approaches and new ideas, and disseminating information to encourage the adoption of the tools and services we develop. Often, these can remain underappreciated and not well adopted.

We aim to change that.