Nothing but benefits: embedding my ELIXIR-UK material in university teaching
It’s a normal Wednesday. My block of planned meetings is done, I’ve got hot coffee, and I’ve finally managed to log on to our servers to look at some new data (or, let’s be honest – the data I should have finished analysing and writing up weeks ago…). Then comes the knock at the door from a wet lab colleague: “I’ve found the perfect dataset in this paper. It’s only twelve samples – could you take a quick look? We’re trying to finish a manuscript, so you’d obviously be on that.”
I’m sure this is a scenario many of us recognise, perhaps with alarm bells ringing:
Invariably, the dataset has no metadata, there are weird biases and incomprehensible file names, and it doesn’t actually show the “perfect” biological result after all – a whimsical shrug and an “oh well” from your colleague. Now there’s no authorship, and your own data still hasn’t been analysed.
I’ve faced this many times since being at The University of York. Large datasets are now crucial in all areas of the Life Sciences, and whilst we have some excellent bioinformatics groups here, the breadth of topics is vast, and our Data Science core team is overstretched. I was lucky enough to complete my PhD at a research institute where bioinformatics was a major component of most research programmes. This meant that whilst a lot of my data skills were largely self-taught through the requirements of my project, there were always people around with relevant expertise who could spot a stray comma or suggest a different database to check. At York, this critical mass doesn’t (yet) exist, so, as an approachable bioinformatician, I get many requests for help.
Teaching is something that I’ve always enjoyed and sought out, starting with undergraduate demonstrating and private high school tutoring during my PhD. Now, as an academic, I feel strongly that helping to develop research data acumen in students and staff is part of my remit. However, I felt that balancing these requests with managing my research group and delivering undergraduate and postgraduate teaching as a Lecturer required me to further develop my skills as a data science trainer.
ELIXIR-UK’s Data Stewardship Training Fellowship was a perfect fit for these ambitions, and I was fortunate to be accepted into the first cohort in 2021. During my fellowship, I developed six RDM bites (5-minute videos on different aspects of Research Data Management) on next-generation sequencing data; delivered a training session on using the cancer BioPortal; and contributed to the ELIXIR cookbook on analysing bulk RNA sequencing data. Beyond the large compendium of high-quality resources from ELIXIR-UK and the two cohorts of Data Stewardship Fellows, I have personally benefited in two main areas. First, I received advice and guidance on how to create, refine, and deliver data science teaching applicable at all levels, which massively increased my confidence throughout the year before my lectureship position started in November 2022. Second, I developed highly relevant materials that I have been able to reuse and develop beyond the fellowship.
This last point should be obvious, in a way. In academia, we always talk about the future applications, development and adaptability of our work. Still, so often, these statements are necessities of an application rather than what eventually happens in practice. However, I have been able to adapt and embed my fellowship material within my teaching and group member training. These materials look good, have undergone refinement and editing, and offer the benefit of introducing ELIXIR (and the full host of training materials) to the next generation of researchers, for whom the analysis of big data has been integral to their training as life scientists.
RDM bites really work
My sequencing collection of RDM bites was designed to help non-bioinformaticians understand the data they could download from papers and public repositories, essentially to answer the question:
Is this data worth my time?
This is an incredibly empowering judgement to draw and, selfishly, one which hopefully could save the friendly down-the-corridor bioinformaticians a lot of time. Naturally, these also served as great starting points for new staff and students in my group, particularly if they came from a wet-lab background. When I came to developing my teaching material for undergraduates, either working with count data (second years) or the whole analysis pipeline from raw FASTQs (third years), again, these videos provided incredibly useful reference points for students. This has been particularly relevant as we ensure that our teaching materials are fully inclusive, including the reality of asynchronous and/or distance learning.
Beyond my videos, I have also used content developed by the core ELIXIR team or other fellows, and the RDM bite format continues to strongly influence new video content I produce. Students’ feedback is that the length is good and that the videos are very rewatchable, with some students clicking through to other ELIXIR content.
Adapting a workshop for different audiences
While the bank of RDM bites provides an incredible resource, in-person delivery is often successful in engaging less self-motivated learners – particularly if free coffee is involved. My original ‘Introduction to cBioPortal’ workshop was targeted at postgraduate researchers of all stages, delivered in-person to around 50 learners. This was my first time designing and delivering a workshop, and I was nervous – particularly as my learners were my peers. Following the theme of making my life easier, I wanted to highlight how extensively browser-reliant researchers can delve into public cancer cohort data (without coming to my door for help). It was a mixed room, ranging from student cell biologists looking for ways to demonstrate the broader applicability of their results to professors seeking ways to deliver cost-effective student projects. In terms of my delivery, I learned a lot about the balance between talking and letting people get on with the tasks, as well as about managing divergent learner expectations against the planned learning outcomes.
This latter point was crucial. I realised I had a highly accessible core workshop which was readily adaptable for other audiences. Initially, this involved refining and extending the workshop for learners with coding experience (in-person and online, with an extended support window for asynchronous learners), and then focusing on a specific cancer (now part of my teaching in two taught Master’s programmes at York). We were also able to integrate the workshop into our research unit’s STEM Learning ENTHUSE partnership with local high schools from deprived areas, giving students experience with both laboratory work and data analysis, which is genuinely representative of a career in the life sciences.
Long-lasting benefits of the Data Stewardship Training Fellowship
I started my fellowship while preparing for my move to my first tenured Lectureship position. Up until that point, my teaching style had largely relied on being personable enough and knowing how to explain core concepts in different ways. The fellowship allowed me to actually refine data-specific teaching skills and develop new materials. I now have a bank of resources I can draw on when the enthusiastic wet lab colleague comes knocking, and when students start in my lab, I can either point them towards my RDM bites or there’s a good chance they’ve already seen them.
Whilst York could still do with getting more bioinformaticians (are there ever too many?!), I am fortunate to have colleagues also dedicated to improving data acumen in the life sciences, and, unlike many universities, our data core remains supported and funded centrally.
As our data skills become increasingly valued, so too is our ability to train others effectively, particularly when we can draw on the now vast repository of tailored ELIXIR materials.
