ReproducibiliTea Blog

Easing Into Open Science 17/09/2021 with Dr Priya Silverstein

written by Laura Klinkhamer (co-organiser of Edinburgh ReproducibiliTea)

Dr. Priya Silverstein

In this session we took a look at the following paper:
Ummul-Kiram Kathawalla, Priya Silverstein, Moin Syed; Easing Into Open Science: A Guide for Graduate Students and Their Advisors. Collabra: Psychology 4 January 2021; 7 (1): 18684. doi: https://doi.org/10.1525/collabra.18684

and we were joined by one of the authors, Dr Priya Silverstein for a live Q&A.

The paper is a great place to start for people who are new to open research concepts and provides a very useful summary and guide for some practices you could consider applying to your research. It introduces open science (now often referred to as open research, to include the academic disciplines that would not describe themselves as a “science”), as a “broad term that refers to a variety of principles and behaviors pertaining to transparency, credibility, reproducibility, and accessibility” (Kathawalla et al., 2021, p. 2). The paper is written specifically for the types of situations that graduate students are more likely to encounter, but the practices described are broadly applicable to researchers of any career stage.

Paper Summary

Eight Open Research (OR) practices are outlined in this guide and classified according to the author’s perception of the difficulty of implementation.

Practice 1 is to set up or join an open research (journal) club, such as the with the ReproducibiliTea organisation. This can be a quick and efficient way of getting to grips with some key concepts of the reproducibility and OR movement, while meeting new people along the way and increasing your network.
For researchers in Edinburgh – we encourage you to join the Edinburgh Open Research Initiative Teams group, which serves as a hub for bringing people interested in OR together. The University of Edinburgh now also has an Open Research Blog and newsletter that you can sign up for here.

Practice 2 refers to thinking about your project workflow, in particular setting up your file organisation, data access regulations and keeping clear records so that (future) you and others can quickly get an overview of your project and are able to reproduce the outcomes. For more information and tips on how to work reproducibly, we refer you to Kaitlyn Hair’s talk (Edinburgh ReproducibiliTea session Nov 2020) on selfish reasons to work reproducibly and Ralitsa Madsen’s talk on RSpace, a platform that you could consider to set up a project workflow in addition to the freely accessible Open Science Framework.

Practice 3 is about preprints, which refers to the practice of publishing your manuscript before or during peer review. Check with the journal where you intend to publish your work first on what their policy on pre-prints is by either messaging them or checking this source suggested by Priya. Pre-prints are a way to bring your research out to the world, even if publication is delayed or rejected. It also increases the number of times your work will be cited. There are free servers that you can upload your manuscript as a pre-print to, such as bioRxiv for biology.

Practice 4 refers to creating reproducible code/analyses. It is very helpful for your project workflow and reproducibility to write your code/analysis plans in such a way that it is clear beyond a doubt for others and your future self what you did. Annotating your steps and writing README files, basic text files describing for instance what files are in your project space/folder and what role they play in your project (e.g. data_spreadsheet_version3 contains the clean data on x number of participants that is used for Analysis B.), are very useful practices.

Practice 5 Sharing data is very useful to the scientific community and there are many platforms that you could upload your project’s anonymized data set to (e.g. OSF again). However, it is very important to make sure you are legally allowed to share the data. This will depend on your local and wider data regulation guides (e.g. in the EU GDPR applies) as well as what has exactly been put into the consent forms (if applicable to your project of course).
There are also options to upload only part of your data set or set up a system so that others can access more sensitive data. Talk to your supervisors/collaborators and check University research support services to see what would be most suitable in your case (for instance see this resource for the school of PPLS).

Practice 6 Being very open in your manuscript writing. In a way it’s fascinating how the norm in manuscript writing is that the research story gets presented as an almost perfect execution of a plan with a happy ending (i.e. significant results), whereas in reality you often hear researchers struggling with all kinds of issues and ending up with a manuscript that is only very slightly connected to the original research idea. It’s not very realistic, and actually harmful to scientific integrity. So if we allow ourselves to be humans, who make mistakes, and allow others to read about and learn from our mistakes, wouldn’t that make life easier?

Practices 7 & 8 are related.
Pre-registration: a time-stamped, read-only version of your research plan created before you begin data collection/analysis.
Registered report: similar to pre-registration but your research plan undergoes peer review before results are known. Helpful resource on the Centre for Open Science website here.
Both practices are very useful ways that make you sit down and plan your research before executing it. In the case of registered reports, you will also obtain feedback before executing it, which may be much more useful than receiving feedback after the fact in the regular peer-review system. Although you state what you intend to research in a pre-registration or registered report, it is important to realise that you do not sign a binding contract. If it turns out that another method or additional exploratory analysis are interesting to your research question, you are of course able to make changes. It is however your responsibility to transparently report and justify these changes.
As Niamh summarised: these practices do not stifle creativity, but create accountability.

It is important to realise that engaging is in OR practices is not an all-or-nothing approach. It’s much more about adopting a certain critical mindset and taking (small) steps that are suitable for you and your specific project.

Discussion

During the discussion Priya mentioned that if the paper were to get written this year she would probably include the same practices, but elaborate on the increased number of options in which they could be applied. For example, one thing that has changed in recent years is that registered reports have become available for projects with secondary data analysis (rather than it just being available for projects where the data still is to be collected).

Another interesting development is that of Peer Community In Registered Reports, which facilitates scheduled peer review. You indicate beforehand when you intend to hand in Stage 1 of your registered report (Introduction & Methods) and the community tries to arrange reviewers for that particular time frame, meaning that the peer review process can be completed much more rapidly. Priya mentioned that in the past this has been one of the main criticisms of the registered report that it was unclear when researchers could start their research analysis/data collection because of uncertainty regarding the peer review duration. This new facility makes registered reports an even more attractive option.

Will Cawthorn added that Review Commons is another place you can send your manuscript to for general peer review. If the manuscript then passes review, you can choose from a list of journals where to publish your paper. This approach decreases redundancy in peer review (i.e. if rejected from one journal, you don’t go through the roulette wheel of another round of completely new peer review).
Priya confirmed that this procedure is also in place for PCI registered reports. From the website: “Following the completion of peer review, authors of RRs that are positively recommended have the option to publish their articles in the growing list of PCI RR-friendly journals that have committed to accepting PCI RR recommendations without further peer review.”

We also had a group discussion about how we could further promote OR in the University. One of the suggested routes was to include more OR practices in undergraduate and postgraduate course curriculae. Will Cawthorn also referred to an OR roadmap for the University of Edinburgh that he and several others (including many from the Library and Research Support Services) are working on that is due to be published soon. Priya emphasised that is important create momentum both through bottom-up and top-down initiatives at the same time to bring about real change in the research culture. This nicely connected to our next session on Friday 15 October, which will be on how to build an open research culture in your lab/research group, by Dr. Will Cawthorn (LERU Open Science Ambassador for the University of Edinburgh).

Then I said something silly about how we should all jump aboard the Open Research train and Priya kindly replied with a “choo choo!” making me feel slightly less embarrassed.

All things considered, we look back at a successful first session of this academic year!

The presentation slides and meeting recording can be found on our OSF page. The meeting recordings can also be found on our YouTube channel.

If you’d like to stay up to date with Edinburgh ReproducibiliTea, please consider joining our mailing list by filling in this form and/or following us on Twitter.

For any questions/suggestions, please send us an email: edinburgh.reproducibilitea@ed.ac.uk

ReproducibiliTea Blog

A selfish guide to RSpace: Why and How?

In this session, Post-Doctoral Research Fellow Dr. Ralitsa Madsen covers why using an Electronic Lab Notebook (ELN) is a great idea. Dr. Madsen suggests that there are many rewards in using ELN for a reproducible workflow and they include:

  • Saving a lot of time while reading, searching the documents and so on.
  • If you are a postgraduate or graduate student, it will be more convenientwhile retrieving the details that you need for materials and methods section of your research.
  • ELN makes it easier to collaborate not only within but also outside of the group.
  • Lab members can pick up where you left off, therefore it ensures the continuity of the research.
  • It is much safer to rely on an ELN rather than your hard drive. Your documents will be accessible even if your computer gets damaged/stolen.
  • It is necessary to have an extensive documentation, version control and traceability of your work if you would like to make a patent application.
  • In addition, RSpace is well-integrated with many other services like Mendeley, Microsoft Office, Dropbox, Google Drive as well as data repositories like Git Hub.

After naming several great reasons, Dr. Madsen goes on to do a walk-through of the tool and gives useful tips to facilitate RSpace adoption within the lab: 

First, you should think FAIR: are your documents easily findable? Are they accessible to researchers inside or outside of the lab? Is it interpretable? Can others read through the lab book and reuse your protocol for their experiment?

But for this to work, says Dr. Madsen, you also need to create;

  • A lab book entry template which will ensure consistency and make it easier to collaborate,
    • Notebook based project organisation, 
    • Data storage rules that are motivating to use external repositories and
    • Consistent file naming rules

Do not forget to check Dr. Ralitsa Madsen’s RSpace demonstration on Edinburgh Reproducibility’s YouTube channel if you haven’t already!

This blog is written by Bengü Kalo

Find more information about the RSpace here

Edinburgh RT YouTube Channel

Edinburgh RT OSF page

Edinburgh RT Mailing List

ReproducibiliTea Blog

How (some) scientists talk about openness

Edinburgh ReproducibiliTea held another great session last Friday, with Dr. Rosalind Attenborough from University of Edinburgh – Science, Technology and Innovation Studies! Her research is focused on the researchers’ attitudes towards open science and here are the main points of her insightful talk for those who have missed;

For her PhD project, Dr. Attenborough interviewed 54 individuals from various career stages, genders and disciplines in biology. She mainly explored what does open science mean to them. Although the interviewees came up with various responses to her question, majority of them fell under three category: open access, open data and interpersonal openness.

In general, researchers tends to be positive while talking about open access and believes that it is a good idea. Yet, it does not go without mentioning the monetary and bureaucratic issues around it. 

Open data is a completely different story. While interviewing scientists and policymakers, Dr. Attenborough saw that people’s attitudes varied immensely. Some of the interviewees perceived it as a norm and embraced it with passion, while the others were cautious. What makes people refrain from sharing data seems to be stemming from the possibility of receiving destructive criticism and getting scooped.

The last category, interpersonal openness, refers to willingness and ability to talk about unpublished research ideas. Like data sharing, interpersonal openness also gets negatively affected by the competitive research culture as well as unsupportive mentorship.

Dr. Attenborough’s work is particularly insightful as it sheds a light on in which ways academia has to change so that the researchers , especially the ECR’s, can feel more comfortable embracing open science practices.

This blog is written by Bengü Kalo

Edinburgh RT YouTube Channel

Edinburgh RT OSF page

Edinburgh RT Mailing List

ReproducibiliTea Blog

Skills training for Open Science: impact and rewards of working with Edinburgh Carpentries

On our last Edinburgh ReproducibiliTea session, Edward Wallace  -PI of the Wallace Lab– shared the benefits of working with Edinburgh Carpentries. Here are some of the key points which have been discussed:

Dr. Wallace argued that all researchers need to learn how to analyse their data reproducibly, reliably and efficiently, regardless of which career stage they are at. 

Researchers need some foundational skills like coding, data science and project organisation in order to practice open science. However, the many of the group leaders, Postdocs, PhD students and RAs across the university stated that they do not have formal training in computing (45%) or statistics (35%) at all. This, says Dr. Wallace, was one of the main reasons to work with the Carpentries for him.

The Carpentries relies on the open community, ethos and pedagogical drive. Here, all the resources are developed by volunteers on GitHub and learning as well as teaching is well structured.

One very important reason to get involved with the Edinburgh Carpentries is that the funding bodies are interested in open science as much as the researchers. For instance, UKRI-BBSRC plans to “take actions to increase the capacity in computational skills within the biosciences”. In fact, Edinburgh Carpentries is now funded by UKRI for two years to expand their trainings.

In the session, Dr. Wallace informed us that the Edinburgh Carpentries are currently developing new teaching materials for statistics, FAIR principles and data management and data science computing with reproducible workflows. In one of these workshops, the instructors are teaching some skills that can save a lot of time and improve your work, such as how to organise and document your code efficiently which is also discussed in the “Good Enough Practices in Scientific Computing”. 

Edinburgh Carpentries is a community that is growing every day and is in need for more instructors. Some of the benefits of getting involved with the community is that;

  • You can get better at coding and teaching
  • The training helps to get funding and
  • You will be a part of a nice, supportive community. 

Many of our attendees seemed to be interested in participating the Edinburgh Carpentries and thinking of ways to engage their own labs in open research. To receive updates about EdCarp workshops and/or sign up as an Instructor and/or Helper for Edinburgh Carpentries you can sign up to the Edinburgh Carpentries mailing list. If you would like to help Dr. Wallace and his colleagues in developing workshop on FAIR practices in biosciences (and be paid for that), please email them at bio_rdm@ed.ac.uk

This blog was written by Bengü Kalo

Find the resources discussed in this session:

Wallace Lab 
Example open source coding from the Wallace Lab 


Edinburgh Carpentries mailing list to receive updates about EdCarp workshops and/or sign up as an Instructor and/or Helper for Edinburgh Carpentries

Wilson (2016) Software Carpentry: Lessons Learned  

Edinburgh-based Coding Club: 
https://ourcodingclub.github.io/  
https://ourcodingclub.github.io/tutorials.html  

Mailing List of the Coding Club

Edinburgh RT YouTube Channel

Edinburgh RT OSF page

Edinburgh RT Mailing List

ReproducibiliTea Blog

Selfish Reasons to Work Reproducibly

Reproducibility is not a newly adopted principle, in fact, it dates back to 1600s. The Irish chemist Robert Boyle was the first to emphasize the importance of  obtaining the same results when the study is re-created. Since then, scientists consistently reflected on how adopting a reproducible workflow helps advancing the science. Question is, does it only benefit science? During an invited talk at Edinburgh ReproducibiliTea Journal Club, Kaitlyn Hair explained how sharing your data, materials and code is also in your own interest.

Perhaps the most important reason for adopting a reproducible workflow is simply to avoid a disaster, according to Hair. Researchers all around the world publish their works continuously. These publications are leading up to other ideas, discoveries and products like vaccines and cancer therapeutics. A few years earlier, scientists from Duke University published a paper in which they claimed to find a way to efficiently target tumours based on their genetic sequencing. It is not very hard to imagine that this was a very important achievement at the time. However, after some failed attempts to replicate the results, scientists came to a shocking realisation; the promising findings were only a by-product of a technical problem which occurred in the process of copying the data from an excel sheet to another statistical program. Being unable to provide evidence against fraud claims, “this mistake was career ending for some of its authors”, says Hair. These kinds of mistakes are not as rare as we wish it to be. However, adopting open science principles can be life-saving in such situations. This was the case for Dr. Julia Strand. In 2018, she published the greatest achievement of her career. As it turns out, there was a way to improve speech perception and diminish the cognitive effort that we spend in noisy environment. Simply presenting a modulating circle which got expanded when the speech got louder made participants respond faster, or she thought so. In 2020, she published a blog post which drew lots of attention from scientists. “The central finding was the result of a software glitch…” she said. She found what she found only because she made a mistake while programming the timing clock. In this case, however, the code was openly available, the study was pre-registered and her work was fully transparent. Detecting her own mistake too, helped her prove that this was not a scientific fraud. 

Another reason for working reproducibly according to Hair is that, it makes writing easier. Writing a paper is not a smooth process if you don’t use tools like R, OSF and GitHub in the process. Having to copy your figures and paste it to your Word document, going back and forth to create a table for your analysis can be overwhelming, especially on a tight schedule. One of the best features of R is that you can run your analysis, make tables and figures and write your results up at the same time. On top of that, Hair explains that the rticles package in RMarkdown comes with many different paper formats such as PLOS or Frontiers style and knitr package allows you to save your document as PDF, word document or HTML file when you are done with it.

In addition, if you are collaborating with other scientists, you may easily end up with dozens of updated versions of the same code. GitHub is a great tool to avoid getting lost in a pool of code files in such situations. All you need to do is to upload your file on a remote GitHub repository so that the co-authors can pull it to their own computer, make changes and push it back to the repository. It tracks the changes that are being made and who made them, which means that you don’t need to keep all the versions on your computer.

Working reproducibly also ensures continuity, according to Hair. Especially as you make progress on your career and publish more often, you will notice that you forgot what you did, what the variable X in your dataset refers to and how it is different than the variable Y which looks just like it. Uploading our notes on OSF or creating a bookdown page can help us remember what we did and easily inform our team without having to go through everything again and again. Likewise, you can use GitHub to share your codes and readme files in which you can explain what your code exactly does. 

Furthermore, it helps you get through the peer review. “If you are using RMarkdown, it helps the reviewers understand what you have done.”, says Hair. Reviewers can just download your data and RMarkdown file, re-run the whole analysis which could improve the reviewing process and help to avoid misunderstandings. 

Reproducibility can help building your reputation and future-proof your work as well. Open science practices such as sharing your code and data are increasingly adopted by not only scientists but also the stakeholders. There are some funding available specifically for open science projects, therefore adopting open science practices can help you secure a funding for your studies. Hair explains that some journals like eLIFE are also sharing “living figures”. Here, the text and the figures are getting updated in the light of new data and information. It is possible to create such work by using RMarkdown and you can even submit it directly to journals like PLOS One. 

Finally, Hair underlines that adopting open science practices eventually leads to getting more citations, recognitions and opportunities therefore helps you build a career as a scientist. She makes a convincing case and finishes her talk by saying;

“It’s not just good for science – it’s good for you!”

Adopting open science practices like reproducible and transparent workflow is becoming widespread among scientists. Whatever your reason is, -be it a selfless act of advancing the science or sparing yourself quite a lot of time- the best time to start is now.

Check out the full video on Edinburgh ReproducibiliTea Youtube page to get more detail on “Selfish Reasons to Work Reproducibly” by Kaitlyn Hair.

Her talk includes the reasons discussed in:

Markowetz, F. Five selfish reasons to work reproducibly. Genome Biol 16, 274 (2015).

Read more about the Duke University paper, Dr. Julia Strand’s blog post and the Living Figures.

For the materials related to this talk please refer to our OSF page

This blog was written by Bengü Kalo