ReproducibiliTea Blog

Errors in Research

“Fallibility in Science- Responding to Errors in the Work of Oneself and Others”

This was the first session of year 2022 and revolved around a paper discussion on Errors in Research. It was led by Laura Klinkhamer, a PHD student at The University of Edinburgh. Her research interests lie at the intersection of neuroscience and psychology. The discussion was on Professor Dorothy Bishop’s 2018 commentary paper ‘Fallibility in Science: Responding to Errors in the Work of Oneself and Others’. Apart from the paper discussion, the session involved interactive sessions with anonymous polls on and some interesting discussions in the breakout rooms. 

The session began with imagining a scenario where a PHD student runs a series of studies to find a positive effect. After getting null findings in three studies, the student changed the design and found a statistically significant effect in the fourth study. This resulted in paper publication in a prestigious journal with student as first author. The study was also featured on National Public Radio. However, after two weeks the student realized as a consequence of preparing for a conference talk that the groups in the study were miscoded and the study was a faulty one. The same scenario was asked to be imagined by the participants in the session and to report their answers anonymously on 

According to Azoulay, Bonatti and Krieger (2017), there was an average decline of 10% in subsequent citations of early work of authors who publicly admitted their mistake. However, the effect was small when the mistake made was an honest one. Moreover, there was no reputational damage in case of junior researchers. According to Hosseini, Hillhorst, de Beaufort & Fanelli (2018), 14 authors who self-retracted their papers believed their reputation would be damaged badly. However, in reality, self-retraction did not damage their reputation but improved it. 

Incentives for Errors in Research or Research Misconduct:

  1. Pressure from colleagues, institutions and journal editors to publish more and more papers
  2. Progression in academic career is determined greatly by metrics that incentivize publications and not retractions

Unfortunately, according to Bishop (2018) there are very few incentives for honesty in academic careers. Participants were encouraged to share their opinions on on what would they do to incentivize scientific integrity. 

Open Research:

  1. Research that is publicly accessible does not indicate that it is free from errors. However, open data and open code enhances the chances of error detection by the other authors
  2. Open research encourages scientists to double check their data and code before publication
  3. Open research helps normalize error detections and reduces stigma which eventually leads to scientific accuracy 

How to Respond to Errors in the Work of Other Researchers:

There are different platforms to do that including-

  • Contacting researchers directly
  • Contacting researchers via journal (if possible)
  • Preprint servers
  • PubMed Commons (discontinued)
  • PubPeer (commentators can be anonymous)
  • Twitter
  • Personal blogs
  • OSF and Octopus (emerging platforms)

One of the drawbacks of anonymous platforms is that they often result in criticism of someone’s work that can be harsh and discouraging. When responding to errors in the work of other scientists it is important to make no assumptions. Because a failure to replicate an original study can be due to reasons beyond incompetence or fraudulent intentions. The scale of error can be useful while approaching the situation.

Scale of errors:

  • Honest errors- coding mistakes
  • Paltering- using a truthful statement to mislead by failing to provide the relevant contextual information
  • P-hacking
  • Citing only a part of literature that matches with one’s position. Commonly referred to as confirmation bias
  • Inaccurate presentation of results from cited studies
  • Inventing fake data
  • Paper mills- businesses producing fake studies for profits

There was a little discussion on the case of Diederik Stapel who was fired instantly after it was discovered that he faked a large-scale data during his academic career. Moreover, some discussion was done on paper mills that are polluting the scientific literature for profits. An important question remains: who are/should be responsible for detecting and responding to large errors? 

  1. At an internal level, head of the department/lab, whistleblowing policy and research misconduct policy
  2. Journals 
  3. Separate institutes like UKRIO (UK Research Integrity Office
  4. Technology
  5. External researchers

There was a lot more to be discussed and hopefully the discussion can continue in later discussions and/or the conference. There is a ‘Edinburgh Open Research Conference’ on Friday 27 May, 2022 organised by the Library Research Support Team and EORI/Edinburgh ReproducibiliTea. SAVE THE DATE!!!!

Anonymous responses the participants on

This blog is written by Sumbul Syed


Edinburgh RT Twitter

Edinburgh RT OSF page

Edinburgh RT mailing list

For any questions/suggestions, please send us an email at

ReproducibiliTea Blog

Bayesian data analysis and preregistration 17/12/2021 with Dr Zachary Horne

This session was the final session of the year 2021. The speaker was Dr Zachary Horne, a lecturer at School of Philosophy, Psychology & Language Sciences, The University of Edinburgh. Dr Horne talked about Bayesian statistics and preregistration in the context of open research practices. Dr Horne started his presentation by talking about what is Bayesian data analysis and very broadly it is a data analysis that takes into consideration prior information about a particular domain, in addition to data collection. Sometimes it is also called prior distribution. 

There are different aspects to keep in mind when it comes to preregistration in Bayesian data analysis:

  • How the data is going to be collected
  • Why is the data being collected in a particular way?
  • Sample size
  • Operationalization of constructs
  • Specifying key analyses
  • Aspects of analysis that will be exploratory

Bayesian workflow (Gelman et al., 2020)

  1. Choosing an initial model
  2. Prior predictive checking
  3. Fitting the model
  4. Computational problems and algorithm diagnostics
  5. Posterior predictive checking
  6. Prior robustness

Dr Horne talked about prior predictive checking in a bit detail and it covers the following features:

  • Prior to data collection, is the model consistent with what is already known about the world?
  • What distribution is implied for an outcome variable given prior and likelihood?
  • Assessing the credibility of model before collecting the data

A question ‘Do tweets from activist groups (e.g., PETA, Greenpeace, etc.) with photos get liked more than tweets without photos?’ was central during the session to discuss models in Bayesian data analysis. Analysis showed that photos are better as far as likes are concerned on twitter. With respect to which model is the ‘right’ model, the regularizing model provided better estimates of central tendency of distribution. However, none of the priors (optimistic, regularizing and improper) captured that the larger central tendency is coming out just from many tweets getting 200 or so likes, but also from tweets getting huge numbers of likes! Moreover, models have a room for improvement. 

The session was concluded with pre-registration priors in Bayesian data analysis and Dr Horne suggested using regularizing priors for the parameters of interest especially when those parameters are expected to ‘do something’ and incorporating posterior information in the priors of subsequent related models.

This blog is written by Sumbul Syed


Edinburgh RT Twitter

Edinburgh RT OSF page

Edinburgh RT mailing list

For any questions/suggestions, please send us an email at

ReproducibiliTea Blog

Edinburgh University Research Optimisation Course (EUROC) 19/11/2021 with Dr Gillian Currie

In this session, Dr Gillian Currie who is a Postdoctoral Research Fellow in the CAMARADES group, Centre for Clinical Brain Sciences at The University of Edinburgh talked about EUROC (Edinburgh University Research Optimisation Course) which encourages open research practices in animal research. Dr Gillian Currie is a meta-researcher and her research interests include improvement in research methodology.

Dr Currie began talking about EUROC (Edinburgh University Research Optimization Course), a course with a focus on rigorous design, conduct, analysis and reporting of research using animals. She further mentioned some key points on research using animals:

  • In the year 2020, 2.8 million animals were used in research across the UK
  • The studies helped understand basic biology, complex diseases and potential treatments development
  • However, there were certain concerns regarding difficulties in replication, reproducibility and translation

Dr Currie talked briefly about the translational pipeline which aims to translate pre-clinical research into clinical research which further results in improved health. A survey conducted by Nature involving 1,576 researchers found that 52% of the researchers agreed there is a ‘reproducibility crisis’. The problem of ‘replication crisis’ can be attributed to the following reasons:

  1. Smaller sample size in studies
  2. Publication bias
  3. Limited randomization and blinding

Dr Currie carried on with a discussion by talking about new opportunities in open research practices including:

  1. An increased focus on methodological rigour which involves ensuring appropriate power, appropriate statistics and p values
  2. An increased transparency through pre-registration of studies, reporting of methods as well as sharing of data
  3. Measures to reduce risks of biases

It is important to realise that a small improvement manifested across large number of researchers can help make sure to have a substantial effect overall. 

Course structure of EUROC:

EUROC comprises of 3 modules which can be completed across multiple sessions. Every module consists of 1 core and 1 extended lecture.

MODULE 1: Study Design and Data Analysis

In module 1, ‘Study Design’ section will comprise of internal validity, Risks of bias, Construct and external validity and Exploratory vs confirmatory research. ‘Data Analysis’ section consists of Statistical analysis, Significance testing, Sample size and statistical power, Outliers, Unit of analysis and Multiple outcome testing.

MODULE 2: Experimental Procedure

Module 2 is divided into two sections: Maximizing Study Validity and Study Design. The former section includes topics like Risks of bias, Pilot studies, Confounding characteristics and variables, Validity of outcome and Optimization of complex treatment parameters. The latter section will have Use of reference compounds, Statistical Analysis Tips, Replication and Standardisation.

MODULE 3: Pre-registration and Reporting

The final module will deal with Pre-registration (including Study protocols) and Reporting (Data sharing, Statement of conflict of interest, Reporting standards).

The course is a contribution by The University of Edinburgh towards an improvement in research. Therefore, the course is also available to researchers outside the university through this link

How to access EUROC on Learn (for people within the University of Edinburgh):

  1. Log in to Learn
  2. Click on ‘self-enrol’ (available on top right of the screen)
  3. Scroll down to Research Improvement
  4. Click on EUROC (Edinburgh University Research Optimisation Course)

The session was concluded with Dr Currie talking about a research improvement project that is coming up soon. Delays in dissemination of research findings act as impediments in scientific progress, therefore one of the most important aims of research improvement project is to increase the speed at which findings are shared with the use of pre-prints. A Pre-print is an early version of a scholarly article that has not gone under peer-review. It is open to comments and is a good means to prioritise new ideas.

This blog is written by Sumbul Syed


Session’ video on YouTube

Edinburgh RT Twitter

Edinburgh RT OSF page

Edinburgh RT mailing list

For any questions/suggestions, please send us an email at

ReproducibiliTea Blog

Building an Open Research Culture 29/10/2021 with Dr Will Cawthorn

In this session, Dr Will Cawthorn at The University of Edinburgh Centre for Cardiovascular Science talked about Building an Open Research Culture. Dr Cawthorn began by talking about the conflicts of interest in research which can be extrinsic and intrinsic and how important it is in open research to have no potential conflicts of interest. Here are the key points which were discussed during the session:

  • Value of a research study these days is based a lot more on if it’s published in a high impact journal and whether it is highly cited
  • Experts opinions are usually flawed especially in flawed and noisy environments
  • There are many consequences of mismeasurement of science including devaluation and ignoring of valuable research, publication delays, incentivization of poor research practice and external pressures killing inner motivation to do good research 
  • Researchers pay to publish their research they produced in the first place in a journal only to ask others to pay in order to access it. This is the opposite of open research

Dr Cawthorn further talked about how he is taking the steps to bring about an open research culture in his own lab. 

  • Encouraging members of his lab to follow their own ideas
  • Publishing negative results because there is no such thing as ‘positive’ and ‘negative’ results. Just the ‘conclusive’ and ‘inconclusive’ results.
  • Encouraging more preprints and open access papers

However, open research culture is easier said than done and there are additional practices that Dr Cawthorn wants to introduce/further improve in his lab. And these include:

  • Electronic lab notebook
  • Writing a lab manual
  • Robust data management

The session was concluded with a question regarding if there is any brighter future of open research culture. There is probably a brighter future with many initiatives coming up like DORA (The Declaration on Research Assessment) and LERU (The League of European Research Universities).

Dr Cawthorn is the LERU Open Science Ambassador for The University of Edinburgh and he is collaborating with many others on writing an Open Science Roadmap for the University which is due to be published soon.

This blog is written by Sumbul Syed


Session’s video on YouTube

Edinburgh RT Twitter

Edinburgh RT OSF page

Edinburgh RT mailing list

For any questions/suggestions, please send us an email at

ReproducibiliTea Blog

Easing Into Open Science 17/09/2021 with Dr Priya Silverstein

written by Laura Klinkhamer (co-organiser of Edinburgh ReproducibiliTea)

Dr. Priya Silverstein

In this session we took a look at the following paper:
Ummul-Kiram Kathawalla, Priya Silverstein, Moin Syed; Easing Into Open Science: A Guide for Graduate Students and Their Advisors. Collabra: Psychology 4 January 2021; 7 (1): 18684. doi:

and we were joined by one of the authors, Dr Priya Silverstein for a live Q&A.

The paper is a great place to start for people who are new to open research concepts and provides a very useful summary and guide for some practices you could consider applying to your research. It introduces open science (now often referred to as open research, to include the academic disciplines that would not describe themselves as a “science”), as a “broad term that refers to a variety of principles and behaviors pertaining to transparency, credibility, reproducibility, and accessibility” (Kathawalla et al., 2021, p. 2). The paper is written specifically for the types of situations that graduate students are more likely to encounter, but the practices described are broadly applicable to researchers of any career stage.

Paper Summary

Eight Open Research (OR) practices are outlined in this guide and classified according to the author’s perception of the difficulty of implementation.

Practice 1 is to set up or join an open research (journal) club, such as the with the ReproducibiliTea organisation. This can be a quick and efficient way of getting to grips with some key concepts of the reproducibility and OR movement, while meeting new people along the way and increasing your network.
For researchers in Edinburgh – we encourage you to join the Edinburgh Open Research Initiative Teams group, which serves as a hub for bringing people interested in OR together. The University of Edinburgh now also has an Open Research Blog and newsletter that you can sign up for here.

Practice 2 refers to thinking about your project workflow, in particular setting up your file organisation, data access regulations and keeping clear records so that (future) you and others can quickly get an overview of your project and are able to reproduce the outcomes. For more information and tips on how to work reproducibly, we refer you to Kaitlyn Hair’s talk (Edinburgh ReproducibiliTea session Nov 2020) on selfish reasons to work reproducibly and Ralitsa Madsen’s talk on RSpace, a platform that you could consider to set up a project workflow in addition to the freely accessible Open Science Framework.

Practice 3 is about preprints, which refers to the practice of publishing your manuscript before or during peer review. Check with the journal where you intend to publish your work first on what their policy on pre-prints is by either messaging them or checking this source suggested by Priya. Pre-prints are a way to bring your research out to the world, even if publication is delayed or rejected. It also increases the number of times your work will be cited. There are free servers that you can upload your manuscript as a pre-print to, such as bioRxiv for biology.

Practice 4 refers to creating reproducible code/analyses. It is very helpful for your project workflow and reproducibility to write your code/analysis plans in such a way that it is clear beyond a doubt for others and your future self what you did. Annotating your steps and writing README files, basic text files describing for instance what files are in your project space/folder and what role they play in your project (e.g. data_spreadsheet_version3 contains the clean data on x number of participants that is used for Analysis B.), are very useful practices.

Practice 5 Sharing data is very useful to the scientific community and there are many platforms that you could upload your project’s anonymized data set to (e.g. OSF again). However, it is very important to make sure you are legally allowed to share the data. This will depend on your local and wider data regulation guides (e.g. in the EU GDPR applies) as well as what has exactly been put into the consent forms (if applicable to your project of course).
There are also options to upload only part of your data set or set up a system so that others can access more sensitive data. Talk to your supervisors/collaborators and check University research support services to see what would be most suitable in your case (for instance see this resource for the school of PPLS).

Practice 6 Being very open in your manuscript writing. In a way it’s fascinating how the norm in manuscript writing is that the research story gets presented as an almost perfect execution of a plan with a happy ending (i.e. significant results), whereas in reality you often hear researchers struggling with all kinds of issues and ending up with a manuscript that is only very slightly connected to the original research idea. It’s not very realistic, and actually harmful to scientific integrity. So if we allow ourselves to be humans, who make mistakes, and allow others to read about and learn from our mistakes, wouldn’t that make life easier?

Practices 7 & 8 are related.
Pre-registration: a time-stamped, read-only version of your research plan created before you begin data collection/analysis.
Registered report: similar to pre-registration but your research plan undergoes peer review before results are known. Helpful resource on the Centre for Open Science website here.
Both practices are very useful ways that make you sit down and plan your research before executing it. In the case of registered reports, you will also obtain feedback before executing it, which may be much more useful than receiving feedback after the fact in the regular peer-review system. Although you state what you intend to research in a pre-registration or registered report, it is important to realise that you do not sign a binding contract. If it turns out that another method or additional exploratory analysis are interesting to your research question, you are of course able to make changes. It is however your responsibility to transparently report and justify these changes.
As Niamh summarised: these practices do not stifle creativity, but create accountability.

It is important to realise that engaging is in OR practices is not an all-or-nothing approach. It’s much more about adopting a certain critical mindset and taking (small) steps that are suitable for you and your specific project.


During the discussion Priya mentioned that if the paper were to get written this year she would probably include the same practices, but elaborate on the increased number of options in which they could be applied. For example, one thing that has changed in recent years is that registered reports have become available for projects with secondary data analysis (rather than it just being available for projects where the data still is to be collected).

Another interesting development is that of Peer Community In Registered Reports, which facilitates scheduled peer review. You indicate beforehand when you intend to hand in Stage 1 of your registered report (Introduction & Methods) and the community tries to arrange reviewers for that particular time frame, meaning that the peer review process can be completed much more rapidly. Priya mentioned that in the past this has been one of the main criticisms of the registered report that it was unclear when researchers could start their research analysis/data collection because of uncertainty regarding the peer review duration. This new facility makes registered reports an even more attractive option.

Will Cawthorn added that Review Commons is another place you can send your manuscript to for general peer review. If the manuscript then passes review, you can choose from a list of journals where to publish your paper. This approach decreases redundancy in peer review (i.e. if rejected from one journal, you don’t go through the roulette wheel of another round of completely new peer review).
Priya confirmed that this procedure is also in place for PCI registered reports. From the website: “Following the completion of peer review, authors of RRs that are positively recommended have the option to publish their articles in the growing list of PCI RR-friendly journals that have committed to accepting PCI RR recommendations without further peer review.”

We also had a group discussion about how we could further promote OR in the University. One of the suggested routes was to include more OR practices in undergraduate and postgraduate course curriculae. Will Cawthorn also referred to an OR roadmap for the University of Edinburgh that he and several others (including many from the Library and Research Support Services) are working on that is due to be published soon. Priya emphasised that is important create momentum both through bottom-up and top-down initiatives at the same time to bring about real change in the research culture. This nicely connected to our next session on Friday 15 October, which will be on how to build an open research culture in your lab/research group, by Dr. Will Cawthorn (LERU Open Science Ambassador for the University of Edinburgh).

Then I said something silly about how we should all jump aboard the Open Research train and Priya kindly replied with a “choo choo!” making me feel slightly less embarrassed.

All things considered, we look back at a successful first session of this academic year!

The presentation slides and meeting recording can be found on our OSF page. The meeting recordings can also be found on our YouTube channel.

If you’d like to stay up to date with Edinburgh ReproducibiliTea, please consider joining our mailing list by filling in this form and/or following us on Twitter.

For any questions/suggestions, please send us an email:

ReproducibiliTea Blog

A selfish guide to RSpace: Why and How?

In this session, Post-Doctoral Research Fellow Dr. Ralitsa Madsen covers why using an Electronic Lab Notebook (ELN) is a great idea. Dr. Madsen suggests that there are many rewards in using ELN for a reproducible workflow and they include:

  • Saving a lot of time while reading, searching the documents and so on.
  • If you are a postgraduate or graduate student, it will be more convenientwhile retrieving the details that you need for materials and methods section of your research.
  • ELN makes it easier to collaborate not only within but also outside of the group.
  • Lab members can pick up where you left off, therefore it ensures the continuity of the research.
  • It is much safer to rely on an ELN rather than your hard drive. Your documents will be accessible even if your computer gets damaged/stolen.
  • It is necessary to have an extensive documentation, version control and traceability of your work if you would like to make a patent application.
  • In addition, RSpace is well-integrated with many other services like Mendeley, Microsoft Office, Dropbox, Google Drive as well as data repositories like Git Hub.

After naming several great reasons, Dr. Madsen goes on to do a walk-through of the tool and gives useful tips to facilitate RSpace adoption within the lab: 

First, you should think FAIR: are your documents easily findable? Are they accessible to researchers inside or outside of the lab? Is it interpretable? Can others read through the lab book and reuse your protocol for their experiment?

But for this to work, says Dr. Madsen, you also need to create;

  • A lab book entry template which will ensure consistency and make it easier to collaborate,
    • Notebook based project organisation, 
    • Data storage rules that are motivating to use external repositories and
    • Consistent file naming rules

Do not forget to check Dr. Ralitsa Madsen’s RSpace demonstration on Edinburgh Reproducibility’s YouTube channel if you haven’t already!

This blog is written by Bengü Kalo

Find more information about the RSpace here

Edinburgh RT YouTube Channel

Edinburgh RT OSF page

Edinburgh RT Mailing List

ReproducibiliTea Blog

How (some) scientists talk about openness

Edinburgh ReproducibiliTea held another great session last Friday, with Dr. Rosalind Attenborough from University of Edinburgh – Science, Technology and Innovation Studies! Her research is focused on the researchers’ attitudes towards open science and here are the main points of her insightful talk for those who have missed;

For her PhD project, Dr. Attenborough interviewed 54 individuals from various career stages, genders and disciplines in biology. She mainly explored what does open science mean to them. Although the interviewees came up with various responses to her question, majority of them fell under three category: open access, open data and interpersonal openness.

In general, researchers tends to be positive while talking about open access and believes that it is a good idea. Yet, it does not go without mentioning the monetary and bureaucratic issues around it. 

Open data is a completely different story. While interviewing scientists and policymakers, Dr. Attenborough saw that people’s attitudes varied immensely. Some of the interviewees perceived it as a norm and embraced it with passion, while the others were cautious. What makes people refrain from sharing data seems to be stemming from the possibility of receiving destructive criticism and getting scooped.

The last category, interpersonal openness, refers to willingness and ability to talk about unpublished research ideas. Like data sharing, interpersonal openness also gets negatively affected by the competitive research culture as well as unsupportive mentorship.

Dr. Attenborough’s work is particularly insightful as it sheds a light on in which ways academia has to change so that the researchers , especially the ECR’s, can feel more comfortable embracing open science practices.

This blog is written by Bengü Kalo

Edinburgh RT YouTube Channel

Edinburgh RT OSF page

Edinburgh RT Mailing List

ReproducibiliTea Blog

Skills training for Open Science: impact and rewards of working with Edinburgh Carpentries

On our last Edinburgh ReproducibiliTea session, Edward Wallace  -PI of the Wallace Lab– shared the benefits of working with Edinburgh Carpentries. Here are some of the key points which have been discussed:

Dr. Wallace argued that all researchers need to learn how to analyse their data reproducibly, reliably and efficiently, regardless of which career stage they are at. 

Researchers need some foundational skills like coding, data science and project organisation in order to practice open science. However, the many of the group leaders, Postdocs, PhD students and RAs across the university stated that they do not have formal training in computing (45%) or statistics (35%) at all. This, says Dr. Wallace, was one of the main reasons to work with the Carpentries for him.

The Carpentries relies on the open community, ethos and pedagogical drive. Here, all the resources are developed by volunteers on GitHub and learning as well as teaching is well structured.

One very important reason to get involved with the Edinburgh Carpentries is that the funding bodies are interested in open science as much as the researchers. For instance, UKRI-BBSRC plans to “take actions to increase the capacity in computational skills within the biosciences”. In fact, Edinburgh Carpentries is now funded by UKRI for two years to expand their trainings.

In the session, Dr. Wallace informed us that the Edinburgh Carpentries are currently developing new teaching materials for statistics, FAIR principles and data management and data science computing with reproducible workflows. In one of these workshops, the instructors are teaching some skills that can save a lot of time and improve your work, such as how to organise and document your code efficiently which is also discussed in the “Good Enough Practices in Scientific Computing”. 

Edinburgh Carpentries is a community that is growing every day and is in need for more instructors. Some of the benefits of getting involved with the community is that;

  • You can get better at coding and teaching
  • The training helps to get funding and
  • You will be a part of a nice, supportive community. 

Many of our attendees seemed to be interested in participating the Edinburgh Carpentries and thinking of ways to engage their own labs in open research. To receive updates about EdCarp workshops and/or sign up as an Instructor and/or Helper for Edinburgh Carpentries you can sign up to the Edinburgh Carpentries mailing list. If you would like to help Dr. Wallace and his colleagues in developing workshop on FAIR practices in biosciences (and be paid for that), please email them at

This blog was written by Bengü Kalo

Find the resources discussed in this session:

Wallace Lab 
Example open source coding from the Wallace Lab 

Edinburgh Carpentries mailing list to receive updates about EdCarp workshops and/or sign up as an Instructor and/or Helper for Edinburgh Carpentries

Wilson (2016) Software Carpentry: Lessons Learned  

Edinburgh-based Coding Club:  

Mailing List of the Coding Club

Edinburgh RT YouTube Channel

Edinburgh RT OSF page

Edinburgh RT Mailing List

ReproducibiliTea Blog

Selfish Reasons to Work Reproducibly

Reproducibility is not a newly adopted principle, in fact, it dates back to 1600s. The Irish chemist Robert Boyle was the first to emphasize the importance of  obtaining the same results when the study is re-created. Since then, scientists consistently reflected on how adopting a reproducible workflow helps advancing the science. Question is, does it only benefit science? During an invited talk at Edinburgh ReproducibiliTea Journal Club, Kaitlyn Hair explained how sharing your data, materials and code is also in your own interest.

Perhaps the most important reason for adopting a reproducible workflow is simply to avoid a disaster, according to Hair. Researchers all around the world publish their works continuously. These publications are leading up to other ideas, discoveries and products like vaccines and cancer therapeutics. A few years earlier, scientists from Duke University published a paper in which they claimed to find a way to efficiently target tumours based on their genetic sequencing. It is not very hard to imagine that this was a very important achievement at the time. However, after some failed attempts to replicate the results, scientists came to a shocking realisation; the promising findings were only a by-product of a technical problem which occurred in the process of copying the data from an excel sheet to another statistical program. Being unable to provide evidence against fraud claims, “this mistake was career ending for some of its authors”, says Hair. These kinds of mistakes are not as rare as we wish it to be. However, adopting open science principles can be life-saving in such situations. This was the case for Dr. Julia Strand. In 2018, she published the greatest achievement of her career. As it turns out, there was a way to improve speech perception and diminish the cognitive effort that we spend in noisy environment. Simply presenting a modulating circle which got expanded when the speech got louder made participants respond faster, or she thought so. In 2020, she published a blog post which drew lots of attention from scientists. “The central finding was the result of a software glitch…” she said. She found what she found only because she made a mistake while programming the timing clock. In this case, however, the code was openly available, the study was pre-registered and her work was fully transparent. Detecting her own mistake too, helped her prove that this was not a scientific fraud. 

Another reason for working reproducibly according to Hair is that, it makes writing easier. Writing a paper is not a smooth process if you don’t use tools like R, OSF and GitHub in the process. Having to copy your figures and paste it to your Word document, going back and forth to create a table for your analysis can be overwhelming, especially on a tight schedule. One of the best features of R is that you can run your analysis, make tables and figures and write your results up at the same time. On top of that, Hair explains that the rticles package in RMarkdown comes with many different paper formats such as PLOS or Frontiers style and knitr package allows you to save your document as PDF, word document or HTML file when you are done with it.

In addition, if you are collaborating with other scientists, you may easily end up with dozens of updated versions of the same code. GitHub is a great tool to avoid getting lost in a pool of code files in such situations. All you need to do is to upload your file on a remote GitHub repository so that the co-authors can pull it to their own computer, make changes and push it back to the repository. It tracks the changes that are being made and who made them, which means that you don’t need to keep all the versions on your computer.

Working reproducibly also ensures continuity, according to Hair. Especially as you make progress on your career and publish more often, you will notice that you forgot what you did, what the variable X in your dataset refers to and how it is different than the variable Y which looks just like it. Uploading our notes on OSF or creating a bookdown page can help us remember what we did and easily inform our team without having to go through everything again and again. Likewise, you can use GitHub to share your codes and readme files in which you can explain what your code exactly does. 

Furthermore, it helps you get through the peer review. “If you are using RMarkdown, it helps the reviewers understand what you have done.”, says Hair. Reviewers can just download your data and RMarkdown file, re-run the whole analysis which could improve the reviewing process and help to avoid misunderstandings. 

Reproducibility can help building your reputation and future-proof your work as well. Open science practices such as sharing your code and data are increasingly adopted by not only scientists but also the stakeholders. There are some funding available specifically for open science projects, therefore adopting open science practices can help you secure a funding for your studies. Hair explains that some journals like eLIFE are also sharing “living figures”. Here, the text and the figures are getting updated in the light of new data and information. It is possible to create such work by using RMarkdown and you can even submit it directly to journals like PLOS One. 

Finally, Hair underlines that adopting open science practices eventually leads to getting more citations, recognitions and opportunities therefore helps you build a career as a scientist. She makes a convincing case and finishes her talk by saying;

“It’s not just good for science – it’s good for you!”

Adopting open science practices like reproducible and transparent workflow is becoming widespread among scientists. Whatever your reason is, -be it a selfless act of advancing the science or sparing yourself quite a lot of time- the best time to start is now.

Check out the full video on Edinburgh ReproducibiliTea Youtube page to get more detail on “Selfish Reasons to Work Reproducibly” by Kaitlyn Hair.

Her talk includes the reasons discussed in:

Markowetz, F. Five selfish reasons to work reproducibly. Genome Biol 16, 274 (2015).

Read more about the Duke University paper, Dr. Julia Strand’s blog post and the Living Figures.

For the materials related to this talk please refer to our OSF page

This blog was written by Bengü Kalo