ReproducibiliTea Blog

Bayesian data analysis and preregistration 17/12/2021 with Dr Zachary Horne

This session was the final session of the year 2021. The speaker was Dr Zachary Horne, a lecturer at School of Philosophy, Psychology & Language Sciences, The University of Edinburgh. Dr Horne talked about Bayesian statistics and preregistration in the context of open research practices. Dr Horne started his presentation by talking about what is Bayesian data analysis and very broadly it is a data analysis that takes into consideration prior information about a particular domain, in addition to data collection. Sometimes it is also called prior distribution. 

There are different aspects to keep in mind when it comes to preregistration in Bayesian data analysis:

  • How the data is going to be collected
  • Why is the data being collected in a particular way?
  • Sample size
  • Operationalization of constructs
  • Specifying key analyses
  • Aspects of analysis that will be exploratory

Bayesian workflow (Gelman et al., 2020)

  1. Choosing an initial model
  2. Prior predictive checking
  3. Fitting the model
  4. Computational problems and algorithm diagnostics
  5. Posterior predictive checking
  6. Prior robustness

Dr Horne talked about prior predictive checking in a bit detail and it covers the following features:

  • Prior to data collection, is the model consistent with what is already known about the world?
  • What distribution is implied for an outcome variable given prior and likelihood?
  • Assessing the credibility of model before collecting the data

A question ‘Do tweets from activist groups (e.g., PETA, Greenpeace, etc.) with photos get liked more than tweets without photos?’ was central during the session to discuss models in Bayesian data analysis. Analysis showed that photos are better as far as likes are concerned on twitter. With respect to which model is the ‘right’ model, the regularizing model provided better estimates of central tendency of distribution. However, none of the priors (optimistic, regularizing and improper) captured that the larger central tendency is coming out just from many tweets getting 200 or so likes, but also from tweets getting huge numbers of likes! Moreover, models have a room for improvement. 

The session was concluded with pre-registration priors in Bayesian data analysis and Dr Horne suggested using regularizing priors for the parameters of interest especially when those parameters are expected to ‘do something’ and incorporating posterior information in the priors of subsequent related models.

This blog is written by Sumbul Syed

SOCIALS:

Edinburgh RT Twitter

Edinburgh RT OSF page

Edinburgh RT mailing list

For any questions/suggestions, please send us an email at edinburgh.reproducibilitea@ed.ac.uk

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s