JsPsychR

Open source, standard tooling for experimental protocols: towards Registered reports

Gorka Navarrete & Herman Valencia

Presentation

https://gorkang.github.io/jsPsychRpresentation/

The past

Gather ’round little ones

Old school science

 

Old school science


  1. Read literature & come up with a shiny1 idea
  2. Design and run experiment
  3. Prepare data & explore different analytic approaches
  4. If When significant result → Write paper

Scientific method

Some issues

Experimenter degrees of freedom, incentives, issues

The need for significance and novelty:

The natural selection of bad science


The persistence of poor methods results partly from incentives that favour them, leading to the natural selection of bad science. This dynamic requires no conscious strategizing (…), only that publication is a principal factor for career advancement.


(…) in the absence of change, the existing incentives will necessarily lead to the degradation of scientific practices. (Smaldino and McElreath 2016) 1

Psychology Replication crisis

Replication of 100 studies 1

Statistically significant:

  • 97% original → 36% replications

Effect sizes:

  • Replication effects were half the magnitude of original effects

Open-Science-Collaboration (2015)

Is there a crisis?

Baker (2016)

Contributing factors

Baker (2016)

Context

Mean number of publications for new hires in the Canadian cognitive psychology job market

Pennycook and Thompson (2018)

And not only Psychology

Issues

Issues Opportunities

Improving Replicability


Improve methods

Reduce errors

Openness



Nosek et al. (2022)

Improving Replicability


Improve methods

  • increase number of participants
  • better measures and manipulations
  • improve design
  • piloting

Improving Replicability


Reduce errors

  • preregistration:

    • p-hacking
    • hypothesizing after the results are known (i.e. HARKing)
    • selective reporting
  • internal replications

Improving Replicability


Openness

  • transparency of the research process

  • sharing methods, materials, procedures, and data

Registered reports

Registered reports (RRs)



RRs were conceived to alter the incentives for authors and journals away from (…) novel, positive, clean findings and towards (…) rigorous research on important questions. Soderberg et al. (2021)

How RRs work

  • Write introduction, method, … before collecting data!
  • Send to journal for review
  • Revise and resubmit (improve before collecting data)
  • Once you get In principle acceptance:
    • collect data & run planned analysis
    • report results and conclusions & send for final review

RRs do help

RRs outperformed comparison papers on all 19 criteria (Soderberg et al. 2021)


Sizable improvements in:

  • rigor of methodology and analysis, and overall paper quality

Statistically indistinguishable in:

  • novelty and creativity

RRs could improve research quality while reducing publication bias…

RRs advantages

  • More open, preregistered, reproducible by design
  • It does not matter if p < 0.05:
    • Less incentives for p-hacking
    • No hypothesizing after the results are known (HARKing)
    • More trustworthy results, less noise
  • You still can explore, but have to say explicitly

Registered reports are great

But isn’t this a bit… hard?

  • Hard to know how to analyze an experiment before having the data

  • Always surprises when data arrives. How to create an analysis plan that will hold?

  • Collecting ALL-THE-THINGS™ allows me to figure out the best way to do analysis

  • My code is a mess. Would take more time to make it shareable…

Our path towards RRs

Background

  • At the CSCN (~5-10 PI’s) we used different technologies for experiments: Psychopy, Qualtrics, Limesurvey, jsPsych,…

  • Each protocol started almost from scratch. A single pre-existing task would define the technology used

  • Multiple implementations of the same tasks, not always exact replicas, not always easy to find

Issues



  • Experiments
  • Resources
  • Reproducibility

Experiment issues

  • Errors in experiment logic
  • Errors in items coding
  • Data not what we expected
  • Data structure made data preparation hard
  • Match between hypotheses and data not clear
  • Variables or questions not used in the analysis/paper

Resources issues: projects as islands

  • Time wasted re-programming tasks
  • Thousands of $/€ ‘invested’ in licenses (e.g. Qualtrics)
  • Piloting protocols as a part-time job for Research Assistants
  • Time wasted re-doing data preparation (each software has its own output format)

Reproducibility issues

  • Can I see what the participants saw in the 2012 protocol?
  • Data preparation/analyses so ugly, sharing them is hard (let me clean up this a bit before sending it)
  • Idiosyncratic analyses, some of which require licensed closed software (SPSS, Matlab,…)
  • Location and organization of projects
  • Why is this 2012 paradigm/data analysis is not running?

Issues Survey

Our wish list

Our wish list

  • Open source software based on standard technologies
  • Reusable tasks (my project feeds future projects)
  • Based on a mature project or technologies (De Leeuw 2015) 1
  • As many ‘automagic’ things as possible
  • Easy to create, analyze and share (paradigms and analysis)
  • Balancing participants
  • Online/offline
  • etc.

The present

A few years latter…

jsPsychR

Open source tools to create experimental paradigms with jsPsych, simulate participants and standardize data preparation and analysis

The tool


A big catalog of reusable tasks in jsPsychMaker. Each task runs with jsPsychMonkeys to create virtual participants, and have a script in jsPsychHelpeR to automate data preparation (re-coding, reversing items, calculating dimensions, etc.)

The goal


Help us have the data preparation and analysis ready before collecting any real data

  • reduce errors in the protocols
  • make the move towards registered reports easier

So far

  • ~100 tasks (maker + helper)
  • Used by researchers in Chile, Colombia, Spain:
    • > 30 online protocols with > 5000 participants (Prolific Academic, Social Media, etc.) + offline
  • Everything open source. > 80 pages manual
  • So, many, errors, caught early ♥

The team

Gorka Navarrete, Herman Valencia


Initial idea and development:

  • Gorka Navarrete, Nicolas Sanchez-Fuenzalida, Nicolas Alarcón, Alejandro Cofre, Herman Valencia

Discussions, ideas, testing:

  • Esteban Hurtado, Alvaro Rivera, Juan Pablo Morales, …

jsPsychR



1) jsPsychMaker

2) jsPsychMonkeys

3) jsPsychHelpeR

jsPsychMaker

Available tasks

Features jsPsychMaker

  • Fully open source, based on web standards (jsPsych)
  • Reuse ~ 100 tasks
  • Online and offline protocols
  • Balancing of participants to between participants conditions
  • Easy to create new tasks
  • Full control over order or tasks (randomization, etc.)
  • Participants can continue where they left (or not)
  • Time and number of participants limits
  • Multilingual support (for a selected number of tasks)
  • All the parameters can be quickly changed editing a single file

Create Tasks and Protocols

  1. Copy example task to your computer
jsPsychMaker::copy_example_tasks(
  destination_folder = "~/Downloads/ExampleTasks", 
  which_tasks = "MultiChoice")
  1. Copy paste items in excel file, edit instructions
# Open source document
rstudioapi::navigateToFile(here::here("R/BRS.txt")) # Brief Resilience Scale

# Edit instructions
rstudioapi::navigateToFile("~/Downloads/ExampleTasks/MultiChoice/MultiChoice_instructions.html")

# Adapt csv/excel file
system("nautilus ~/Downloads/ExampleTasks/MultiChoice/MultiChoice.csv")
  1. Create protocol (see next slide)
jsPsychMaker::create_protocol(
  canonical_tasks = c("BNT"), # Berlin Numeracy Test
  folder_tasks = "~/Downloads/ExampleTasks/",
  folder_output = "~/Downloads/protocol9996",
  launch_browser = TRUE
)

jsPsychR tools



1) jsPsychMaker

2) jsPsychMonkeys

3) jsPsychHelpeR

jsPsychMonkeys

 

Features jsPsychMonkeys

  • Fully open source (R, docker, selenium)
  • Works for online and offline protocols
  • Sequentially and in parallel
  • Get pictures of each screen
  • Store logs to make debugging easier
  • Watch the monkeys as they work for you
  • Random pauses or refreshing to simulate human behavior
  • Set random seed to make the monkey’s behavior predictable

Release monkeys

Release a single Monkey and take a look:

jsPsychMonkeys::release_the_monkeys(
  uid = "1",
  local_folder_tasks = "~/Downloads/protocol9996/",
  open_VNC = TRUE,
  wait_retry = 0
)


Release 10 Monkeys in parallel:

jsPsychMonkeys::release_the_monkeys(
  uid = 1:10,
  sequential_parallel = "parallel",
  number_of_cores = 10,
  local_folder_tasks = "~/Downloads/protocol9996/",
  open_VNC = FALSE
)

jsPsychR tools



1) jsPsychMaker

2) jsPsychMonkeys

3) jsPsychHelpeR

jsPsychHelpeR

Features jsPsychHelpeR

  • Fully open source (R)
  • Get tidy output data frames for each task, and for the whole protocol
  • Standard naming for tasks, dimensions, scales, …
  • Include tests for common issues
  • Snapshots to detect changes in data processing
  • Functions to help create new tasks correction using a standard template
  • Automatic reports with progress, descriptive statistics, code-book, etc.
  • Create a fully reproducible Docker container with the project’s data preparation and analysis
  • Create a blinded data frame to perform blinded analyses

jsPsychHelpeR


Create project for data preparation:

jsPsychHelpeR::run_initial_setup(
  pid = 9996,
  data_location = "~/Downloads/protocol9996/.data/",
  folder = "~/Downloads/jsPsychHelpeR9996", 
  dont_ask = TRUE
  )

Challenge: everything in 3 minutes?

Create protocol, simulate participants and prepare data…

rstudioapi::navigateToFile("R/script-full-process.R")

Create, simulate, prepare

Survey Experiment Issues

Survey results

Let’s try to download the data, process it and show a report with the results:


Plan A: run Experiment Issues project

rstudioapi::openProject("jsPsychHelpeR-ExperimentIssues/jsPsychHelpeR-ExperimentIssues.Rproj", newSession = TRUE)



Plan B: If something fails, we always have the monkeys!

utils::browseURL(here::here("jsPsychHelpeR-ExperimentIssues/outputs/reports/report_analysis_monkeys.html"), 
                 browser = "firefox")

Limitations

  • Easy to create new scales and simple tasks, but complex experimental tasks require javascript knowledge (although there are a good number of examples available)

  • Data preparation for new experimental tasks requires expertise in R (simple surveys not so much)

  • Analysis reports require some R knowledge (simple templates available)

  • Needs access to a server for online tasks

  • Only behavioral tasks (no EEG fMRI, maybe eyetracker…)

The future

The past is always tense, the future perfect

Zadie Smith

Too many things, too little time

Development is linked to our needs, time and resources. Future roadmap:

  • Templates for common experimental designs (tasks, data preparation and analysis)
  • More tasks, translations, tests, …
  • Upgrade, improve, clean, simplify, share
  • jsPsychR paper

Help wellcome!

  • Javascript programmers

  • R programmers

  • Documentation

  • Task creators

  • Testers

  • Coffee brewers

  • Patrons

Back to Registered reports




Can jsPsychR really help? (1/2)

  • Protocols are standardized with (mostly) clean code, open source, and based on web standards

  • Data preparation 90% automatic, standardized, beautiful

  • Less errors in protocols and in data preparation

  • Time from idea to protocol much lower

  • Super easy to self-replicate (adapt, re-run, analysis already works)

  • When errors are found and fixed, old protocols can benefit from the corrections, old results can be checked, …

Can jsPsychR really help? (2/2)

  • Trivial to work on analysis before collecting human data

  • Much easier to write up a good analysis plan, share it, improve it, …

  • Easy to create fully reproducible papers and results’ reports

  • Sharing protocol, materials, data preparation is painless (single command) 1

  • Creating future-proof fully reproducible data preparation projects (with Docker) is one command away 2

jsPsychR ♥ Registered Reports

More information

RRs’ templates, checklists, participating journals (> 300):

Also, check out the future:

And if you are a reviewer:

References

Baker, Monya. 2016. “Reproducibility Crisis.” Nature 533 (26): 353–66.
Begley, C Glenn, and Lee M Ellis. 2012. “Raise Standards for Preclinical Cancer Research.” Nature 483 (7391): 531–33.
Bettis, Richard A. 2012. “The Search for Asterisks: Compromised Statistical Tests and Flawed Theories.” Strategic Management Journal 33 (1): 108–13.
Boulesteix, Anne-Laure, Sabine Hoffmann, Alethea Charlton, and Heidi Seibold. 2020. “A Replication Crisis in Methodological Research?” Significance 17 (5): 18–21.
Bruns, Stephan B, and John PA Ioannidis. 2016. “P-Curve and p-Hacking in Observational Research.” PloS One 11 (2): e0149144.
Coiera, Enrico, Elske Ammenwerth, Andrew Georgiou, and Farah Magrabi. 2018. “Does Health Informatics Have a Replication Crisis?” Journal of the American Medical Informatics Association 25 (8): 963–68.
De Leeuw, Joshua R. 2015. “jsPsych: A JavaScript Library for Creating Behavioral Experiments in a Web Browser.” Behavior Research Methods 47: 1–12.
Forstmeier, Wolfgang, Eric-Jan Wagenmakers, and Timothy H Parker. 2017. “Detecting and Avoiding Likely False-Positive Findings–a Practical Guide.” Biological Reviews 92 (4): 1941–68.
Frias-Navarro, Dolores, Juan Pascual-Llobell, Marcos Pascual-Soler, Jose Perezgonzalez, and Jose Berrios-Riquelme. 2020. “Replication Crisis or an Opportunity to Improve Scientific Production?” European Journal of Education 55 (4): 618–31.
Ioannidis, John PA. 2005. “Why Most Published Research Findings Are False.” PLoS Medicine 2 (8): e124.
Kerr, Norbert L. 1998. “HARKing: Hypothesizing After the Results Are Known.” Personality and Social Psychology Review 2 (3): 196–217.
Maniadis, Zacharias, Fabio Tufano, and John A List. 2017. “To Replicate or Not to Replicate? Exploring Reproducibility in Economics Through the Lens of a Model and a Pilot Study.” Oxford University Press Oxford, UK.
Nosek, Brian A, Tom E Hardwicke, Hannah Moshontz, Aurélien Allard, Katherine S Corker, Anna Dreber, Fiona Fidler, et al. 2022. “Replicability, Robustness, and Reproducibility in Psychological Science.” Annual Review of Psychology 73: 719–48.
Open-Science-Collaboration. 2015. “Estimating the Reproducibility of Psychological Science.” Science 349 (6251): aac4716.
Pagell, Mark. 2021. “Replication Without Repeating Ourselves: Addressing the Replication Crisis in Operations and Supply Chain Management Research.” Journal of Operations Management 67 (1): 105–15.
Pennycook, Gordon, and Valerie A Thompson. 2018. “An Analysis of the Canadian Cognitive Psychology Job Market (2006–2016).” Canadian Journal of Experimental Psychology/Revue Canadienne de Psychologie Expérimentale 72 (2): 71.
Rogers, Lee F. 1999. “Salami Slicing, Shotgunning, and the Ethics of Authorship.” AJR. American Journal of Roentgenology 173 (2): 265–65.
Rubin, Mark. 2017. “An Evaluation of Four Solutions to the Forking Paths Problem: Adjusted Alpha, Preregistration, Sensitivity Analyses, and Abandoning the Neyman-Pearson Approach.” Review of General Psychology 21 (4): 321–29.
Smaldino, Paul E, and Richard McElreath. 2016. “The Natural Selection of Bad Science.” Royal Society Open Science 3 (9): 160384.
Soderberg, Courtney K, Timothy M Errington, Sarah R Schiavone, Julia Bottesini, Felix Singleton Thorn, Simine Vazire, Kevin M Esterling, and Brian A Nosek. 2021. “Initial Evidence of Research Quality of Registered Reports Compared with the Standard Publishing Model.” Nature Human Behaviour 5 (8): 990–97.

Thanks!

Gorka Navarrete

gorkang@gmail.com