data-as-a-science
data-as-a-science copied to clipboard
Module 1 - Lesson 9: Sample robustness, central limit theory, and the ethics and abuses of p-hacking
ETHICS
Appraise the risk of bias in p-hacking, and the risk to scientific self-correction from stigmatising researchers.
P-hacking happens unintentionally; review of mechanisms by which they occur and how to avoid, and calculate what to do about it. However, how ethical is it to stigmatise researchers where research subsequently turns out to be p-hacked? Examples: Amy Cuddy, Data Colada and guidelines from “False-positive psychology”
CURATION
Prepare data for long-term accessibility through unique domain object identifiers and platforms to support it.
DOI and URN are essential to ensure persistent referencing and discovery; data don’t exist if they keep moving … plus, leads into discussion and value of long-term cohort studies.
ANALYSIS
Assess sample robustness using the Central Limit Theory, and infer statistical significance based on inference for numerical data.
Central Limit Theory, variability of sample mean, determine approaches using point estimates, or test stats. Difference of means, and hypothesis testing based on difference of means.
PRESENTATION
Plot sampling distributions for the mean of different sample sizes, and distribution of different sample means.
CASE STUDY
Impact of mothers who smoke on birth weight /// Beast-feeding and baby head circumference? + how p-hack these data?