porteretalndpuzzleaboutknowledge
/data/papers/porteretalndpuzzleaboutknowledge/REPORT.md
Rendered Markdown.

REPORT — porteretalndpuzzleaboutknowledge

What I processed

  • Target PDF: inbox/Porter_et_al_2025.pdf
  • Paper key: porteretalndpuzzleaboutknowledge
  • Raw dataset used for recomputation: OSF GPP study 3 nlp.csv (https://osf.io/download/59qd6/)

Sources used

  • Local extraction outputs:
  • out/tables/camelot_stream_p4_t1.csv (sample demographics / passed check counts)
  • out/tables/camelot_stream_p9_t4.csv (published evidence-fixed stakes coefficients)
  • out/tables/camelot_stream_p13_t7.csv (published evidence-seeking stakes coefficients)
  • out/fulltext.md
  • OSF materials:
  • GPP study 3 nlp.csv (participant-level raw data)
  • A Puzzle About Knowledge Ascription.Rmd (analysis script used by authors)

Current computation workflow (updated)

papers/porteretalndpuzzleaboutknowledge/analysis/effect_sizes.qmd now computes split effects from raw data:

  • Evidence-fixed (binary q2_knowledge):
  • split by evidence strength (weak = num_checks == "O", strong = num_checks == "F")
  • stakes contrast within each split via exact 2x2 counts
  • effect size via esc::esc_2x2(es.type = "d")
  • continuity correction +0.5 only if any 2x2 cell is zero

  • Evidence-seeking (numeric nlp):

  • split by evidence strength (weak/strong)
  • stakes contrast within each split via group means/SDs
  • effect size via esc::esc_mean_sd(es.type = "d")

Sign convention everywhere:

  • d = mean(low stakes) - mean(high stakes)

Filtering logic applied (matching paper workflow):

  • q1_importance < 3
  • merge Russia sub-sites into russia
  • age >= 18
  • comprehension-check pass proxy: stakes == importance
  • for split logic: valid evidence code (num_checks in {O, F})

YAML update summary

papers/porteretalndpuzzleaboutknowledge/porteretalndpuzzleaboutknowledge.yaml contains four extracted effects per site/language sample:

  • sX_e1: Evidence-fixed, weak evidence
  • sX_e2: Evidence-fixed, strong evidence
  • sX_e3: Evidence-seeking, weak evidence
  • sX_e4: Evidence-seeking, strong evidence

Each effect now includes:

  • split-specific groups (low_stakes, high_stakes)
  • raw-data-based reported_test.notes
  • effect_size from esc when computable
  • needs_review: true + quality_flags: [insufficient_data_for_split_effect] when not computable

Recoding decision — study/site/effect structure

  • Updated on 2026-04-20: the YAML now treats Porter et al. as one paper-reported cross-cultural dam study, not as 15 separate studies.
  • The 15 former site/language study entries were moved into effect-level site_id, site_label, language, language_other, and sample fields.
  • All 60 site×evidence-strength×outcome effect sizes and effect IDs were retained. Effect subgroup labels are prefixed with the site label to keep exported rows readable.
  • This change is semantic/structural only: sites no longer inflate the study count, while site-specific sample metadata remains available to the exporter through effect-level overrides.

Computability summary

  • Evidence-fixed split effects: 29 / 30 computable
  • Evidence-seeking split effects: 23 / 30 computable
  • Total: 52 / 60 computable

Machine-readable audit file:

  • papers/porteretalndpuzzleaboutknowledge/scratch/split_effects_from_raw.csv

8 non-computable split effects (explicit)

  1. s6_e1 (India - Meitei, evidence-fixed, weak): n_low=0, n_high=20 reason: missing one stakes group in this evidence stratum.
  2. s6_e3 (India - Meitei, evidence-seeking, weak): n_low=0, n_high=18 reason: insufficient per-group data for esc_mean_sd.
  3. s9_e3 (Peru - Shipibo, evidence-seeking, weak): n_low=0, n_high=1 reason: insufficient per-group data for esc_mean_sd.
  4. s9_e4 (Peru - Shipibo, evidence-seeking, strong): n_low=0, n_high=2 reason: insufficient per-group data for esc_mean_sd.
  5. s12_e3 (South Africa - Sepedi, evidence-seeking, weak): n_low=1, n_high=10 reason: insufficient per-group data for esc_mean_sd (SD undefined in low group).
  6. s12_e4 (South Africa - Sepedi, evidence-seeking, strong): n_low=1, n_high=10 reason: insufficient per-group data for esc_mean_sd (SD undefined in low group).
  7. s13_e3 (South Africa - isiZulu, evidence-seeking, weak): n_low=0, n_high=7 reason: insufficient per-group data for esc_mean_sd.
  8. s13_e4 (South Africa - isiZulu, evidence-seeking, strong): n_low=1, n_high=4 reason: insufficient per-group data for esc_mean_sd (SD undefined in low group).

Notes

  • This report supersedes the older Table-4/Table-7-only computation notes for this paper.
  • The old statement that all evidence-seeking effects were non-computable from local extracted tables is still true for table-only extraction, but raw OSF data now supports split effect computation for most effects.