This retrospective is one reading of 2.6 million words. It is not the only reading, and it is almost certainly not the best one. We chose twelve themes, but someone else might find fifteen, or eight, or a structure we never imagined. We used keyword dictionaries and z-scores, but a different method — topic modeling, network analysis, close reading — might surface patterns that our approach systematically missed.
That's not a caveat. It's the point. SLL was built on the premise that education works better when you share openly. This archive extends that premise to the analysis itself. Every transcript, every score, every dictionary, every script is below. Question our questions. Challenge our answers. Run the numbers yourself. Or ask entirely different ones.
Search the Full SLL Archive
Looking for a specific episode, guest, or topic? Search the entire Silver Lining for Learning archive. Results open in a new tab on the SLL website.
Open Data Archive
In the spirit of transparency and the open education values that run through SLL itself, we are making the complete research archive freely available. Everything needed to verify, replicate, or extend this analysis is below, piece by piece. Each component can be downloaded individually, or you can grab the full archive as a single download at the bottom.
Methodology Overview PDF
A standalone document describing the full analytical process: how themes were derived, how keyword dictionaries were built and calibrated, how scoring and normalization work, and how independent replication was used to test stability. Covers all eight interactive explorations (PCA, streamgraph, keyword evolution, intersections, Shannon entropy, Spearman correlations, novelty scoring, and convergence), plus theme ranking, host vs. guest divergence, and the Big Three time series. Start here.
Download 31 KB264 Episode Transcripts ZIP · TXT
The cleaned, standardized corpus: every episode from March 2020 through March 2026, totaling over 2.6 million words. These are the auto-captioned YouTube transcripts that form the raw material for the entire analysis. Each file includes episode metadata (number, title, date, source URL, word count).
Download 4.7 MBScored Episode Data ZIP · CSV & JSON
Complete thematic scores for all 264 episodes in both CSV and JSON formats. Includes raw keyword-frequency scores across all 12 themes, z-score normalized versions, top-three theme identification for each episode, per-keyword frequency breakdowns, keyword evolution data, host vs. guest divergence, theme rankings, the Big Three time series, and all five visualization datasets that power the interactive pages on this site.
Download 557 KBKeyword Dictionaries ZIP · TXT
The 12 scoring instruments: one dictionary per theme, each with primary, secondary, and tertiary keywords at different weights. These are what turn transcripts into thematic scores. Includes revision notes and a validation report documenting cross-theme overlap.
Download 11 KBReproducibility Scripts ZIP · Python
Four Python scripts that reproduce the full analysis pipeline from transcript to scored data: cleaning and standardizing raw transcripts, applying keyword dictionaries to produce scores, computing z-score normalization, and running dictionary validation checks. Python 3.8+, no external dependencies.
Download 14 KBMethodology & Dialogue Record ZIP · Markdown
The complete analytical trail: a chronological research log of every step taken, the full 12-theme taxonomy with temporal narratives, both batches of theme derivation notes (100 episodes analyzed in detail), convergence analysis between the two independent samples, temporal breakdowns by year, the dialogic coding note, and a detailed analytical methods document describing all eight visualization methods (PCA, Shannon entropy, Spearman correlations, novelty scoring, phase detection, and convergence synthesis).
Download 53 KBEvery number in the analysis can be traced from the scored data back through the dictionaries to the original transcripts. The methodology folder preserves the full dialogue between researcher and AI collaborator, making the interpretive process auditable in a way that conventional codebooks do not.
If you use this data, find something interesting, or have used SLL episodes in your teaching — we'd love to hear about it.