07-source-material/ — RDCO bookshelf
The canonical home for raw source material that grounds Ray’s reasoning + the founder’s research.
This directory is the answer to the founder’s 2026-04-30 question: “Where is our ‘bookshelf’ if we need to go back to the source to rediscover the details for a specific situation?”
See ../06-reference/2026-04-30-bookshelf-source-material-architecture-gap for the architectural concept and ../06-reference/2026-04-30-bookshelf-discovery-source-material-inventory for the discovery inventory that produced this initial migration.
Architecture
07-source-material/
├── books/ ← canonical books (1 dir per title)
│ └── <slug>/
│ ├── source.pdf (or source.epub, source.txt, source.docx)
│ ├── extracted.txt (or extracted.md — text version for retrieval)
│ └── metadata.yaml (title, author, ISBN, source, format, copyright)
├── masterclass/ ← purchased course materials (multi-file)
│ └── <slug>/
│ └── (mixed PDFs + xlsx + docx)
├── articles/ ← long-form articles saved raw
├── papers/ ← arxiv / academic papers
├── transcripts/ ← video/audio transcripts
│ └── youtube-and-podcast/ (symlink → 06-reference/transcripts)
├── newsletter-bodies/ ← raw newsletter HTML/text (extension to /process-newsletter)
└── web-archives/ ← saved web articles + blog posts
└── <author-slug>-<title-slug>/
├── source.html (or source.md if natively markdown)
├── extracted.md
└── metadata.yaml
Per-source slugging
- Books:
<author-lastname>-<short-title-slug>→wheeler-understanding-variation,boyd-vandenberghe-convex-optimization - Masterclasses:
<author-or-publisher>-<course-slug>→wheeler-spc-metrics-masterclass - Web archives:
<author-slug>-<title-slug>→cedric-chin-process-behaviour-charts-more-than-you-need-to-know - Papers:
arxiv-<id>-<title-slug>or<conf>-<year>-<title-slug> - Newsletter bodies (when extended):
<sender-slug>/<YYYY-MM-DD>-<topic-slug>.md
Per-source metadata.yaml schema
title: "Understanding Variation: The Key to Managing Chaos"
authors: ["Donald J. Wheeler"]
publisher: "SPC Press"
edition: "2nd"
year: 1993
isbn: "9780945320807"
pages: 161
source_format: "VitalSource ebook → screencap+OCR text extraction"
acquired: "2026-04-21"
acquired_by: "founder (purchased SPC Press → VitalSource)"
copyright: "Personal-use only. NEVER quote at length. NEVER index in QMD without explicit founder OK."
extraction_quality_notes: "Italics sometimes render with stray slashes. Page-number footers reliable."
canonical_citation_format: "Wheeler, *Understanding Variation*, Ch. N, p. M"
Copyright posture (inherited from private/ deprecation)
- Never commit. This entire directory is in
.gitignore(line:07-source-material/). - Never quote at length in vault notes, public writing, social posts, or any context the founder didn’t approve. Standard fair-use limits apply: ≤15 words in quotation marks per excerpt, only for genuine commentary, never for displacive summary that would substitute for the source.
- Don’t index in QMD by default. The Phase-1 retrieval pattern is
grep+Readfor specific passages. QMD indexing into a separatesource-materialcollection is a Phase-2 decision pending founder green-light. - Don’t ship full source material to sub-agents unless the founder asks for a specific extraction. Reading these into context costs ~225KB-740KB per file. The right pattern is
grepfor the specific concept and read only the matching pages.
Retrieval patterns
Pattern 1 — grep for a concept
grep -n "operational definition" ~/rdco-vault/07-source-material/books/wheeler-understanding-spc/extracted.txt | head -5
Returns line numbers + matching lines. Use Read with offset/limit on those line numbers to fetch the surrounding context.
Pattern 2 — citation chain
When making a strong claim that benefits from grounding:
grepthe relevant slug in07-source-material/books/- Read the matching passage with surrounding context
- Cite as:
Wheeler, *Understanding SPC*, Ch. N, p. M ([07-source-material/books/wheeler-understanding-spc/extracted.txt:LINE](path))
Pattern 3 — Phase-2 (deferred): semantic search via separate QMD collection
Pending founder green-light: ingest 07-source-material/ into a SECOND QMD collection named source-material (distinct from the rdco-vault collection). Query via mcp__qmd__query with collection: "source-material". Keeps synthesis search and source search independent.
Current inventory (2026-04-30 initial migration)
books/ (17 total, ~488MB)
Founder-purchased / pre-existing:
wheeler-understanding-variation/— Wheeler, Understanding Variation (2nd ed) — OCR textwheeler-making-sense-of-data/— Wheeler, Making Sense of Data: SPC for the Service Sector — OCR textwheeler-understanding-spc/— Wheeler & Chambers, Understanding Statistical Process Control (3rd ed) — OCR textboyd-vandenberghe-convex-optimization/— Boyd & Vandenberghe, Convex Optimization (free PDF, Cambridge)corr-stagnitto-agile-data-warehouse-design/— Corr & Stagnitto, Agile Data Warehouse Design (BEAM* Modelling)
Free graduate ML curation (sub-agent download 2026-04-30, per ../06-reference/2026-04-30-dami-defi-12-graduate-ml-textbooks-curation):
mohri-foundations-of-machine-learning/— 505p, 5.9MB ✅prince-understanding-deep-learning/— 541p, 21MB ✅mlsysbook-machine-learning-systems/— Vol1+Vol2 = 2152p, 31MB ✅ (note: split into 2 volumes since curation note written; consolidated PDF no longer offered)kochenderfer-algorithms-for-decision-making/— 700p, 12MB ✅goodfellow-deep-learning/— ⚠️ PARTIAL: onlyfront_matter.pdf(9p). Publisher contract (MIT Press) forbids PDF distribution; chapters HTML-only at deeplearningbook.org/contents/.html. On-demand HTML lookup is the workaround. sutton-barto-reinforcement-learning-introduction/— 548p, 70MB ✅bellemare-distributional-reinforcement-learning/— ❌ FAILED: no PDF available; publisher contract forbids; HTML-only at /contents/chapterN. Emptysource.pdfslot; see metadata.yaml for retrieval path.albrecht-multi-agent-reinforcement-learning/— 395p, 7.6MB ✅kochenderfer-decision-making-under-uncertainty/— 349p, 9.2MB ✅barocas-hardt-narayanan-fairness-and-machine-learning/— 294p, 2.1MB ✅murphy-probabilistic-ml-introduction/— 860p, 88MB ✅murphy-probabilistic-ml-advanced-topics/— 1370p, 144MB ✅
Free curation result: 10/12 success, 1 partial (Goodfellow front-matter only), 1 failed (Bellemare). ~408MB added.
masterclass/
wheeler-spc-metrics-masterclass/— 28 PDFs + xlsx + docx from Wheeler’s SPC Metrics Masterclass
web-archives/
cedric-chin-process-behaviour-charts-more-than-you-need-to-know/— Cedric Chin’s PBC explainer (commoncog.com)
transcripts/
youtube-and-podcast/— symlink →06-reference/transcripts/(367 transcript files: Moonshots, Acquired, Tim Ferriss, Lex Fridman, Dwarkesh, etc.)
Pending acquisitions (per ../06-reference/2026-04-30-bookshelf-discovery-source-material-inventory decision queue)
Free downloads (queued, sub-agent dispatched 2026-04-30):
- Mohri et al., Foundations of Machine Learning
- Prince, Understanding Deep Learning
- MLSysBook (community/MIT)
- Kochenderfer/Wheeler/Wray, Algorithms for Decision Making
- Goodfellow/Bengio/Courville, Deep Learning
- Sutton & Barto, Reinforcement Learning: An Introduction
- Bellemare et al., Distributional Reinforcement Learning
- Albrecht et al., Multi-Agent Reinforcement Learning
- Kochenderfer (long-game), Decision Making Under Uncertainty
- Barocas/Hardt/Narayanan, Fairness and Machine Learning
- Murphy Vol I, Probabilistic Machine Learning: An Introduction
- Murphy Vol II, Probabilistic Machine Learning: Advanced Topics
Paid acquisitions (founder yellow-light 2026-04-30, format-clarification pending):
- Reis & Housley, Fundamentals of Data Engineering (O’Reilly direct DRM-free PDF/EPUB ~$45)
- Deming, Out of the Crisis (MIT Press 2018, format TBD)
- Andy Grove, High Output Management (format TBD; likely Kindle DRM)
Web archive scrape candidates (deferred, founder call):
- Kingsbury Jepsen analyses (~50 articles, ~100MB)
- Commoncog source consolidation for top-cited posts
Skill integrations (Phase 2)
/save-to-bookshelf <url-or-path>— new skill (in development)/process-newsletter— extend to also save raw body tonewsletter-bodies/<sender>//process-youtube— symlink already in place; future ingest writes directly undertranscripts//cite-from-bookshelf <claim>— Phase-2 retrieval skill (deferred until QMD-indexing decision)
Migration history
- 2026-04-30: initial migration. 5 books, 28 masterclass files, 1 web archive moved into structure. Wheeler material moved out of
private/(now deprecated; seeprivate/README.md). Boyd & Vandenberghe Convex Optimization moved out of06-reference/textbooks/(folder now empty, can be removed). Founder approved scaffold-first approach, all 12 free downloads, yellow-light on 3 purchases pending format clarification.