Practical Engineering — An Engineer’s Perspective on the Texas Floods
Why this is in the vault
23-minute first-person Grady Hillhouse essay on the July 4th 2025 Kerr County, Texas flood that killed 100+ people, many of them children, when the Guadalupe River rose 35 ft in 3 hours in the middle of the night. Grady lives near the affected area, has worked the Guadalupe professionally as a civil engineer, and has kids approaching summer-camp age — the personal proximity is the load-bearing frame for an essay that’s not a forensic post-mortem but a structural critique of how the engineering profession communicates flood risk. The technical core: hydrology relies on rainfall return-period analysis (Technical Paper 40 from 1961, now Atlas 14) that is fundamentally an extrapolation from extremely sparse historical data — fewer than 100 rain gauges within a 50-mile radius of Hunt, TX, only four with records >70 years, none with hourly data before 1940 — to predict events that the stations themselves have never measured. Compounding problems: stream gauges are sparser still, can break during the events they’re trying to measure (one gauge near Hunt broke at 35 ft); rainfall is wildly spatially variable (some watersheds got 100-year rainfall while neighbors got 5-year); and the entire framework assumes temporal stationarity (the distribution of extreme events doesn’t change over time) which climate change is invalidating in ways the methodology can’t easily incorporate. Grady’s load-bearing thesis: floodplain maps draw crisp lines around an inherently uncertain quantity, and the binary “in/out of floodplain” framing actively misleads the public about risk. The vault keeps this for four reasons: (1) it’s the canonical case for “honest about uncertainty beats false certainty” — directly applicable to RDCO’s distribution work where the temptation is always to flatten uncertainty into a clean number; (2) it strengthens ~/rdco-vault/06-reference/concepts/brier-score (existing concept page) with the civil-engineering analog — Brier score is the meteorologists’ version of what Grady is arguing the hydrology profession needs more of; (3) it adds a sparse-data-extrapolation discipline that maps onto every model trained on incident data (LLM evals, MLOps drift detection, anomaly scoring); (4) the “binary line drawn around a continuous probability gradient” anti-pattern is structurally identical to several agent-system design failures (rate limit boundaries, model temperature thresholds, retry/no-retry decisions) and is worth promoting as a candidate concept page.
Episode summary
23-minute first-person essay by Grady Hillhouse (Practical Engineering) reflecting on the July 4th 2025 Kerr County flood that killed 100+ people in Central Texas when a stalled storm cell over the upper Guadalupe River watershed dropped enough rain to raise the river 35 ft in 3 hours overnight. Grady’s frame is not forensic — he writes from the position of a civil engineer who lives in the area and has kids — and the essay opens onto a structural critique of how flood risk is estimated, mapped, and communicated. Three load-bearing technical sections: (1) the inherent uncertainty of return-period estimation from sparse historical rainfall data (TP40 from 1961, Atlas 14 ongoing NOAA update, both extrapolating from gauges that have never measured the events they’re predicting); (2) stream gauges are sparser, can break during events, and can’t capture spatial variability (rainfall on July 4th showed enormous within-watershed variation — some areas got 100-year rainfall, others got 5-year); (3) temporal stationarity is the hidden load-bearing assumption that climate change is invalidating, but attribution to any single event is still epistemically out of reach. Closes on Grady’s central thesis: floodplain maps draw crisp lines around inherently uncertain quantities, and the engineering profession has a responsibility to communicate uncertainty honestly because “facing the limitations of our understanding head-on actually instills more trust than pretending like we have all the answers.” No paid sponsor read.
Key arguments / segments
- [00:00:00] The setup. Radar animation of central Texas July 3rd-4th 2025: torrential rain across the state from tropical storm Barry remnants, plus a small severe storm cell that got stuck northwest of San Antonio and stayed in place for several hours. Looks insignificant in context but raised the upper Guadalupe higher than ever recorded.
- [00:01:00] The personal frame. Grady lives near the affected area. His family wasn’t directly affected. He has kids approaching summer-camp age. His career has been spent designing flood-coping infrastructure. The emotional opening establishes why this essay exists in this form rather than as a standard PE forensic explainer.
- [00:02:00] The fundamental engineering problem: predicting future loads under deep uncertainty. Especially weather loads — wind, ice, snow, waves, rain. Two opposing forces: caution wants overestimation for safety; cost wants underestimation for budget. The professional discipline is to find the line.
- [00:03:00] Technical Paper 40 (1961). Monumental effort to compile US rainfall data, fit probability distributions, and map results by duration and recurrence interval. The first widely-used resource for flood-frequency analysis in the US. Still in use in many places.
- [00:04:00] The 100-year flood is widely misunderstood. It does NOT mean cyclical recurrence. Floods are statistically independent events. The “100-year” terminology survives because the technically correct definition — “the depth of precipitation over a given duration that has a 1% probability of being equaled or exceeded in a given year” — doesn’t roll off the tongue.
- [00:05:00] Snake-eyes analogy. The odds of rolling snake-eyes in craps are 1 in 36. If you go 35 rolls without snake-eyes, the odds on roll 36 haven’t changed. The dice don’t remember. Floods are the same — the atmosphere rolls the metaphorical dice every year.
- [00:06:00] TP40’s 100-year rainfall for Kerr County: ~9.5 in / 240 mm in 24 hours. Based entirely on pre-1961 historical data. Decades of newer rainfall not included. Statistical methodology of the time was less rigorous than today’s.
- [00:07:00] Atlas 14 (NOAA, ongoing update). Same county now estimated at 11.5 in / nearly 300 mm for the 100-year 24-hour storm — a 20% increase over TP40. What was the 100-year storm in 1961 is now the 50-year storm. Confidence intervals: 8 to 16 in. Wide enough to swallow most engineering decisions.
- [00:08:00] The Guadalupe flood as a sparse-data case study. Hourly rainfall map of central Texas July 4th 2025. The yellow watershed of the upper Guadalupe. The cell that caused most of the fatal flooding was there and gone in 4 hours. Classic flash flood — short burst on a small steep rocky basin.
- [00:08:30] Hourly rainfall records weren’t common until the 1940s. Within a 50-mile radius of Hunt, TX (where most fatalities occurred), about 100 rain gauges were used by Atlas 14. None had hourly data before 1940. Of the ones that did, only four had records >70 years.
- [00:09:00] Spatial variability is enormous and the map is sparse. Pick four random pixels of the radar map and try to recreate it — that’s what we’re doing with rainfall frequency analysis. Most of the rain gauges we use to estimate flood probabilities have never even seen an event of the magnitude we’re trying to use them to predict. The Mona Lisa from a dozen pixels.
- [00:10:00] Within-watershed variation on July 4th. Some areas saw extreme precipitation (>100-year). Most of the upper Guadalupe basin saw 2-to-5-year rainfall. The storm was a flood-of-record event for one part of the watershed and a normal storm for the rest.
- [00:10:30] The thing we actually care about isn’t rainfall — it’s stream rise. Stream gauge upstream of Hunt: river rose 20 ft / 6 m in 3.5 hours starting 2 AM. Downstream: 35 ft / nearly 11 m in 3 hours before the gauge broke. Practically a wall of water. Not enough time for evacuation. >100 dead, many children.
- [00:11:00] Stream gauges are even sparser than rain gauges. Records often shorter, gauges more expensive, can break during exactly the events they’re trying to measure. Engineers/hydrologists then visit affected areas and map the high-water line by hand to validate or fill gaps when gauges break.
- [00:12:00] Hydrologic models add another layer of uncertainty. Most engineering predictions use models that convert rainfall into runoff into flooding — each conversion compounds uncertainty.
- [00:12:30] Temporal stationarity is the hidden load-bearing assumption. The distribution of extreme events doesn’t change over time. Future precipitation can be represented by past observations. This assumption is increasingly indefensible. Within the professional community of hydrologists/engineers/climate scientists, the question isn’t whether the climate is changing but how much, how quickly, and where the effects are most pronounced.
- [00:13:30] Atlas 14 Texas volume tested for long-term trends. Some scattered weather stations showed increased extreme rainfall over time. Most didn’t. Other studies looking at recent decades found more pronounced increases.
- [00:14:00] You can’t ascribe a single event to climate change deterministically. Attribution studies can estimate the contribution of extra energy in the system, but no single weather event is “caused by” warming in a clean attributional sense. Climate change is one more confounding source of uncertainty in flood risk estimation, not the only one.
- [00:15:00] Why does any of this matter? Because before you can mitigate flood impacts, you have to know what the actual risks are. Humans are notoriously bad at using probabilities to make decisions about rare and extreme events. Almost nothing in our biology is optimized for long-term rational decision-making about catastrophes that almost never happen.
- [00:16:00] The deterministic bias in engineering practice. Even within engineering, where we should know better, we have a strong tendency to treat uncertain quantities deterministically. Take the bold number from the table, plug it into the equation, forget the confidence bands existed. In some ways it makes sense — you do have to choose a number for “how high to build the bridge.” But the deterministic choice gets translated into a confidence that doesn’t actually exist.
- [00:17:00] The floodplain map as the canonical anti-pattern. National Flood Insurance Program requires floodplain mapping for participation. Maps draw crisp lines around base flood (100-year) and sometimes 500-year zones. Property owners spend significant resources to shift the line slightly to reduce regulatory burden.
- [00:18:00] Grady’s load-bearing question. What’s the difference in risk profile between just-inside-the-line and just-outside? Is it enough to have a sharp line between them? And if not — if the true situation is more nebulous — is the map doing a good job of communicating flood risk to the public?
- [00:19:00] When the map gets updated, distrust follows. New data → updated maps → public reaction: “We’ve had 200-year floods in the past 5 years. These engineers don’t know what they’re talking about.” Mostly a misunderstanding of return-period semantics, but partly a real failure of communication. Same dynamic as meteorology forecast-criticism without acknowledging the wizard-stuff difficulty.
- [00:20:00] The Kerr County puzzle: why here and not elsewhere? Many areas of central Texas got more than 100-year 24-hour precipitation on July 4th. Severe flooding region-wide. Yet nearly all the fatalities happened in one place. Likely combination of timing (overnight), warning systems, rural location, floodplain regulation differences, and bad luck. Worth comparative study.
- [00:21:00] The closing thesis. People can’t act to reduce their risk unless they can internalize what it actually is. Professionals think about these issues every day. But most people don’t have the same cognizance of the hazards. The engineering profession has a responsibility to improve flood-risk communication — for accessibility AND for honesty.
- [00:22:00] The counterintuitive close. Facing the limitations of our understanding head-on actually instills more trust than pretending like we have all the answers. And when people understand those uncertainties, they get a deeper appreciation for how flood hazards vary across the landscape, giving them MORE insight, not less, to prepare for what’s ahead.
Notable claims
- [00:00:30] July 4 2025 Kerr County flood: one of the deadliest inland flooding events in the past 50 years in the US. >100 dead, many children.
- [00:07:30] Atlas 14 raised the Kerr County 100-year 24-hour rainfall from 9.5 in (TP40, 1961) to 11.5 in — a 20% increase. What was the 100-year storm in 1961 is now the 50-year storm. Confidence intervals: 8-16 in.
- [00:08:30] Hourly rainfall records weren’t common until the 1940s. Within 50 mi of Hunt, TX, ~100 rain gauges in Atlas 14. None had hourly data before 1940. Only 4 had records >70 years.
- [00:10:30] Stream rise during the flood: 20 ft / 6 m in 3.5 hours upstream of Hunt; 35 ft / 11 m in 3 hours downstream before the gauge broke.
- [00:11:00] Stream gauges can go offline during the events they’re trying to measure — a known and ironic failure mode.
- [00:12:30] Within the professional climate-science / hydrology / engineering community, climate change is no longer a yes/no question — only how much, how quickly, where most pronounced.
- [00:14:30] Climate change is on net systematically causing engineering methods to underestimate future flood loads if a stationary climate is assumed. Strong consensus across climate models and recorded data.
- [00:18:00] Floodplain maps draw crisp binary lines around continuous probability gradients. This is a known structural communication problem in the National Flood Insurance Program design.
- [00:20:00] Most central-Texas areas hit by the July 4 2025 storm received >100-year 24-hour rainfall. The Kerr County fatalities were geographically concentrated despite the rainfall being region-wide.
Mapping against Ray Data Co
- The “honest about uncertainty beats false certainty” thesis is the load-bearing argument for ~/rdco-vault/06-reference/concepts/brier-score. Existing concept page covers calibration metrics for probabilistic predictions. Grady’s essay is the civil engineering twin: same epistemic discipline (acknowledge uncertainty bands, calibrate probability claims, distinguish “we computed a number” from “we know the answer”) applied to a different domain. Action: update brier-score.md with this video as a 4th source — it’s the cleanest recent argument that the communication of uncertainty is the under-developed half of the discipline. ~10 min edit.
- The floodplain-map binary-line anti-pattern maps directly onto agent-system design. RDCO’s skills make many binary decisions (retry yes/no, escalate to founder yes/no, mark task done yes/no, file to vault yes/no) that are fundamentally probability gradients. Most of those decisions happen at hard-coded thresholds (a number in a SKILL.md or a condition in a hook) without any mechanism to surface the underlying uncertainty. Same shape as the floodplain map. The engineering response Grady wants is the same one we should adopt: when a decision is fundamentally probabilistic, expose the probability alongside the decision. Concrete: every
/check-boardtask triage decision could carry a confidence (“0.85: this is in scope for Ray to handle this cycle”) and the founder-facing report could show the distribution of confidences over the cycle’s decisions. Forces honest acknowledgement of uncertainty into the workflow. - Sparse-data extrapolation is a structural risk in every model RDCO runs. Grady’s case: rainfall return periods are extrapolated from gauges that have never seen the events being predicted. Same shape: LLM evals on rare failure modes, MLOps anomaly detection on imbalanced data, RDCO’s own newsletter-quality scores on a tiny sample of human-graded examples. We are extrapolating from sparse data to predict events the data has never seen. Worth a discipline: every model output that’s an extrapolation outside the training distribution should be labeled as such, and the audit pipeline should flag predictions in extrapolation-zones for human review. Concrete first cut: add an
extrapolation_flagfield toaudit-newsletter-outputs.pyfor any score where the input lies outside the training distribution. - Temporal stationarity is the hidden load-bearing assumption in EVERY backtested skill. The hydrology equivalent of “future will look like past” is in every skill that uses historical data to predict outcomes (process-newsletter sponsor-detection, finance-pulse anomaly thresholds, sync-contacts last-touched decay curves). All assume the underlying generative process is stationary. Climate change for hydrologists is API drift / model behavior change / market regime change for our skills. Worth a quarterly audit: for each cron skill that uses thresholds, validate the thresholds against the most recent 90 days of data — if drift is detected, recalibrate. Maps cleanly to the CA-019 (design-for-controlled-decay) discipline already in the candidates list — quarterly threshold-recalibration is a slot-cut.
- The Kerr County “why here and not elsewhere” puzzle is a perfect Sanity Check angle. Most of central Texas got >100-year rainfall on July 4. Severe flooding region-wide. Yet fatalities concentrated in one place. The data-engineering analog: most production data systems have similar inputs, similar workloads, similar SLAs — but outage damage is highly concentrated in a few systems. Why? Combination of timing, warning systems, geography, regulation, bad luck. The Sanity Check angle: when a catastrophe is concentrated, the lessons aren’t in the catastrophe itself — they’re in the comparison to the systems that didn’t fail. Worth a 1500-word essay on “the value of comparing the catastrophe to its near-miss neighbors.” Pairs with CA-018 (emergent correlated failure).
- The “humans are notoriously bad at using probabilities to make decisions about rare events” frame is operationally relevant for the channels-agent. The autonomous loop sometimes faces low-probability-but-high-cost decisions (should I escalate this ambiguous task? should I publish this draft? should I delete this stale file?). The default human-cognitive-bias pattern is to either treat the event as impossible (no escalation, no review) or treat it as catastrophic (escalate everything, review everything). The right answer is calibrated probability + asymmetric-cost weighting. Worth a SKILL.md addition for
/check-board: for any ambiguous decision, compute a probability and an asymmetric cost ratio; act accordingly rather than defaulting to either extreme. - Sanity Check angle: “The 100-Year Anything Is a Lie.” Lead with the visceral image (35 ft of stream rise in 3 hours, gauge breaks during the flood), the stat (none of the rain gauges within 50 mi of Hunt had hourly data before 1940), and Grady’s load-bearing question (what’s the difference between just-inside-the-floodplain and just-outside?). Pivot to data: “100-year MTBF” on hardware, “99.99% uptime” on cloud services, “p99 latency” on APIs — all extrapolations from sparse data being communicated as crisp deterministic claims. Land on the operating discipline: publish the confidence band, not just the number. ~1500-1800 words. Strong angle.
Open follow-ups
- Update ~/rdco-vault/06-reference/concepts/brier-score with this video as a 4th source. The civil-engineering twin of the calibration discipline. ~10 min edit.
- Add an
extrapolation_flagfield toaudit-newsletter-outputs.pyfor any score where input lies outside the training distribution. Mirrors Grady’s “the gauge has never seen this magnitude” warning. ~30 min build. - Add quarterly threshold-recalibration to every cron skill that uses thresholds. finance-pulse anomaly thresholds, process-newsletter sponsor-detection thresholds, sync-contacts decay curves. Recalibrate against most recent 90 days; flag drift. ~2 hours total. Maps to CA-019 (design-for-controlled-decay) — this is a slot-cut.
- Add to
/check-boardSKILL.md: for any ambiguous decision, compute probability + asymmetric-cost ratio. ~15 min edit. Avoids the cognitive-bias default to either treat-as-impossible or treat-as-catastrophic. - Write the Sanity Check piece “The 100-Year Anything Is a Lie.” Lead with Kerr County stream rise + gauge breakage; pivot to MTBF / uptime / p99 claims; land on publish-the-confidence-band discipline. ~1500-1800 words. Strong angle, pairs naturally with the next finance-pulse cycle if recent anomalies surface.
- Add CA-022 candidate: binary-decision-around-continuous-probability anti-pattern. Currently 1 source (this video, floodplain map example). Needs 2 more. Likely candidates: any post-mortem on a hard-threshold-driven outage (LB threshold, retry threshold, autoscaling threshold), a UX critique of binary trust badges, or a paper on probabilistic vs deterministic ML decision boundaries.
Sponsorship
No paid sponsor read. Grady mentions his book and Nebula at the start (per video description) but the YouTube cut omits the Nebula promo entirely — the video closes on the thesis, not on a pitch. This is unusual for Practical Engineering and signals Grady deliberately chose not to monetize this episode. Per RDCO bias-flagging discipline:
- The technical content (TP40, Atlas 14, return-period semantics, snake-eyes analogy, stream-gauge fragility, temporal-stationarity assumption, attribution-study limits, NFIP floodplain-map mechanics) is editorial — drawn from Grady’s professional civil-engineering background and standard hydrology references.
- The first-person framing (Grady lives near the area, has kids approaching summer-camp age) is a deliberate editorial choice rather than a sponsored angle. The personal proximity is load-bearing for the “communicate uncertainty honestly” thesis — Grady has skin in the game on this risk.
- No paid sponsor read in the YouTube cut. The book + Nebula references in the description are creator self-promotion, not paid placements.
- The climate-change framing acknowledges public discourse contention up-front and stays on the professional-consensus position (changing yes; how much/how fast/where = open question). Worth flagging as the most politically-loaded content in the video — Grady is careful but the position is not neutral and reflects professional climate-science consensus, not a balanced-debate framing.
Related
- ~/rdco-vault/06-reference/transcripts/2026-04-20-practical-engineering-an-engineers-perspective-on-the-texas-floods-transcript.md — full transcript
- ~/rdco-vault/06-reference/2026-04-20-practical-engineering-do-retention-ponds-actually-work — paired hydrology piece; retention ponds are downstream of exactly this rainfall-uncertainty discipline (CA-018 source)
- ~/rdco-vault/06-reference/2026-04-20-practical-engineering-spillway-failed-on-purpose — engineered-failure-mode hydrology; the Asheville bypass-line case is the infrastructure-design response to flood-prediction uncertainty (CA-016, CA-018 source)
- ~/rdco-vault/06-reference/2026-04-20-practical-engineering-californias-tallest-bridge-has-nothing-underneath — same-cycle PE companion; both are essays-about-the-engineering-profession rather than standard explainers
- ~/rdco-vault/06-reference/concepts/brier-score — existing concept page on calibration; this video is the civil-engineering twin and should be added as a 4th source
- ~/rdco-vault/06-reference/concepts/CANDIDATES.md — surfaces CA-022 candidate (binary-decision-around-continuous-probability anti-pattern); strengthens CA-019 (design-for-controlled-decay) with quarterly threshold-recalibration as the slot-cut analog for cron-skill thresholds