Practical Engineering — Hurricane vs. Tiny Houses

Why this is in the vault

22-minute Grady Hillhouse field-piece on storm-surge resilience experiments at Oregon State’s O.H. Hinsdale Wave Research Laboratory, set up by FEMA’s post-Hurricane-Ian (Sept 2022) finding that flood insurance claims for elevated coastal structures in Fort Myers averaged ~1/3 the cost of claims for non-elevated buildings. The episode follows Dr. Dan Cox’s team running 1/3-scale identical model homes through a directional wave basin, with the only difference between them being 3 feet of vertical elevation (1 foot at scale). Same surge, same waves, same construction — the lower house ultimately collapses, the higher house takes “hardly any” damage. The marginal upfront cost of the extra 3 feet is “almost negligible” compared to total structure value. The vault keeps this for four reasons: (1) it’s a clean layered-defense exemplar — elevation is one independent failure-mode layer in a stack that also includes building codes, floodplain maps, evacuation policy, and the NFIP’s insurance pricing signal; the experiment isolates the marginal value of one layer in a way real disasters can’t; (2) it’s the canonical “binary decision around continuous probability” companion to ~/rdco-vault/06-reference/2026-04-20-practical-engineering-an-engineers-perspective-on-the-texas-floods — Cox explicitly calls out the public misperception that a “500-year flood” requires 5x the elevation of a 100-year flood (it doesn’t; the marginal foot does most of the work) and the floodplain-edge “where do you draw the line” problem is the same anti-pattern as Kerr County; (3) physical-model-vs-numerical-simulation epistemics — Cox’s “the simulations always look pretty but you have to verify; the physical model is closer to the real world” is the engineering articulation of why RDCO needs end-to-end physical tests of agent behavior, not just unit-test simulations of skill outputs; (4) the scale-collapse surprise (the 1/6-scale model failed progressively; the 1/3-scale model failed in fits-and-starts because the destroyed first floor acted as another level of stilts) is the textbook “you learn things at higher fidelity that don’t show up in cheaper models” finding — directly relevant to RDCO’s discipline of running real cron cycles, not just dry-run skill simulations.

Episode summary

Grady visits Oregon State’s directional wave basin to watch Dr. Dan Cox run two identical 1/3-scale model homes through escalating storm surge — the green house elevated 1 foot above the orange (3 ft real-world). The motivating data is FEMA’s post-Ian analysis: elevated structures in Fort Myers had ~1/3 the claim cost of non-elevated. The experimental question is how tall is tall enough? — the most expensive open question in coastal engineering, because elevating structures has nonlinear cost (passed down to renters, suppressing housing supply) but the cost of getting it wrong is total. Wave conditions escalate over an hour; the lower house’s first-story walls fail, then waves penetrate, then portions are swept away, and surprisingly the structure stabilizes for a while because the destroyed first floor functions like a second tier of stilts (a fits-and-starts failure mode that the 1/6-scale prior study missed — the bigger model failed in fits-and-starts, the smaller in a smooth progressive collapse). Eventually the orange house collapses; the green house, separated by only 3 ft of elevation, takes nearly zero damage. Cox’s punch line: marginal elevation cost is almost negligible vs total structure value; the public misperception that a 500-year storm requires 5x defense vs a 100-year storm is wrong; the right framing is “a little more height does most of the work.” The data will be used to calibrate hydrodynamic computer models so future engineers can answer similar questions without building scale houses. SendCutSend sponsor read at the end.

Key arguments / segments

[00:00:00] Cold open + Hurricane Ian framing. Sept 2022 Ian — one of the strongest/deadliest modern storms. Storm surge, not wind, did most of the catastrophic damage. Buildings weren’t just wet — they were swept off foundations.
[00:01:00] FEMA post-Ian finding. >1000 flood claims analyzed. Elevated structures in Fort Myers averaged ~1/3 the claim cost of non-elevated buildings. Headline data point that motivates the wave-lab visit.
[00:02:00] Stilt construction = the dominant coastal flood-mitigation pattern since the early 2000s. Move living space above the surge; let the wave action flow around stilts instead of slamming into walls.
[00:03:00] But how tall is tall enough? Some elevated buildings still failed in Ian — they weren’t elevated enough. This is the hardest open question in hurricane engineering.
[00:03:30] The cost-asymmetry conundrum. Losing your home is expensive; building it to withstand anything is also expensive. Higher elevation means higher upfront cost, passed to renters, suppressing housing supply. Hurricanes are rare → uncertain ROI on resilience investment.
[00:04:00] The codes-and-policies tradeoff. Local governments want resilient development and they want development to happen. NFIP wants fewer claims and must accept reduced property tax base, limited housing supply, individual-compliance burden. “At the scale of a single structure these decisions seem trivial; multiplied across a coastline, each extra foot of elevation has monumental implications.”
[00:05:00] Why the wave lab matters. Building codes need data + stakeholder consensus. Computer models are limited. You can’t summon a real hurricane (and even if you could, ethics violation). The wave lab is the next-best thing.
[00:06:00] O.H. Hinsdale Wave Research Lab, Oregon State. Operating since the 1970s. Two test beds: large wave flume (2D) and directional wave basin (3D). The basin uses dozens of independently piston-driven paddles to generate complex multi-directional waves — “a wave pool turned up to 11.”
[00:07:30] Dr. Dan Cox on choosing the basin over the flume. Wanted to test an entire house in 3D, not a wall slice in 2D.
[00:08:00] The 1/3-scale models. Two near-perfect replicas of a real coastal house. Identical interior walls, windows, framing, paint. Only difference: green is 1 ft (30 cm) higher than orange — 3 ft (1 m) at full scale. Wave periods, velocities, structural stiffness all scaled to maintain dynamic similarity.
[00:09:00] Cox on the limits of scale modeling. “Forces can be scaled up; how the structure failed is more qualitative than quantitative scaling.” The mode of failure is what the lab adds over computer simulation.
[00:10:00] The experiment. Waves start small, build gradually in height and frequency, simulating storm approach. LiDAR + cameras + sensors record wave height, velocity, pressure, accelerations, internal motion. The data calibrates and validates computer models so future engineers don’t need to build scale houses to answer similar questions.
[00:12:00] Cox on physical-model vs numerical-simulation epistemics. “The numerical simulation is the best we think we can do, and it always looks pretty, but you have to verify it. You have to show that it’s correct, not just looks cool. When we get to the laboratory, like we’re seeing during this test — okay, it’s not as simple as we think. There’s a lot more complexity inherent in the physical model.”
[00:13:00] The communication-tool subtext. The models have roofing, paint, windows that don’t affect results — they exist so the footage tells a story for non-academic audiences. “You don’t need data to understand which of these two structures you’d want to live in when a hurricane comes.”
[00:14:00] First damage. Wall under the window of the orange (lower) house gives way. Waves penetrate the interior.
[00:15:00] The unexpected stabilization. First-story walls fully obliterated. The destroyed first floor almost begins to act like another level of stilts. Second story remains fine for a while even as waves intensify. Cox: at half this scale (1/6), prior tests showed smooth progressive collapse — at 1/3 scale, failure comes in fits and starts. “You learn things at higher scale and realism that aren’t always expected.”
[00:16:00] Cox’s calibration confession. “It was a tough problem. I thought I knew the answer and it turns out I didn’t. A little bit tough to swallow, but it highlights — okay, this is more complicated than we thought. That’s a success.” The honesty-about-limits move that mirrors the Texas floods essay.
[00:17:00] The orange house finally collapses. Cox + Grady: “Holy moly.”
[00:17:30] The green house — almost no damage. Same conditions, 3 ft of elevation, “hardly any damage and that was only really after we tried to take the other one out.”
[00:18:00] Cox on the misperception. “People talk about 100-year, 500-year and there’s a misperception that the 500-year is five times bigger, five times worse, I have to elevate five times greater. There was not much of a difference in elevation between those two buildings — one is toast, the other had hardly any damage.”
[00:19:00] The “just don’t build there” counter-position. Grady acknowledges the cleanest answer is buyout-and-buffer — but where’s the line between flood-prone and not? When annual probabilities are 1-in-100 or 1-in-500, the floodplain edge is fundamentally fuzzy.
[00:20:00] The closing thesis. Engineering as balancing act — strong, safe, affordable, occupiable, even pleasing. Tests like this give clearer definition of the edges of the problem.
[00:20:30] SendCutSend sponsor read.

Notable claims

[00:01:30] FEMA post-Ian: elevated coastal structures had ~1/3 the average flood-claim cost of non-elevated. Headline empirical finding.
[00:08:30] Marginal elevation: 3 feet (1 m) full-scale — the only structural difference between the two test houses.
[00:12:30] Cox: physical-model verification is required because numerical simulations “always look pretty” but need ground truth. The structure-class-determines-mitigations point translated to the physical-vs-numerical model split.
[00:15:30] Scale-fidelity surprise: 1/6-scale model failed progressively; 1/3-scale failed in fits-and-starts (destroyed first floor acted as second stilts level). “You learn things at higher fidelity that don’t show up at lower fidelity.”
[00:17:30] Same surge, same waves, 3 ft of elevation difference: lower house = collapse, higher house = nearly zero damage. The headline visual finding.
[00:18:00] Public misperception: “500-year storm” requires 5x defense over 100-year. False — the marginal foot does most of the work; the ratio isn’t linear in elevation.
[00:19:00] Floodplain edges are fuzzy because annual exceedance probabilities are 1-in-100 or 1-in-500. Same anti-pattern as Kerr County, Texas (CA-022 source).

Mapping against Ray Data Co

CA-016 (layered-defense-architecture) — direct source addition. Coastal hurricane resilience is a classic layered stack: elevation (structural), building codes (regulatory), floodplain maps (informational), NFIP insurance pricing (economic signal), evacuation policy (operational), buyout-and-buffer (geographic). The experiment isolates one layer’s marginal value — exactly the kind of evidence the layered-defense canon needs. The orange house failed because its elevation layer was just below the wave height; the green house’s elevation layer absorbed the same stress. The independence-of-failure-modes rule from CA-016 maps cleanly: elevation works only if the structural layer is sufficient to remain standing while the wave action flows around stilts. When the orange house’s first-story walls failed, the structure improvised a new second layer (the destroyed first floor as additional stilts) — an emergent layer that bought time. RDCO analog: when one defense layer collapses, sometimes the failure mode itself creates a new layer (e.g., when a skill hits a 429, the rate-limit error itself becomes a backoff signal that reshapes the next call). Worth adding as a sub-pattern: emergent-layer-from-graceful-degradation. Also strengthens the structure-class determines available mitigations sub-pattern — the 1/6-scale model behaved differently than the 1/3-scale because higher fidelity surfaced fits-and-starts failure modes the cheaper model couldn’t capture. Same logic for skill testing: the cheap dry-run can’t surface the fits-and-starts failure modes that only emerge in real cron cycles.
CA-022 (binary-decision-around-continuous-probability) — direct source addition. Cox’s misperception observation — the public thinks a “500-year storm” needs 5x the defense of a 100-year — is the same anti-pattern as the floodplain map: a continuous probability gradient (annual exceedance probability) gets collapsed to a discrete return-period label, which then gets treated as if the labels were ratios. They aren’t. The marginal foot does most of the work is the operational counter-rule: when the underlying probability is continuous, small interventions near the threshold have outsized effect, and large interventions far from the threshold have diminishing returns. RDCO analog: most “is this newsletter worth filing” / “should this task be escalated” decisions are continuous confidence scores collapsed to binary, with the threshold treated as a hard line rather than a gradient. The right discipline is to expose the score and let the downstream consumer treat the marginal probability appropriately. Add this video as the 2nd source for CA-022 (currently 1 source — Texas floods essay; this brings it to 2; still needs 1 more for ripeness, likely the 3Blue1Brown LLM-token-sampling source already cross-referenced in CA-022 draft).
Physical-model vs numerical-simulation epistemics → end-to-end skill testing discipline. Cox: “The numerical simulation is the best we think we can do, and it always looks pretty, but you have to verify it.” The physical model surfaces complexity the simulation misses. RDCO equivalent: dry-run / unit-test simulation of a skill is “the best we think we can do,” but actual cron-cycle execution is the physical model. The fits-and-starts failure mode that emerged at 1/3 scale but not 1/6 is exactly the kind of thing that surfaces only in real cycles, never in --dry-run. Discipline: every cron skill should have at least one full real-execution canary cycle per significant change, not just a unit-test pass. Pairs naturally with CA-019 (design-for-controlled-decay) — the canary cycle is the slot-cut.
Cox’s “I thought I knew the answer and it turns out I didn’t” frame. The same honesty-about-limits move Grady made in the Texas floods essay. “It’s a success right there. Say, hey, this is more complicated than we thought.” RDCO equivalent: calibration-checkpoint discipline — when a skill behaves differently than expected, the right move is to log the surprise, not to paper over it. Worth a SKILL.md addition: when a skill’s actual output diverges from the expected behavior, the cycle log should capture the divergence as a first-class event, not as a transient warning.
Sanity Check angle: “The Marginal Foot.” Lead with the visual (two identical houses, 3 ft of elevation, one collapses, one doesn’t) and Cox’s misperception observation (people think 5x defense for 5x return period). Pivot to data-engineering: the marginal investment in the right place (one more redundancy layer, one more validation step, one more retry) often does most of the resilience work, but teams default to more of the same (5x bigger cluster, 5x longer SLA buffer, 5x more retries) and get diminishing returns. Land on the operating principle: diagnose what layer is actually thin, then add 1 ft to that layer, not 5x to the layer that’s already adequate. ~1500-1800 words. Strong, concrete, visual. Pairs with the existing “100-Year Anything Is a Lie” angle as a two-part series on flood-engineering thinking applied to data infrastructure.
Communication-tool framing. Cox + Grady deliberately added paint, windows, roofing to the test models so the footage could communicate the result to non-academic audiences. The footage is the real product, not the data. RDCO analog: every cron-cycle report should be designed as a communication tool for the founder, not just a log dump for the assistant. The headline should be the visual (“orange collapsed, green stood — 3 ft difference”), not the data spec.

Open follow-ups

Update ~/rdco-vault/06-reference/concepts/layered-defense-architecture with this video as a new source. Add the emergent-layer-from-graceful-degradation sub-pattern (the destroyed first floor improvised a new stilts level). Also reinforces the structure-class-determines-mitigations sub-pattern via the 1/6-vs-1/3-scale finding. ~15 min edit.
Update ~/rdco-vault/06-reference/concepts/binary-decision-around-continuous-probability with this video as the 2nd source. Cox’s “5x return period ≠ 5x elevation” observation is the cleanest operational counter-rule for the anti-pattern. Brings CA-022 from 1 source to 2; needs 1 more for ripeness. ~10 min edit.
Add a calibration-checkpoint discipline to the cron-cycle skill template. When a skill behaves differently than expected, log the divergence as a first-class event. Mirrors Cox’s honesty-about-the-surprise move. ~30 min edit across the template.
Add an end-to-end-canary discipline to every cron skill. Real-execution canary cycle per significant change, not just --dry-run pass. Pairs with CA-019 (slot-cut). ~1-2 hours total across active cron skills.
Write the Sanity Check piece “The Marginal Foot.” ~1500-1800 words. Lead with the visual; pivot to data-infra resilience layer-by-layer; land on diagnose-then-add-one-foot operating principle. Pairs with “The 100-Year Anything Is a Lie” as a two-part flood-engineering-for-data series.
CANDIDATES bump for CA-022 — now has 2 sources (Texas floods + this); needs 1 more to hit ripeness. Cleanest 3rd source likely 3Blue1Brown LLM-token-sampling (already cross-referenced in the CA-022 draft).

Sponsorship

SendCutSend sponsor read at the end (~1 min, clearly marked, after the technical content closes). Standard Practical Engineering placement. Bias-flagging:

The technical content (FEMA post-Ian data, O.H. Hinsdale lab capabilities, dynamic-similarity scaling, Cox’s commentary on physical-vs-numerical models, the 3-ft elevation difference, the fits-and-starts failure mode) is editorial and grounded in the on-site experiment. No commercial conflict with SendCutSend.
No paid placements for the experiment itself — Cox’s team invited Grady to film, which is standard academic-PR; Grady’s framing reciprocates by treating the team’s data needs (calibrating computer models, communicating to public/policy audiences) as the load-bearing argument.
The 1/3-scale model is a real research artifact, not a demo built for the video. The educational subtext (footage as communication tool) is acknowledged explicitly by Grady, which is unusually honest framing for a sponsored science-communication piece.
No climate-change framing. Notable absence — the episode stays narrowly on coastal-engineering tradeoffs. Pairs interestingly with the Texas floods essay where Grady is more willing to engage climate-change framing directly.

~/rdco-vault/06-reference/transcripts/2026-04-20-practical-engineering-hurricane-vs-tiny-houses-transcript.md — full transcript
~/rdco-vault/06-reference/2026-04-20-practical-engineering-an-engineers-perspective-on-the-texas-floods — paired hydrology essay; same return-period misperception, same binary-around-continuous-probability anti-pattern, same honesty-about-limits move
~/rdco-vault/06-reference/2026-04-20-practical-engineering-do-retention-ponds-actually-work — companion flood-engineering piece on the regulatory-policy layer of the same defense stack
~/rdco-vault/06-reference/concepts/layered-defense-architecture — CA-016; this video adds the emergent-layer-from-graceful-degradation sub-pattern
~/rdco-vault/06-reference/concepts/binary-decision-around-continuous-probability — CA-022; this video is the 2nd source (Cox’s 5x-misperception observation)
~/rdco-vault/06-reference/concepts/CANDIDATES.md — strengthens CA-022 (now 2 sources) and CA-019 (canary-cycle as slot-cut)