“Why I’m replacing Polars with DuckDB” — Daniel Beach (Apr 9 2026)

Why this is in the vault

Direct external validation of the tool we already chose. RDCO’s graph-db-eval prototype landed on DuckDB on 2026-04-14 (../01-projects/graph-db-eval/prototype-results.md). Five days earlier, a longtime Polars advocate publicly switched a production AWS Lambda workload off Polars onto DuckDB. The substantive lesson is not “Polars vs DuckDB benchmarks” — it’s about maintainer culture and breaking-change discipline as production-stability inputs, which is exactly the lens we’d use when betting infra on any embedded engine.

The core argument

Beach was an early Polars adopter (2022-2023), wrote a popular “replace Pandas with Polars” post, and put Polars in production “when others were still watching from the sidelines.” The break point: a Lambda-only logic update with pinned polars==1.31.0 on the AWS base Python 3.13 image silently broke the next morning. Compounding the technical surprise was an earlier GitHub interaction where a memory issue he hit was closed unkindly as “not our problem.”

His thesis (paraphrased): tools split into two camps — ones that obsess over developer ease, non-breaking changes, and maintainer kindness, and ones that don’t. He wants the first kind. He swapped Polars for DuckDB and “sleeps better already.”

Quoted phrasing he uses: “the proof is in the pudding”; the maintainer split is between projects that take “ownership to another level” and ones that don’t.

Mapping against Ray Data Co

Strong relevance — direct overlap with active tooling decision.

Reinforces the DuckDB pick for graph-db-eval. Our prototype chose DuckDB for local-first, embeddable, SQL-native graph traversal. Beach’s switch is unrelated to graph use, but it’s a second independent data point that DuckDB’s stability and maintainer posture hold up under production AWS Lambda load. Worth folding into ../01-projects/graph-db-eval/prototype-results.md as external validation when we move to phase 2.
Maintainer culture as a tool-selection criterion. This is a vault-worthy framework lift. Today our tool-selection rubric leans on technical fit (local-first, typed edges, query expressiveness). Beach’s experience says: add “maintainer responsiveness and breaking-change discipline” as an explicit row in the rubric. Especially load-bearing for embedded engines we’re going to wrap a skill around (like graph-ingest.py).
Cautionary pattern: pinning a version doesn’t save you when the surrounding image moves. Beach pinned polars==1.31.0 and still got bit because his Lambda base image’s Python toolchain shifted underneath. Translation for RDCO: anything we put on Mac Mini cron with pinned package versions is still exposed to OS-level Python upgrades. Worth a one-line note in ../04-tooling/ when we set up the autonomous loop’s Python deps.
Style observation for newsletter remix. Beach uses first-person confession framing (“burn me at the stake,” “ripped Polars from its Lambda throne”) to make a tool-selection essay readable. That’s a Sanity Check-adjacent voice pattern — practitioner reckoning rather than vendor analysis. File mentally for the ../skills/draft-review.md voice corpus.

Sponsorship

None detected. No sponsor block, no self-consulting CTA in the body, no curation section. This is straight thought-leadership. Beach does link to his “other blog” (his old replace-Pandas-with-Polars post) — that’s self-citation, not sponsorship.

Caveats

Sample size of one production failure. Beach explicitly acknowledges the dual blame on “Python environments” and “life in software,” not solely Polars.
He doesn’t benchmark DuckDB vs Polars on his actual Lambda workload — the swap is driven by maintainer-culture and stability, not measured perf.
Polars is a fast-moving Rust project; the GitHub interaction he describes is anecdotal, not a systemic claim.

../01-projects/graph-db-eval/prototype-results.md — RDCO’s DuckDB prototype, completed 5 days after this article published
../01-projects/graph-db-eval/vertex-edge-dictionary.md — schema we built on DuckDB
2026-04-13-data-engineering-central-lambda-kappa.md — prior Daniel Beach Lambda-architecture piece
2026-04-15-data-engineering-central-robert-pack-basf-delta-lake.md — most recent DEC entry, same author/host
../decisions.md — candidate addition: “tool selection rubric must include maintainer-culture row”

Copyright note

All quoted phrasing is ≤15 words and attributed to Daniel Beach / Data Engineering Central. Full article at the source_url above; this note paraphrases for vault assessment only.