IndyDevDan — Why are top engineers DITCHING MCP Servers? (3 PROVEN Solutions) (Transcript)

Source: https://www.youtube.com/watch?v=OIKTsVjTVJE Title: Why are top engineers DITCHING MCP Servers? (3 PROVEN Solutions) Channel: IndyDevDan Duration: 32m 1s Published: 2025-11-10

What’s up, engineers? Indy Dev Dan here. Once again, my MCP server just ate 10,000 tokens before my agent even started working. That’s 5% of my agent’s context window gone, and my Cal-She prediction market MCP server isn’t even that large. Stack up two or three more MCP servers, and I’ll be bleeding 20%+ context in no time. This isn’t new, but what is new are these three proven alternatives to MCP servers. You know about skills. We’re going to break down two additional approaches you can use to connect your agents to external tools without torching your context window while maintaining control. These approaches are being used by some of the top leading companies in the agent space, like Anthropic, and by some of the top engineers who use agentic coding tools every single day. Let’s go beyond MCP.

The first version of this is the classic MCP server. The great part is that the MCP server manages the connection to your external data source. Here we have a Haiku model and a Cal-She prediction markets MCP server giving us agentic access to the Cal-She prediction markets betting platform. We can search “OpenAI achieves AGI”, get the order book, summarize bets and market sentiment in a concise table — bearish by 2029, only 43% chance. The market is telling us with their dollars that by 2029 we expect only a 43% chance OpenAI achieves AGI. Great tool, great resource — but the big problem is, slash context, 10K tokens torched right away.

These approaches trade off complexity for control. The theme is the same: use raw code as tools.

Approach 2 — CLI. We prompt our agent and give it a specific prompt that teaches them how to use a CLI, a set of functions they can call. Open up a brand new agent, drop the MCP server, just running our cheap fast Haiku model. /mcp shows nothing. How does this work? It all boils down to context, model, prompt, and tools. “Prime Cal-She CLI tools” — our fast agent reads two files: a readme and a CLI. Our agent now understands the tool exactly. The “prime” prompt is a concise 25-line prompt: “Read only these two files, the readme of our second app, and the CLI. As you work with the user, call the right tools to get the data you need.”

Now market search trillionaire — instead of running an MCP tool, our agent runs a CLI command. There’s a little more upfront cost — we are prompt-engineering how to use this — but as a reward, we get to fully control everything the agent can and cannot do. MCP just says “here’s a bunch of tools, here are all the descriptions, here’s all the context you’ll need to consume every time you boot the agent up.” Here, we’re in full control. “Here’s the readme, here’s the CLI, this is what you should do. Do not read any other Python files.” We can do the exact same work — pulling up trillionaire markets, Elon Musk by 2030, etc. Mario (top-tier engineer) argues you don’t need MCP at all. He sets up a prompt through a readme file and just tells the agent: look at this, here are all the tools you can use.

If we restart the agent and run prime, only reading those two files: /context shows tokens down from 10% to just 5.6%. Saved roughly 4% context window with our CLI approach.

Approach 3 — Scripts. Scripts look a lot like skills. Same setup: prompt your agent with a “prime” prompt. The trick is progressive disclosure — Anthropic mentions this in their blog about direct tool calls, scaling better by writing code to tool calls. Anthropic actually ends up calling the MCP server under the hood. Mario and I would argue you don’t need to go that far — just hand the agent the script directly.

Boot up the agent, /context, run /file-system-scripts. Same deal — priming with information. This time we only read a readme. Our agent understands when to use each script — condition mapped to files. This is a powerful agentic data structure to activate or ignore context. /context: less than 1%, just under 2K context. Key part: “I will not read scripts themselves unless —help doesn’t provide information needed.” We are prompt-engineering this outcome.

A lot of engineers are super obsessed with context engineering. Even before context comes prompt engineering. This is still THE critical skill for engineers in 2025 and beyond. The prompt shows up before the context gets in your context window. We have just prompt-engineered out 10,000 tokens.

Default MCP server: 10K tokens. CLI: ~5K. Script-based: ~10% of original. Same prompt structure broken down piece by piece. “Do not read the scripts themselves” + “as you work through each script, you use —help to understand how to use it.” Every single script is a self-contained Astral UV Python single-file script with dependencies declared at the top. Code duplication all over the place — that’s fine, we’re willing to pay that price because it makes our agents more effective.

Kalshi market search government shutdown — uv run app3 file-system-scripts search --json. Effectively the same thing as the MCP server. All these approaches help you solve the same problem. The question is how you do that and what does it cost?

Approach 4 — Skills. Skills look very similar to scripting. The big key difference is how you invoke them. With scripts you have a priming prompt to fire things off. With a skill, the prime prompt IS the skill.md file — the prompt that kicks things off and lets your agent understand the tools. Don’t give away your understanding of how to write great prompts: at the end of the day, everything is just the core four — context, model, prompt, and tools. Every feature builds on top of your agent’s context-model-prompt-tools.

The interesting thing is where and how the tools are discovered. cd app4, boot up Haiku, /mcp nothing, /context — only the definition of our skill. Skills have progressive disclosure. Kalshi market search top LLM — this kicks off the skill. We’ve embedded all the scripts into the skills directory. Self-contained, isolated. Powerful approach to give your agents specific tools without building an MCP server.

Tradeoffs:

	MCP	CLI	Scripts	Skills
Agent-invoked	✅	❌ (slash command first)	❌	✅
Context window	❌ (high consumption)	✅	✅✅	✅✅
Customizability	❌ unless you own it	✅	✅	✅
Portability	low	higher	highest (single file)	high (single directory)
Composability	✅ (resources/prompts/elicitation/sampling)	✅	✅	✅
Simplicity	high	medium	medium	medium
Engineering investment	low (external)	medium	medium	medium

Engineers gloss over MCP features all the time. It’s not just tools — there are resources, prompts, elicitation, completion, sampling. Skills are Claude ecosystem lock-in. Be clear about that. CLI and scripts: full control, share however you like.

My recommendation for tool access — when using EXTERNAL tools (you don’t own):

80% of the time: use MCP servers. Don’t reinvent the wheel.
15% of the time: dial down to CLI when you need to modify, extend, or specifically control tools/context.
5% of the time: scripts or skills, only if you need context preservation (the value-add of scripts/skills is progressive disclosure — you rarely use every single tool every single time).

Stack 2-3 large MCP servers and 20% of your context window is gone. That’s a problem.

For NEW tools (you’re building):

80% of the time: just use CLI. Why? CLI works for you, your team, AND your agents. The trifecta. A simple “prime” prompt with a 3-step workflow: read these files, summarize tools, as you work call the right tools.
10% of the time: wrap in MCP server when you need multiple agents at scale and don’t want to focus on the context. Build CLI first so it’s simple to wrap. All your MCP tools just call right into the CLI — interoperability with MCP because you have a CLI server first.
10% of the time: scripts or skills if you need context preservation.

If you’re really deep in the Cloud Code ecosystem, go right for skills. But I keep a balanced approach. As much as I love Cloud Code, I’m always aware of lock-in.

This codebase is available, link in the description. Check out blogs from Vitalik, from Mario, and from the Anthropic team. Stay focused and keep building.