Apideck CLI – An AI-agent interface with much lower context consumption than MCP (apideck.com)

76 points by gertjandewilde 3 hours ago

caust1c 3 hours ago

I'm getting tired of everyone saying "MCP is dead, use CLIs!".

Yes, MCP eats up context windows, but agents can also be smarter about how they load the MCP context in the first place, using similar strategy to skills.

The problem with tossing it out entirely is that it leaves a lot more questions for handling security.

When using skills, there's no implicit way to be able to apply policies in the sane way across many different servers.

MCP gives us a registry such that we can enforce MCP chain policies, i.e. no doing web search after viewing financials.

Doing the same with skills is not possible in a programatic and deterministic way.

There needs to be a middle ground instead of throwing out MCP entirely.

ewild 27 minutes ago

I feel like I don't fully understand mcp. I've done research on it but I definitely couldn't explain it. I get lost on the fact that to my knowledge it's a server with API endpoints that are well defined into a json schema then sent the to LLM and the LLM parses that and decides which endpoints to hit (I'm aware some llms use smart calling now so they load the tool name and description but nothing else until it's called). How exactly are you doing the process of stopping the LLM from using web search after it hits a certain endpoint in your MCP server? Or is this referring strictly to when you own the whole workflow where you can then deny websearch capabilities on the next LLM step?

yoyohello13 3 hours ago

It is a weird trend. I see the appeal of Skills over MCP when you are just a solo dev doing your work. MCP is incredibly useful in an organization context when you need to add controls and process. Both are useful. I feel like the anti-MCP push is coming from people who don't need to work in a large org.

krzyk 2 hours ago

Not sure. Our big org, banned MCPs because they are unsafe, and they have no way to enforce only certain MCPs (in github copilot).

thenewnewguy an hour ago

thecopy 19 minutes ago

mbreese 41 minutes ago

yoyohello13 an hour ago

9rx 2 hours ago

> I feel like the anti-MCP push is coming from people who don't need to work in a large org.

Any kind of social push like that is always understood to be something to ignore if you understand why you need to ignore it. Do you agree that a typical solo dev caught in the MCP hype should run the other way, even if it is beneficial to your unique situation?

yoyohello13 an hour ago

CuriouslyC 2 hours ago

Skills are just prompts, so policy doesn't apply there. MCP isn't giving you any special policy control there, it's just a capability border. You could do the same thing with a service mesh or any other capability compartmentalization technique.

The only value in MCP is that it's intended "for agents" and it has traction.

consumer451 2 hours ago

> Yes, MCP eats up context windows, but agents can also be smarter about how they load the MCP context in the first place, using similar strategy to skills.

I have been keeping an eye on MCP context usage with Claude Code's /context command.

When I ran it a couple months ago, supabase used 13.2k tokens all the time, with the search_docs tool using 8k! So, I disabled that tool in my config.

I just ran /context now, and when not being used it uses only ~300 tokens.

I have a question. Does anyone know a good way to benchmark actual MCP context usage in Claude Code now? I just tried a few different things and none of them worked.

skybrian 3 hours ago

Towards the end of the article, they do write about some things that MCP does better.

il 3 hours ago

Tool search pretty much completely negates the MCP context window argument.

mvrckhckr 3 hours ago

I agree, and it's context-dependent when to use what (the author mentions use cases for other solutions). I'm glad there are multiple solutions to choose from.

j45 3 hours ago

MCPs are handy in their place. Agents calling CLI locally is much more efficient.

mihir_kanzariya an hour ago

The real issue isn't MCP vs Skills/CLIs, it's that most MCP servers dump their entire schema into context on init regardless of whether you'll actually use those tools. Lazy loading tool definitions based on what the agent is actually doing would solve like 80% of the bloat problem without throwing out the protocol entirely.

The security/registry point in the thread is underrated too. Being able to enforce policies at the protocol level is something you lose completely with ad hoc skill files.

JohnMakin an hour ago

This matches my experience building in-house MCP servers. The mechanism I prefer on load is something like a quick FTS5 with BM25 ranking lookup to find what it needs, and then serve those. I think a lot of these things are implemented pretty naively - for instance, we ran into the huge context problem with Jira, so we just built our own Jira MCP interface that doesn't have all the bloat. If the agent finds it needs something it doesnt have, it can ask again.

rob an hour ago

The real issue isn't MCP, it's these fucking bots posting here every day.

mritchie712 32 minutes ago

claude code solved this about a month ago

esafak an hour ago

This is becoming a solved problem with tool search; MCP is back.

robot-wrangler 29 minutes ago

> Limit integrations → agent can only talk to a few services

The idea that people see this as one horn of a trilemma instead of just good practice is a bit strange. Who would complain that every import isn't a star-import? Bring in what you need at first, then load new things dynamically with good semantics for cascade / drill-down. Let's maybe abandon simple classics like namespacing and the unix philsophy for the kitchen-sink approach after the kitchen-sink thing is shown to work.

hparadiz 3 hours ago

10 years from now: "Can you believe they did anything with such a small context window?"

this_user 3 hours ago

More likely: "Can you believe they were actually trying to use LLMs for this?"

nipponese 2 hours ago

OSes and software engs did not end up using less RAM.

gitonup an hour ago

lionkor 3 hours ago

10 years from now: "The next big thing: HENG - Human Engineers! These make mistakes, but when they do, they can just learn from it and move on and never make it again! It's like magic! Almost as smart as GPT-63.3-Fast-Xtra-Ultra-Google23-v2-Mem-Quantum"

cheevly 3 hours ago

Imagine believing humans don’t make the same mistakes. You live in a different universe than me buddy.

recursive 2 hours ago

creesch 2 hours ago

mbreese 3 hours ago

10 years from now: “what’s a context window?”

sghiassy 3 hours ago

10 years from now: “come with me if you want to live”

Terminator 2 Clip: https://youtu.be/XTzTkRU6mRY?t=72&si=dmfLNDqpDZosSP4M

MattGaiser 3 hours ago

I am kind of already at that point. For all the complaining about context windows being stuffed with MCPs, I am curious what they are up to and how many MCPs they have that this is a problem.

smrtinsert 2 hours ago

"That was back when models were so slow and weighty they had to use cloud based versions. Now the same LLM power is available in my microwave"

berziunas 3 hours ago

“640K ought to be enough for anybody”

hparadiz 3 hours ago

I dunno why you're getting down voted. This is funny.

dend 2 hours ago

One of the MCP Core Maintainers here, so take this with a boulder of salt if you're skeptical of my biases.

The debate around "MCP vs. CLI" is somewhat pointless to me personally. Use whatever gets the job done. MCP is much more than just tool calling - it also happens to provide a set of consistent rails for an agent to follow. Besides, we as developers often forget that the things we build are also consumed by non-technical folks - I have no desire to teach my parents to install random CLIs to get things done instead of plugging a URI to a hosted MCP server with a well-defined impact radius. The entire security posture of "Install this CLI with access to everything on your box" terrifies me.

The context window argument is also an agent harness challenge more than anything else - modern MCP clients do smart tool search that obviates the entire "I am sending the full list of tools back and forth" mode of operation. At this point it's just a trope that is repeated from blog post to blog post. This blog post too alludes to this and talks about the need for infrastructure to make it work, but it just isn't the case. It's a pattern that's being adopted broadly as we speak.

o_____________o an hour ago

> modern MCP clients do smart tool search that obviates the entire "I am sending the full list of tools back and forth" mode of operation

How, "Dynamic Tool Discovery"? Has this been codified anywhere? I've only see somewhat hacky implementations of this idea

https://github.com/modelcontextprotocol/modelcontextprotocol...

Or are you talking about the pressure being on the client/harnesses as in,

https://platform.claude.com/docs/en/agents-and-tools/tool-us...

dend 15 minutes ago

More of the latter than the former. The protocol itself is constrained to a set of well-defined primitives, but clients can do a bunch of pre-processing before invoking any of them.

kristjansson 3 hours ago

CLIs are great for some applications! But 'progressive disclosure' means more mistakes to be corrected and more round trips to the model - every time[1] you use the tool in a new thread. You're trading latency for lower cost/more free context. That might be great! But it might not be, and the opposite trade (more money/less context for lower latency) makes a lot of sense for some applications. esp. if the 'more money' part can be amortized over lots of users by keeping the tool definitions block cached.

[1]: one might say 'of course you can just add details about the CLI to the prompt' ... which reinvents MCP in an ad hoc underspecified non-portable mode in your prompt.

amzil 2 hours ago

This is a fair trade-off and the post should probably be more explicit about it. You're right that progressive disclosure trades latency for cost and context space. For some workloads that's the wrong trade.

The amortization point is interesting too. If you're running a support agent that calls the same 5 tools thousands of times a day, paying the schema cost once and caching it makes total sense. The post covers this in the "tightly scoped, high-frequency tools" section but your framing of it as a caching problem is cleaner.

On the footnote: guilty as charged, partially. The ~80 token prompt is a minimal bootstrap, not a full schema. It tells the agent how to discover, not what to call. But yeah, the moment you start expanding that prompt with specific flags and patterns, you're drifting toward a hand-rolled tool definition. The difference is where you stop. 80 tokens of "here's how to explore" is different from 10,000 tokens of "here's everything you might ever need." But the line between the two is blurrier than the post implies. Fair point.

drewbitt 38 minutes ago

nzoschke 2 hours ago

The industry is talking in circles here. All you need is "composability".

UNIX solved this with files and pipes for data, and processes for compute.

AI agents are solving this this with sub-agents for data, and "code execution" for compute.

The UNIX approach is both technically correct and elegant, and what I strongly favor too.

The agent + MCP approach is getting there. But not every harness has sub-agents, or their invocation is non-deterministic, which is where "MCP context bloat" happens.

Source: building an small business agent at https://housecat.com/.

We do have APIs wrapped in MCP. But we only give the agent BASH, an CLI wrapper for the MCPs, and the ability to write code, and works great.

"It's a UNIX system! I know this!"

Havoc an hour ago

Getting LLMs to reliably trigger CLI functions is quite hard in my experience though especially if it’s a custom tool

nicoritschel 3 hours ago

While I generally prefer CLI over MCP locally, this is bad outdated information.

The major harnesses like Claude Code + Codex have had tool search for months now.

injidup 3 hours ago

Can you explain how to take advantage. Is there any specific info from anthropic with regards to context window size and not having to care about MCP?

amzil 3 hours ago

Fair point on tool search. Claude Code and Codex do have it.

But tool search is solving the symptom, not the cause. You still pay the per-tool token cost for every tool the search returns. And you've added a search step (with its own latency and token cost) before every tool call.

With a CLI, the agent runs `--help` and gets 50-200 tokens of exactly what it needs. No search index, no ranking, no middleware. The binary is the registry.

Tool search makes MCP workable. CLIs make the search unnecessary.

bkummel 3 hours ago

There's already an open source tool that does exactly the same thing: https://github.com/knowsuchagency/mcp2cli

amzil 2 hours ago

Great tool, however we went to a dedicated CLI client (think gh, aws, stripe) in Go.

machinecontrol 3 hours ago

The trend is obviously towards larger and larger context windows. We moved from 200K to 1M tokens being standard just this year.

This might be a complete non issue in 6 months.

hrmtst93837 an hour ago

Those bigger windows come with lovely surcharges on compute, latency, and prompt complexity, so "just wait for more tokens" is a nice fantasy that melts the moment someone has to pay the bill. If your use case is tiny or your budget is infinite, fine, but for everyone else the "make the window bigger" crowd sounds like they're budgeting by credit card. Quality still falls off near the edge.

amzil 2 hours ago

Context windows getting bigger doesn't make the economics go away. Tokens still cost money. 50K tokens of schemas at 1M context is the same dollar cost as 50K tokens at 200K context, you just have more room left over.

The pattern with every resource expansion is the same: usage scales to fill it. Bigger windows mean more integrations connected, not leaner ones. Progressive disclosure is cheaper at any window size.

magospietato 2 hours ago

Context caching deals with a lot of the cost argument here.

amzil 2 hours ago

gertjandewilde 3 hours ago

We built a unified API with a large surface area and ran into a problem when building our MCP server: tool definitions alone burned 50,000+ tokens before the agent touched a single user message.

The fix that worked for us was giving agents a CLI instead. ~80 tokens in the system prompt, progressive discovery through --help, and permission enforcement baked into the binary rather than prompts.

The post covers the benchmarks (Scalekit's 75-run comparison showed 4-32x token overhead for MCP vs CLI), the architecture, and an honest section on where CLIs fall short (streaming, delegated auth, distribution).

austinhutch 3 hours ago

> Not a protocol error, not a bad tool call. The connection never completed.

Very interesting topic, but this LLM structure is instant anthema I just have to stop reading once I smell it.

enraged_camel 2 hours ago

With context windows starting to get much larger (see the recent 1M context size for Claude models), I think this will be a non-issue very soon.

m3kw9 2 hours ago

The thing with CLIs is that you also need to return results efficiently. It if both MCP and CLI return results efficiently, CLI wins

ekropotin 2 hours ago

Let me guess - another article about how CLI s are superior to MCP?

rirze 3 hours ago

At this point, I feel like MCP servers are just not feasible at the current level of context windows and LLMs. Good idea, but we're way too early.