The Webpage Has Instructions. The Agent Has Your Credentials (openguard.sh)
25 points by everlier 4 hours ago
redgridtactical 2 hours ago
This is the natural consequence of building everything around "the agent needs access to everything to be useful." The more capabilities you hand an agent, the larger the attack surface when it encounters a malicious page.
The simplest mitigation is also the least popular one: don't give the agent credentials in the first place. Scope it to read-only where possible, and treat every page it visits as untrusted input. But that limits what agents can do, which is why nobody wants to hear it.
rocho an hour ago
I absolutely agree, although even that doesn't solve the root problem. The underlying LLM architecture is fundamentally insecure as it doesn't separate between instructions and pure content to read/operate on.
I wonder if it'd be possible to train an LLM with such architecture: one input for the instructions/conversation and one "data-only" input. Training would ensure that the latter isn't interpreted as instructions, although I'm not knowledgeable enough to understand if that's even theoretically possible: even if the inputs are initially separate, they eventually mix in the neural network. However, I imagine that training could be done with massive amounts of prompt injections in the "data-only" input to penalize execution of those instructions.
stavros 2 hours ago
Why does the agent have your credentials? There's no need for that! I made one that doesn't:
indigodaddy an hour ago
So this is like a claw type thing? I’ve never used these “agents”. Not sure what I would do with them. Probably not for coding right?
amelius 28 minutes ago
You can do basically anything with a claw agent. For example, I asked one to build me a Dyson sphere. It is still working on it, but so far so good.
stavros an hour ago
Yeah, it's more of a personal assistant. It can do coding, but it's most useful as a PA.
indigodaddy 37 minutes ago
petesergeant 26 minutes ago
I am building https://agentblocks.ai for just this; you set fine-grained rules on what your agents are allowed to access and when they have to ask you out-of-channel (eg via WhatsApp or Slack) for permissions, with no direct agent access. It works today, well, supports more tools than are on the website, and if you have any need for this at all, I’d love to give you an account: [email protected]
Works great with OpenClaw, Claude Cowork, or anything, really