Schedule tasks on the web (code.claude.com)
259 points by iBelieve 15 hours ago
jFriedensreich 6 hours ago
We need to fight model providers trying to own memory, workflows and tooling. Don't give them an inch more of your software than needed even if there is a slight inconvenience setting up.
tossandthrow 28 minutes ago
I have tasks files in the code base that Claude executes on a schedule. I can easily move to other agents.
arrowleaf 4 hours ago
Why? As a user of these tools, I love the convenience factor of having one tool rather than wrangling dozens. It's why in the past I've used an IDE (JetBrains), a language created by the provider of the IDE (Kotlin), web framework created by the same people (ktor), etc.
jFriedensreich 3 hours ago
This is very different to a framework, language or IDE. More comparable to apple or amazon trying to create corporate anti competitive hellscapes of enslaved users that have no agency, no dignity and no real choice, reduced to rent extraction targets. Just with much more dire consequences and much more at stake. We still have the power to make ai providers have no moat and be interchangeable commodity. But we have to fight for them to not get control of the other layers they are trying to grab. We are in a war, people who can still use claude code or other of their garbage tools, after anthropic threatened and shut off opencode, are very naive and ignorant.
sillysaurusx 2 hours ago
arrowleaf an hour ago
wyre 3 hours ago
Except you can write Kotlin and ktor outside of Jetbrain's IDEs.
Anthropic wants a world where they own your agent where it can't exist outside of the Claude desktop app or Claude Code.
There could exist a world where your agent isn't confined by the whims of a corporation.
arrowleaf an hour ago
sharemywin 5 hours ago
I wish there was a company that was easy to use but wouldn't sell out in this arena.
solaceb 4 hours ago
hi, I don’t normally promote here, but I feel compelled to ask if you’d like to test my thing. it’s a personal agent / API for creating and managing background cloud agents that I’m 100% committed to keeping open source & accessible as an alternative platform to putting all your eggs in one basket. there is also a desktop app and expanding the api to involve storage. kind of like agentic dropbox that can also do coding and has a full computer and ability to spin up N agents
kizashi an hour ago
Bnjoroge 3 hours ago
thats like looking for a unicorn.
andai 3 hours ago
>slight inconvenience
You misspelt ">95% discount relative to API pricing" ;)
gowthamgts12 14 hours ago
interesting to see feature launches are coming via official website while usage restrictions are coming in with a team member's twitter account - https://x.com/trq212/status/2037254607001559305.
also, someone rightly predicted this rugpull coming in when they announced 2x usage - https://x.com/Pranit/status/2033043924294439147
stingraycharles 14 hours ago
To me it makes perfect sense for them to encourage people to do this, rather than eg making things more expensive for everyone.
The same as charging a different toll price on the road depending on the time of day.
trvz 4 hours ago
If you use the cloud providers you accept this and more.
If you want stability, own the means of inference and buy a Mac Studio or Strix Halo computer.
girvo 9 hours ago
Funnily, Anthropic's pricing etc. why I'm using GLM-5 a bunch more outside of work. Definitely not Opus level, but surprisingly decent. Though I got lucky and got the Alibaba Coding Model lite plan, which is so cheap they got rid of it
saratogacx 4 hours ago
I've been doing something similar. I use Claude for analysis and non-coding work, GLM for most coding tasks (GLM's coding plan) and when I need to do a larger implementation project I use GLM&Claude to build out an in depth plan and toss it to Github Copilot to Opus the implementation.
I was trying to get The alibaba plan but missed the mark. I'm curious to try out the Minimax coding plan (#10/mo) or Kimi ($20/mo) at some point to see how they stack up.
For Pricing: GLM was $180 for a year of their pro tier during a black friday sale and GHCP was $100/year but they don't have the annual plan any more so it is now $120. Alibaba's only coding plan today is $50/mo, too rich for me.
brianjking 7 hours ago
Does GLM-5 have multimodality or are they still wanting you to load an MCP for vision support?
girvo 7 hours ago
tyre 12 hours ago
If you read the replies to the second, you’ll see an engineer on Claude Code at Anthropic saying that it is false.
Someone spread FUD on the internet, incorrectly, and now others are spreading it without verifying.
hobofan 10 hours ago
And if you look closely at the usernames, you see that the same engineer from link 2 that said "nah it’s just a bonus 2x, it’s not that deep" (just two week ago) is now saying "we're going to throttle you during peak hours" (as predicted).
Yes, it was FUD, but ended up being correct. With the track record that Anthropic has (e.g. months long denial of dumbed down models last year, just to later confirm it as a "bug"), this just continues to erode trust, and such predictions are the result of that.
browningstreet 4 hours ago
nickandbro 15 hours ago
I feel like we are just inching closer and closer to a world where rapid iteration of software will be by default. Like for example a trusted user makes feedback -> feedback gets curated into a ticket by an AI agent, then turned into a PR by an Agent, then reviewed by an Agent, before being deployed by an Agent. We are maybe one or two steps from the flywheel being completed. Or maybe we are already there.
jwpapi 8 hours ago
I just don’t see it coming. I was full on that camp 3 months ago, but I just realize every step makes more mistakes. It leads into a deadlock and when no human has the mental model anymore.
Don’t you guys have hard business problems where AI just cant solve it or just very slowly and it’s presenting you 17 ideas till it found the right one. I’m using the most expensive models.
I think the nature of AI might block that progress and I think some companies woke up and other will wake up later.
The mistake rate is just too high. And every system you implement to reduce that rate has a mistake rate as well and increases complexity and the necessary exploration time.
I think a big bulk of people is of where the early adaptors where in December. AI can implement functional functionality on a good maintained codebase.
But it can’t write maintable code itself. It actually makes you slower, compared to assisted-writing the code, because assisted you are way more on the loop and you can stop a lot of small issues right away. And you fast iterate everything•
I’ve not opened my idea for 1 months and it became hell at a point. I’ve now deleted 30k lines and the amount of issues I’m seeing has been an eye-opening experience.
Unscalable performance issues, verbosity, straight up bugs, escape hatches against my verification layers, quindrupled types.
Now I could monitor the ai output closer, but then again I’m faster writing it myself. Because it’s one task. Ai-assisted typing isn’t slower than my brain is.
Also thinking more about it FAANG pays 300$ per line in production, so what do we really trying to achieve here, speed was never the issue.A great coder writes 10 production lines per day.
Accuracy, architecture etc is the issue. You do that by building good solid fundamental blocks that make features additions easier over time and not slower
onionisafruit 5 hours ago
I know it’s not your main point, but I’m curious where $300/line comes from. I don’t think I’ve ever seen a dollar amount attached to a line of production code before.
aspenmartin 7 hours ago
I think this sounds like a true yet short sighted take. Keep in mind these features are immature but they exist to obtain a flywheel and corner the market. I don’t know why but people seem to consistently miss two points and their implications
- performance is continuing to increase incredibly quickly, even if you rightfully don’t trust a particular evaluation. Scaling laws like chinchilla and RL scaling laws (both training and test time)
- coding is a verifiable domain
The second one is most important. Agent quality is NOT limited by human code in the training set, this code is simply used for efficiency: it gets you to a good starting point for RL.
Claiming that things will not reach superhuman performance, INCLUDING all end to end tasks: understanding a vague business objective poorly articulated, architecting a system, building it out, testing it, maintaining it, fixing bugs, adding features, refactoring, etc. is what requires the burden of proof because we literally can predict performance (albeit it has a complicated relationship with benchmarks and real world performance).
Yes definitely, error rates are too high so far for this to be totally trusted end to end but the error rates are improving consistently, and this is what explains the METR time horizon benchmark.
sobellian 6 hours ago
embedding-shape 7 hours ago
nprateem 7 hours ago
chatmasta 14 hours ago
I love everything about this direction except for the insane inference costs. I don’t mind the training costs, since models are commoditized as soon as they’re released. Although I do worry that if inference costs drop, the companies training the models will have no incentive to publish their weights because inference revenue is where they recuperate the training cost.
Either way… we badly need more innovation in inference price per performance, on both the software and hardware side. It would be great if software innovation unlocked inference on commodity hardware. That’s unlikely to happen, but today’s bleeding edge hardware is tomorrow’s commodity hardware so maybe it will happen in some sense.
If Taalas can pull off burning models into hardware with a two month lead time, that will be huge progress, but still wasteful because then we’ve just shifted the problem to a hardware bottleneck. I expect we’ll see something akin to gameboy cartridges that are cheap to produce and can plug into base models to augment specialization.
But I also wonder if anyone is pursuing some more insanely radical ideas, like reverting back to analog computing and leveraging voltage differentials in clever ways. It’s too big brain for me, but intuitively it feels like wasting entropy to reduce a voltage spike to 0 or 1.
efromvt 5 hours ago
Inference costs at least seem like the thing that is easiest to bring down, and there's plenty of demand to drive innovation. There's a lot less uncertainty here than with architectural/capability scaling. To your point, tomorrow's commodity hardware will solve this for the demands of today at some point in the future (though we'll probably have even more inference demand then).
throwaw12 11 hours ago
> I love everything about this direction except for the insane inference costs.
If this direction holds true, ROI cost is cheaper.
Instead of employing 4 people (Customer Support, PM, Eng, Marketing), you will have 3-5 agents and the whole ticket flow might cost you ~20$
But I hope we won't go this far, because when things fail every customer will be impacted, because there will be no one who understands the system to fix it
michaelmior 9 hours ago
I worry about the costs from an energy and environmental impact perspective. I love that AI tools make me more productive, but I don't like the side effects.
azan_ 6 hours ago
eksu 13 hours ago
This is the wrong way to see it. If a technology gets cheaper, people will use more and more and more of it. If inference costs drop, you can throw way more reasoning tokens and a combination of many many agents to increase accuracy or creativity and such.
gf000 10 hours ago
mastermage 13 hours ago
I mean theoretically if there are many competitiors the costs of the product should generally drop because competition.
Sadly enough I have not seen this happening in a long time.
Leptonmaniac 14 hours ago
I think that as a user I'm so far removed from the actual (human) creation of software that if I think about it, I don't really care either way. Take for example this article on Hacker News: I am reading it in a custom app someone programmed, which pulls articles hosted on Hacker News which themselves are on some server somewhere and everything gets transported across wires according to a specification. For me, this isn't some impressionist painting or heartbreaking poem - the entity that created those things is so far removed from me that it might be artificial already. And that's coming from a kid of the 90s with some knowledge in cyber security, so potentially I could look up the documentation and maybe even the source code for the things I mentioned; if I were interested.
slopinthebag 13 hours ago
Art is and has always been about the creator.
raincole 10 hours ago
vntok 11 hours ago
theredbeard 13 hours ago
We haven’t been inching closer to users writing a half-decent ticket in decades though.
fhub 10 hours ago
Solutions like https://bugherd.com/ might make the issue context capture part more accurate.
aembleton 12 hours ago
Maybe the agent can ask the user clarifying questions. Even better if it could do it at the point of submission.
heavyset_go 12 hours ago
Feedback loops like that would be an exercise in raising garbage-in->garbage-out to exponential terms.
It's the "robots will just build/repair themselves" trope but the robots are agents
TeMPOraL 11 hours ago
Yes. Next they'll want nanobots that build/repair themselves.
Oh wait. That's already here and is working fine.
jvuygbbkuurx 14 hours ago
Tusted user like Jia Tan.
mindwok 11 hours ago
I think Anthropic will launch backend hosting off the back of their Bun acquisition very soon. It makes sense to basically run your entire business out of Claude, and share bespoke apps built by Claude code for whatever your software needs are.
pxtail 9 hours ago
100% its going to happen - also OpenAI will do same, there were already rumors about them building internal "github" which is stepping stone for that Also it is requirement for completing lock-in - the dream for these companies.
lancekey 8 hours ago
Ha I just SPECed out a version of this. I have a simple static website that I want a few people to be able to update.
So, we will give these 3 or 4 trusted users access to an on-site chat interface to request updates.
Next, a dev environment is spun up, agent makes the changes, creates PR and sends branch preview link back to user.
Sort of an agent driven CMS for non-technical stakeholders.
Let’s see if it works.
EastLondonCoder 7 hours ago
I think some type of tickets can be done like this but your trusted user assumption does a lot of work here. Now I don't see this getting better than that with the current architecture of LLMs, you can do all sorts of feedback mechanisms which helps but since LLMs are not conscious drift is unavoidable unless there is a human in the loop that understands and steers what's going on.
But I do think even now with certain types of crud apps, things can be largely automated. And that's a fairly large part of our profession.
andy_ppp 10 hours ago
Users are often incorrect about what the software should actually be doing and don’t see the bigger picture.
backscratches 9 hours ago
In the past three weeks a couple of projects I follow have implemented AI tools with their own github accounts which have been doing exactly this. And they appear to be doing good work! Dozens of open issues iterated, tested and closed. At one point i had almost 50 notification for one projects backlog being eradicated in 24 hours. The maintainer reviewed all of it and some were not merged.
obastani 8 hours ago
I don't know if this is the future, but if it is, why bother building one version of the software for everyone? We can have agents build the website for each user exactly the way they want. That would be the most exciting possibility to come out of AI-generated software.
bwestergard 8 hours ago
"why bother building one version of the software for everyone?"
So one user's experience is relevant to another, so they can learn from one another?
slopinthebag 14 hours ago
What kind of software are people building where AI can just one shot tickets? Opus 4.6 and GPT 5.4 regularly fail when dealing with complicated issues for me.
girvo 9 hours ago
GPT 5.4 straight up just dies with broken API responses sometimes, let alone when it struggles with a even moderately complex task.
I still can't get a good mental model for when these things will work well and when they won't. Really does feel like gambling...
withinboredom 14 hours ago
Not just complicated, but even simple ones if the current software is too “new” of a pattern they’ve never seen before or trained on.
slopinthebag 13 hours ago
victorbjorklund 13 hours ago
Of course not all tickets are complex. Last week I had to fix a ticket which was to display the update date on a blog post next to the publish date. Perfect use case for AI to one shot.
thin_carapace 14 hours ago
i dont see anyone sane trusting ai to this degree any time soon, outside of web dev. the chances of this strategy failing are still well above acceptable margins for most software, and in safety critical instances it will be decades before standards allow for such adoption. anyway we are paying pennies on the dollar for compute at the moment - as soon as the gravy train stops rolling, all this intelligence will be out of access for most humans. unless some more efficient generalizable architecture is identified.
heavyset_go 13 hours ago
m00x 13 hours ago
slopinthebag 13 hours ago
eerikkivistik 6 hours ago
I know a company already operating like this in the fintech space. I foresee a front page headline about their demise in their future.
tuo-lei 13 hours ago
The missing piece for me is post-hoc review.
A PR tells me what changed, but not how an AI coding session got there: which prompts changed direction, which files churned repeatedly, where context started bloating, what tools were used, and where the human intervened.
I ended up building a local replay/inspection tool for Claude Code / Cursor sessions mostly because I wanted something more reviewable than screenshots or raw logs.
dominotw 8 hours ago
I dont mean this as a shade but ppl who are not coders now seem to think "coding is now solved" and seem to be pushing absurd ideas like shipping software with slack messages. These ppl are often high up in the chain and have never done serious coding.
Stripe is apparently pushing gazzaliion prs now from slack but their feature velocity has not changed. so what gives?
how is that number of pr is now the primary metric of productivity and no one cares about what is being shipped or if we are shipping product faster. Its total madness right now. Everyone has lost their collective minds.
rkomorn 8 hours ago
I ask myself the same question.
I'm not seeing the apps, SaaS, and other tools I use getting better, with either more features or fewer bugs.
Whatever is being shipped, as an end user, I'm just not seeing it.
dominotw 8 hours ago
duped 5 hours ago
edf13 13 hours ago
Or perhaps we end up where all software is self evolving via agents… adjusting dynamically to meet the users needs.
PeterStuer 12 hours ago
The "user" being the one that's in charge of the AI, not the person on the receiving end.
eru 13 hours ago
Instead of having a trusted user, you can also do statistics on many users.
(That's basically what A/B testing is about.)
hyperionultra 13 hours ago
"Trusted user" also can be an Agent.
bredren 14 hours ago
What you're describing is absolutely where we're headed.
But the entire SWE apparatus can be handled.
Automated A/B testing of the feature. Progressive exposure deployment of changes, you name it.
shafyy 10 hours ago
Haha sure, let's just let every user add their feedback to the software.
tossandthrow 14 hours ago
I think the Ai agent will directly make a PR - tickets are for humans with limited mental capacity.
At least in my company we are close to that flywheel.
_puk 14 hours ago
Tickets need to exist purely from a governance perspective.
Tickets may well not look like they do now, but some semblance of them will exist. I'm sure someone is building that right now.
No. It's not Jira.
tossandthrow 13 hours ago
Gigachad 14 hours ago
The agents have even more limited capacity
eru 13 hours ago
MattGaiser 14 hours ago
I am already there with a project/startup with a friend. He writes up an issue in GitHub and there is a job that automatically triggers Claude to take a crack at it and throw up a PR. He can see the change in an ephemeral environment. He hasn't merged one yet, but it will get there one day for smaller items.
I am already at the point where because it is just the two of us, the limiting factor is his own needs, not my ability to ship features.
m00x 13 hours ago
Must be nice working on simple stuff.
jondwillis 14 hours ago
Why doesn’t he merge them?
MattGaiser 5 hours ago
yieldcrv 14 hours ago
We do feedback to ticket automatically
We dont have product managers or technical ticket writers of any sort
But us devs are still choosing how to tackle the ticket, we def don't have to as I’m solving the tickets with AI. I could automate my job away if I wanted, but I wouldn't trust the result as I give a degree of input and steering, and there’s bigger picture considerations its not good at juggling, for now
charcircuit 14 hours ago
Then sets up telemetry and experiments with the change. Then if data looks good an agent ramps it up to more users or removes it.
overfeed 11 hours ago
> I feel like we are just inching closer and closer to a world where rapid iteration of software will be by default.
There's a lots of experimentation right now, but one thing that's guaranteed is that the data gatekeepers will slam the door shut[1] - or install a toll-booth when there's less money sloshing about, and the winners and losers are clear. At some point in the future, Atlassian and Github may not grant Anthropic access to your tickets unless you're on the relevant tier with the appropriate "NIH AI" surcharge.
1. AI does not suspend or supplant good old capitalism and the cult of profit maximization.
eranation 14 hours ago
Um, we are already there...
kelvinjps10 5 hours ago
I feel like a lot of people and companies wanted to automate the web, but most website's operators wouldn't let you and would block you. Now you put the name AI into and now you're allowed to do It.
simianwords 13 hours ago
I remember when I tried to set something up with the ChatGPT equivalent like "notify me only if there are traffic disruptions in my route every morning at 8am" and it would notify me every morning even if there was no disruption.
theredbeard 13 hours ago
This is because for some reason all agentic systems think that slapping cron on it is enough, but that completely ignores decades of knowledge about prospective memory. Take a look at https://theredbeard.io/blog/the-missing-memory-type/ for a write-up on exactly that.
primer42 8 hours ago
“A programmer is going to the store and his wife tells him to buy a gallon of milk, and if there are eggs, buy a dozen. So the programmer goes shopping, does as she says, and returns home to show his wife what he bought. But she gets angry and asks, ‘Why’d you buy 13 gallons of milk?’ The programmer replies, ‘There were eggs!’”
You need to write a clearer prompt.
devsda 4 hours ago
"I need to fly to NY next weekend, make the necessary arrangement".
Your AI assistant orders an experimental jetpack from a random startup lab. Would you have honestly guessed that the prompt was "ambiguous" before you knew how the AI was going to act on it ?
jeremyjh 7 hours ago
Did GP edit their comment? Or did you read the prompt they used somewhere else?
alexhans 10 hours ago
Why not set your own evals and something like pi-mono for that? https://github.com/badlogic/pi-mono/
You'll define exactly what good looks like.
scottmcdot 13 hours ago
Me too. It doesn't have ability to alert only on true positive. I has to also alert on true negative. So dumb
worldsayshi 12 hours ago
This doesn't seem to hard to solve except for the ever so recurring llm output validation problem. If the true positive is rare you don't know if the earthquake alert system works until there's an earthquake.
g3f32r an hour ago
monkeydust 12 hours ago
I do feel people will end up using this for things where a deterministic rule could be used - more effective, faster and cheaper. See this starting to happen at work...'We need AI to solve X....no you don't"
TeMPOraL 12 hours ago
Maybe. The problem of "execute task on a cron" is something I've noticed the industry seems to refuse to solve in general, as if intentionally denying this capability for regular people. Even without AI, it's the most basic block of automation, and is always mysteriously absent from programs and frameworks (at least at the basic level). AI only makes it more useful on "then" side, but reliable cron on "if" side is already useful.
PurpleRamen 7 hours ago
Most of the industry today is educated to avoid manual hacky solutions on single servers. You need to have fancy UI, frameworks with easy feedback and layers on top of layers who maintain other layers. Cron is an ancient tool with arcane syntax which offer barely anything out of the box, you have to know it and work it to get something out of it.
And there is also the mindset to avoid boring loops, and prefer event driven solutions for optimal resource-usage. So people also have a kind of blind spot for this functionality.
9wzYQbTYsAIc 10 hours ago
I don’t recall if IFTTT had/has a basic cron or not, but it sure has/had put a lot of basic automations in the hands of the general public. Same for Apple Shortcuts, to some extent, or Zapier.
TeMPOraL 10 hours ago
monkeydust 11 hours ago
Agree. How would you solve this in general, what would be the ingredients? People use things like zapier, n8n, node-red to achieve this today but in many cases are overkill.
bshimmin 11 hours ago
TeMPOraL 10 hours ago
dspillett 11 hours ago
> See this starting to happen at work...'We need AI to solve X....no you don't"
Same. Sometimes it is just people overeager to play with new toys, but in our case there is a push from the top & outside too: we are in the process of being subsumed into a larger company (completion due on April the 1st, unless the whole thing is an elaborate joke!) and there is apparently a push from the investors there to use "AI" more in order to not "get left behind the competition".
monkeydust 11 hours ago
Its self perpetuating, I was talking to CEO of a Series A level B2B SaaS company here in UK recently. Most of the propspects his sales team are hitting are re-allocating their wallets to only looking for products that use AI on back of senior management pushing them to do so.
This company already does some pretty cool stuff with statistics for forecasting but now they are pivoting their roadmap to bake in GenAI into their offering over some other features that would be more valuable to their clients.
alexhans 10 hours ago
I'd say that's almost fine if they can start expressing intent correctly and thinking what good looks like. They (or some automated thing if you're building "think for them" type of products instead of "give them tools and teach them to think how to use them") can then freeze determism more and more were useful
I wrote this to help people (not just Devs) reason about agent skills
https://alexhans.github.io/posts/series/evals/building-agent...
And this one to address the drift of non determism (but depending on the audience it might not resonate as much)
https://alexhans.github.io/posts/series/evals/error-compound...
beefsack 11 hours ago
I feel this would be more useful for tasks like "Check website X to see if there are any great deals today". Specifically, tasks that are loosely defined and require some form of intuition.
logicprog 10 hours ago
The problem I'd think, for the average user, would be writing the 'then' part of any deterministic rule — that would require coding, or at least some kind of automation script (visual or otherwise) that's basically coding in a trench coat, which for most people is still a barrier to entry and annoying. I think that's why they'd use AI tbh — they can just describe what they want in natural language with AI.
elcapitan 10 hours ago
AI will become this colleague who sucks at everything, but never says no, so he becomes the favorite go-to person.
comboy 12 hours ago
People are loading huge interpreted environments for stuff that can be done from the command line. Run computations on complex objects where it could be a single machine instruction etc. The trend has been around for a long time.
globular-toast 10 hours ago
Standard pendulum swing. Most people want to disengage their thinking circuits most of the time, so problems can't be evaluated one by one. There is no such thing as "this is a good solution for some problems". It can only be "this is a good solution for all problems". When the pendulum swings this far, this hard, it will swing all the way back eventually.
javiercr 12 hours ago
I've recently switched from GitHub Copilot Pro to Claude Code Max (20x). While Claude is clearly superior in many aspects, one area where it falls short is remote/cloud agents.
Yesterday, I spent the entire day trying to set up "Claude on the web" for an Elixir project and eventually had to give up. Their network firewall kept killing Hex/rebar3 dependency resolution, even after I selected "full" network access.
The environment setup for "on the web" is just a bash script. And when something goes wrong, you only see the tail of the log. There is currently no way to view the full log for the setup script. It's really a pain to debug.
The Copilot equivalent to "Claude on the web" is "GitHub Copilot Coding Agents," which leverages GitHub Actions infrastructure and conventions (YAML files with defined steps). Despite some of the known flaws of GitHub Actions, it felt significantly more robust.
"Schedule task on the web" is based on the same infrastructure and conventions as "Claude on the web", so I'm afraid I'm gonna have the same troubles if I want to use this.
georaa 3 hours ago
Scheduling is easy. The hard part is everything between "started" and "done" - task needs human approval at step 3, fails at step 5 (retry from 4 or from scratch?), takes 6 hours and something restarts. How do they handle tasks that span multiple inference calls? Is there checkpointing or does it start over?
iBelieve 14 hours ago
Looks like I'm limited to only 3 cloud scheduled tasks. And I'm on the Max 20x plan, too :(
"Your plan gets 3 daily cloud scheduled sessions. Disable or delete an existing schedule to continue."
But otherwise, this looks really cool. I've tried using local scheduled tasks in both Claude Code Desktop and the Codex desktop app, and very quickly got annoyed with permissions prompts, so it'll be nice to be able to run scheduled tasks in the cloud sandbox.
Here are the three tasks I'll be trying:
Every Monday morning: Run `pnpm audit` and research any security issues to see if they might affect our project. Run `pnpm outdated` and research into any packages with minor or major upgrades available. Also research if packages have been abandoned or haven't been updated in a long time, and see if there are new alternatives that are recommended instead. Put together a brief report highlighting your findings and recommendations.
Every weekday morning: Take at Sentry errors, logs, and metrics for the past few days. See if there's any new issues that have popped up, and investigate them. Take a look at logs and metrics, and see if anything seems out of the ordinary, and investigate as appropriate. Put together a report summarizing any findings.
Every weekday morning: Please look at the commits on the `develop` branch from the previous day, look carefully at each commit, and see if there are any newly introduced bugs, sloppy code, missed functionality, poor security, missing documentation, etc. If a commit references GitHub issues, look up the issue, and review the issue to see if the commit correctly implements the ticket (fully or partially). Also do a sweep through the codebase, looking for low-hanging fruit that might be good tasks to recommend delegating to an AI agent: obvious bugs, poor or incorrect documentation, TODO comments, messy code, small improvements, etc.
I ran all of these as one-off tasks just now, and they put together useful reports; it'll be nice getting these on a daily/weekly basis. Claude Code has a Sentry connector that works in their cloud/web environment. That's cool; it accurately identified an issue I've been working on this week.
I might eventually try having these tasks open issues or even automatically address issues and open PRs, but we'll start with just reports for now.
NuclearPM 14 hours ago
0 7 * * 1-5 ANTHROPIC_API_KEY=sk-... /path/to/claude-cron.sh /path/to/repo >> ~/claude-reports.md 2>&1
Seems trivial.
esperent 13 hours ago
A trivial way to rack up hundreds of dollars in API costs, sure.
But you can set up a claude -p call via a cronjob without too much hassle and that can use subscriptions.
maccard 6 hours ago
Sure, now what happens if my laptop is asleep at 7am? Or if our scheduled build took an extra 30 minutes because of contention?
chopete3 13 hours ago
Claude is moving fast.
Grok has had this feature for some time now. I was wondering why others haven't done it yet.
This feature increases user stickiness. They give 10 concurrent tasks free.
I have had to extract specific news first thing in the morning across multiple sources.
mkagenius 14 hours ago
This is a bit restrictive, doesn't take screenshots. So you can't "say take screenshots of my homepage and send it to me via email"
It doesnt allow egress curl, apart from few hardcoded domains.
I have created Cronbox in the cloud which has a better utility than above. Did a "Show HN: Cronbox – Schedule AI Agents" a few days back.
and a pelican riding a bicycle job -
https://cronbox.sh/jobs/pelican-rides-a-bicycle?variant=term...
sarpdag 4 hours ago
I can't pick the effort for the tasks run on Claude Web. I have a feeling Claude is using low or medium effort on those tasks, and I observe clear quality differences with the task ran on my local claude code, which uses high effort.
0898 6 hours ago
One interesting restriction is that it won’t do anything with people’s faces.
I run conferences and I like to have photos of delegates on the page so you can see who else is attending.
I wanted to automate this by having Claude go to the person’s LinkedIn profile and save the image to the website.
But it seems it won’t do that because it’s been instructed not to.
wslh 5 hours ago
LinkedIn already employs anti-scraping measures, so I'd expect a lot of users to get flagged.
That's not unique to LinkedIn but what is somewhat unique is the strong linkage to real world identities, which raises the cost of Sybil attacks on personal networks with high trust.
zmmmmm 14 hours ago
i'm missing something basic here .... what does it actually do? It executes a prompt against a git repository. Fine - but then what? Where does the output go? How does it actually persist whatever the outcome of this prompt is?
Is this assuming you give it git commit permission and it just does that? Or it acts through MCP tools you enable?
jngiam1 14 hours ago
MCP tools. We're doing some MCP bundling and giving it here, pretty cool stuff.
ares623 10 hours ago
wasn't MCP a critical link in the recent litellm attack?
TeMPOraL 9 hours ago
tossandthrow 14 hours ago
We use to do do automated sec audits weekly on the code base and post the result on slack
zmmmmm 14 hours ago
so is slack posting an MCP tool it has? or a skill it just knows?
tossandthrow 14 hours ago
hirako2000 8 hours ago
Oh my, did Anthropic invent Cron jobs as a service?
It's a game changer.
Edit: my mistake. It's inferior to a Cron job. If my repos happen to be self hosted with Forgejo or codeberg, then it won't even work. If I concede to use GitHub though I don't have to set up any env variables. Schedules lock-in, all over the web.
TeMPOraL 7 hours ago
You jest, but for some reason the industry stubbornly refuses to solve the "cron job as a service" problem for end-users, whether on the web or in the OS.
I feel this is rooted in problems that extend beyond computing. Regular people are not allowed to automate things in their life. Consider that for most people, the only devices designed to allow unattended execution off a timer are a washing machine, some ovens and dishwashers, and an alarm clock (also VCRs in the previous era). Anything else requires manual actuation and staying in a synchronous loop.
hirako2000 7 hours ago
There is nothing to solve. It's already there, a VPS, a container platform, just push your script and schedule it.
Of course a provider can offer convenient shortcuts, but at the cost of getting tied into their ecosystem.
Anthropic is clearly battling an existential threat: what happens when our paying users figure out they can get a better and cheaper model elsewhere.
TeMPOraL 6 hours ago
talkin 7 hours ago
> for some reason the industry stubbornly refuses to solve the "cron job as a service" problem for end-users, whether on the web or in the OS.
Such a service will always be destroyed by the bell-ends who want to run spam or worse activities.
TeMPOraL 6 hours ago
WJW 7 hours ago
What is wrong with things like the Zapier scheduler? (ie https://zapier.com/apps/schedule/integrations) For running locally, there's also a plethora of cronlikes for every OS under the sun.
I think the core problem is not so much that it is not "allowed", but that even the most basic types of automation involves programming. I mean "programming" here in the abstract sense of "methodically breaking up a problem into smaller steps and control flows". Many people are not interested in learning to automate things, or are only interested until they learn that it will involve having to learn new things.
There is no secret conspiracy stopping people from learning to automate things, rather I think it's quite the opposite: many forces in society are trying to push people to automate more and more, but most are simply not interested in learning to do so. See for example the bazillion different "learn to code" programs.
TeMPOraL 6 hours ago
alasano 7 hours ago
I built this last year because I thought it was overdue back then already.
https://imgur.com/a/apero-TWHSKmJ
Cron triggers (or specific triggers per connector like new email in Gmail, new linear issue, etc for built in connectors).
Then you can just ask in natural language when (whatever trigger+condition) happens do x,y and z with any configuration of connectors.
It creates an agentic chain to handle the events. Parent orchestrator with limited tools invoking workers who had access to only their specific MCP servers.
Official connectors are just custom MCP servers and you could add your own MCP servers.
I definitely had the most advanced MCP client on the planet at that point, supporting every single feature of the protocol.
I think that's why I wasn't blown away by OpenClaw, I had been doing my own form of it for a while.
I need to release more stuff for people to play around with.
My friends had use cases like "I get too many emails from my kids school I can't stay on top of everything".
So the automation was just asking "when I get an email from my kids school, let me know if there's anything actionable for me in it"
throwatdem12311 7 hours ago
So this is basically just Anthropic’s version of Open Claw that they manage for you and you pay them.
arjie 15 hours ago
What's the per-unit-time compute cost (independent of tokens)? Compute deadline etc.? They don't charge for the Cloud Environment https://code.claude.com/docs/en/claude-code-on-the-web#cloud... currently running?
lucgagan 14 hours ago
Here goes my project.
rhubarbtree 10 hours ago
Better idea. Watch online feedback on this feature. Then implement things users want. Go niche. Join the forum and help them use Claude to its limits. Then be the next step for power users.
hydroweaver87 13 hours ago
What were you working on?
pxtail 9 hours ago
Welcome to Amazon playbook replayed again, most useful, profitable and popular use-cases will implemented by platform - and they will do it ruthlessly and quickly as money needs to be recouped.
dbvn 6 hours ago
it would be easier to use claude to write a cronjob that does the same thing for you but accurately
qznc 6 hours ago
And yet it probably covers 90% of what people use OpenClaw for.
pastel8739 15 hours ago
Is this free? I don’t see pricing info. I guess just a way to make you forget that you’re spending money on tokens?
weird-eye-issue 14 hours ago
You don't spend money on tokens. It is a subscription.
mememememememo 9 hours ago
The PHP script from a cron tab is back!
j1000 9 hours ago
lmao
PeterStuer 12 hours ago
Is only Github supported as a repository?
jngiam1 14 hours ago
This is powerful. Combined with MCPs, you can pretty much automate a ton of work.
esperent 13 hours ago
Can you give some examples?
adobrawy 11 hours ago
That feature was silent launched about week ago for me.
I use it to:
- perform review of latest changes of code to update my documentation (security policies, user documentation etc.)
- perform review to latest changes of code, triage them, deduplicate and improve code - I review them, close them with comments for over-engoneering / add review for auto-fix
- perform review of open GitHub issue with label, select the one with highest impact, comment with rationale, implement it and make pull request - I wake up and I have a few pull request to fix issues that I can approve /finish in existing Claude Code thread
I want also use it to: - review recent Sentry issues, make GitHub issues for the one with highest priority, make pull request with proposed fix - I can just wake up and see that some crash is ready to be resolved
Limit of 3 scheduled jobs is pretty impactful, but playing with it give me a nice idea on how I can reduce my manual work.