I Sent One Message. An Agent Shipped the Fix.
How I dispatch real work to AI agents that run without me, and the small version you can start with
A few nights ago I cleared a failing check on a production app without opening my laptop. I typed one sentence into a Discord channel from my phone, set it down, and went back to my evening. About twenty minutes later a reply came back: the check was green, the fix was pushed, here’s exactly what changed. I read it, looked at the pull request, and merged it.
I didn’t write that fix. An AI agent did, running on a Mac at my house that I hadn’t touched in hours.
I want to be careful with how that sounds, because “autonomous AI agents” is the phrase everyone’s drowning in right now. Half the feed says agents will run your whole company by Friday. The other half says they’re a demo that collapses the second real work touches them. Both takes are exhausting, and neither matches what I actually use.
So here’s the unglamorous version. What I built, what it really did that night, the part nobody shows, and the much smaller version you could start with this week.
What it actually is
Strip away the word “autonomous” and it’s simpler than it sounds. I have a few Claude Code agents that stay running on an always-on machine. They coordinate through Discord, the same chat app a lot of people already use. One channel is where I drop tasks. Other channels are where the agents post what they’re doing.
When a task lands, an agent wakes up, does the work in the right place, and reports back. Because the machine’s always on, I can send that task from anywhere. My phone in a grocery line. A laptop that has nothing to do with the work. It barely matters where I am, because the thing doing the work isn’t the thing in my hand.
That’s the whole idea. A reliable place to drop a well-scoped task, and something on the other end that picks it up and finishes it while I go live my life.
What happened that night
Here’s the actual sequence, because the specifics matter more than the pitch.
I had a pull request open on one of my apps. The code was reviewed and clean, but one check stayed red: a security scan flagging known vulnerabilities in some dependencies. So I typed one line into my dashboard channel: make sure this pull request is fixed and the checks pass.
I did nothing else. A watcher noticed the message, started up the coordinating agent, and handed it the task. That agent figured out which project it belonged to, started the project’s own agent, and passed it along. The project agent dug in. It traced the failing scan to a batch of advisories published two days earlier, against libraries already in the project. My code was innocent. It bumped the flagged libraries, ran the entire test suite to make sure nothing broke, pushed the update, and posted a clear summary back to me. Then it waited for the checks to finish and confirmed every one of them went green.
The only things I did were type one sentence and, later, click the merge button. The diagnosis, the fix, the testing, the write-up: none of it was mine.
Why this is leverage, not magic
Here’s why it matters to me: the machine stays on when I don’t.
For most of my career, progress required me to be at a keyboard. If I stepped away, the work stopped. This flips a small but real piece of that. A bounded task can move forward while I’m at dinner, asleep, or three errands deep into a Saturday. I’m off the leash that ties getting something done to sitting at my desk.
I also want to be honest about the ceiling here, because the hype skips it. This is a long way from the AI running my business. It fixed a dependency advisory, the kind of narrow, checkable job I’d hand to a junior engineer and verify after. The judgment stayed mine. I read what came back. I decided to merge. An agent that can show me precisely what it did, with tests proving it, is one I’ll trust with a small task. The moment a task is fuzzy or the blast radius is large, I’m back in the loop, and that’s correct.
The part nobody shows
Now the unsexy truth, because building in public means showing the middle.
This system sat dead for two months. I’d set most of it up earlier and it never actually worked. The coordination table I built for the agents had twenty-five old entries, every one of them stuck or cancelled, none ever finished. It looked like infrastructure. It functioned like a brick.
Making it real took a long string of small, frustrating fixes. The agents stalled on every single action because they kept stopping to ask permission, so I had to give them a careful, scoped set of things they could do on their own. One stale setting silently blocked them from saving any work at all, and it took a while to even find it. The agents couldn’t hand tasks to each other, because they all share one chat identity and the system quietly drops a bot talking to itself, so I had to route the handoffs a different way. And my favorite: for an hour the whole thing was writing to the wrong database entirely, because one environment pointed somewhere I didn’t expect.
The clean, hands-off run I described at the top only happened after all of that. The autonomy is real, and it was earned the slow, annoying way. Anyone selling you the twenty-minute version is selling you the demo, not the Tuesday.
How the pieces fit, plainly
You don’t need to know any of this to get the point, but here’s the map in plain words.
There’s a tool that keeps programs alive on a machine after you disconnect from it. There’s the chat app, which is just my remote control: where I send tasks and read status from any device. There’s a coordinating agent whose only job is to receive a task and hand it to the right specialist. There are the specialist agents, each one living in its own project. And there’s a small watcher that boots the whole thing the instant a task arrives, so nothing has to burn power until there’s real work to do.
That last piece is the one I’m proudest of, because it means the system is asleep by default and only wakes for a real request.
You don’t need the whole fleet
Here’s where I talk you out of copying me exactly.
I run this because I ship across a lot of projects and I’m away from my desk constantly. That’s a specific situation. You almost certainly don’t need a dozen agents and your own chat server. If you build the full thing because it sounds cool, you’ll spend your weekend debugging plumbing instead of getting the one benefit that actually matters.
The benefit worth stealing is small. It’s the habit of handing a bounded, well-defined task to an agent and trusting it to run without you watching every keystroke. Everything else, the dispatch-from-your-phone part, the coordination, the always-on machine, is convenience you bolt on later, once that core habit is reliable.
The small version you can start with
You can get most of the value today with nothing I have. Open Claude Code, or whatever AI coding tool you use, and give it one task with a clear finish line. Make it concrete: update these dependencies and make sure the tests still pass, then tell me what changed. Then walk away for ten minutes and let it run.
When you come back, actually read what it did, the way you’d read a junior teammate’s work. That single loop, a bounded task plus a clear way to check it plus the discipline to actually check, is the entire foundation everything else sits on. The remote control and the agents are plumbing that sits on top of that habit.
If you do want the always-on, send-it-from-your-phone version built around your real work, that’s the kind of system I help people set up. But earn the habit first. The plumbing is worthless without it.
Your ten-minute start
Pick one task you’d normally babysit. A dependency bump. A test that needs writing. A small refactor you keep putting off.
Write two sentences before you hand it over. The first says what done looks like. The second says how you’ll know it worked: the test or the check or the output you’ll look at. That second sentence is the one most people skip, and it’s the one that makes an agent trustworthy instead of hopeful.
Then give it the task and leave for ten minutes. Come back and read the result like it’s someone else’s work, because it is.
Do that a few times and you’ll feel the shift. Mostly it feels like getting a piece of your attention back. That’s the whole game. Start with one task you can walk away from. The rest is plumbing.



