This seems like a recipe for bad code I dunno. How would someone using this app test as they go?
Agentic coding sorta works for me because you can stop and test each iteration and pinpoint where something has gone wrong.
Example: I ask for a tweak, it give me 20 lines, I test for the intended behavior and keep working on those 20 lines until I'm happy with the reliability/effects of it.
But that loop itself requires a environment in which the final product will be running and takes up most the time in my expirence.
What model are you using? I can use Claude Code's remote control on Opus 4.8, have it implement a meaty feature, talk through edge cases, ask it to posit potential interactions we want to be careful of, have it write a plan and expand tests for surrounding areas, ask it to implement, ship that increment to testflight, then experiment with it on my phone - without being at home. My only limitation is that I like to clear context between ships, and that's tough to do using remote control.
I guess if your shipping to TestFlight you still have the feedback loop, but it seems like it's adding a lot of steps. Whereas, if you were working on a app in a desktop environment, just build and boot to the simulator right there.
I could see the use case for the planning stage of the build though if your out and about and have a good idea.
If you could actually build and test locally I think I would have different thoughts. For instance, if I'm building a utility script for the CLI and can run it locally it makes sense not to whipout the desktop in some scenarios.
I understand the feasibility of this and sometimes in my lazier moments I skim the code changes and trust automated/manual testing to validate changes, but to just like... you don't even see what it did?
Of course I see it on device after it installs. But that's what I'm used to: I've spent more than a decade as a product manager, I don't read all the code my engineers write! My job is to dig into how well it actually works, put guardrails in place to make sure it has to work right, monitor the outcomes! And I make sure Claude gets the feedback from my monitoring and use, the patterns of bugs we find, updates its approach, documents the most important issues to check against in future testing.
I honestly think good technical product managers have a huge leg up on engineers in this world.
Maybe not for maintainable production code but I've definitely built apps/games for friends and this kind of thing would come in handy. I leave for work, I think of something I want to do, and this way I could just text the agent to do it.
There's also a lot to be said about planning modes which don't write anything, but rather just generate text files to be implemented later when I can watch over the repo more closely.
I just want to know how the avg. dev is using these things. I feel like it's a completely different world, and all the noise is from luddites, or spotify pushing 45000 deployments to prod per day.
It's so far from the days of you should try git because it's distributed, or intellij because it has great intellisense, or vscode cuz it's fast - where the value proposition was obvious and understandable.
As someone who has consciously worked towards being more present in the moment and regularly pushes back on client expectations of always being available tools like this remind me that a nightmare hellscape of work is more than possible.
I love and hate this feature. I use it all the time with Claude, and find it super useful. At the same time I wish it didn’t exist so I was unable to continue to do work away from my computer. Kinda like the days before mobile phone and when you left the office work mostly ended.
How long before someone builds an ssh/rsh type shell experience on top and project it as "natural language shell" for dummies.
This seems like a recipe for bad code I dunno. How would someone using this app test as they go?
Agentic coding sorta works for me because you can stop and test each iteration and pinpoint where something has gone wrong.
Example: I ask for a tweak, it give me 20 lines, I test for the intended behavior and keep working on those 20 lines until I'm happy with the reliability/effects of it.
But that loop itself requires a environment in which the final product will be running and takes up most the time in my expirence.
What model are you using? I can use Claude Code's remote control on Opus 4.8, have it implement a meaty feature, talk through edge cases, ask it to posit potential interactions we want to be careful of, have it write a plan and expand tests for surrounding areas, ask it to implement, ship that increment to testflight, then experiment with it on my phone - without being at home. My only limitation is that I like to clear context between ships, and that's tough to do using remote control.
I guess if your shipping to TestFlight you still have the feedback loop, but it seems like it's adding a lot of steps. Whereas, if you were working on a app in a desktop environment, just build and boot to the simulator right there.
I could see the use case for the planning stage of the build though if your out and about and have a good idea.
If you could actually build and test locally I think I would have different thoughts. For instance, if I'm building a utility script for the CLI and can run it locally it makes sense not to whipout the desktop in some scenarios.
and you never look at the code?
I understand the feasibility of this and sometimes in my lazier moments I skim the code changes and trust automated/manual testing to validate changes, but to just like... you don't even see what it did?
Of course I see it on device after it installs. But that's what I'm used to: I've spent more than a decade as a product manager, I don't read all the code my engineers write! My job is to dig into how well it actually works, put guardrails in place to make sure it has to work right, monitor the outcomes! And I make sure Claude gets the feedback from my monitoring and use, the patterns of bugs we find, updates its approach, documents the most important issues to check against in future testing.
I honestly think good technical product managers have a huge leg up on engineers in this world.
Maybe not for maintainable production code but I've definitely built apps/games for friends and this kind of thing would come in handy. I leave for work, I think of something I want to do, and this way I could just text the agent to do it.
There's also a lot to be said about planning modes which don't write anything, but rather just generate text files to be implemented later when I can watch over the repo more closely.
I just want to know how the avg. dev is using these things. I feel like it's a completely different world, and all the noise is from luddites, or spotify pushing 45000 deployments to prod per day.
It's so far from the days of you should try git because it's distributed, or intellij because it has great intellisense, or vscode cuz it's fast - where the value proposition was obvious and understandable.
As someone who has consciously worked towards being more present in the moment and regularly pushes back on client expectations of always being available tools like this remind me that a nightmare hellscape of work is more than possible.
The only use case I see for this is for hotfixing while away from your computer. Other than that it seems irrational for me to use it.
I hate typing and reading long text on my phone.
Not for me, thanks.
But since the loopmaxxers are neither prompting nor reading anymore, it kinda makes sense.
I love and hate this feature. I use it all the time with Claude, and find it super useful. At the same time I wish it didn’t exist so I was unable to continue to do work away from my computer. Kinda like the days before mobile phone and when you left the office work mostly ended.
I feel you. For me, AI models are like the new TikTok. It's productive, but also toxic to my mental health.
I love guiding my coding agents on the go, can't wait to guide my agents on the go with this app.