Decomposition is definitely needed as tasks become more complicated. I’d prefer to define the desired state, get a decomposed breakdown of the gap between the current and desired states, and let agents figure out how to close it themselves, rather than manually operating each intermediate task. As a project owner, I’d love to work at the desired-state level.
I'd love to see a comparison with other spec-driven development tools for Claude, like OpenSpec and Superpowers. How does this compare and contrast with them?
I think those tools would be good as well. The point of sddw for me was to be able to adjust sdd to typical size of your projects. GSD was great but probably for gigantic projects only. For mid - its overkill of tokens.
I've been using agent flywheel workflow which is similar. Still not completely sold - it feels a bit like using power tools to shape wood but the final product needs a lot of sanding and polishing.
I thought initially this meant that the spec wasn't detailed enough but the problem is more agent adherence and laziness.
Agentic coding works especially great for me when application is platform-like. You have core and you extend it with a standardized plugins. When few plugins are already there - its hard to distinguish if next plugin is written by agent or by a human.
Exactly. A detailed-enough spec is just code that you can’t run. If models and agents got to a point where doing a good job in Claude Code plan mode meant that I didn’t have to keep an eye on them in implementation, then I would be interested in some bigger spec-driven thing like this. That is still far from the case today for me.
Are there any benchmarks/evals to see if this particular one is doing anything good comparing to, let's say, plan mode? How do you measure it actually works and you don't waste tokens and your personal time?
I fail to see any backing for claims 'boosting performance' and 'keeping costs low'
when plan + code mode works - no need to change it. when it does not, because feature is complicated - than we need something else. Thats when sdd is applicable. I use it for mid + size projects only.
Measuring is a bit of subjective thing here. But when plan mode + code does not work and sdd works (because of double decomposition) - you get what you need.
Tokens consumption is lower because you can wipe your context after every step or subtask implemented. The scope to deliver specs is bigger however. But confusion is way lower as your context is focused per single step or subtask.
Decomposition is definitely needed as tasks become more complicated. I’d prefer to define the desired state, get a decomposed breakdown of the gap between the current and desired states, and let agents figure out how to close it themselves, rather than manually operating each intermediate task. As a project owner, I’d love to work at the desired-state level.
I'd love to see a comparison with other spec-driven development tools for Claude, like OpenSpec and Superpowers. How does this compare and contrast with them?
I think those tools would be good as well. The point of sddw for me was to be able to adjust sdd to typical size of your projects. GSD was great but probably for gigantic projects only. For mid - its overkill of tokens.
[flagged]
I've been using agent flywheel workflow which is similar. Still not completely sold - it feels a bit like using power tools to shape wood but the final product needs a lot of sanding and polishing.
I thought initially this meant that the spec wasn't detailed enough but the problem is more agent adherence and laziness.
Agentic coding works especially great for me when application is platform-like. You have core and you extend it with a standardized plugins. When few plugins are already there - its hard to distinguish if next plugin is written by agent or by a human.
Also sddw works nicely with fleet of agent: https://news.ycombinator.com/item?id=48226033. I just insert the sequence of sddw steps into the queue and take a nap.
Exactly. A detailed-enough spec is just code that you can’t run. If models and agents got to a point where doing a good job in Claude Code plan mode meant that I didn’t have to keep an eye on them in implementation, then I would be interested in some bigger spec-driven thing like this. That is still far from the case today for me.
[dead]
Are there any benchmarks/evals to see if this particular one is doing anything good comparing to, let's say, plan mode? How do you measure it actually works and you don't waste tokens and your personal time?
I fail to see any backing for claims 'boosting performance' and 'keeping costs low'
fair
here are slides explaining it in more details: https://docs.google.com/presentation/d/1SjKXF7hkoqyiN9-3tBGY...
when plan + code mode works - no need to change it. when it does not, because feature is complicated - than we need something else. Thats when sdd is applicable. I use it for mid + size projects only.
Measuring is a bit of subjective thing here. But when plan mode + code does not work and sdd works (because of double decomposition) - you get what you need.
Tokens consumption is lower because you can wipe your context after every step or subtask implemented. The scope to deliver specs is bigger however. But confusion is way lower as your context is focused per single step or subtask.
[flagged]
[dead]