Model-Based GUI Automation (Springer SoSyM)

1 points | by jspinak 13 hours ago

5 comments

pushpeshkarki 10 hours ago
Can we have more complex examples in the examples sections, like actual gameplay automation rather than just basic UI navigation? This will allow the readers to understand the capability of the tool/framework better. Also, I would like to know how the results are displayed to the end users once the automation test suite execution is completed.
[-]
- jspinak 4 hours ago
  Thanks for your questions! The mobile game demo (https://jspinak.github.io/brobot/docs/tutorials/tutorial-bas...) shows game automation and automated image collection and labeling to build a dataset for model training.
  Here's the Qontinui Runner's action log during live automation: https://i.imgur.com/8R4d2Uf.png. Note the GO_TO_STATE action – that’s unique to model-based GUI automation. Instead of writing explicit navigation steps, you tell the framework "go to this state" and it handles pathfinding automatically.
  You can see some actions failed (red X) - like "select to process corn". Traditional scripts would crash here. The model-based approach handles this differently: the next GO_TO_STATE call finds paths from wherever the GUI actually is (the current active states) to the desired state. So even when individual actions fail, the automation self-corrects on the next navigation.
  Important clarification: This isn't test automation (using bots to test applications). The breakthrough is making the AUTOMATION ITSELF testable, enabling standard software engineering practices in a domain where they were previously infeasible. You can write integration tests that verify your bot works correctly before running it live. Section 11 of the paper covers this (Appendix 3 has an example from Brobot; qontinui.io provides visual test output).
  The approach works for any GUI automation: gaming, visual APIs for RL agents, data collection, business automation, and yes, also software testing. I started with games (Brobot, 2018) because brittleness was most painful there.
  Does that help clarify?
jspinak 13 hours ago
Hi HN, author here.
I started building Brobot in 2018 to automate gameplay - I wanted to understand why my automation kept breaking. The more I dug in, the more I realized this was a fundamental problem in GUI automation itself.
Two problems kept surfacing:
1. Script fragility - automation breaks constantly from minor GUI changes
2. Inability to test - no way to verify automation works before deploying
Research in GUI testing shows that the vast majority of test failures come from UI changes, not actual bugs. Yet you can't write integration tests for traditional GUI automation. You just run it and hope.
The root cause: traditional automation uses sequential scripts (do A, then B, then C). Making this robust requires exponential code growth - a 30-state automation has 6.36 trillion possible paths. You can't test all paths, can't guarantee it works.
Model-based GUI automation solves both problems by borrowing from robotics navigation. Instead of writing step-by-step scripts, you create a navigable map of the GUI. The framework handles pathfinding, state management, and error recovery automatically.
Key results:
• Reduces complexity from exponential to polynomial (mathematically proven)
• Makes GUI automation testable for the first time (integration tests, path verification)
• Enables reliable visual APIs for RL agents
• Supports robust dataset generation for model training
• Works for games, business apps, web interfaces - any GUI
Over 7 years, I developed and formalized this approach through both mathematical theory and real-world validation. Springer SoSyM published it in late October.
Open-source implementation: https://github.com/qontinui
• qontinui (Python) - Core automation library (pip install qontinui)
• multistate (Python) - State machine (pip install multistate)
• qontinui-runner (Rust/TypeScript) - Desktop execution engine
• qontinui-api (Python/FastAPI) - REST API bridge (pip install qontinui-api)
Interactive docs & playground: https://qontinui.github.io/multistate/
Original Java version (Brobot, 2018-2025): https://github.com/jspinak/brobot
I'm also building a visual builder (qontinui-web, Feb 2026 launch) for no-code automation - point-and-click designer that creates JSON configs the runner executes locally. Available now in early access (breaking changes possible before launch, but migration tools provided for format changes).
The research provides the mathematical foundation, the Python stack lets you use it today (code-based or visual). Wanted to contribute something useful to the AI/RL community.
Demos:
• Mobile game image collection/labeling: https://jspinak.github.io/brobot/docs/tutorials/tutorial-bas...
• More examples: https://jspinak.github.io/brobot/
Paper: https://link.springer.com/article/10.1007/s10270-025-01319-9
Story behind the name: https://jspinak.github.io/brobot/docs/theoretical-foundation...
pushpeshkarki 10 hours ago
[dead]
pushpeshkarki 10 hours ago
[dead]