Computer-Use Testing for Android Apps and Browser Extensions
Test Android apps and browser extensions with a computer-use agent — it drives the real UI like a user and checks the result. Open source and MIT licensed — no vendor lock-in, no agent infra in the middle, runs entirely on your own CI.
Two backends. One CLI.
agentprobe ships an Android backend driven by adb and a browser backend driven by Chrome CDP. Both feed screenshots to a vision model and translate its decisions into real UI actions.
adb screencap → base64 PNG
↓
vision model (Azure / OpenAI / Gemini / xAI)
↓
adb shell input (tap / type / swipe / key)
↓
ui_dump assertions + NL verificationChrome CDP → scrot screenshot
↓
vision model
↓
xdotool click / type
↓
CDP assertion + NL verificationFeatures
Everything you need to run real-device UI tests in CI.
Drives the real UI
No mocks, no test doubles. Interacts like a human user via adb input or xdotool. Catches regressions test doubles miss.
BYOK, no vendor lock-in
Supports Azure, OpenAI, Gemini, xAI. MIT licensed, self-hosted on your CI, no call home.
GIF artifacts + CI reports
Every run records a screen GIF via ffmpeg and emits JUnit XML. Pass the GIF URL in the PR comment.
GitHub Actions integration
Android tests run inside android-emulator-runner (API 28). Browser tests run in a containerized Chrome instance. Both targets emit JUnit XML and a screen GIF.
# .github/workflows/android-cua.yml
- uses: reactivecircus/android-emulator-runner@v2
with:
api-level: 28
script: agentprobe run --target android --case cases/onboarding.yaml
# For browser extensions:
- run: |
agentprobe run --target browser \
--extension ./my-extension.crx \
--case cases/sidepanel.yamlMinimal YAML — no code required
Test cases are plain YAML files. Describe the goal and success criteria in natural language; the vision model handles the rest.
name: onboarding-flow instruction: "Open the app and complete onboarding until you see the dashboard" successCriteria: "Dashboard screen is visible" failureCriteria: "App crashes or shows error dialog" maxSteps: 30 verification: prompt: "Is the dashboard visible and functional?"
instructionNatural language goal for the agent
successCriteriaNL condition checked after completion
failureCriteriaAbort condition detected mid-run
maxStepsHard cap on agent action steps
verification.promptFinal NL assertion sent to vision model
MIT licensed. No agent infra in the middle.
agentprobe is MIT licensed and entirely self-hostable. It runs on your own GitHub Actions runner — there is no Agent Labs infrastructure between your test runner and your app. Your screenshots, your API keys, and your source code stay on your machines.
pip install agentprobe