Use Abrantes with AI
How to generate experiments (AB tests) with Abrantes, AI tools and models. Recommendations based on practical tests.
Important disclaimer
This workflow is experimental. Do not rely only on AI to create experiments.
- Always review generated code before using it.
- Always test variants and tracking before launch.
- Treat model output as draft code, not as production-ready code.
Recommended workflow
- Preferably use a coding-focused AI tool that can read files in a local Abrantes repo clone. Some online tools worked, but many didn't work.
- Prompt your target URL and experiment hypothesis.
- Use a skill or provide the
SKILL.mdfile with the prompt. (Or reference the Abrantes docs URL if upload is not supported). - Run and validate the output manually before launch.
Tools that worked well
chat.z.ai + GLM-5 agent mode
Best online result in these tests. Worked well with uploaded skill and prompt.
OpenCode + MiniMax M2.5 free
Desktop workflow worked well. The model scanned repo context and generated correct code.
Gemini CLI + Gemini 3 Flash preview
Good output when permissions were granted for required actions.
Codex + GPT 5.3 codex medium
Worked reliably in these tests. Allow curl so the model can access docs and pages.
Claude Code + Sonnet 4.6 (Medium)
Worked well using the /experiment skill workflow.
Antigravity + Gemini 3.1 Pro (high)
Worked after being directed to use the experiment skill. Initial result hallucinated.
Tools that failed in tests
These results were noted on 2026-02-28 and may change as tools improve.
- GitHub + Claude Haiku 4.5 (online): could not access the tested page.
- Gemini online: hallucinated the implementation.
- Mistral Le Chat: hallucinated the implementation.
- Grok 4.20 (beta): many selector and implementation mistakes.
- Qwen 3.5 plus reasoning: hallucinations and struggled with
add2dataLayer. - DeepSeek: generated hallucinations and ignored the real page context.
- Google NotebookLM: ignored uploaded
SKILL.mdguidance. - ChatGPT Mac app (free plan): could not access tested URLs directly in this workflow.
Final recommendations
- Evaluate a tool + model pair, not only the model alone.
- Desktop coding tools can perform better than online tools because they can inspect local repo files.
- If your tool cannot access the tested page URL, the tool can't generate a working experiment.
- Before launch, validate selectors, assignments, rendering, persistence, and tracking events.