skip to content
Terry Li

Browser Use just open-sourced Browser Harness, a 592-line Python bridge that gives LLMs raw Chrome DevTools Protocol access. No Playwright. No Selenium. No abstraction layer at all. Their stated philosophy is “no framework, no recipes, no rails.” The LLM sends CDP commands directly and figures out how to interact with each page on its own.

It is a genuinely interesting project, and I spent an afternoon pulling it apart. But the thing worth stealing turned out to have nothing to do with their anti-framework thesis. It was a three-line side-channel in the navigate function.

When the LLM navigates to a URL, the function checks whether a domain-skills directory exists for that hostname. If it does, it returns the matching file paths alongside the page title and URL. That is the entire mechanism. The LLM sees the paths, reads the relevant skill file, and gets domain-specific knowledge about that site — which selectors work, which flows succeed, which edge cases need handling.

The irony is that Browser Harness says “no framework” but domain-skills is absolutely a framework. It is a structured knowledge layer that accumulates over time and shapes how the agent operates. The difference is that it is not designed upfront. The skill files are written by the agent during execution, not by a human architect. The agent encounters a new site, figures out what works, records it, and the next session benefits. The helpers.py file follows the same pattern — the LLM notices a missing function, writes it, and the harness grows.

This is not the absence of structure. It is structure that emerges from use rather than being prescribed in advance. The Bitter Lesson applied to tooling: let the agent learn what works instead of telling it.

What I actually built after studying this was the navigation-triggered injection pattern, ported into my own browser automation stack. When my agent navigates to linkedin.com, the navigate response now includes paths to my LinkedIn-specific skill files. Zero-cost context delivery at exactly the moment it becomes relevant. The agent does not need to remember to check for domain knowledge. The infrastructure surfaces it.

The broader principle is that the best agent infrastructure is not about giving agents more power or less structure. It is about delivering the right context at the right moment. Full frameworks like Playwright over-specify the interaction model. Raw protocols like CDP under-specify it. The sweet spot is thin tools plus contextual knowledge injection — minimal mechanism, maximal memory.

Every agent system I have built converges on this same architecture. Skills fire when their description matches the current task. Domain knowledge surfaces when the agent enters a matching context. Coaching notes are prepended to dispatches based on the target tool. None of this is the agent being smart. It is the infrastructure being attentive.

The framework that writes itself is still a framework. It just has better taste about when to show up.

· · ·

Keep reading