My use case is mainly to make it easier to show Claude Code a problem with an SPA as I develop it. Claude’s decent at traditional server-rendered stuff, since it can curl and reason a bit about the responses, but SPAs require something more like your tool here.
there are dozens of "ai-powered browser" tools now. so why this one?
the selenium ecosystem is massive: millions of tests, thousands of companies, decades of investment. but there's no obvious bridge to the ai future. many have moved to playwright — and for good reason: it's fast, easy to use, has popular features like auto-waiting, integrated video recording, and a ton of other batteries included.
vibium takes the same approach. batteries included. great dx. but built for where the industry is going: ai agents that need to drive browsers.
when i did those interviews in september, the response wasn't just "cool idea." it was relief. the community trusts us to build this bridge because we built the last two: selenium in 2004, appium in 2012.
but why a new thing vs extending selenium? it's a little complicated, but neither selenium nor playwright were designed with ai in mind from day 1. with vibium, i'm optimizing for "vibe coding" and ai-driven workflows first.
This makes sense. I guess I wanted to understand why starting from scratch was better than "fixing" selenium, but perhaps "fixing" selenium isn't an option?
for the entire testing tools industry, in some ways, selenium was the "final boss" to beat. every new tool had to trash selenium in their marketing. eventually those "hit points" added up. "fixing selenium" is as much as of a branding problem as it is a technical problem. "oh, there's a new version of selenium? i heard selenium sucks!" is actually a problem that has to be dealt with. an entire new generation of coders only know "playwright rules, selenium drools".
of course, i have a new host of problems by going all in with "vibium"... i'm making a huge bet that "vibe coding" is a trend, not a fad. (it could still be a fad! we'll see if this post ages well soon enough!)
That makes a lot of sense. Sometimes it's easier to leave the baggage behind. It's too bad..selenium is a masterpiece. Thanks for sharing it with the world
I'd love to be able to lock down the browser to only allow certain URLs (e.g. localhost) so I can give Claude (and other tools) carte blanche to use browser automation (rather than manually approving each command). Is this something on your radar / roadmap?
fully aware of the "blast radius" risk of using claude to do stuff. i'm doing all my vibium dev in a vm using UTM (and you should, too!). wonder if there are some network rules we can add.
i did post a v2 roadmap on the github repo. might be time to start the draft for v3!
I was looking into this earlier -- presumably you'd also need to allowlist Claude itself (whatever endpoints it hits to run inference etc). VM firewall gets a little trickier with Claude's web search tool, too.
The solution I landed on recently was to locally modify the Chrome devtools MCP to launch the browser instance with strict network restrictions. I believe the implementation used `--host-resolver-rules`, blocking all URLs by default with an environment variable to control the allowlist (which, in hindsight, Claude can easily work around if it needs to -- I should probably just hard-code the allowlist).
Hi this looks really valuable, thanks for developing and sharing. Would you share some use cases and how you or your users use it personally? would love to see some examples and feel the aha "That's how I'd like to use it too!" and it would help me drive and se the problems I have as being solvable by this too rather than seeing a tool/solution looking for a problem. (not implying you're that, but without examples/use cases that's the default way I think)
lots of people have already been posting examples of how they used vibium on linkedin. (code's only been available for a day or two, so we're just getting started!)
we also have a new discord server for the project that we just spun up and will be opening up more widely soon. discord could be a good place to share uses cases and experiments until we set up a more formal website structure).
What’s next 5 years look like given that you are very good at building long-term projects that last and evolve through time? And for a very specific example, what’s the plan for incorporating new standards like Agent Skills as they quickly evolve and launch?
short term: yeah, we should totally add agent skills asap! new year's eve goal?
as far as long term plans go, i like the tim o'reilly quote: "create more value than you capture".
with selenium, we created an entire ecosystem of tools, users, companies, and economic activity. (literally billions of usd -- it's a story frequently ignored by the tech press when looking for "open source success stories".) but i hope to do the same with vibium. there will likely be a hosted "vibium.cloud" hosted service. i also hope there will be lots of them. in a similar way, there weren't many "hosted selenium" services when i started sauce labs. now there's a bunch. browserstack, lambdatest, etc.
it was also not really an accident we did that with selenium. there is a lot of behind-the-scenes consensus building that happens to make things like a w3c webdriver standard happen. (funfact: vibium relies on the new! w3c standard "webdriver bidi" protocol heavily inspired by the chrome devtools protocol used by playwright. (tl;dr: it's just json over websockets.)
i'm betting on industry cooperation, standards, and shared prosperity. that's my 5 year plan!
How does it handle context bloat between the browser and the llm?
Any plans of exposing more of the browser? For instance playwright is able to store tracing files the agent may decide to read to understand some requests / payloads…
Any plans on allowing the agent to run an arbitrary js script?
i definitely have plans to expose more of the browser! at the moment, it's very limited. i'm not sure if anyone has completely nailed the context bloat problem -- it's worth more study and benchmarking. i suspect the long term answer is "don't use mcp". but mcp (warts and all) felt like a table-stakes feature for a v1 release.
also need to clarify: there are two apis exposed right now: the mcp server and a "plain old" js/ts api. the js/api does have the ability to run arbitrary js. theoretically, you could ask an agent to write a vibium script with the js/ts library, and have the ai run that... (which ironically? is also a way to deal with the issue of context bloat)
Neat. Any reason why the MCP server doesn't expose a JavaScript/eval tool? Current models excel at writing JS to drive and inspect the DOM. They aren't great at driving browsers via screenshots.
> why the MCP server doesn't expose a JavaScript/eval tool?
no reason other than my number #1 goal was "ship something". i only started the actual coding on dec 11. it's been a bit of a sprint the last two weeks!
though "image-based" vs "dom-based" testing approaches is a very big topic! (look forward to researching that more in the future.)
FWIW, if you have Claude Code or the like, you can quickly prompt your way to an eval function in MCP. It already exists in clicker and the client API. You can use it to get the accessibility tree, for example, and use that to find what to fill out and click.
> I do think Playwright is the defacto standard now
i'll politely pushback a little. i think it's safe (at this moment in time) to say: playwright wins the first derivative, but selenium wins the "area under the curve". selenium is very entrenched in many parts of the world, especially outside of SF/USA. part of the inbound interest i've been getting for vibium is from those selenium users who want some kind of bridge to the future, but didn't have an obvious path forward beyond "dump selenium, adopt playwright"...
part of my plan with vibium post-v1 is to give that massive (and it truly is massive, i'm not bragging) installed base of selenium users an upgrade path to more agentic coding options.
Playwright really simplifies getting setup. It won't work for everyone, but within 30 seconds Playwright will download it's needed browsers along with a test runner.
I also find the documentation is much better/consolidated.
Definitely open to helping you out if I can be of assistance.
"npm install vibium" installs the needed browser on install.
right now, code-wise -- for the code you see in github at the moment -- it's just me and my ai pal, claude. but there's a growing cast of (human!) characters also helping with all the other things we need to do to run a successful project. patches and tokens welcome!
entirely possible I’m just really bad at this stuff but I can’t get browser agents to do simple report pulls without running into a captcha or a dropdown menu that breaks its brain. hopefully this is the one!
playwright got a lot of things right. one of the big ones was a fast websockets+json way to drive the browser. (vibium is using the w3c standard equivalent - webdriver bidi). but they also raised the bar on usability and developer experience. i hope to get to the level of "click, click, awesome" out-of-the-box experience that playwright did so well.
Does it allow you to inject js, modify the DOM, and most crucially monitor/modify network requests? I do those things in probably 95-99% of the time I reach for playwright mcp in claude, and from the "For Agents" part of the README, it seems like all this can do is click/type/screenshot?
Thanks. I would love to understand what people are doing with Playwright that doesn't involve those things. I really can't recall ever using it where that wasn't what I was doing. I use it letting Claude fix things. You can't fix what you can't see! What else are people using it for? Obviously there must be a (very popular!) use case for "just clicking", but I can't seem to imagine it.
In my experience, we've used playwright significantly for unit/integration tests combining it with react-testing-library to verify individual components and also whole (mocked, we used something else that I can't seem to remember for E2E tests) flows within that React application
To me doing network interception in browser driven tests is a smell like that. Unless you’re running vs a full mocked server (like MSW).
I’m a big fan of testing exactly like a user. Users don’t use network intercepts, timeouts, etc. All of my most reliable tests assert on DOM state. If the user doesn’t see it, don’t assert on it.
If an agent gets a copy of the screen using browser_screenshot and then wants to click somewhere on that screen, how is it meant to find the right css selector to pass to browser_click?
There's a browser_find method, but that assumes you already know what type of element it is. But I can't always tell what type of element something is just by looking at a screenshot.
For right now, the MCP server doesn’t expose quite enough to navigate on its own.
I’ve added a browser_evaluate tool in my fork—though I haven’t committed or pushed a PR yet. With that, the agent can call JavaScript to get the accessibility tree and then use that to navigate via browser_find.
christophilus | 4 hours ago
[OP] hugs | 4 hours ago
christophilus | 3 hours ago
xnx | 3 hours ago
anamexis | 3 hours ago
[OP] hugs | 3 hours ago
to save a click, i'll post it here, too:
-----------
why vibium?
there are dozens of "ai-powered browser" tools now. so why this one?
the selenium ecosystem is massive: millions of tests, thousands of companies, decades of investment. but there's no obvious bridge to the ai future. many have moved to playwright — and for good reason: it's fast, easy to use, has popular features like auto-waiting, integrated video recording, and a ton of other batteries included.
vibium takes the same approach. batteries included. great dx. but built for where the industry is going: ai agents that need to drive browsers.
when i did those interviews in september, the response wasn't just "cool idea." it was relief. the community trusts us to build this bridge because we built the last two: selenium in 2004, appium in 2012.
community and ecosystem are the moat.
anamexis | 3 hours ago
AFAIK Playwright also takes the approach of batteries included, great dx, and has a lot of good integration with AI agents.
Basically, what sets Vibium apart?
therunninglight | an hour ago
suchintan | 3 hours ago
What was the reason you went down this path instead of extending selenium with AI features?
[OP] hugs | 3 hours ago
but why a new thing vs extending selenium? it's a little complicated, but neither selenium nor playwright were designed with ai in mind from day 1. with vibium, i'm optimizing for "vibe coding" and ai-driven workflows first.
suchintan | 2 hours ago
[OP] hugs | 2 hours ago
of course, i have a new host of problems by going all in with "vibium"... i'm making a huge bet that "vibe coding" is a trend, not a fad. (it could still be a fad! we'll see if this post ages well soon enough!)
suchintan | 2 hours ago
moss_dog | 3 hours ago
ramoz | 3 hours ago
A custom sh script or something for whitelists would take ~5min to setup.
For more robust governance (many policies), you can write Rego using https://github.com/eqtylab/cupcake
https://code.claude.com/docs/en/hooks#mcp-tool-naming
moss_dog | 3 hours ago
[OP] hugs | 3 hours ago
i did post a v2 roadmap on the github repo. might be time to start the draft for v3!
falcor84 | 2 hours ago
moss_dog | 2 hours ago
The solution I landed on recently was to locally modify the Chrome devtools MCP to launch the browser instance with strict network restrictions. I believe the implementation used `--host-resolver-rules`, blocking all URLs by default with an environment variable to control the allowlist (which, in hindsight, Claude can easily work around if it needs to -- I should probably just hard-code the allowlist).
mannanj | 3 hours ago
[OP] hugs | 3 hours ago
we also have a new discord server for the project that we just spun up and will be opening up more widely soon. discord could be a good place to share uses cases and experiments until we set up a more formal website structure).
rancar2 | 3 hours ago
https://github.com/VibiumDev/vibium/blob/main/V2-ROADMAP.md
What’s next 5 years look like given that you are very good at building long-term projects that last and evolve through time? And for a very specific example, what’s the plan for incorporating new standards like Agent Skills as they quickly evolve and launch?
[OP] hugs | 2 hours ago
as far as long term plans go, i like the tim o'reilly quote: "create more value than you capture".
with selenium, we created an entire ecosystem of tools, users, companies, and economic activity. (literally billions of usd -- it's a story frequently ignored by the tech press when looking for "open source success stories".) but i hope to do the same with vibium. there will likely be a hosted "vibium.cloud" hosted service. i also hope there will be lots of them. in a similar way, there weren't many "hosted selenium" services when i started sauce labs. now there's a bunch. browserstack, lambdatest, etc.
it was also not really an accident we did that with selenium. there is a lot of behind-the-scenes consensus building that happens to make things like a w3c webdriver standard happen. (funfact: vibium relies on the new! w3c standard "webdriver bidi" protocol heavily inspired by the chrome devtools protocol used by playwright. (tl;dr: it's just json over websockets.)
i'm betting on industry cooperation, standards, and shared prosperity. that's my 5 year plan!
hcoura | 3 hours ago
Any plans of exposing more of the browser? For instance playwright is able to store tracing files the agent may decide to read to understand some requests / payloads…
Any plans on allowing the agent to run an arbitrary js script?
[OP] hugs | 3 hours ago
also need to clarify: there are two apis exposed right now: the mcp server and a "plain old" js/ts api. the js/api does have the ability to run arbitrary js. theoretically, you could ask an agent to write a vibium script with the js/ts library, and have the ai run that... (which ironically? is also a way to deal with the issue of context bloat)
michelb | 3 hours ago
[OP] hugs | 2 hours ago
chews | 2 hours ago
nivekney | 3 hours ago
https://github.com/VibiumDev/vibium/commits/main/?after=ffc3...
therunninglight | 57 minutes ago
ripped_britches | 3 hours ago
[OP] hugs | an hour ago
v1 is about getting to a base-line of functionality.
things get interesting in v2: https://github.com/VibiumDev/vibium/blob/main/V2-ROADMAP.md
badlogic | 2 hours ago
[OP] hugs | 2 hours ago
no reason other than my number #1 goal was "ship something". i only started the actual coding on dec 11. it's been a bit of a sprint the last two weeks!
though "image-based" vs "dom-based" testing approaches is a very big topic! (look forward to researching that more in the future.)
v1 announcement: https://github.com/VibiumDev/vibium/blob/main/docs/updates/2...
coty | 2 hours ago
999900000999 | 2 hours ago
It's been an interesting journey.I do think Playwright is the defacto standard now, but Selenium was the original browser driver.
Anyway, how does Vibium compare to Playwright ? Playwright's main advantage is it has official support for multiple languages.
[OP] hugs | 2 hours ago
i'll politely pushback a little. i think it's safe (at this moment in time) to say: playwright wins the first derivative, but selenium wins the "area under the curve". selenium is very entrenched in many parts of the world, especially outside of SF/USA. part of the inbound interest i've been getting for vibium is from those selenium users who want some kind of bridge to the future, but didn't have an obvious path forward beyond "dump selenium, adopt playwright"...
part of my plan with vibium post-v1 is to give that massive (and it truly is massive, i'm not bragging) installed base of selenium users an upgrade path to more agentic coding options.
steve_adams_86 | 2 hours ago
therunninglight | an hour ago
999900000999 | an hour ago
Playwright really simplifies getting setup. It won't work for everyone, but within 30 seconds Playwright will download it's needed browsers along with a test runner.
I also find the documentation is much better/consolidated.
Definitely open to helping you out if I can be of assistance.
[OP] hugs | an hour ago
right now, code-wise -- for the code you see in github at the moment -- it's just me and my ai pal, claude. but there's a growing cast of (human!) characters also helping with all the other things we need to do to run a successful project. patches and tokens welcome!
starik36 | 2 hours ago
therunninglight | an hour ago
source: https://www.linkedin.com/posts/apzal-bahin_ai-mcp-browseraut...
jeff4f5da2 | 2 hours ago
[OP] hugs | an hour ago
captainregex | 2 hours ago
[OP] hugs | 2 hours ago
therunninglight | 56 minutes ago
rukuu001 | 2 hours ago
I’m interested in checking out Vibium - I’ve been a reluctant adopter of Playwright and hopeful for a new approach.
[OP] hugs | 2 hours ago
dmd | 2 hours ago
[OP] hugs | an hour ago
not yet. definitely on the roadmap, though. goal is to embrace what playwright has done well, then extend what's possible...
dmd | an hour ago
[OP] hugs | an hour ago
therunninglight | an hour ago
VoidWhisperer | an hour ago
Robdel12 | an hour ago
I’m a big fan of testing exactly like a user. Users don’t use network intercepts, timeouts, etc. All of my most reliable tests assert on DOM state. If the user doesn’t see it, don’t assert on it.
dmd | 49 minutes ago
j2kun | an hour ago
therunninglight | an hour ago
- MCP option (where tokens will eventually get burned) Getting Started with Vibium MCP: https://github.com/VibiumDev/vibium/blob/main/docs/tutorials...
rahimnathwani | an hour ago
There's a browser_find method, but that assumes you already know what type of element it is. But I can't always tell what type of element something is just by looking at a screenshot.
What have I missed or misunderstood?
coty | an hour ago
I’ve added a browser_evaluate tool in my fork—though I haven’t committed or pushed a PR yet. With that, the agent can call JavaScript to get the accessibility tree and then use that to navigate via browser_find.
This and much more will be coming soon. See the V2 roadmap for more insight: https://github.com/VibiumDev/vibium/blob/main/V2-ROADMAP.md
OutOfHere | an hour ago