This is a great idea. I can’t believe I didn’t think of this, given I generate and screenshot so many “poster images” in html just like this. Haven’t played around a ton but seems intuitive. Nice work!
Why? I assume the intention is to show these images on a webpage somewhere. WebP is well-supported by browsers and can store lossless images at better compression ratios than PNG, so why not use it? I don't think using a lossy format like JPEG makes much sense. JPEG is a fine format for photos, but for HTML content rendered as an image I assume most people would want a lossless format so you don't get artifacts.
I thought webp would be better for this and checked again just to be sure, and yes, it would be better for this. WebP is quite well supported, albeit not as well supported as png, and it can have significantly smaller file sizes for the same lossless image as png.
Any similar AI based services/agents that can take images/creative assets (eg Figma, Sketch, Adobe PS, etc files) and create production-ready emails and landing pages in HTML?
They're bedazzled by a little bit of marketing flair.
Generally I find production-ready images have more synergy and tend to be web-scale. Often they're built from the ground up for AI & are blazing fast, at scale, and empower your team whilst unlocking new possibilities. As my sibling comment suggests, being cloud-native is a crucial factor too.
I’m afraid out of all the waiting strategies available in Puppeteer/Playwright, waiting a fixed period is the worst possible. Maybe consider exposing the proper waiting strategies, load/domcontentloaded/networkidle, maybe even the more fine-grained ones https://playwright.dev/docs/actionability
Ah haha. I love this conversation of trying to find a product market fit in public.
What if the input to the JavaScript (mermaid in this case) is not trusted to run on the end client machines but by running untrusted input on a sandbox (this service, or self hosted idk) is somehow acceptable and the output a blob of an image is acceptable to display on the actual client machines.
Takes the planets to align just right and need us to squint just enough but I think we can find something if we look hard enough.
But then mermaid can simply output PNG so you could run it as a worker... Thinking...
It certainly does, that's why it's been a common dev tool for a bit over 20 years. I'm not really sure what the point of OP making it a web app is, though.
that "Not MCP" is so refreshing it makes me laugh out loud
it's literally waht i've been saying all along when I came across mcp "why can't i just give agent a prompt and it will run the rest api calls for me"
there's still some MCPs which makes sense but we have it for literally everything when just a prompt will do the job!
now on the topic of html2png i do wonder is this like the self-hostable version on github https://github.com/maranemil/HTML2Png where they use canvas? or is this something else ?
I’ve been doing this manually by having a static development-only route on my website and taking a “node screenshot” using the Chrome developer tools. This is definitely a better way, well done!
> I'm not sure of what "production ready" is supposed to mean here
Given this text at the bottom:
> The high-performance HTML to PNG engine. Built for developers, agents, and automation. Completely free to use. All generated assets are public and ephemeral.
...I assume the implications are that:
1. this service will scale to meet request load without QoS degradation (i.e. it's probably running on FaaS infra), rather than being a fixed-size slowly-elastic cluster that would get choked out if your downstream service got popular and flooded it with thousands of concurrent requests
2. you can directly take the URLs the service spits out, and serve them to your downstream service's clients, without worrying much about deliverability, because there's an object store + edge CDN involved.
In other words, it's not just a single headless-chromium instance running on a box somewhere; you could actually use this thing as an upstream dependency and rely on it.
> the demo image is not optimized, `optipng` command decreases its size by 53.21%
Given that the author's imagined use-case is giving non-multimodal LLMs a way to emit visuals (the prompt at the bottom of the page starts "When asked to create visuals, charts, or mockups"), I think their idea is that the resulting rendered images would more-likely-than-not only be requested once, immediately, to display the result to the same user who caused the prompt to be evaluated.
Where, in that case, the metric of concern isn't "time+bytes cost for each marginal fetch of the resulting image from the CDN"; but rather "end-to-end wall-clock time required to load the HTML in the headless browser, bake the image, push it to the object store, and serve it once to the requesting user."
OptiPNG would slightly lower that last "serve it once" cost, but massively inflate the "bake the image" time, making it not worth it.
(I suppose they could add image optimization as something you could turn on — but "image optimization at the edge" is already a commodity product you can get from numerous vendors, e.g. Cloudflare.)
It's nice looking for sure but much more complex than using `wkhtmltox` with `pngquant`, `optipng` and/or ImageMagick `convert` locally - esp. since the learning curve seems to be about equivalent.
Yeah, I thought that as well. So I was wondering if that's some kind of a joke, or maybe modern html is so fucked up that all usual solutions became obsolete since the last time I did that.
Alternatively, open devtools, press ctrl+shift+p (or cmd+shift+p on a mac) to open the command palette, search for 'screenshot' and choose 'Capture full size screenshot' to do the same thing on your browser. There's 'area screenshot' for selecting an area, 'screenshot' for the viewport', and even 'node screenshot' for capturing the selected DOM node.
(checks list) --Mh, yah so, you've got a point there (scribbles, smiles, extends hand) --Welcome dear Sir or Madam, or, as we will call you, Number Thirteen!
It looks like this app has helpful functions for size, format, and transparency that you can't do with the built in chrome command all at once without probably piping it through inagemagik or something. And even then maybe this site renders the html responsively before rasterizing.
The only tech you can trust to be around is the tech you control. And even then it's still a bit iffy if you didn't write all the code yourself and you host it on someone else's servers.
eastoeast | 17 hours ago
xiaohanyu | 16 hours ago
dtagames | 16 hours ago
Mogzol | 14 hours ago
kaizenb | 13 hours ago
dtagames | 4 hours ago
benatkin | 15 hours ago
I thought webp would be better for this and checked again just to be sure, and yes, it would be better for this. WebP is quite well supported, albeit not as well supported as png, and it can have significantly smaller file sizes for the same lossless image as png.
RyanShook | 15 hours ago
chevman | 15 hours ago
geooff_ | 15 hours ago
threeducks | 12 hours ago
Retr0id | 14 hours ago
apeters | 13 hours ago
yeasku | 9 hours ago
KellyCriterion | 8 hours ago
aembleton | 5 hours ago
KellyCriterion | 4 hours ago
good one!!!
RadiozRadioz | 13 hours ago
Generally I find production-ready images have more synergy and tend to be web-scale. Often they're built from the ground up for AI & are blazing fast, at scale, and empower your team whilst unlocking new possibilities. As my sibling comment suggests, being cloud-native is a crucial factor too.
4ndrewl | 11 hours ago
ludicrousdispla | 10 hours ago
estebarb | 3 hours ago
lima | 2 hours ago
back2reddit | 11 hours ago
No cruft. No legacy formats.
Just buttery smooth production readiness.
andrecarini | 10 hours ago
b0ner_t0ner | 9 hours ago
But buttery bloated if the images don't run OptiPNG before exporting.
xgulfie | 5 hours ago
jsight | 4 hours ago
Claude Code loves to say that everything is production ready, even if it doesn't quite compile or pass automated tests yet.
vbezhenar | 30 minutes ago
tbrownaw | 14 hours ago
franze | 7 hours ago
oefrha | 13 hours ago
Retr0id | 13 hours ago
oefrha | 12 hours ago
jihchi | 13 hours ago
JimDabell | 13 hours ago
Garlef | 12 hours ago
JimDabell | 10 hours ago
mcny | 7 hours ago
What if the input to the JavaScript (mermaid in this case) is not trusted to run on the end client machines but by running untrusted input on a sandbox (this service, or self hosted idk) is somehow acceptable and the output a blob of an image is acceptable to display on the actual client machines.
Takes the planets to align just right and need us to squint just enough but I think we can find something if we look hard enough.
But then mermaid can simply output PNG so you could run it as a worker... Thinking...
reassess_blind | 13 hours ago
devmor | 13 hours ago
spiderfarmer | 5 hours ago
agentifysh | 12 hours ago
it's literally waht i've been saying all along when I came across mcp "why can't i just give agent a prompt and it will run the rest api calls for me"
there's still some MCPs which makes sense but we have it for literally everything when just a prompt will do the job!
now on the topic of html2png i do wonder is this like the self-hostable version on github https://github.com/maranemil/HTML2Png where they use canvas? or is this something else ?
jumploops | 12 hours ago
Not that it matters, but curious what percentage of this service was “vibe-coded”?
WilcoKruijer | 11 hours ago
me_bx | 11 hours ago
I'm not sure of what "production ready" is supposed to mean here, but the demo image is not optimized, `optipng` command decreases its size by 53.21%.
kristopolous | 8 hours ago
[OP] alvinunreal | 8 hours ago
spiderfarmer | 5 hours ago
the_arun | 2 hours ago
derefr | an hour ago
Given this text at the bottom:
> The high-performance HTML to PNG engine. Built for developers, agents, and automation. Completely free to use. All generated assets are public and ephemeral.
...I assume the implications are that:
1. this service will scale to meet request load without QoS degradation (i.e. it's probably running on FaaS infra), rather than being a fixed-size slowly-elastic cluster that would get choked out if your downstream service got popular and flooded it with thousands of concurrent requests
2. you can directly take the URLs the service spits out, and serve them to your downstream service's clients, without worrying much about deliverability, because there's an object store + edge CDN involved.
In other words, it's not just a single headless-chromium instance running on a box somewhere; you could actually use this thing as an upstream dependency and rely on it.
> the demo image is not optimized, `optipng` command decreases its size by 53.21%
Given that the author's imagined use-case is giving non-multimodal LLMs a way to emit visuals (the prompt at the bottom of the page starts "When asked to create visuals, charts, or mockups"), I think their idea is that the resulting rendered images would more-likely-than-not only be requested once, immediately, to display the result to the same user who caused the prompt to be evaluated.
Where, in that case, the metric of concern isn't "time+bytes cost for each marginal fetch of the resulting image from the CDN"; but rather "end-to-end wall-clock time required to load the HTML in the headless browser, bake the image, push it to the object store, and serve it once to the requesting user."
OptiPNG would slightly lower that last "serve it once" cost, but massively inflate the "bake the image" time, making it not worth it.
(I suppose they could add image optimization as something you could turn on — but "image optimization at the edge" is already a commodity product you can get from numerous vendors, e.g. Cloudflare.)
rognjen | 10 hours ago
krick | 9 hours ago
mewpmewp2 | 7 hours ago
randoments | 10 hours ago
mattrighetti | 8 hours ago
Think of the GitHub thumbnails where the PR number changes constantly and has to be reflected on the image preview
albert_e | 8 hours ago
I am sure @simonw has some ideas :) -- he recently blogged about HTML tools which is also one or my favorite use cases for LLMs.
Maybe similar to SVG generation, this could be a more powerful / flexible way to generate complex images / screen mockups and the like on-the-fly.
PS: How do the economics work -- how is this free to use?
PS2: The live HTML editor seems buggy. Cursor is off by one position and messes up editing. (chrome on windows)
onion2k | 5 hours ago
spiderfarmer | 5 hours ago
[OP] alvinunreal | 4 hours ago
thatgerhard | 5 hours ago
stronglikedan | 2 hours ago
xnx | 4 hours ago
google-chrome --headless --screenshot=my_screenshot.png https://www.example.com
stronglikedan | 2 hours ago
nabeards | 2 hours ago
DemocracyFTW2 | an hour ago
thekevan | 21 minutes ago
So it's installed now but still un-personalized like it was installed 5 minutes ago. I don't use it except with Antigravity.
Lord_Zero | 23 minutes ago
donohoe | 4 hours ago
What’s the catch, or how I can I be sure it will still be around in 3 months?
No snark, genuinely curious as I would use this if I could count on it.
leptons | 2 hours ago
Yash16 | 3 hours ago
scosman | 2 hours ago
dom96 | 38 minutes ago
I created an svg to png API to generate open graph images a while back. It works pretty well and can be hosted on Cloudflare Workers for free.
https://github.com/dom96/svg-renderer