Problem: DOM-based text measurement (getBoundingClientRect, offsetHeight)
forces synchronous layout reflow. When components independently measure text,
each measurement triggers a reflow of the entire document. This creates
read/write interleaving that can cost 30ms+ per frame for 500 text blocks.
Solution: two-phase measurement centered around canvas measureText.
prepare(text, font) — segments text via Intl.Segmenter, measures each word
via canvas, caches widths, and does one cached DOM calibration read per
font when emoji correction is needed. Call once when text first appears.
layout(prepared, maxWidth, lineHeight) — walks cached word widths with pure
arithmetic to count lines and compute height. Call on every resize.
~0.0002ms per text.
Native search only sees the DOM. Once you virtualize a list or grid, offscreen rows do not exist as nodes, so Ctrl-F cannot match them unless you keep enough hidden text around to erase most of the win from virtualization.
A browser API could help, but it would need to hook into selection, focus, scroll position, and match navigation across content the page has not rendered yet. That is a much bigger contract than 'search this string', and I would not bet on sites using it consistently.
This is awesome! I had this problem when building a datagrid where cells would dynamically render textarea. IIRC I ended up doing a simple canvas measurement, but I had all the text and font properties static, and even then it was hellish to get it right.
Love this. I especially liked shape based reflow example.
This is something I've been thinking for ages and would love to add to Ensō (enso.sonnet.io), purely because it would allow me to apply better caret transitions between the lines of text.
(I'm not gonna do that because I'm trying to keep it simple, but it's a strong temptation)
Now a CSS tangent: regarding the accordion example from the site (https://chenglou.me/pretext/accordion), this can be solved with pure CSS (and then perhaps a JS fallback) using the `interpolate-size` property.
> Regarding the text bubbles problem [...], you can use `text-wrap: balance | pretty` to achieve the same result.
No, neither solves the problem. And even if `balance` did work, it's not a good substitute because you don't usually want your line lengths to all be the same length.
Quick overview of pretext: if you want to layout text on the web, you have to use canvas.measureText API and implement line-breaking / segmentation / RTL yourself.
Pretext makes this easier. Just pass the text and text properties (font, color, size, etc) into a pure JS API and it layouts the content into given viewport dimension.
Earlier you'll have to either use measureText or ship harbuzz to browser somehow. I guess pretext is not a technical breakthrough, just the right things assembled to make layouting as a pure JS API.
I have one question though: how is this different from Skia-wasm / Canvaskit? Skia already has sophisticated API to layout multiline text and it also is a pure algorithmic API.
Skia brings in the world. You’re not wrong, and I understand the subtleties in the Q you asked (i.e. we’re talking Flutter; peer comment saying Skia-wasm is wasm comes across as pedantry to us because wasm vs JS is a compile-time option).
When we’re using flutter, we’re asking for there to be a device-agnostic render-the-world API, i.e. Skia / Impeller.
Here, someone took the time to code with AI a pure Typescript version of glyph rendering.
For us, the difference would sort of be like the difference between having ffmpeg in Dart, and abstractly, having an ffmpeg C library.
It’s been technically possible for years to have a WASM/FFI version with a Dart API, but it hasn’t happened because it’s a lot to take on yourself, and “real companies” would just use a server, because once you’re charging for it, people expect things like backup, download links, their computer not to need to be awake for minutes to complete a transcode, etc
Neatly completing the analogy: now, you or I takes on the grunt work of getting this hammered out and tested via AI over the next two weeks, and sticks it on GitHub. It’s not necessarily the language choice or tool itself that’s fascinating, but it legitimately breaks new ground in client-side media FOSS just to have it possible at all
The problem it solves is efficiently calculating the height of some wrapped text on a web page, without actually rendering that text to the page first (very expensive).
It does that by pre-calculating the width/height of individual segments - think words - and caching those. Then it implements the full algorithm for how browsers construct text strings by line-wrapping those segments using custom code.
This is absurdly hard because of the many different types of wrapping and characters (hyphenation, emoji, Chinese, etc) that need to be taken into account - plus the fact that different browsers (in particular Safari) have slight differences in their rendering algorithms.
I had struggled so much to measure text and number of lines when creating dynamic subtitles for remotion videos, not sure if it was my incompetence or a complexity with the DOM itself. I feel hopeful this will make it much easier :-)
Agreed! Text layout engines are stupidly hard. You start out thinking "It's a hard task, but I can do it" and then 3 months later you find yourself in a corner screaming "Why, Chinese? Why do you need to rotate your punctuation differently when you render in columns??"
> the wheel template can have some letter parts like the top of L or d extend beyond the wheel
Yeah - I use the template (in that case, a circle) to calculate line lengths, then I run 2d text along the 1d lines. Even if I tried to keep all of the glyphs inside the wheel I'd fail - because some fonts lie about how tall they are. Fonts are, basically, criminals.
> The problem it solves is efficiently calculating the height of some wrapped text on a web page, without actually rendering that text to the page first (very expensive).
But in the end, in a browser, the actual text rendering is still done by the browser?
It's a library that allows to "do stuff" before the browser renders the actual text, but by still having the browser render, eventually, the actual text?
Or is this thing actually doing the final rendering of the text too?
Yes the browser still renders the text at the end - but you can now do fancy calculations in advance to decide where you're going to ask the browser to draw it.
for ASCII text, mine finishes in 80ms, while pretext takes 2200ms. i haven't yet checked pretext for accuracy (how closely it matches the browser), but will test tonight - i expect it will do well.
let's see how close pretext can get to 80ms (or better) without adopting the same tricks.
Looks like uWrap only handles latin characters and doesn't deal with things like soft hyphens or emoji correction, plus uWrap only handles white-space: pre-line while Pretext doesn't handle pre-line but does handle both normal and pre-wrap.
correct, it was meant for estimating row height for virtualizing a 100k row table with a latin-ish LTR charset (no emoji handling, etc). its scope is much narrower. still, the difference in perf is significant, which i have found to be true in general of AI-generated geenfield code.
I've worked with text and in my experience all of these things (soft hyphens, emoji correction, non-latin languages, etc) are not exceptions you can easily incorporate later, but rather the rules that end up foundational parts of the codebase.
That is to say, I wouldn't be so quick to call a library that only handles latin characters comparable to one that handles all this breath of things, and I also wouldn't be so quick to blame the performance delta on the assumption of greenfield AI-generated code.
no disagreement. i never claimed uWrap did anything more than it does. it was not meant for typography/text-layout but for line count estimation. it never needed to be perfect, and i did not claim equivalence to pretext. however, for the use case of virtualization of data tables -- a use case pretext is also targeted at -- and in the common case of latin alphabets, right now pretext is significantly slower. i hope that it can become faster despite its much more thorough support for "rest of the owl".
prepare uses measure text, if it is in a for loop, it won't be fast. This library is meant to do prepare once and then layout many times. layout calls should be sub-1 ms.
it is not clear from the API/docs how i would use prepare() once on one text and then use layout() for completely different text.
i think the intended purpose is that your text is maybe large but static and your layout just changes quickly. this is not the case for figuring out the height of 100k rows of different texts in a table, for example.
I think for that to use pretext is to join each row with hard line break and then do prepare once, then walk each line. At least that will put the single layout performance into the best light.
I am skeptical getting row height of many items only once is the intended behavior though. It is probably the intended behavior to get row height of many items and enables you to resizing width many time later (which is pretty useful on desktop).
There's a handful of perf related PRs open already so maybe it will be faster soon. I'm sure with enough focus on it we could have a hyper optimized version in a few hours.
i don't have a mac to test this with currently, so hopefully it's not the price but a matter of adding a Safari-specific adjustement :)
internally it still uses the Canvas measureText() API, so there's nothing fundamentally that should differ unless Safari has broken measureText, which tbh, would not be out of character for that browser.
> It does that by pre-calculating the width/height of individual segments - think words - and caching those.
From the description, it doesn’t calculate it, but instead renders the segments in canvas and measures them. That’s still relatively slow compared to what native rendered-text-width APIs will do, and you have to hope that the browser’s rendering will use the identical logic in non-canvas contexts.
I recently battled this and reverted to using DOM measurements. In my case the measurement would be off by around a pixel, which caused layout issues if I tried rendering the text in DOM. This was only happening on some Linux and Android setups
I mentioned this elsewhere, but it's not actually doing any of what you describe. It's rendering the text to a canvas and then measuring that. I don't see any benchmarks that indicate it's faster than just sticking it in a <p> tag, and not any clear indication that it would be. It's certainly not implementing the full algorithm for text rendering in a browser.
It certainly seems to provide an API for analysing text layouts, but all of the computation still goes through the browser's native layout system.
This is incredibly impressive, many of this things have been missing for forever! I remember the first time I couldn't figure out how do a proper responsive accordion, it was with bootstrap 1, released in 2011 !! Today it's still not properly solved (until now?).
Many of thing things belong in css no in js, but this has been the pattern with so many things in the web
1) web needs evolve into more complex needs
2) hacky js/css implementation and workarounds
3) gets implemented as css standard
This is a not so hacky step 2. Really impressive,
I would have thunk that if this was actually possible someone would have done it already, apparently not, at some point I really want to understand what's the real insight in the library, their https://github.com/chenglou/pretext/blob/main/RESEARCH.md is interesting, they seem to have just done the hard work, of browser discrepancies to the last detail of what does an emoji measure in each browser, hope this is not a maintenance nightmare.
All in all this will push the web forward no doubt.
Responsive accordions are actually solved using CSS nowadays, but plenty of other things aren't, and the web has definitely needed an API or library like this for a long, long time. So it's great that we now have it.
Building something like this was certainly possible before, but it was a lot of effort. What's changed is simple: AI. It seems clear this library was mostly built in Cursor using an agent. That's not a criticism, it's a perfect use of AI to build something that we couldn't before.
> it's a perfect use of AI to build something that we couldn't before.
There's no reason why it couldn't have been built before. This is something that probably should exist as standard functionality, like what the Canvas API already includes. It's pretty basic functionality that every text renderer would include already at a lower level.
This should be standard functionality offered by browsers. How do you make feature requests to W3C, and do they allow the community to vote on feature ideas?
I've had to approximate text size without rendering it a few times and it's always been awkward, I'm glad there's something to reach for now (just hoping I remember that this exists next time I need it)
> This was achieved through showing Claude Code and Codex the browsers ground truth, and have them measure & iterate against those at every significant container width, running over weeks
Gosh, I wish this had existed a year ago; I spent an absurd amount of time creating a system for print brochure typesetting in HTML, that would iteratively try to find viable break points (keeping in mind that bullets etc. could exist at any time) that would ensure non-orphaned new lines, etc., all by using the Selection API and repeatedly finding bounding boxes of prospective renders.
It works, and still runs quite successfully in production, but there are still off-by-one hacks where I have no idea why they work. The iterative line generation feature here is huge.
The most practical use case is the text bubble wrapping one. That’s always frustrating when you want to wrap text inside any box with a border or background color (like a button or a “badge” component).
With all the multi$ efforts invested in the browsers, what explains that such basics as text layout are neglected, requiring libraries such as this one?
I get the feeling this is an AI hallucination. It uses the canvas to render the text to be measured, which doesn't bypass the browser layouting. The only potential performance to be gained here is rendering straight to a canvas instead of building a dom node, but it's not clear that it's actually faster. I can't imagine the cost of a single <p> is that large, and I'm not certain that it's slower than whatever steps the canvas API uses to turn text into pixels.
What you're missing is that each segment (typically a word) only needs to be measured once, in the setup phase. The canvas gets thrown away after that, and subsequent layout passes all reuse the cached measurements.
If you only perform layout once, it doesn't save any work. If you need to reflow many times, it saves a lot.
rattray | 16 hours ago
rattray | 16 hours ago
Problem: DOM-based text measurement (getBoundingClientRect, offsetHeight) forces synchronous layout reflow. When components independently measure text, each measurement triggers a reflow of the entire document. This creates read/write interleaving that can cost 30ms+ per frame for 500 text blocks.
Solution: two-phase measurement centered around canvas measureText.
prepare(text, font) — segments text via Intl.Segmenter, measures each word via canvas, caches widths, and does one cached DOM calibration read per font when emoji correction is needed. Call once when text first appears.
layout(prepared, maxWidth, lineHeight) — walks cached word widths with pure arithmetic to count lines and compute height. Call on every resize. ~0.0002ms per text.
https://github.com/chenglou/pretext/blob/main/src/layout.ts
mgaunard | 16 hours ago
simonw | 15 hours ago
gastonmorixe | 15 hours ago
Maybe for this we need a new web "Search" API instead of JS. Not sure it can be done otherwise without browser's help.
incr_me | 6 hours ago
As far as I can tell, this went basically nowhere. Except this:
https://github.com/WICG/display-locking
https://developer.mozilla.org/en-US/docs/Web/API/Element/bef...
hrmtst93837 | 50 minutes ago
A browser API could help, but it would need to hook into selection, focus, scroll position, and match navigation across content the page has not rendered yet. That is a much bigger contract than 'search this string', and I would not bet on sites using it consistently.
dalmo3 | 15 hours ago
rpastuszak | 15 hours ago
This is something I've been thinking for ages and would love to add to Ensō (enso.sonnet.io), purely because it would allow me to apply better caret transitions between the lines of text.
(I'm not gonna do that because I'm trying to keep it simple, but it's a strong temptation)
Now a CSS tangent: regarding the accordion example from the site (https://chenglou.me/pretext/accordion), this can be solved with pure CSS (and then perhaps a JS fallback) using the `interpolate-size` property.
https://www.joshwcomeau.com/snippets/html/interpolate-size/
Regarding the text bubbles problem (https://chenglou.me/pretext/bubbles), you can use `text-wrap: balance | pretty` to achieve the same result.
(`balance` IIRC evens out the # of lines)
dnlzro | 8 hours ago
No, neither solves the problem. And even if `balance` did work, it's not a good substitute because you don't usually want your line lengths to all be the same length.
See also, related CSS Working Group issue: https://github.com/w3c/csswg-drafts/issues/191
rpastuszak | 50 minutes ago
tshaddox | 8 hours ago
lewisjoe | 15 hours ago
Pretext makes this easier. Just pass the text and text properties (font, color, size, etc) into a pure JS API and it layouts the content into given viewport dimension.
Earlier you'll have to either use measureText or ship harbuzz to browser somehow. I guess pretext is not a technical breakthrough, just the right things assembled to make layouting as a pure JS API.
I have one question though: how is this different from Skia-wasm / Canvaskit? Skia already has sophisticated API to layout multiline text and it also is a pure algorithmic API.
lewisjoe | 15 hours ago
madeofpalk | 15 hours ago
It’s not wasm?
refulgentis | 11 hours ago
When we’re using flutter, we’re asking for there to be a device-agnostic render-the-world API, i.e. Skia / Impeller.
Here, someone took the time to code with AI a pure Typescript version of glyph rendering.
For us, the difference would sort of be like the difference between having ffmpeg in Dart, and abstractly, having an ffmpeg C library.
It’s been technically possible for years to have a WASM/FFI version with a Dart API, but it hasn’t happened because it’s a lot to take on yourself, and “real companies” would just use a server, because once you’re charging for it, people expect things like backup, download links, their computer not to need to be awake for minutes to complete a transcode, etc
Neatly completing the analogy: now, you or I takes on the grunt work of getting this hammered out and tested via AI over the next two weeks, and sticks it on GitHub. It’s not necessarily the language choice or tool itself that’s fascinating, but it legitimately breaks new ground in client-side media FOSS just to have it possible at all
simonw | 15 hours ago
The problem it solves is efficiently calculating the height of some wrapped text on a web page, without actually rendering that text to the page first (very expensive).
It does that by pre-calculating the width/height of individual segments - think words - and caching those. Then it implements the full algorithm for how browsers construct text strings by line-wrapping those segments using custom code.
This is absurdly hard because of the many different types of wrapping and characters (hyphenation, emoji, Chinese, etc) that need to be taken into account - plus the fact that different browsers (in particular Safari) have slight differences in their rendering algorithms.
It tests the resulting library against real browsers using a wide variety of long text documents, see https://github.com/chenglou/pretext/tree/main/corpora and https://github.com/chenglou/pretext/blob/main/pages/accuracy...
jimkleiber | 15 hours ago
rikroots | 15 hours ago
Agreed! Text layout engines are stupidly hard. You start out thinking "It's a hard task, but I can do it" and then 3 months later you find yourself in a corner screaming "Why, Chinese? Why do you need to rotate your punctuation differently when you render in columns??"
This effort feeds back to the DOM, making it far more useful than my efforts which are confined to rendering multiline text on a canvas - for example: https://scrawl-v8.rikweb.org.uk/demo/canvas-206.html
eviks | 6 hours ago
(by the way, in your cool demo the wheel template can have some letter parts like the top of L or d extend beyond the wheel)
rikroots | 3 hours ago
Yeah - I use the template (in that case, a circle) to calculate line lengths, then I run 2d text along the 1d lines. Even if I tried to keep all of the glyphs inside the wheel I'd fail - because some fonts lie about how tall they are. Fonts are, basically, criminals.
TacticalCoder | 12 hours ago
But in the end, in a browser, the actual text rendering is still done by the browser?
It's a library that allows to "do stuff" before the browser renders the actual text, but by still having the browser render, eventually, the actual text?
Or is this thing actually doing the final rendering of the text too?
simonw | 11 hours ago
tadfisher | 10 hours ago
leeoniya | 10 hours ago
uWrap.js: https://news.ycombinator.com/item?id=43583478. it did not reach 11k stars overnight, tho :D
for ASCII text, mine finishes in 80ms, while pretext takes 2200ms. i haven't yet checked pretext for accuracy (how closely it matches the browser), but will test tonight - i expect it will do well.
let's see how close pretext can get to 80ms (or better) without adopting the same tricks.
https://github.com/chenglou/pretext/issues/18
there are already significant perf improvement PRs open right now, including one done using autoresearch.
simonw | 10 hours ago
leeoniya | 10 hours ago
aylmao | 6 hours ago
That is to say, I wouldn't be so quick to call a library that only handles latin characters comparable to one that handles all this breath of things, and I also wouldn't be so quick to blame the performance delta on the assumption of greenfield AI-generated code.
leeoniya | an hour ago
liuliu | 10 hours ago
leeoniya | 10 hours ago
i think the intended purpose is that your text is maybe large but static and your layout just changes quickly. this is not the case for figuring out the height of 100k rows of different texts in a table, for example.
liuliu | 8 hours ago
I am skeptical getting row height of many items only once is the intended behavior though. It is probably the intended behavior to get row height of many items and enables you to resizing width many time later (which is pretty useful on desktop).
leeoniya | 7 hours ago
contrahax | 8 hours ago
eviks | 6 hours ago
leeoniya | an hour ago
internally it still uses the Canvas measureText() API, so there's nothing fundamentally that should differ unless Safari has broken measureText, which tbh, would not be out of character for that browser.
eviks | an hour ago
leeoniya | an hour ago
layer8 | 3 hours ago
From the description, it doesn’t calculate it, but instead renders the segments in canvas and measures them. That’s still relatively slow compared to what native rendered-text-width APIs will do, and you have to hope that the browser’s rendering will use the identical logic in non-canvas contexts.
spoiler | 3 hours ago
slopinthebag | 3 hours ago
It certainly seems to provide an API for analysing text layouts, but all of the computation still goes through the browser's native layout system.
Trufa | 15 hours ago
This is incredibly impressive, many of this things have been missing for forever! I remember the first time I couldn't figure out how do a proper responsive accordion, it was with bootstrap 1, released in 2011 !! Today it's still not properly solved (until now?).
Many of thing things belong in css no in js, but this has been the pattern with so many things in the web
1) web needs evolve into more complex needs 2) hacky js/css implementation and workarounds 3) gets implemented as css standard
This is a not so hacky step 2. Really impressive,
I would have thunk that if this was actually possible someone would have done it already, apparently not, at some point I really want to understand what's the real insight in the library, their https://github.com/chenglou/pretext/blob/main/RESEARCH.md is interesting, they seem to have just done the hard work, of browser discrepancies to the last detail of what does an emoji measure in each browser, hope this is not a maintenance nightmare.
All in all this will push the web forward no doubt.
staminade | 14 hours ago
Building something like this was certainly possible before, but it was a lot of effort. What's changed is simple: AI. It seems clear this library was mostly built in Cursor using an agent. That's not a criticism, it's a perfect use of AI to build something that we couldn't before.
Rohansi | 9 hours ago
There's no reason why it couldn't have been built before. This is something that probably should exist as standard functionality, like what the Canvas API already includes. It's pretty basic functionality that every text renderer would include already at a lower level.
c-smile | 14 hours ago
That's why I've added Graphics.Text (https://docs.sciter.com/docs/Graphics/Text) in Sciter.
Graphics.Text is basically a detached <p> element that can be rendered on canvas with all CSS bells and whistles.
lateforwork | 14 hours ago
esprehn | 8 hours ago
Right now Chrome is much more focused on AI related APIs (sigh) and not stuff like FontMetrics.
[1] https://drafts.css-houdini.org/font-metrics-api-1/
siriusfeynman | 14 hours ago
gjvc | 14 hours ago
wasted generation.
Retr0id | 11 hours ago
Edit: example: https://files.catbox.moe/4w3um0.png
Retr0id | 11 hours ago
smusamashah | 10 hours ago
> This was achieved through showing Claude Code and Codex the browsers ground truth, and have them measure & iterate against those at every significant container width, running over weeks
https://x.com/_chenglou/status/2037715226838343871?s=20
There was another comment about using Autoresearch probably for this but I might be misremembering
btown | 9 hours ago
It works, and still runs quite successfully in production, but there are still off-by-one hacks where I have no idea why they work. The iterative line generation feature here is huge.
esprehn | 8 hours ago
https://drafts.css-houdini.org/font-metrics-api-1/
RIP eae@.
tshaddox | 8 hours ago
eviks | 6 hours ago
richardw | 5 hours ago
slopinthebag | 3 hours ago
sethaurus | 53 minutes ago
If you only perform layout once, it doesn't save any work. If you need to reflow many times, it saves a lot.