It looks dirty, but yeah, it's 255. If this is unintuitive you can force the intuition by looking at the degenerate 2-bit case where the only integer values are 0, 1, 2, and 3, and then brute force the corresponding int->float conversion values. If you don't want extremely obviously wonky behavior like obviously-uneven, or black or white not being black or white, you end up with 0.0, 0.33..., 0.66..., and 1.0. And then the inversion of that involves multiplying by 3, not 4 (2^2).
The first part is correct, but "then the inversion of that involves multiplying by 3, not 4" does not follow from that.
The inversion requires quantization (rounding) which is the whole point here that breaks the symmetry.
Make a gradient of evenly distributed real numbers 0..=1 and try quantize them to 0, 1, 2, 3.
You'll find that multiplying by 3 gives uneven results. ×3 and round() makes 1 and 2 over-represented. ×3 and floor or ceil collapses 0 or 3 into a singularity, making the gradient look like it's using only 3 out of 4 colors.
The /3 and ×3 logic seems fine if you just roundtrip exact numbers there and back, as they'll always round to themselves, but any values in between are greatly affected by the choice of rounding, which matters as soon as you start processing the data.
You only get even proportions of integers if you multiply by (4-ε) and round down (same as ×4, floor() and clamp()). It feels like a weird off-by-1 or off-by-ε error, but intuitively it's the solution that looks best.
Just to make it clear I understand your proposed approach: y=k/255 when loading and k_new=trunc(y*255.999) when saving, where k and k_new are 8-bit integers. I apologize if this an libimagequant detail that I have already asked about. It definitely doesn't sound intuitive to me. I'm intrigued though.
No; if you agree that the right scale factor for 2-bit int->float is 3, then you have to agree that the inverse also uses 3:
import random
import statistics
import math
def f():
samps = [random.randint(0, 3) for _ in range(10000)] # randint is inclusive for some reason
print("mean", statistics.mean(samps), "\tstddev", statistics.stdev(samps))
samps = [float(x) * (1.0/3.0) for x in samps] # always right: we want to map black to black and white to white
print("mean", statistics.mean(samps), "\tstddev", statistics.stdev(samps))
samps = [x + (random.random()*0.5 - 0.25) for x in samps]
print("mean", statistics.mean(samps), "\tstddev", statistics.stdev(samps))
samps3 = [round(x * 3.0) for x in samps]
print("mean", statistics.mean(samps3), "\tstddev", statistics.stdev(samps3))
samps4 = [math.trunc(x * 3.999) for x in samps]
print("mean", statistics.mean(samps4), "\tstddev", statistics.stdev(samps4))
f()
then:
mean 1.4946 stddev 1.1120676730536434
mean 0.4982 stddev 0.3706892243512145
mean 0.4995226599747743 stddev 0.39824401057102154
mean 1.4975 stddev 1.2563246336669978
mean 1.6229 stddev 1.4296502998978455
(EDIT2: This 1.6 comes from the difference between floor and trunc, which leads into the lower earlier edit.)
There's a related problem in image resampling about corner vs center aligned interpolation, but the constraints are the opposite here.
EDIT:
It may help to think of the clipped-off portions of the white and black integer bins as not "missing" but rather inaccessible because of the format of the original data. Looking at them on a distribution from only 0.0 to 1.0 is missing the fuller picture. There are lots of times where negative and overflowing-positive pixel energy is valid and it needs to be scaled using the same rules as in-gamut energy does without introducing new inconsistencies, and also while retaining the constraint that white is white and black is black. And you don't want -0.000000001 rounding to -0.25 or something.
Put simpler: on the scale where 0 is black and 255 is white, 1024 is a real color. If you convert it to float by dividing by 255 (which simply must be correct, otherwise white isn't white), then if you convert it back to an integer by multiplying by 256, you just increased its energy, even if you quantize (you get something like 1028). The original article openly assumes a strictly clipped range, but you don't want your conversions to be different for non-clipped ranges.
I was so confused by the title (maybe intentionally?) this seems to be “does 0..1 map to [0..255.0] or [0.5..255.5]?”
To me the answer has always been “obviously” [0.0..255.0] but maybe that’s not “obvious”?
The articles points out that “extreme” bins have half the capacity of the others. I don’t think this is correct framing, either:
values outside of [0..1] don’t exist, in which case the “appearance” a narrower band is rendering artifact - the bin is rendered narrower because that rendering was made with knowledge that nothing out of bounds can exist and so the bucket was clipped
or values outside of [0..1] exist, in which case they’re infinite
The article acknowledges the latter point but not the former.
To me, once you acknowledge the first point the correct behaviour is obvious, but the fact that this comes out of a post like this is clearly not objectively “obvious” :D
If it's obviously 0…255.0, what range of float values maps back to integer 0 and which float values map to integer 255?
If you say 0..<1 maps to integer 0, and 254>..255.0 to integer value 255, then they'll eat 128. You'll probably want 127.5..128.5 to map to 128. But where do these halves go!?
If you shift everything a bit to fit the 128, you end up with 0..0.99609375 mapping to int 0.
The standard approach also just comes natural when people call round() on it. Since that feels quite natural to people I'm assuming it became the standard also just by the simplicity of it.
For f(x) -> [0.5/8,7.5/8] then f(0) + f(0) + f(0) = 0.5/8 + 0.5/8 + 0.5/8 = 1.5/8 != f(0)
Choosing f(x) -> [0, 255] means that if you do a calculation on the x side and you do the same calculation on the right side, then you'll get the same result when you convert from one to the other.
Choosing f(x) -> [0.5/8,7.5/8] breaks the algebraic correspondence.
sRGB is a non-linear approximation of a displayable subset of the larger CIE XYZ colorspace. If you "uniformly sample" [0,1] and convert to [0,255], you haven't really uniformly sampled anyways with regards to perceptivity.
The spacing of the /256 samples looks nicer, but I prefer /255 as the 0 and 1 boundaries are special since it's an artificial boundary case.
How would you generate perceptually uniform samples if not in sRGB gamma space? Is the L* cuberoot-plus-offset nonlinearity really more accurate? Note that I'm talking about shades of gray like (10,10,10) here.
I wonder if there would be use in the opposite of what he was trying to achieve with 256, which would map 0.0 to 0, 1.0 to 255, and all other floats to values from 1 to 254.
uint8_t output = 0.0f >= result
? 0
: 1.0f <= result
? 255
: 1 + 253*result;
Hopefully black stays black and white stays white during processing.
the image right under the "The case against 255.0" header isn't working at all for me but the rest are, trying with curl gives me an infinite 301 redirect from cloudflare?
Must be something strange going on with Cloudflare's CDN. Anyway, I uploaded a copy here and also pushed a page with a different file name to bust their cache. Thanks for the report.
wareya | a day ago
It looks dirty, but yeah, it's 255. If this is unintuitive you can force the intuition by looking at the degenerate 2-bit case where the only integer values are 0, 1, 2, and 3, and then brute force the corresponding int->float conversion values. If you don't want extremely obviously wonky behavior like obviously-uneven, or black or white not being black or white, you end up with 0.0, 0.33..., 0.66..., and 1.0. And then the inversion of that involves multiplying by 3, not 4 (2^2).
kornel | 15 hours ago
The first part is correct, but "then the inversion of that involves multiplying by 3, not 4" does not follow from that.
The inversion requires quantization (rounding) which is the whole point here that breaks the symmetry.
Make a gradient of evenly distributed real numbers 0..=1 and try quantize them to 0, 1, 2, 3.
You'll find that multiplying by 3 gives uneven results. ×3 and round() makes 1 and 2 over-represented. ×3 and floor or ceil collapses 0 or 3 into a singularity, making the gradient look like it's using only 3 out of 4 colors.
The /3 and ×3 logic seems fine if you just roundtrip exact numbers there and back, as they'll always round to themselves, but any values in between are greatly affected by the choice of rounding, which matters as soon as you start processing the data.
You only get even proportions of integers if you multiply by (4-ε) and round down (same as ×4, floor() and clamp()). It feels like a weird off-by-1 or off-by-ε error, but intuitively it's the solution that looks best.
pekkavaa | 5 hours ago
Just to make it clear I understand your proposed approach: y=k/255 when loading and k_new=trunc(y*255.999) when saving, where k and k_new are 8-bit integers. I apologize if this an libimagequant detail that I have already asked about. It definitely doesn't sound intuitive to me. I'm intrigued though.
wareya | 2 hours ago
No; if you agree that the right scale factor for 2-bit int->float is 3, then you have to agree that the inverse also uses 3:
then:
(EDIT2: This 1.6 comes from the difference between floor and trunc, which leads into the lower earlier edit.)
There's a related problem in image resampling about corner vs center aligned interpolation, but the constraints are the opposite here.
EDIT:
It may help to think of the clipped-off portions of the white and black integer bins as not "missing" but rather inaccessible because of the format of the original data. Looking at them on a distribution from only 0.0 to 1.0 is missing the fuller picture. There are lots of times where negative and overflowing-positive pixel energy is valid and it needs to be scaled using the same rules as in-gamut energy does without introducing new inconsistencies, and also while retaining the constraint that white is white and black is black. And you don't want
-0.000000001rounding to-0.25or something.Put simpler: on the scale where 0 is black and 255 is white, 1024 is a real color. If you convert it to float by dividing by 255 (which simply must be correct, otherwise white isn't white), then if you convert it back to an integer by multiplying by 256, you just increased its energy, even if you quantize (you get something like 1028). The original article openly assumes a strictly clipped range, but you don't want your conversions to be different for non-clipped ranges.
olliej | 19 hours ago
I was so confused by the title (maybe intentionally?) this seems to be “does 0..1 map to [0..255.0] or [0.5..255.5]?”
To me the answer has always been “obviously” [0.0..255.0] but maybe that’s not “obvious”?
The articles points out that “extreme” bins have half the capacity of the others. I don’t think this is correct framing, either:
The article acknowledges the latter point but not the former.
To me, once you acknowledge the first point the correct behaviour is obvious, but the fact that this comes out of a post like this is clearly not objectively “obvious” :D
kornel | 15 hours ago
If it's obviously 0…255.0, what range of float values maps back to integer 0 and which float values map to integer 255?
If you say 0..<1 maps to integer 0, and 254>..255.0 to integer value 255, then they'll eat 128. You'll probably want 127.5..128.5 to map to 128. But where do these halves go!?
If you shift everything a bit to fit the 128, you end up with 0..0.99609375 mapping to int 0.
olliej | 9 hours ago
This the entire point of the article, which also comes to the conclusion of 0..255 being the best option.
mitsuhiko | 11 hours ago
The standard approach also just comes natural when people call
round()on it. Since that feels quite natural to people I'm assuming it became the standard also just by the simplicity of it.jyounker | 4 hours ago
From an algebraic standpoint, the answer is clearly f(x) -> [0, 255].
If you don't have f(n * 0) == n * f(0), then all sorts of weird stuff happens, like:
For f(x) -> [0, 255] then f(0) + f(0) + f(0) = 0 + 0 + 0 = 0 = f(0)
For f(x) -> [0.5/8,7.5/8] then f(0) + f(0) + f(0) = 0.5/8 + 0.5/8 + 0.5/8 = 1.5/8 != f(0)
Choosing f(x) -> [0, 255] means that if you do a calculation on the x side and you do the same calculation on the right side, then you'll get the same result when you convert from one to the other.
Choosing f(x) -> [0.5/8,7.5/8] breaks the algebraic correspondence.
pyj | 3 hours ago
sRGB is a non-linear approximation of a displayable subset of the larger CIE XYZ colorspace. If you "uniformly sample" [0,1] and convert to [0,255], you haven't really uniformly sampled anyways with regards to perceptivity.
The spacing of the /256 samples looks nicer, but I prefer /255 as the 0 and 1 boundaries are special since it's an artificial boundary case.
pekkavaa | an hour ago
How would you generate perceptually uniform samples if not in sRGB gamma space? Is the L* cuberoot-plus-offset nonlinearity really more accurate? Note that I'm talking about shades of gray like (10,10,10) here.
an_origamian | a day ago
I wonder if there would be use in the opposite of what he was trying to achieve with 256, which would map 0.0 to 0, 1.0 to 255, and all other floats to values from 1 to 254.
Hopefully black stays black and white stays white during processing.
kornel | 15 hours ago
This makes 0 and 255 get a bigger share of the unit range than other numbers (by about 0.8%, i.e. 255/253).
Jan200101 | 9 hours ago
the first image appears to be broken for me
pekkavaa | 5 hours ago
Hello, author of the article here. Do you mean the image is corrupted (I did compress it with pngcrush) or do you find it somehow incorrect?
Jan200101 | 4 hours ago
the image right under the "The case against 255.0" header isn't working at all for me but the rest are, trying with curl gives me an infinite 301 redirect from cloudflare?
pekkavaa | 4 hours ago
Must be something strange going on with Cloudflare's CDN. Anyway, I uploaded a copy here and also pushed a page with a different file name to bust their cache. Thanks for the report.