You can go even further with this in other languages, with things like dependent typing - which can assert (among other interesting properties) that, for example, something like
get_elem_at_index(array, index)
cannot ever have index outside the bounds of the array, but checked statically at compilation time - and this is the key, without knowing a priori what the length of array is.
"In Idris, a length-indexed vector is Vect n a (length n is in the type), and a valid index into length n is Fin n ('a natural number strictly less than n')."
Similar tricks work with division that might result in inf/-inf, to prevent them from typechecking, and more subtle implications in e.g. higher order types and functions
How does that work? If the length of the array is read from stdin for example, it would be impossible to know it at compile time. Presumably this is limited somehow?
If the length is read from outside the program it's an IO operation, not a static variable, but there are generally runtime checks in addition to the type system. Usually you solve this as in the article, with a constructor that checks it - so you'd have something like "Invalid option: length = 5 must be within 0-4" when you tried to create the Fin n from the passed in value
It doesn’t have to be a compile time constant. An alternative is to prove that when you are calling the function the index is always less than the size of the vector (a dynamic constraint). You may be able to assert this by having a separate function on the vector that returns a constrained value (eg. n < v.len()).
One option is dependent pairs, where one value of the pair (in this example) would be the length of the array and the other value is a type which depends on that same value (such as Vector n T instead of List T).
Type-Driven Development with Idris[1] is a great introduction for dependently typed languages and covers methods such as these if you're interested (and Edwin Brady is a great teacher).
If you check that the value is inside the range, and execute some different code if it's not, then congratulations, you now know at compile time that the number you will read from stdin is in the right range.
Not sure about Idris, but in Lean `Fin n` is a struct that contains a value `i` and a proof that `i < n`. You can read in the value `n` from stdin and then you can do `if h : i < n` to have a compile-time proof `h` that you can use to construct a `Fin n` instance.
Dividing a float by zero is usually perfectly valid. It has predictable outputs, and for some algorithms like collision detection this property is used to remove branches.
I think “has predictable outputs” is less valuable than “has expected outputs” for most workloads. Dividing by zero almost always reflects an unintended state, so proceeding with the operation means compounding the error state.
(This isn’t to say it’s always wrong, but that having it be an error state by default seems very reasonable to me.)
This reminds me a bit of a recent publication by Stroustrup about using concepts... in C++ to validate integer conversions automatically where necessary.
{
Number<unsigned int> ii = 0;
Number<char> cc = '0';
ii = 2; // OK
ii = -2; // throws
cc = i; // OK if i is within cc’s range
cc = -17; // OK if char is signed; otherwise throws
cc = 1234; // throws if a char is 8 bits
}
I don't really get why this is getting flagged, I've found this to be true but more of a trade off than a pure benefit. It also is sort of besides the point: you always need to parse inputs from external, usually untrusted, sources.
Agree with this. Mismatching types are generally an indicator of an underlying issue with the code, not the language itself. These are areas AI can be helpful flagging potential problems.
Yeah, there's something of a tension between the Perlis quote "It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures" and Parse, don't validate.
The way I've thought about it, though, is that it's possible to design a program well either by encoding your important invariants in your types or in your functions (especially simple functions). In dynamically typed languages like Clojure, my experience is that there's a set of design practices that have a lot of the same effects as "Parse, Don't Validate" without statically enforced types. And, ultimately, it's a question of mindset which style you prefer.
Note that the division-by-zero example used in this article is not the best example to demonstrate "Parse, Don't Validate," because it relies on encapsulation. The principle of "Parse, Don't Validate" is best embodied by functions that transform untrusted data into some data type which is correct by construction.
Alexis King, the author of the original "Parse, Don't Validate" article, also published a follow-up, "Names are not type safety" [0] clarifying that the "newtype" pattern (such as hiding a nonzero integer in a wrapper type) provide weaker guarantees than correctness by construction. Her original "Parse, Don't Validate" article also includes the following caveat:
> Use abstract datatypes to make validators “look like” parsers. Sometimes, making an illegal state truly unrepresentable is just plain impractical given the tools Haskell provides, such as ensuring an integer is in a particular range. In that case, use an abstract newtype with a smart constructor to “fake” a parser from a validator.
So, an abstract data type that protects its inner data is really a "validator" that tries to resemble a "parser" in cases where the type system itself cannot encode the invariant.
The article's second example, the non-empty vec, is a better example, because it encodes within the type system the invariant that one element must exist. The crux of Alexis King's article is that programs should be structured so that functions return data types designed to be correct by construction, akin to a parser transforming less-structured data into more-structured data.
Even the newtype-based "parse, don't validate" is tremendously useful in practice, though. The big thing is that if you have a bare string, you don't know "where it's been". It doesn't carry with it information whether it's already been validated. Even if a newtype can't provide you full correctness by construction, it's vastly easier to be convinced of the validity of an encapsulated value compared to a naked one.
For full-on parse-don't-validate, you essentially need a dependent type system. As a more light-weight partial solution, Rust has been prototyping pattern types, which are types constrained by patterns. For instance a range-restricted integer type could be simply spelled `i8 is 0..100`. Such a feature would certainly make correctness-by-construction easier in many cases.
The non-empty list implemented as a (T, Vec<T>) is, btw, a nice example of the clash between practicality and theoretical purity. It can't offer you a slice (consecutive view) of its elements without storing the first element twice (which requires that T: Clone, unlike normal Vec<T>), which makes it fairly useless as a vector. It's okay if you consider it just an abstract list with a more restricted interface.
Would have to be F32, no?
I cannot think of any way to enforce "non-zero-ness" of the result without making it return an optional Result<NonZeroF32>, and at that point we are basically back to square one...
> I cannot think of any way to enforce "non-zero-ness" of the result without making it return an optional Result<NonZeroF32>, and at that point we are basically back to square one...
`NonZeroU32::checked_add(self, other: u32)` basically does this, although I'll note it returns an `Option` instead of a `Result` ( https://doc.rust-lang.org/std/num/type.NonZeroU32.html#metho... ), leaving you to `.map_err(...)` or otherwise handle the edge case to your heart's content. Niche, but occasionally what you want.
The examples in question propagate complexity throughout related code. I think this is a case I see frequently in Rust of using too many abstractions, and its associated complexities.
I would just (as a default; the situation varies)... validate prior to the division and handle as appropriate.
The analogous situation I encounter frequently is indexing, e.g. checking if the index is out of bounds. Similar idea; check; print or display an error, then fail that computation without crashing the program. Usually an indication of some bug, which can be tracked down. Or, if it's an array frequently indexed, use a (Canonical for Rust's core) `get` method on the whatever struct owns the array. It returns an Option.
I do think either the article's approach, or validating is better than runtime crashes! There are many patterns in programming. Using Types in this way is something I see a lot of in OSS rust, but it is not my cup of tea. Not heinous in this case, but I think not worth it.
This is the key to this article's philosophy, near the bottom:
> I love creating more types. Five million types for everyone please.
dang | 2 hours ago
also:
Parse, Don’t Validate – Some C Safety Tips - https://news.ycombinator.com/item?id=44507405 - July 2025 (73 comments)
Parse, Don't Validate (2019) - https://news.ycombinator.com/item?id=41031585 - July 2024 (102 comments)
Parse, don't validate (2019) - https://news.ycombinator.com/item?id=35053118 - March 2023 (219 comments)
Parse, Don't Validate (2019) - https://news.ycombinator.com/item?id=27639890 - June 2021 (270 comments)
Parsix: Parse Don't Validate - https://news.ycombinator.com/item?id=27166162 - May 2021 (107 comments)
Parse, Don’t Validate - https://news.ycombinator.com/item?id=21476261 - Nov 2019 (230 comments)
Parse, Don't Validate - https://news.ycombinator.com/item?id=21471753 - Nov 2019 (4 comments)
(p.s. these links are just to satisfy extra-curious readers - no criticism is intended! I add this because people sometimes assume otherwise)
jaggederest | 2 hours ago
"In Idris, a length-indexed vector is Vect n a (length n is in the type), and a valid index into length n is Fin n ('a natural number strictly less than n')."
Similar tricks work with division that might result in inf/-inf, to prevent them from typechecking, and more subtle implications in e.g. higher order types and functions
VorpalWay | 2 hours ago
jaggederest | 2 hours ago
ratorx | 2 hours ago
mdm12 | an hour ago
Type-Driven Development with Idris[1] is a great introduction for dependently typed languages and covers methods such as these if you're interested (and Edwin Brady is a great teacher).
[1] https://www.manning.com/books/type-driven-development-with-i...
marcosdumay | an hour ago
dernett | an hour ago
esafak | an hour ago
satvikpendem | 33 minutes ago
Which refers to https://docs.rs/anodized/latest/anodized/
cmovq | 2 hours ago
woodruffw | 2 hours ago
(This isn’t to say it’s always wrong, but that having it be an error state by default seems very reasonable to me.)
noitpmeder | 2 hours ago
https://www.stroustrup.com/Concept-based-GP.pdf
strawhatguy | 2 hours ago
Like how clojure basically uses maps everywhere and the whole standard library allows you to manipulate them in various ways.
The main problem with the many type approach is several same it worse similar types, all incompatible.
packetlost | an hour ago
doublesocket | an hour ago
fiddlerwoaroof | an hour ago
The way I've thought about it, though, is that it's possible to design a program well either by encoding your important invariants in your types or in your functions (especially simple functions). In dynamically typed languages like Clojure, my experience is that there's a set of design practices that have a lot of the same effects as "Parse, Don't Validate" without statically enforced types. And, ultimately, it's a question of mindset which style you prefer.
strawhatguy | an hour ago
The real world often changes though, and more often than not the code has to adapt, regardless of how elegant are systems are designed.
marcosdumay | an hour ago
Rygian | an hour ago
sam0x17 | an hour ago
IshKebab | an hour ago
hutao | an hour ago
Alexis King, the author of the original "Parse, Don't Validate" article, also published a follow-up, "Names are not type safety" [0] clarifying that the "newtype" pattern (such as hiding a nonzero integer in a wrapper type) provide weaker guarantees than correctness by construction. Her original "Parse, Don't Validate" article also includes the following caveat:
> Use abstract datatypes to make validators “look like” parsers. Sometimes, making an illegal state truly unrepresentable is just plain impractical given the tools Haskell provides, such as ensuring an integer is in a particular range. In that case, use an abstract newtype with a smart constructor to “fake” a parser from a validator.
So, an abstract data type that protects its inner data is really a "validator" that tries to resemble a "parser" in cases where the type system itself cannot encode the invariant.
The article's second example, the non-empty vec, is a better example, because it encodes within the type system the invariant that one element must exist. The crux of Alexis King's article is that programs should be structured so that functions return data types designed to be correct by construction, akin to a parser transforming less-structured data into more-structured data.
[0] https://lexi-lambda.github.io/blog/2020/11/01/names-are-not-...
Sharlin | 14 minutes ago
For full-on parse-don't-validate, you essentially need a dependent type system. As a more light-weight partial solution, Rust has been prototyping pattern types, which are types constrained by patterns. For instance a range-restricted integer type could be simply spelled `i8 is 0..100`. Such a feature would certainly make correctness-by-construction easier in many cases.
The non-empty list implemented as a (T, Vec<T>) is, btw, a nice example of the clash between practicality and theoretical purity. It can't offer you a slice (consecutive view) of its elements without storing the first element twice (which requires that T: Clone, unlike normal Vec<T>), which makes it fairly useless as a vector. It's okay if you consider it just an abstract list with a more restricted interface.
rapnie | 8 minutes ago
fph | an hour ago
```
impl Add for NonZeroF32 { ... }
impl Add<f32> for NonZeroF32 { ... }
impl Add<NonZeroF32> for f32 { ... }
```
What type would it return though?
alfons_foobar | 56 minutes ago
MaulingMonkey | 29 minutes ago
Generally yes. `NonZeroU32::saturating_add(self, other: u32)` is able to return `NonZeroU32` though! ( https://doc.rust-lang.org/std/num/type.NonZeroU32.html#metho... )
> I cannot think of any way to enforce "non-zero-ness" of the result without making it return an optional Result<NonZeroF32>, and at that point we are basically back to square one...
`NonZeroU32::checked_add(self, other: u32)` basically does this, although I'll note it returns an `Option` instead of a `Result` ( https://doc.rust-lang.org/std/num/type.NonZeroU32.html#metho... ), leaving you to `.map_err(...)` or otherwise handle the edge case to your heart's content. Niche, but occasionally what you want.
the__alchemist | 12 minutes ago
I would just (as a default; the situation varies)... validate prior to the division and handle as appropriate.
The analogous situation I encounter frequently is indexing, e.g. checking if the index is out of bounds. Similar idea; check; print or display an error, then fail that computation without crashing the program. Usually an indication of some bug, which can be tracked down. Or, if it's an array frequently indexed, use a (Canonical for Rust's core) `get` method on the whatever struct owns the array. It returns an Option.
I do think either the article's approach, or validating is better than runtime crashes! There are many patterns in programming. Using Types in this way is something I see a lot of in OSS rust, but it is not my cup of tea. Not heinous in this case, but I think not worth it.
This is the key to this article's philosophy, near the bottom:
> I love creating more types. Five million types for everyone please.