Language syntax is like the weather. When it's good (or when you're acclimated to it, I guess) you don't notice it. When the weather is perfect you don't even feel like the atmosphere even exists. When a language is so ingrained in your mental models, you don't even notice syntax, you just see semantics.
A lot of programming is taste, and syntax gives you a very quick judgement about how good the language designer's taste is. How familiar they are with what we know about which syntax works well, and so on. For example if you're designing a language in 2026 that uses `type name` instead of `name: type`... that is highly suspicious.
Also syntax is the interface through which you interact with the language, so bad syntax is going to be something annoying you have to deal with constantly. Sure you'll be able to write good programs in a language with bad syntax choices, but it's going to be less fun.
> Odin’s rules, which are very similar to Python’s, are to ignore newline-based “semicolons” within brackets (( ) and [ ], and { } used as an expression or record block).
Honestly I always thought that was a bit crap in Python and I'm surprised anyone thought this was a sensible thing to copy. Really, just use semicolons. As soon as an "easy" rule becomes even vaguely difficult to remember it's better to bin it and just require explicitness, because overall that is easier.
As Ken Iverson noted in "Notation as a Tool of Thought"[1], yeah the syntax absolutely matters. The same program might resonate and make sense in one language but be incomprehensible if translated 1:1 in another.
Computer languages are for humans to understand and communicate.
Iverson's point is more regarding semantics than syntax, though. The only mention of syntax suggests its better for it to be simple (presumably so that the semantics are closer to the surface). Every programming language is a notation for describing computation; notation is a catch all for all three levels: orthography, syntax, and semantics. APL is interesting because it not only uses an unconventional syntax, but also an unconventional orthography (obligate usage of special symbols), and its semantics are different as well from most languages (array programming). Iverson's point is that APL as a notation is valuable for making the structure of certain computations obvious, and that this point generalizes across programming languages.
GingerBill's article is making a narrower claim: that semantics are what determines a good notation usually, not syntax.
Syntax is what keeps me away from Rust. I have tried many times to get into it over the years but I just don't want to look at the syntax. Even after learning all about it, I just can't get over it. I'm glad other people do fine with it but it's just not for me.
For this reason (coming from C++) I wished Swift were more popular because that syntax is much more familiar/friendly to me, while also having better memory safety and quality of life improvements that I like.
Do you have some examples of what you couldn't get along with? I know this is a lot to ask, but to me while I do write Rust and I don't write C++ or Swift in volume (only small examples) the syntax just doesn't feel that different really.
If you do like Swift you might want to just bite the bullet and embrace the Apple ecosystem. That would be my recommendation I think.
Swift's syntax may look nice, but as soon as you run into "The compiler is unable to type-check this expression in reasonable time; try breaking up the expression into distinct sub-expressions" you'll forget all of that. Hint: they are related.
Wow, this is one of the most surprising comments I've ever read on HN!
Personally, I bucket C++ and Rust and Swift under "basically the same syntax." When I think about major syntax differences, I'm thinking about things like Python's significant indentation, Ruby's `do` and `end` instead of curly braces, Haskell's whitespace-based function calls, Lisp's paren placement, APL's symbols, etc.
Before today I would have assumed that anyone who was fine with C++ or Rust or Swift syntax would be fine with the other two, but TIL this point exists in the preference space!
There are a lot of important points of difference. Whether functions are introduced with an explicit keyword; how return types are marked; whether semicolons can be omitted at end of line; how types are named (and the overall system for describing algebraic types). Not to mention semantics around type inference (and whether it must be explicitly invoked with a `var` or `auto` etc.) and, again, algebraic types (what means of combination are possible?). And specifically with Rust you have the syntax required to make the borrow checker work. Oh, and then there are the implicit returns. Rust certainly didn't invent that (I seem to recall BASIC variants where you could assign to the current function name and then that value would be returned implicitly if control flow reached the end), but it reflects a major difference in philosophy.
... Which is really all to say: different people are focused on different details, at different levels.
No need to defend yourself, I share this sentiment as well. If I'm going to spend time writing and reading a lot of code in a new learning language, I want my previous knowledge to be somewhat reusable.
For this reason I was able to get into Odin as opposed to Zig because of some similarities with Swift Syntax as well how easy it is to parse.
The less I need to rewire my brain to use xyz language, the greater the chance of me getting into it.
If my life depended on it, I could get over such a shallow reason to dismiss a language but fortunately it doesn't and that's why I write Swift rather than Rust.
When I was a kid learning BASIC, a lot of beginner examples in books used the (purely decorative) keyword LET for every assignment. Consequently I associate it with "coding like a baby who understands nothing" and still hate to write let to this day.
A languages syntax and its error messages are its user interface. Yes you can have a good tool that you don’t enjoy looking at. You can also have a good tool that’s frustrating to learn because its user interface isn’t clear and doesn’t do what you expect. Can I not hope for something that does what I need, is easy to use, and looks good?
I dislike the “you can change the syntax” argument because that just doesn’t happen. Closest thing is a new language that compiles to another.
I have really just one wish when it comes to syntax: no syntactically significant whitespace. Space, newline, tab, etc. should ALL map to the same exact token. In practice this also means semicolons or something like them are needed as well, to separate expressions/statements. I dislike langs that try to insert semicolons for you, but at least it's better than the alternative.
the way python treats whitespace is a huge design mistake that has probably wasted like a century (if not more) worth of time across all users, on something really trivial.
That's one of the things I like about C, the independence in how one can write code. I was able to develop my own style thanks to that, visualising the structure of the code to distinguish the different parts of statements and make it more clear (at least to myself).
(edited several times to try to correct changes in formatting for an example here, but it's just screwed up :-/ )
I agree, that I do not like automatic semicolon insertion (in my opinion it is one of the worst features of JavaScript, and possibly the really worst one), and I think it is a good idea that you should use semicolons or whatever to separate expressions and statements (except for a programming language where it is already unambiguous (e.g. because you are required to have brackets around them instead), in which case it is unnecessary).
However, spaces, line breaks, tabs, page breaks, etc are not normally tokens (and should not be tokens), but will separate tokens.
However, that is not the only issue with the syntax, although it is a significant one.
Semantics are where the rubber meets the road, certainly; but syntax determines how readable the code is for someone meeting it the first time.
Contrast an Algol-descendant like C, Pascal, Java, or even Python with a pure functional language like Haskell. In the former, control structure names are reserved words and control structures have a distinct syntax. In the latter, if you see `foo` in the body of a function definition you have no idea if it's a simple computation or some sophisticated and complex control structure just from what it looks like. The former provides more clues, which makes it easier to decipher at a glance. (Not knocking Haskell, here; it's an interesting language. But it's absolutely more challenging to read.)
To put it another way, syntax is the notation you use to think. Consider standard math notation. I could define my own idiosyncratic notation for standard algebra and calculus, and there might even be a worthwhile reason for me to do that. But newcomers are going to find it much harder to engage with my work.
I absolutely agree about Haskell (and also OCaml). They both suffer from "word soup" due to their designers incorrectly thinking that removing "unnecessary" punctuation is a good idea, and Haskell especially suffers from "ooo this function could be an operator too!".
> Contrast an Algol-descendant like C, Pascal, Java, or even Python with a pure functional language like Haskell. In the former, control structure names are reserved words and control structures have a distinct syntax. In the latter, if you see `foo` in the body of a function definition you have no idea if it's a simple computation or some sophisticated and complex control structure just from what it looks like. The former provides more clues, which makes it easier to decipher at a glance. (Not knocking Haskell, here; it's an interesting language. But it's absolutely more challenging to read.)
For what it's worth, Python has been moving away from this, taking advantage of a new parser that can implement "soft keywords" like 3.10's "match" statement (which I'm pretty sure was the first application).
Believe it or not, the motivation for this is to avoid reverse compatibility breaks. Infamously, making `async` a keyword broke TensorFlow, which was using it as an identifier name in some places (https://stackoverflow.com/questions/51337939).
In my own language design, there's a metaprogramming facility that lets you define new keywords and associated control structures, but all keywords are chosen from a specific reserved "namespace" to avoid conflicts with identifiers.
I don't have any real problem with words that are reserved absolutely and words that are reserved just in particular places. My point was more that in Algol-derived languages control structures look like control structures. And even in languages that implement `map()` and other higher-order functions, you can tell that it's a method/function call and that these things are being passed to it and those other things are not without going and looking up what `map()` does.
> In the latter, if you see `foo` in the body of a function definition you have no idea if it's a simple computation or some sophisticated and complex control structure just from what it looks like.
All control structures are reserved as keywords in Haskell and they're not extensible from within the language. In C I can't tell that an if(condition) isn't a function call or a macro without searching for additional syntactic cues, or readily knowing that an if is never a function. I generally operate on syntax highlighting, followed by knowing that an if is always a control structure, and never scan around for the following statement terminator or block to disambiguate the two.
I've found in general programmers greatly overestimate the unreadability they experience with the ISWIM family to be an objective property of the grammar. It's really just a matter of unfamiliarity. Firstly, I say this as a programmer who did not get started in the ML family and initially struggled with the languages. The truth of the matter is that they simply engage a different kind of mental posture and have different structural lines you're perceiving, this is generally true of all language families.
Pertinant to that last point and secondly, the sense of "well this is clearly less readable" isn't unique when going from the Algol family to the ISWIM family. The same thing happens in reverse, or across pretty much any language family boundary. For example: Prolog/Horn clauses are one of the least ambiguous syntax families (less so than even S-expressions IMO), and yet we find Elixir is greatly more popular than Erlang, and the most commonly cited preference reason has to deal with the syntax. Many will say that Erlang is unintuitive, confusing, strange, opaque, etc. and that it's hard to read and comprehend. It's just the same unfamiliarity at play. I've never programmed Ruby, I find Elixir to be borderline incomprehensible while Erlang is in the top 3 most readable and writable languages for me because I've spent a lot of time with horn clauses.
I think there's a general belief programmers have where once you learn how to program, you are doing so in a universal sense. Once you've mastered one language, the mental structures you've built up are the platonic forms of programming and computer science. But this is not actually the case. More problematically, it's propped up and reinforced when a programmer jumps between two very similar languages (semantically and/or syntactically) and while they do encounter some friction (learning to deal without garbage collection, list comprehensions, etc), it's actually nothing that fundamentally requires building up an entirely different intuitive model. This exists on a continuum in both semantics and syntax. My Erlang example indicates this, because semantically the language is nothing like Prolog, its differentiation from Elixir is purely syntactic.
There is no real universal intuition you can build up for programming. There is no point at which you've mastered some degree of fundamentals that you would ever be able to cross language family boundaries trivially. I've built up intuition for more formal language families than is possibly reasonable, and yet every time I encounter a new one I still have to pour a new foundation for myself. The only "skill" I've gotten from doing this ad nauseum is knowing at the outset that mastery of J does not mean I'd be able to get comfortable reading complex Forth code.
An article about diversity of language syntax that somehow only deals with C-adjacent curly-brace languages (and,tbf, Odin).
This is a blinkered viewpoint. If you want to talk about syntax, at least mention the Haskell family (Elm, Idris, F*, etc), Smalltalk, and the king of syntax (less) languages, LISP (and Scheme), which teach us that syntax is a data structure.
The syntax of a language is the poetry form, it defines things like meter, scansion, rhyming scheme. Of course people are going to have strong aesthetic opinions on it, just as there are centuries of arguments in poetry over what form is best. You can make great programs in any language, just like you make beautiful poetry in almost every form. (Leaving an almost there for people that dislike Limericks, I suppose.) Language choice is one of the (sometimes too few) creative choices we can make in any project.
> Another option is to do something like automatic semicolon insertion (ASI) based on a set of rules. Unfortunately, a lot of people’s first experience with this kind of approach is JavaScript and its really poor implementation of it, which means people usually just write semicolons regardless to remove the possible mistakes.
Though the joke is that the largest ASI-related mistakes in JavaScript aren't solved by adding more semicolons, it's the places that the language adds semicolons you didn't expect that trip you up the worst. The single biggest mistake is adding a newline after the `return` keyword and before the return value accidentally making a `return undefined` rather than the return value.
In general JS is actually a lot closer to the Lua example than a lot of people want to believe. There's really only one ASI-related rule that needs to be remembered when dropping semicolons in JS (and it is a lot like that Lua rule of thumb), the Winky Frown rule: if a line starts with a frown it must wink. ;( ;[ ;`
(It has a silly name because it keeps it easy to remember.)
Syntax, or how humans perceive the syntax, is only a very small part of the problems when designing a programming language. There is a lot more about how a compiler would handle the syntax (efficiently) and about how the syntax affects actual code and ecosystem.
The recent go blog on error handling should make it clear that syntax is often not worth worrying about. https://go.dev/blog/error-syntax
I like the semantics you type in the google search bar when using it for impromptu calculations. You can use ^ to raise to a power, for example. Just type sin 45. It’s all least surprise.
Syntactically it probably has a ceiling, to be so casual. Least surprise won’t work for very complex programs. But maybe the programs wouldn’t be so complex if you didn’t have to stick together complex program syntax either.
> Lua is an example of such a language, and when a semicolon is necessary is when you have something that could be misconstrued as being a call:
(function() print("Test1") end)(); -- That semicolon is required
(function() print("Test2") end)()
Tangential, but I sidestepped this ambiguity in a language I've been designing on the side, via the simple rule that the function being called and the opening parenthesis can't have whitespace between them (e.g. "f()" is fine but "f ()" or "f\n()" is not). Ditto for indexing ("x[y]"). If these characters are encountered after whitespace, the parser considers it the beginning of a new expression.
By sacrificing this (mostly unused, in practice) syntactic flexibility, I ended up not needing any sort of "semicolon insertion" logic - we just parse expressions greedily until they're "done" (i.e. until the upcoming token is not an operator).
I definitely think needless whitespace flexibility often causes problems. For eg I'm pretty sure Bjarne chose :: instead of : for the namespace operator in C++ due to ambiguity. A little bit of required whitespace around jump labels and ternary expressions and we could have saved an extra character in an operator that often occurs multiple times per line. Everybody runs linters that enforce that anyways. Likewise the inability to use a hyphen in an identifier has wasted a lot of my time over the years, but nobody uses squashed subtraction expressions.
in the era of LLMs, syntax might matter more than you think.
The c form of `type name;` is ambiguous because it could actually be more than one thing depending on context. Even worse if you include macro sheananigans. The alternate (~rust/zig) is `var/const/mut name type` is unambiguous.
For humans, with rather long memory of what is going on in the codebase, this is ~"not a problem" for experts. But for an LLM, its knowledge is limited to the content that currently exists in your context, and conventions baked in with the training corpus, this matters. Of course it is ALSO a problem for humans if they are first looking at a codebase, and if the types are unusual.
I hope that someday LLMs will interact with code mostly via language servers, rather than reading the code itself (which both frequently confuses the LLM, as you've noted, but is also simply a waste of tokens).
Not sure I follow. You seem to have omitted the part of 1) explaining how the LLM knew that my_function even existed - presumably, it read the entire file to discover that, which is way more input tokens than your hypothetical available_functions response.
LSP is meant for IDEs and very deterministic calls. Its APIs are like this: give me a definition of <file> <row> <column> <lenght>. This makes sense for IDEs because all of those can be deterministically captures based of your cursor position.
I think one could easily build an MCP tool wrapping LSP which smooths over those difficulties. What the LLM needs is just a structured way to say "perform this code change" and a structured way to ask things like "what's the definition of this function?" or "what functions are defined in this module?"
Not much different from what agents already do today inside of their harnesses, just without the part where they have to read entire files to find the definition of one thing.
This is an underappreciated point. I work across a lot of codebases and the difference in how well AI coding tools handle Rust vs JavaScript vs Python is striking — and syntax ambiguity is a big part of it.
The `type name` vs `let name: type` distinction matters more than it seems. When the grammar is unambiguous, the LLM can parse intent from a partial file without needing the full compilation context that a human expert carries in their head. Rust and Go are notably easier for LLMs to work with than C or C++ partly because the syntax encodes more structure.
The flip side: syntax that is too terse becomes opaque to LLMs for the same reason it becomes opaque to humans. Point-free Haskell, APL-family languages, heavy operator overloading — these rely on the reader holding a lot of context that does not exist in the immediate token window.
I wonder if we will see new languages designed with LLM-parseability as an explicit goal, the way some languages were designed for easy compilation.
Humans also have limited context. For LLMs it's mostly a question of pipeline engineering to pack the context and system prompt with the most relevant information, and allow tool use to properly understand the rest of the codebase. If done well I think they shouldn't have this particular issue. Current AI coding tools are mostly huge amounts of this pipeline innovation.
I think we need a LLM equivalent of this part's of fitt's law: The fastest place to click under a cursor is the location of the cursor. For an LLM the least context-expensive feedback is no feedback at all, the LLM should be able to intuit the correct code in-place, at token generation.
I am an S-exp enjoyer, and more for practical reasons than aesthetic ones—I really like the editor tooling that's possible with S-expressions. So I will absolutely choose a Lisp or a lisp if given the option, even at some level of inconvenience when it comes to the maturity of the language itself. I will always write Hy[0] rather than Python, for example.
The most machine-friendly syntax - and the least appropriate for our LLM overlords which get confused by parenthesis because they don’t see the structure.
Major syntactic structures definitely have an influence on my language choices. Outside of compilation and runtime model, modeling the domain (both data and procedures) changes drastically between paradigms. Syntax is what enables or hamstrings different modeling paradigms.
My two biggest considerations when picking a language are:
- How well does it support value semantics? (Can I pass data around as data and know that it is owned by the declaring scope, or am I chained to references, potential nulls, and mutable handles with lifetimes I must consider? Can I write expression-oriented code?)
- How well does it support linear pipelining for immutable values? (If I want to take advantage of value semantics, there needs to be a way to express a series of computations on a piece of data in-order, with no strange exceptions because one procedure or another is a compiler-magic symbol that can't be mapped, reduced, filtered, etc. In other words, piping operators or Universal Function Call Syntax.)
I lean on value semantics, expression-oriented code, and pipelining to express lots of complex computations in a readable and maintainable manner, and if a language shoots me in the foot there, it's demoralizing.
I don't want to be overly negative, but it seems to me that author considers just different flavours of C.
There is a massive difference between Clojure, Prolog, and Forth.
The whole:
type name = value—type-focused
name: type = value—name-focused
var name type = value—qualifier-focused
Is so much deep into details of how syntax might look like.
If you are choosing between Kotlin and Go, it is for the platform, not the syntax. If you decide between Haskell, Idris, Scheme, you do it with the syntax in mind.
I never got why compilers don't have pluggable syntaxes.
I mean, once you decide the "flavor" (e.g.: typed, imperative, with a dash of functional and some oop for good measure), you could have more than one syntax and easily switch to whatever the reader wants.
The original syntax scared some people, so we had the compiler use the same AST with three different parsers: Original, Java and VB. The editor (which had syntax highlighting and auto completion) would let you see the code however you wanted.
You could even have a setting in the IDE that always showed the code as you wanted.
We even respected some weirdness in the spacing and indentation of comments and code when needed.
For some languages, like rust it may be a stretch, but for most vanilla languages, you could easily re-skin them to look much more like something else, that's comfy for whoever is looking at the code.
> I never got why compilers don't have pluggable syntaxes.
An interesting question, but the answer is "because it's a bad idea" that doesn't actually solve the problem.
That said, the right way to implement this is as a "transpiler" that compiles one syntax into another. And only the people who want to use it pay the costs.
Code is communication. The compiler could handle it, but what is important is that other people can.
There are many infamous examples of people using the C preprocessor to write near-Pascal or similar in C. It largely died out because it hindered effective communication about the code.
> I am still perplexed by how people judge a language purely by its declaration syntax, and will decide whether to use the language purely based on whether they like that aspect or not.
Throughout the article, OP seems baffled that people have aesthetic preferences. Well, yes, of course we do; dealing with ugly things is the computer's job.
It also comes across like OP hasn't seen a lot of examples of really interesting language syntax, i.e. things outside, shall we say, the extended Algol family. The discussion seems to accommodate brace-less languages like Python, but not e.g. the Lisp or Forth families.
> and thus just becomes a question of ergonomics or “optimizing for typing” (which is never the bottleneck).
It might not be a bottleneck in terms of time needed. But unpleasant syntax is annoying and breaks flow. Thoughts creep in about how you wish the language looked different and that you didn't have to type these other bits. (Which is why a lot of people feel so strongly about type inference.)
> From what I gather, this sentiment of not understanding why many “modern” languages still use semicolons is either:
OP seems to conflate "semicolon" with "visible, explicit token that separates statements. There's no reason it couldn't be some other punctuation, after all. Describing Python's approach to parsing as "automatic semicolon insertion" is wild to me; the indented-block structure is the point and statements are full lines by default. Semicolons in Python are a way to break that rule (along with parentheses, as noted), which are rarely seen as useful anyway (especially given the multiple assignment syntax).
> To allow for things like Allman braces, Odin allows for extra single newline in many places in its grammar, but only an extra single newline. This is to get around certain ambiguities between declaration a procedure type and a procedure literal
Right; and the point of Python's approach is to not need braces in the first place, and therefore sidestep any considerations of brace style. And when you do that, it turns out that you don't need to think nearly as hard about whether a newline should terminate a statement. It's a package deal.
> Maybe I don’t need to be as cynical and it is a lot simpler than all of that: first exposure bias. It’s the tendency for an individual to develop a preference simply because they became familiar with it first, rather that it be a rational choice from a plethora of options.
> However I do think there are rational reasons people do not like a syntax of a language and thus do not use it. Sometimes that syntax is just too incoherent or inconsistent with the semantics of the language. Sometimes it is just too dense and full or sigils, making it very hard to scan and find the patterns within the code.
For what it's worth, before I ever touched Python I had already used (in no particular order) multiple flavours of BASIC, Turing, probably at least two kinds of assembly, Scheme, C, C++, Java and Perl. To be fair, I had also used HyperTalk and Applescript, so maybe that does explain why I glommed onto Python. But BASIC came first.
In my mind, a mid-line semicolon is exactly the kind of sigil described here, and an end-of-line sigil is simply redundant. Multi-line statements should be the explicitly-marked exception, if only because long statements should be less common than shorter ones.
I personally would prefer to hear more about what is uniquely good about Odin semantically or syntactically than more ad hominem attacks on the intelligence of the critics of the language, which I have seen in multiple recent pieces by this author.
This is the old "syntax does not matter" claim. Syntax is not the most important thing in the world when it comes to programming languages, but it does matter too. I was using perl, then PHP then ruby. There is no comparison here; ruby beats the other two languages hands down. I get to be able to do more, with less syntax and it is easier to read too (provided you write good code; you can write horrible code in any language of course).
Most of the languages that are created anew, end up being a clone of C or C++. Go is one of the few exceptions here; Rust is not an exception. It is basically C++ really, from the syntax - or even worse.
Sadly it is not possible to try to convince people who claim that syntax does not matter, that it does matter. They just keep on repeating that syntax is irrelevant. I don't think syntax it is irrelevant at all. It has to do with efficiency of expression. Clear thoughts. Clear design. It is all inter-connected.
chrsw | 3 hours ago
IshKebab | 3 hours ago
Also syntax is the interface through which you interact with the language, so bad syntax is going to be something annoying you have to deal with constantly. Sure you'll be able to write good programs in a language with bad syntax choices, but it's going to be less fun.
> Odin’s rules, which are very similar to Python’s, are to ignore newline-based “semicolons” within brackets (( ) and [ ], and { } used as an expression or record block).
Honestly I always thought that was a bit crap in Python and I'm surprised anyone thought this was a sensible thing to copy. Really, just use semicolons. As soon as an "easy" rule becomes even vaguely difficult to remember it's better to bin it and just require explicitness, because overall that is easier.
imglorp | 3 hours ago
Computer languages are for humans to understand and communicate.
1. https://www.eecg.utoronto.ca/~jzhu/csc326/readings/iverson.p...
d-us-vb | an hour ago
GingerBill's article is making a narrower claim: that semantics are what determines a good notation usually, not syntax.
majorchord | 3 hours ago
For this reason (coming from C++) I wished Swift were more popular because that syntax is much more familiar/friendly to me, while also having better memory safety and quality of life improvements that I like.
tialaramex | 2 hours ago
If you do like Swift you might want to just bite the bullet and embrace the Apple ecosystem. That would be my recommendation I think.
rrgok | 2 hours ago
Strangely enough I find Lisp's parentheses much more attractive.
g947o | 2 hours ago
O-stevns | an hour ago
It's particularly terrible in SwiftUI context nowadays but you can also make it chuck on something as simple as a .map(...)
rtfeldman | 2 hours ago
Personally, I bucket C++ and Rust and Swift under "basically the same syntax." When I think about major syntax differences, I'm thinking about things like Python's significant indentation, Ruby's `do` and `end` instead of curly braces, Haskell's whitespace-based function calls, Lisp's paren placement, APL's symbols, etc.
Before today I would have assumed that anyone who was fine with C++ or Rust or Swift syntax would be fine with the other two, but TIL this point exists in the preference space!
zahlman | an hour ago
... Which is really all to say: different people are focused on different details, at different levels.
cfiggers | 2 hours ago
I understand exactly how shallow that makes me sound, and I'm not about to try and defend myself.
O-stevns | an hour ago
For this reason I was able to get into Odin as opposed to Zig because of some similarities with Swift Syntax as well how easy it is to parse.
The less I need to rewire my brain to use xyz language, the greater the chance of me getting into it.
If my life depended on it, I could get over such a shallow reason to dismiss a language but fortunately it doesn't and that's why I write Swift rather than Rust.
card_zero | 18 minutes ago
shortercode | 3 hours ago
I dislike the “you can change the syntax” argument because that just doesn’t happen. Closest thing is a new language that compiles to another.
nice_byte | 3 hours ago
the way python treats whitespace is a huge design mistake that has probably wasted like a century (if not more) worth of time across all users, on something really trivial.
JamesTRexx | 2 hours ago
(edited several times to try to correct changes in formatting for an example here, but it's just screwed up :-/ )
liveoneggs | 2 hours ago
zzo38computer | 8 minutes ago
However, spaces, line breaks, tabs, page breaks, etc are not normally tokens (and should not be tokens), but will separate tokens.
However, that is not the only issue with the syntax, although it is a significant one.
wduquette | 3 hours ago
Contrast an Algol-descendant like C, Pascal, Java, or even Python with a pure functional language like Haskell. In the former, control structure names are reserved words and control structures have a distinct syntax. In the latter, if you see `foo` in the body of a function definition you have no idea if it's a simple computation or some sophisticated and complex control structure just from what it looks like. The former provides more clues, which makes it easier to decipher at a glance. (Not knocking Haskell, here; it's an interesting language. But it's absolutely more challenging to read.)
To put it another way, syntax is the notation you use to think. Consider standard math notation. I could define my own idiosyncratic notation for standard algebra and calculus, and there might even be a worthwhile reason for me to do that. But newcomers are going to find it much harder to engage with my work.
IshKebab | 2 hours ago
zahlman | an hour ago
For what it's worth, Python has been moving away from this, taking advantage of a new parser that can implement "soft keywords" like 3.10's "match" statement (which I'm pretty sure was the first application).
Believe it or not, the motivation for this is to avoid reverse compatibility breaks. Infamously, making `async` a keyword broke TensorFlow, which was using it as an identifier name in some places (https://stackoverflow.com/questions/51337939).
In my own language design, there's a metaprogramming facility that lets you define new keywords and associated control structures, but all keywords are chosen from a specific reserved "namespace" to avoid conflicts with identifiers.
wduquette | an hour ago
ux266478 | an hour ago
All control structures are reserved as keywords in Haskell and they're not extensible from within the language. In C I can't tell that an if(condition) isn't a function call or a macro without searching for additional syntactic cues, or readily knowing that an if is never a function. I generally operate on syntax highlighting, followed by knowing that an if is always a control structure, and never scan around for the following statement terminator or block to disambiguate the two.
I've found in general programmers greatly overestimate the unreadability they experience with the ISWIM family to be an objective property of the grammar. It's really just a matter of unfamiliarity. Firstly, I say this as a programmer who did not get started in the ML family and initially struggled with the languages. The truth of the matter is that they simply engage a different kind of mental posture and have different structural lines you're perceiving, this is generally true of all language families.
Pertinant to that last point and secondly, the sense of "well this is clearly less readable" isn't unique when going from the Algol family to the ISWIM family. The same thing happens in reverse, or across pretty much any language family boundary. For example: Prolog/Horn clauses are one of the least ambiguous syntax families (less so than even S-expressions IMO), and yet we find Elixir is greatly more popular than Erlang, and the most commonly cited preference reason has to deal with the syntax. Many will say that Erlang is unintuitive, confusing, strange, opaque, etc. and that it's hard to read and comprehend. It's just the same unfamiliarity at play. I've never programmed Ruby, I find Elixir to be borderline incomprehensible while Erlang is in the top 3 most readable and writable languages for me because I've spent a lot of time with horn clauses.
I think there's a general belief programmers have where once you learn how to program, you are doing so in a universal sense. Once you've mastered one language, the mental structures you've built up are the platonic forms of programming and computer science. But this is not actually the case. More problematically, it's propped up and reinforced when a programmer jumps between two very similar languages (semantically and/or syntactically) and while they do encounter some friction (learning to deal without garbage collection, list comprehensions, etc), it's actually nothing that fundamentally requires building up an entirely different intuitive model. This exists on a continuum in both semantics and syntax. My Erlang example indicates this, because semantically the language is nothing like Prolog, its differentiation from Elixir is purely syntactic.
There is no real universal intuition you can build up for programming. There is no point at which you've mastered some degree of fundamentals that you would ever be able to cross language family boundaries trivially. I've built up intuition for more formal language families than is possibly reasonable, and yet every time I encounter a new one I still have to pour a new foundation for myself. The only "skill" I've gotten from doing this ad nauseum is knowing at the outset that mastery of J does not mean I'd be able to get comfortable reading complex Forth code.
hackyhacky | 2 hours ago
This is a blinkered viewpoint. If you want to talk about syntax, at least mention the Haskell family (Elm, Idris, F*, etc), Smalltalk, and the king of syntax (less) languages, LISP (and Scheme), which teach us that syntax is a data structure.
WorldMaker | 2 hours ago
> Another option is to do something like automatic semicolon insertion (ASI) based on a set of rules. Unfortunately, a lot of people’s first experience with this kind of approach is JavaScript and its really poor implementation of it, which means people usually just write semicolons regardless to remove the possible mistakes.
Though the joke is that the largest ASI-related mistakes in JavaScript aren't solved by adding more semicolons, it's the places that the language adds semicolons you didn't expect that trip you up the worst. The single biggest mistake is adding a newline after the `return` keyword and before the return value accidentally making a `return undefined` rather than the return value.
In general JS is actually a lot closer to the Lua example than a lot of people want to believe. There's really only one ASI-related rule that needs to be remembered when dropping semicolons in JS (and it is a lot like that Lua rule of thumb), the Winky Frown rule: if a line starts with a frown it must wink. ;( ;[ ;`
(It has a silly name because it keeps it easy to remember.)
ameliaquining | 2 hours ago
g947o | 2 hours ago
The recent go blog on error handling should make it clear that syntax is often not worth worrying about. https://go.dev/blog/error-syntax
hyperhello | 2 hours ago
1e1a | 2 hours ago
hyperhello | 2 hours ago
wavemode | 2 hours ago
By sacrificing this (mostly unused, in practice) syntactic flexibility, I ended up not needing any sort of "semicolon insertion" logic - we just parse expressions greedily until they're "done" (i.e. until the upcoming token is not an operator).
recursivecaveat | an hour ago
dnautics | 2 hours ago
The c form of `type name;` is ambiguous because it could actually be more than one thing depending on context. Even worse if you include macro sheananigans. The alternate (~rust/zig) is `var/const/mut name type` is unambiguous.
For humans, with rather long memory of what is going on in the codebase, this is ~"not a problem" for experts. But for an LLM, its knowledge is limited to the content that currently exists in your context, and conventions baked in with the training corpus, this matters. Of course it is ALSO a problem for humans if they are first looking at a codebase, and if the types are unusual.
wavemode | 2 hours ago
dnautics | 2 hours ago
Like which do you think is more token-efficient?
1)
2)wavemode | an hour ago
0x457 | an hour ago
LLMs are notoriously bad at counting.
wavemode | an hour ago
Not much different from what agents already do today inside of their harnesses, just without the part where they have to read entire files to find the definition of one thing.
0x457 | 25 minutes ago
So far enabling LSP in Claude only added messages like "this is old diagnostic before my edit".
cranberryturkey | 2 hours ago
The `type name` vs `let name: type` distinction matters more than it seems. When the grammar is unambiguous, the LLM can parse intent from a partial file without needing the full compilation context that a human expert carries in their head. Rust and Go are notably easier for LLMs to work with than C or C++ partly because the syntax encodes more structure.
The flip side: syntax that is too terse becomes opaque to LLMs for the same reason it becomes opaque to humans. Point-free Haskell, APL-family languages, heavy operator overloading — these rely on the reader holding a lot of context that does not exist in the immediate token window.
I wonder if we will see new languages designed with LLM-parseability as an explicit goal, the way some languages were designed for easy compilation.
vidarh | an hour ago
I've worked on fine tuning projects. There's a massive bias towards fone tuning for Python at several model providers for example, followed by JS.
dheera | 2 hours ago
dnautics | 2 hours ago
zahlman | an hour ago
cfiggers | 2 hours ago
[0] https://hylang.org
(I am aware of Combobulate[1] for Emacs folks, of which I'm sadly not one.)
[1] https://GitHub.com/mickeynp/combobulate
kgwgk | 2 hours ago
andersmurphy | an hour ago
There's an option beyond lisp. Forth has even less syntax.
kgwgk | 49 minutes ago
netbioserror | 2 hours ago
My two biggest considerations when picking a language are:
- How well does it support value semantics? (Can I pass data around as data and know that it is owned by the declaring scope, or am I chained to references, potential nulls, and mutable handles with lifetimes I must consider? Can I write expression-oriented code?)
- How well does it support linear pipelining for immutable values? (If I want to take advantage of value semantics, there needs to be a way to express a series of computations on a piece of data in-order, with no strange exceptions because one procedure or another is a compiler-magic symbol that can't be mapped, reduced, filtered, etc. In other words, piping operators or Universal Function Call Syntax.)
I lean on value semantics, expression-oriented code, and pipelining to express lots of complex computations in a readable and maintainable manner, and if a language shoots me in the foot there, it's demoralizing.
jiriknesl | 2 hours ago
There is a massive difference between Clojure, Prolog, and Forth.
The whole:
Is so much deep into details of how syntax might look like.If you are choosing between Kotlin and Go, it is for the platform, not the syntax. If you decide between Haskell, Idris, Scheme, you do it with the syntax in mind.
ErroneousBosh | an hour ago
C? Basically Algol. Pascal? Basically Algol, actually quite closely. Go? Basically Algol, via Pascal. Lua? Basically Algol, surprisingly closely.
Forth? Basically Lisp. Postscript? Basically Lisp.
juancn | 2 hours ago
I mean, once you decide the "flavor" (e.g.: typed, imperative, with a dash of functional and some oop for good measure), you could have more than one syntax and easily switch to whatever the reader wants.
We had an integration language in a product I worked on that had three flavors (you can check it here: https://docs.oracle.com/cd/E13154_01/bpm/docs65/pdf/OracleBP... , page 254)
The original syntax scared some people, so we had the compiler use the same AST with three different parsers: Original, Java and VB. The editor (which had syntax highlighting and auto completion) would let you see the code however you wanted.
You could even have a setting in the IDE that always showed the code as you wanted.
We even respected some weirdness in the spacing and indentation of comments and code when needed.
For some languages, like rust it may be a stretch, but for most vanilla languages, you could easily re-skin them to look much more like something else, that's comfy for whoever is looking at the code.
hyperpape | an hour ago
An interesting question, but the answer is "because it's a bad idea" that doesn't actually solve the problem.
That said, the right way to implement this is as a "transpiler" that compiles one syntax into another. And only the people who want to use it pay the costs.
zahlman | an hour ago
This doesn't really explain anything, and it isn't clear that both of you have the same model of "the problem" in mind.
vidarh | an hour ago
There are many infamous examples of people using the C preprocessor to write near-Pascal or similar in C. It largely died out because it hindered effective communication about the code.
zahlman | 2 hours ago
Throughout the article, OP seems baffled that people have aesthetic preferences. Well, yes, of course we do; dealing with ugly things is the computer's job.
It also comes across like OP hasn't seen a lot of examples of really interesting language syntax, i.e. things outside, shall we say, the extended Algol family. The discussion seems to accommodate brace-less languages like Python, but not e.g. the Lisp or Forth families.
> and thus just becomes a question of ergonomics or “optimizing for typing” (which is never the bottleneck).
It might not be a bottleneck in terms of time needed. But unpleasant syntax is annoying and breaks flow. Thoughts creep in about how you wish the language looked different and that you didn't have to type these other bits. (Which is why a lot of people feel so strongly about type inference.)
> From what I gather, this sentiment of not understanding why many “modern” languages still use semicolons is either:
OP seems to conflate "semicolon" with "visible, explicit token that separates statements. There's no reason it couldn't be some other punctuation, after all. Describing Python's approach to parsing as "automatic semicolon insertion" is wild to me; the indented-block structure is the point and statements are full lines by default. Semicolons in Python are a way to break that rule (along with parentheses, as noted), which are rarely seen as useful anyway (especially given the multiple assignment syntax).
> To allow for things like Allman braces, Odin allows for extra single newline in many places in its grammar, but only an extra single newline. This is to get around certain ambiguities between declaration a procedure type and a procedure literal
Right; and the point of Python's approach is to not need braces in the first place, and therefore sidestep any considerations of brace style. And when you do that, it turns out that you don't need to think nearly as hard about whether a newline should terminate a statement. It's a package deal.
> Maybe I don’t need to be as cynical and it is a lot simpler than all of that: first exposure bias. It’s the tendency for an individual to develop a preference simply because they became familiar with it first, rather that it be a rational choice from a plethora of options.
> However I do think there are rational reasons people do not like a syntax of a language and thus do not use it. Sometimes that syntax is just too incoherent or inconsistent with the semantics of the language. Sometimes it is just too dense and full or sigils, making it very hard to scan and find the patterns within the code.
For what it's worth, before I ever touched Python I had already used (in no particular order) multiple flavours of BASIC, Turing, probably at least two kinds of assembly, Scheme, C, C++, Java and Perl. To be fair, I had also used HyperTalk and Applescript, so maybe that does explain why I glommed onto Python. But BASIC came first.
In my mind, a mid-line semicolon is exactly the kind of sigil described here, and an end-of-line sigil is simply redundant. Multi-line statements should be the explicitly-marked exception, if only because long statements should be less common than shorter ones.
norir | an hour ago
zahlman | an hour ago
shevy-java | an hour ago
Most of the languages that are created anew, end up being a clone of C or C++. Go is one of the few exceptions here; Rust is not an exception. It is basically C++ really, from the syntax - or even worse.
Sadly it is not possible to try to convince people who claim that syntax does not matter, that it does matter. They just keep on repeating that syntax is irrelevant. I don't think syntax it is irrelevant at all. It has to do with efficiency of expression. Clear thoughts. Clear design. It is all inter-connected.