I'm interested to hear how many people agree with this.
I personally find writing tests to be a lot of fun and look forward to it when programming. Seeing lots of green (and looking forward to seeing lots of green) really scratches an itch in my brain.
I like writing tests, but I find tedious creating the whole setup that you need to build effective tests. Once a project has it, it's easy, but if it has nothing then I need to breath deeply
I think this is really common when tests are considered an afterthought. This is a sign that no one has taken the tests seriously enough to think thru what kind of abstractions they need to make their setup not be tedious and painful.
Yup. I usually have an idea of the situation that I want to test, but the machinery to set that up is often very complicated and (in the case of the Python tests at my workplace) completely untyped and often not composable. Most of the time I end up copying the setup code from another test without completely understanding it and be content with jj splitting the test from my change and run it before, check that it fails, and run it after, and check that it passes.
Iâm definitely in the âtests are boring and I wouldnât do them if I didnât have toâ camp.
I can understand the mentality of people who enjoy testing, occasionally I even get to write a test of some well behaved thing like a nice pure function of and to value types and I kinda get a kick out of trying to devise a table of inputs and outputs with sneaky little edge cases, then making sure theyâre all covered. But I often find the things I work on arenât like this, thereâs a big messy API, the thing under test has side effects, I end up having to make my test very beholden to implementation details rather than the actual interface (basically all mocking) so I feel like many tests that get written are more a âtest thing never gets modified againâ sort of thing than actual evidence of correctness and thatâs where the whole exercise starts to feel like tedium
Iâve done both, I find the kind of work thatâs amenable to the good kind of tests is also amenable to test first methodologies, conversely when work is âcall three apis and synthesize the results in some contingent way and write the results somewhere then kick off some other thingâ then youâre not going to write a meaningful test before you start an implementation, and even if you do youâre going to spend more time and lines on mocking or a DI scheme than testing first party code
I'd agree that integration work is less amenable to unit testing and a "test first" approach.
However, it's doable. If the job is calling the 3 apis (remote?) and synthesizing the results, then that sounds more like an integration test than a unit test. I nowadays do integration tests as "test-only" tests.
I code the whole thing up in a test case, calling up the various external APIs and verifying their outputs (â characterization test). When the test is working, I refactor it, preferably extracting the "synthesize the results in some contingent way" as a function or method that just depends on its inputs.
I generally avoid stubs (sometimes called mocks), they are a code-smell that almost always tells me something in my architecture is wrong.
However, if I have a lot of functionality that depends on an external service, I will eventually code up a local test-double of that external system. And I code it up by pointing my characterization tests at my test-double.
Any functionality that does something complex does not call the external APIs directly. It is called with the parameters it needs previously obtained from the external APIs and the "orchestration" that ties it all together as trivial and unbreakable as possible.
So that when your unit tests work, you pretty much know that your integration tests will work as well.
Depends on where I've worked. In my early jobs tests were basically pure cargo culting and never caught a single real error. Later I worked at places that devised tests based on what errors we frequently had, and these included unusual things like statistical tests on the output of a data pipeline, or tests for how the runtime of an algorithm scaled with larger inputs. They ran more slowly, but actually prevented bugs & regressions.
I find making tests easy to write quite interesting and often fun. The actual tests arenât exciting, but if Iâm doing it right the actual tests are very little work.
I personally donât find much value in test-driven development, but testability-driven development is essential. If tests are tedious and hard to write, the tests will not be written, and bugs will inevitably lurk in the untested gaps.
But then, I do test-first. I find that people who find tests boring almost invariably write them after. Of course it is going to be boring. And very unrewarding, because you already had your high from writing the code, and now at best the test can tell you that it was OK (null result), and at worst it will tell you that you were wrong (sucks).
With test-first, that first failure is positive, so is getting it green.
I think the types of tests most people write are boring. If you are validating behavior you already know to be correct, it's a chore, the fundamental benefit you get is catching regressions. If you are actively trying to break your software, or if you are trying to understand the system better through tests (this happens to me a lot when writing parsers because I cannot fully anticipate how the parsing goes), then it can be very fun. Personally, I prefer writing PBTs to writing the software itself.
That snapshot is a pytest fixture which magically resolves to the captured snapshot for the test in question. You can update it using pytest --snapshot-update - the snapshot is serialized to disk in the tests/__snapshots__ folder.
I've also used inline-snapshot which is a little more magic - it rewrites your test files directly to match the snapshot, as seen here. You don't even need a fixture with that one, you just import the snapshot() function and use it like this:
This might be a hot take for this comment section but I generally consider snapshot testing to be an anti-pattern, having worked on a large JS code base that used copious Jest .toMatchSnapshot().
Some of the gripes I had with it are:
It was widely used for testing constants, which is a waste of everyone's time. (The post actually seems to do this by testing help text, which seems somewhat unnecessary, unless there's something I'm missing such as the help text being dynamically generated).
The tests aren't descriptive. At the end of the day what you're doing is you're adding a barrier to ensure that the value before and after the code change is the same. If the return value of the function under test is simple, the writing and alteration of an ordinary unit test is also simple...
...But if it's complex, then the snapshot does not tell you what the author of the test expected to actually test. It becomes much more complex to figure out what was the intention, because snapshot tests of more complicated functionality tend to invite laconic assertion descriptions, along the lines of "matches snapshot". The entire thing breaks, even if something minor (that's possibly covered by another test) changes.
In general, I feel like testing should be done deliberately, by checking the essential and ignoring the accidental. Snapshot testing doesn't do this - it just takes the entire thing.
The only exception I would point out as to where snapshots are actually useful is the idea of Elixir's doctests, but that's because they test the correspondence between the documentation and functionality (making sure that the examples given in the docs match the actual behaviour). But that's really about it.
I agree with you on the "help text" example. I don't really see the point in a test like that unless the tests are written by somebody other than the developer, or the help text is dynamic. You're literally testing that a function can return a string, and there's no good reason to do that.
Which ties back to whether writing tests is fun. If the tests are validating something meaningful then it's fun to write them and gain confidence in the implementation. If writing the test feels like busy work, testing that the developer can copy/paste a string into two files then it's miserable.
I quite liked the usage of snapshots for testing the help text. I think it works well because the text content is diff-able and very easy to parse as a human. In my opinion, this is where snapshot tests excel.
Having used snapshot tests extensively in UI contexts, my experience has been I put little to no trust in snapshots tests (particularly of the DOM). I find that snapshot libraries have this âhandyâ feature to accept the changes and this makes it too easy to accept changes without reviewing why the test failed. Of course, this might be a process problem, but I think still itâs a weakness of snapshot tests in general. Subtle but incorrect changes to the output can be easy to disregard.
Iâve found myself to trust unit tests with the classic setup and assertions because it requires a developer to think harder about what is causing the test to fail instead of being given an easy path of accepting the new output.
This is generally my experience in most team contexts as well. I will say that UI snapshot testing on very small team projects or personal projects where you can reasonably expect every person making changes understands every piece of the UI and what should or shouldnât be changing in snapshots when a given piece of code changes, then itâs worked well for me.
Also often the value isnât so much âI rigorously looked at the snapshot diff and confirmed the change was right or wrongâ as âwait, why did my change in a util file used by foo cause barâs snapshot to change?â. That is knowing what wiggled at all is often more useful than exactly what changed, if for no other reason than itâs a good reminder to go look at everything that changed and confirm that you didnât make some assumption that gets violated in some far flung context.
Having used snapshot tests extensively in UI contexts...
Yeah I think snapshot testing UIs is really not that nice, you get that feeling of "just accept anything and everything". I snapshot testing really shines when testing "regular" code that can benefit from being displayed in some pretty string version
I really like expectation testing. In OCaml when using dune, you can have a test.ml file that prints things to stdout, and if you also have a test.expected file in the same directory dune will detect it and compare the output from test.ml and show you a diff if there's any changes, and if you like it you can update it with dune promote.
I've used it in xobl to review changes to the output of the compiler. I'm glad Gleam also has a framework for doing this!
I first encountered snapshot tests on the Gleam compiler, which is mentioned in the article, and I really like them now! The cool formatting that Giacomo shows off in the article definitely makes the snapshots easier to review, and another benefit is that your test files are much more focused because they only contain the test code and not hundreds/thousands of lines of expected output.
berserk | a day ago
I'm interested to hear how many people agree with this.
I personally find writing tests to be a lot of fun and look forward to it when programming. Seeing lots of green (and looking forward to seeing lots of green) really scratches an itch in my brain.
aarroyoc | a day ago
I like writing tests, but I find tedious creating the whole setup that you need to build effective tests. Once a project has it, it's easy, but if it has nothing then I need to breath deeply
technomancy | a day ago
I think this is really common when tests are considered an afterthought. This is a sign that no one has taken the tests seriously enough to think thru what kind of abstractions they need to make their setup not be tedious and painful.
steinuil | a day ago
Yup. I usually have an idea of the situation that I want to test, but the machinery to set that up is often very complicated and (in the case of the Python tests at my workplace) completely untyped and often not composable. Most of the time I end up copying the setup code from another test without completely understanding it and be content with
jj splitting the test from my change and run it before, check that it fails, and run it after, and check that it passes.giacomo_cavalieri | a day ago
Ah I get that feeling! I remember once being tasked to bring this work codebase up to 80% coverage and it had no tests at all đ
Lanny | a day ago
Iâm definitely in the âtests are boring and I wouldnât do them if I didnât have toâ camp.
I can understand the mentality of people who enjoy testing, occasionally I even get to write a test of some well behaved thing like a nice pure function of and to value types and I kinda get a kick out of trying to devise a table of inputs and outputs with sneaky little edge cases, then making sure theyâre all covered. But I often find the things I work on arenât like this, thereâs a big messy API, the thing under test has side effects, I end up having to make my test very beholden to implementation details rather than the actual interface (basically all mocking) so I feel like many tests that get written are more a âtest thing never gets modified againâ sort of thing than actual evidence of correctness and thatâs where the whole exercise starts to feel like tedium
mpweiher | a day ago
Do you test first or test after?
Lanny | 23 hours ago
Iâve done both, I find the kind of work thatâs amenable to the good kind of tests is also amenable to test first methodologies, conversely when work is âcall three apis and synthesize the results in some contingent way and write the results somewhere then kick off some other thingâ then youâre not going to write a meaningful test before you start an implementation, and even if you do youâre going to spend more time and lines on mocking or a DI scheme than testing first party code
mpweiher | 14 hours ago
I'd agree that integration work is less amenable to unit testing and a "test first" approach.
However, it's doable. If the job is calling the 3 apis (remote?) and synthesizing the results, then that sounds more like an integration test than a unit test. I nowadays do integration tests as "test-only" tests.
I code the whole thing up in a test case, calling up the various external APIs and verifying their outputs (â characterization test). When the test is working, I refactor it, preferably extracting the "synthesize the results in some contingent way" as a function or method that just depends on its inputs.
I generally avoid stubs (sometimes called mocks), they are a code-smell that almost always tells me something in my architecture is wrong.
Why I don't mock
However, if I have a lot of functionality that depends on an external service, I will eventually code up a local test-double of that external system. And I code it up by pointing my characterization tests at my test-double.
Any functionality that does something complex does not call the external APIs directly. It is called with the parameters it needs previously obtained from the external APIs and the "orchestration" that ties it all together as trivial and unbreakable as possible.
So that when your unit tests work, you pretty much know that your integration tests will work as well.
spc476 | a day ago
I'm not @Lanny (who I agree with) but I test after. Writing tests before coding is just alien to my way of thinking.
mpweiher | 7 hours ago
How is it alien to your way of thinking?
For me, writing down precisely what my code is supposed to do before I write the code seems just very natural.
k749gtnc9l3w | 6 hours ago
For many tasks what will available tools do / what is even possible is something you discover as you try writing the code.
spc476 | a day ago
Writing tests is boring.
Personally, I've found I like table-driven tests more than writing tests (as code). Something like:
but I seem to be in the minority in this case. I also find mocking to be problematic and the complaint about that post was that it involved too much testing.
satvikberi | 18 hours ago
Depends on where I've worked. In my early jobs tests were basically pure cargo culting and never caught a single real error. Later I worked at places that devised tests based on what errors we frequently had, and these included unusual things like statistical tests on the output of a data pipeline, or tests for how the runtime of an algorithm scaled with larger inputs. They ran more slowly, but actually prevented bugs & regressions.
giacomo_cavalieri | a day ago
Hello, author here! Yeah I agree, it's a bit tongue in cheek to appeal to a larger crowd but I myself am in the "mostly likes writing tests" camp đ
icholy | a day ago
Writing tests is fine if the code is written in a testable way. Otherwise it's hell.
neild | 22 hours ago
I find making tests easy to write quite interesting and often fun. The actual tests arenât exciting, but if Iâm doing it right the actual tests are very little work.
I personally donât find much value in test-driven development, but testability-driven development is essential. If tests are tedious and hard to write, the tests will not be written, and bugs will inevitably lurk in the untested gaps.
mpweiher | 16 hours ago
Me too!
But then, I do test-first. I find that people who find tests boring almost invariably write them after. Of course it is going to be boring. And very unrewarding, because you already had your high from writing the code, and now at best the test can tell you that it was OK (null result), and at worst it will tell you that you were wrong (sucks).
With test-first, that first failure is positive, so is getting it green.
alpaylan | 23 hours ago
I think the types of tests most people write are boring. If you are validating behavior you already know to be correct, it's a chore, the fundamental benefit you get is catching regressions. If you are actively trying to break your software, or if you are trying to understand the system better through tests (this happens to me a lot when writing parsers because I cannot fully anticipate how the parsing goes), then it can be very fun. Personally, I prefer writing PBTs to writing the software itself.
simonw | a day ago
I love snapshot testing. I started doing it manually six years ago using Black and pytest, then I found Syrupy which lets you write tests like this:
That
snapshotis apytestfixture which magically resolves to the captured snapshot for the test in question. You can update it usingpytest --snapshot-update- the snapshot is serialized to disk in thetests/__snapshots__folder.I've also used inline-snapshot which is a little more magic - it rewrites your test files directly to match the snapshot, as seen here. You don't even need a fixture with that one, you just import the
snapshot()function and use it like this:zanlib | 23 hours ago
This might be a hot take for this comment section but I generally consider snapshot testing to be an anti-pattern, having worked on a large JS code base that used copious Jest
.toMatchSnapshot().Some of the gripes I had with it are:
It was widely used for testing constants, which is a waste of everyone's time. (The post actually seems to do this by testing help text, which seems somewhat unnecessary, unless there's something I'm missing such as the help text being dynamically generated).
The tests aren't descriptive. At the end of the day what you're doing is you're adding a barrier to ensure that the value before and after the code change is the same. If the return value of the function under test is simple, the writing and alteration of an ordinary unit test is also simple...
...But if it's complex, then the snapshot does not tell you what the author of the test expected to actually test. It becomes much more complex to figure out what was the intention, because snapshot tests of more complicated functionality tend to invite laconic assertion descriptions, along the lines of
"matches snapshot". The entire thing breaks, even if something minor (that's possibly covered by another test) changes.In general, I feel like testing should be done deliberately, by checking the essential and ignoring the accidental. Snapshot testing doesn't do this - it just takes the entire thing.
The only exception I would point out as to where snapshots are actually useful is the idea of Elixir's doctests, but that's because they test the correspondence between the documentation and functionality (making sure that the examples given in the docs match the actual behaviour). But that's really about it.
jlarocco | 6 hours ago
I agree with you on the "help text" example. I don't really see the point in a test like that unless the tests are written by somebody other than the developer, or the help text is dynamic. You're literally testing that a function can return a string, and there's no good reason to do that.
Which ties back to whether writing tests is fun. If the tests are validating something meaningful then it's fun to write them and gain confidence in the implementation. If writing the test feels like busy work, testing that the developer can copy/paste a string into two files then it's miserable.
bmh | a day ago
I quite liked the usage of snapshots for testing the help text. I think it works well because the text content is diff-able and very easy to parse as a human. In my opinion, this is where snapshot tests excel.
Having used snapshot tests extensively in UI contexts, my experience has been I put little to no trust in snapshots tests (particularly of the DOM). I find that snapshot libraries have this âhandyâ feature to accept the changes and this makes it too easy to accept changes without reviewing why the test failed. Of course, this might be a process problem, but I think still itâs a weakness of snapshot tests in general. Subtle but incorrect changes to the output can be easy to disregard.
Iâve found myself to trust unit tests with the classic setup and assertions because it requires a developer to think harder about what is causing the test to fail instead of being given an easy path of accepting the new output.
Lanny | a day ago
This is generally my experience in most team contexts as well. I will say that UI snapshot testing on very small team projects or personal projects where you can reasonably expect every person making changes understands every piece of the UI and what should or shouldnât be changing in snapshots when a given piece of code changes, then itâs worked well for me.
Also often the value isnât so much âI rigorously looked at the snapshot diff and confirmed the change was right or wrongâ as âwait, why did my change in a util file used by foo cause barâs snapshot to change?â. That is knowing what wiggled at all is often more useful than exactly what changed, if for no other reason than itâs a good reminder to go look at everything that changed and confirm that you didnât make some assumption that gets violated in some far flung context.
giacomo_cavalieri | a day ago
Yeah I think snapshot testing UIs is really not that nice, you get that feeling of "just accept anything and everything". I snapshot testing really shines when testing "regular" code that can benefit from being displayed in some pretty string version
steinuil | a day ago
I really like expectation testing. In OCaml when using
dune, you can have atest.mlfile that prints things to stdout, and if you also have atest.expectedfile in the same directorydunewill detect it and compare the output fromtest.mland show you a diff if there's any changes, and if you like it you can update it withdune promote.I've used it in
xoblto review changes to the output of the compiler. I'm glad Gleam also has a framework for doing this!seafoamteal | a day ago
I first encountered snapshot tests on the Gleam compiler, which is mentioned in the article, and I really like them now! The cool formatting that Giacomo shows off in the article definitely makes the snapshots easier to review, and another benefit is that your test files are much more focused because they only contain the test code and not hundreds/thousands of lines of expected output.
giacomo_cavalieri | a day ago
I discovered snapshot testing thanks to the Gleam compiler too! My first reaction was "how come I've never heard of this before, this rocks"