(And let's not forget that in this particular example the amount of unit tests required to catch all cases is relatively small, not even nearly 50.)
That really depends what you mean by "all cases".
If you just want to test for the specific case of "have we created a suppression cross rather than a suppression cube", sure, a few unit tests can probably catch that error. If you want to test for
all the ways a suppression function could affect a zone which is not a cuboid of the defined size and position ... well, there's an infinite number of possible shapes which are not that one. If you write a set of unit tests, I guarantee I can write a function which passes the unit tests and also suppresses a zone of the wrong shape. Which is a problem for the TDD utopia where you "first write the tests based on the spec such that only the correct function can pass all of them"
So, for example: let's say we're using Sol-centred coordinates and want to suppress a cube 2500 LY per side, centred on Sol.
For a basic test (which will catch the suppression cross failure mode) you'd need to at least check just either side of (1250, 0, 0) and the other five faces. That's 12 tests.
But that's not sufficient to catch the case where it's actually been implemented as a suppression sphere with radius 1250 - you also want to check either side of (1250,1250,1250) and +/- permutations to make sure the corners are in the correct places. That adds another 16 tests.
None of
that catches the case where the suppression alternates - so there is a suppression cross, but only consisting of every other cube - or there's a suppression shell, so (1249,0,0) is suppressed but (999,0,0) isn't. Depending on how the code has evolved, it's entirely possible that leftovers from a previous design might be doing that - for example, what if there were previously four types of suppression at nested cubes in 625 LY steps. The decision is taken that this isn't needed, just simplify it to a single 1250LY cube. The programmer takes out the three calls to the functions testing the other cubes ... and doesn't notice, because all the unit tests defined above pass, that the remaining function only suppresses the 625-1250 cubic shell, and the area right next to Sol is no longer suppressed. (The tests so far would also not notice them forgetting to disable the 1875-2500 shell)
Coverage and branching metrics don't help here: you can get total branch coverage on this function with, depending on how it's written internally, likely somewhere between two and seven tests. That's nowhere near enough to guarantee that it's branching the right way for every input set.
So ... how many more tests? There are very roughly 10^18 possible 3D coordinates this function needs to work on. We're not going to write 10^18 assertions to make sure that each one individually behaves... but we probably need well over 100 to cover even just plausible shapes which are not an axis-aligned cube (and then we need some further tests to check that each
call site is indeed passing in Sol-centred coordinates in the right order and units, rather than some other ints it had lying around)
but in many cases complex algorithms require a lot more thought and effort put into writing them.
Yes, but that complex algorithm should by the "make everything unit-testable" principle be decomposed into a series of simpler algorithms, each of which is independently tested, and then the recomposition of them is also tested, so the number of unit tests required is also much higher because the algorithm has a much greater number of boundaries than a simple cuboid.
As noted above, given that you can't possibly test every single combination of input values, you have to actually think about the tests you're writing so that each assertion covers as many wrong combinations as possible. That
also requires thought and effort.
Complex algorithms - once you get out of the level of testing single lines of codes - can also have emergent bugs despite all their unit components working correctly ... as we see with the blue star cubes:
- the generation of stars in a cubic octree is entirely deliberate
- the variation of density between adjacent cubes is intentional and necessary
- the visibility of individual stars has been picked to replicate real visibility ranges
...and yet from certain view positions and directions you get an obvious pattern which looks wrong to human eyes. "It shouldn't look like that" isn't a formal spec that can be converted into unit tests, though. Maybe you can define some edge-detection algorithm and say that for no position and rotation should this algorithm ever find an edge with a strength greater than X ... but
how many millions of positions and directions have explorers looked at the galaxy from now? How debugged is that edge-detection algorithm itself?
Meanwhile, unit tests are very simple and fast to write (and that's their very purpose; if you find yourself spending a lot of time writing one unit test, you are doing it wrong).
Indeed, but that doesn't make them immune to the sort of typo that creates the suppression cross - more often that not when I write a test and the test fails, it's because I wrote the
test wrong - simple mistakes like putting two parameters in the wrong order, or defining an input object wrong for the behaviour I wanted to test. Worse: sometimes the test passes anyway. In the example above, what if I typo the test that checks that (1251,0,0) is outside the cube and put (2151,0,0) instead? Single character transpositions like that are
trivial to make, especially if you're writing "easy" code like unit tests and therefore not paying full attention, and in this case the test will pass but isn't actually testing the right thing, so future code changes can result in an undetected regression.
Yes, they're individually very simple, but they're still code and still need debugging and reviewing and documenting in their own right.