What's with these oddly regular star patterns?

Ian Doncaster · Aug 22, 2023

DjVortex said:
It doesn't matter if that means that 38 or 105 unit tests have to be written, the unit tests should check all the combinations to see that the function returns the correct value for all of them.

I'm aware of the theory, yes. I just think it's - outside very limited cases - not at all practical to actually do.

If every single line of code requires multiple tens of lines of unit tests writing (and rewriting if the spec changes), that's essentially multiplying by 50 the effort required to write each line of code, which means roughly 50 times the cost to develop, which means roughly 50 times the time taken and the end cost to the customer. Frontier already get a lot of criticism for taking far too long to develop things.

There are fields where that's a sensible cost-benefit trade-off, but computer games aren't one of them [1]. In my work (not games, similarly non-safety-critical), I probably have a roughly 1:1 ratio of time spent on code and time spent on automated tests: that combined with some non-automated peer review of code is more than sufficient to catch well over 95% of bugs before any customer sees them, with the remaining 5% being much cheaper (including costs incurred by the bug happening!) to fix retrospectively than to rule out in advance. Most of my automated tests therefore aren't unit tests - they're high-level functional tests to check that the end-to-end process works properly in a range of situations; if one fails, nice as it would be to have a chain of unit tests saying "over here", it'd take far longer to write that comprehensive suite than to look through the code once the functional test has highlighted the approximate nature of the problem.

[1] It wouldn't even fix every bug in computer games: unit tests won't catch most threading-related bugs, are unlikely to help very much beyond "fails safe" with network asynchronicity issues, there's probably no sensible way to test things like shader code for spec-compliance once it compiles beyond "well, does it look right?", and any bug triggered by unexpected interactions with the OS or ever-shifting hardware drivers is also out of scope. There are ways to handle those too, of course, if the funding is available, for nuclear-reactor design and similar.

DjVortex · Aug 22, 2023

Ian Doncaster said:
If every single line of code requires multiple tens of lines of unit tests writing (and rewriting if the spec changes), that's essentially multiplying by 50 the effort required to write each line of code, which means roughly 50 times the cost to develop, which means roughly 50 times the time taken and the end cost to the customer. Frontier already get a lot of criticism for taking far too long to develop things.

You seem to think that every line of code is equally fast to type, and thus if you need to write 50 lines of code that will take 50 times as long as writing 1 line of code.

Rather obviously that's not so.

Perhaps not in this exact particular case, but in many cases complex algorithms require a lot more thought and effort put into writing them. Meanwhile, unit tests are very simple and fast to write (and that's their very purpose; if you find yourself spending a lot of time writing one unit test, you are doing it wrong).

That's on top of the fact that actually writing the code is only a relatively small part of the entire time that a project of this size requires. All the other stuff required to start writing the code and integrating it into the rest usually takes a lot longer than just typing the code itself.

(And let's not forget that in this particular example the amount of unit tests required to catch all cases is relatively small, not even nearly 50.)

Ian Doncaster · Aug 22, 2023

DjVortex said:
(And let's not forget that in this particular example the amount of unit tests required to catch all cases is relatively small, not even nearly 50.)

That really depends what you mean by "all cases".

If you just want to test for the specific case of "have we created a suppression cross rather than a suppression cube", sure, a few unit tests can probably catch that error. If you want to test for all the ways a suppression function could affect a zone which is not a cuboid of the defined size and position ... well, there's an infinite number of possible shapes which are not that one. If you write a set of unit tests, I guarantee I can write a function which passes the unit tests and also suppresses a zone of the wrong shape. Which is a problem for the TDD utopia where you "first write the tests based on the spec such that only the correct function can pass all of them"

So, for example: let's say we're using Sol-centred coordinates and want to suppress a cube 2500 LY per side, centred on Sol.

For a basic test (which will catch the suppression cross failure mode) you'd need to at least check just either side of (1250, 0, 0) and the other five faces. That's 12 tests.

But that's not sufficient to catch the case where it's actually been implemented as a suppression sphere with radius 1250 - you also want to check either side of (1250,1250,1250) and +/- permutations to make sure the corners are in the correct places. That adds another 16 tests.

None of that catches the case where the suppression alternates - so there is a suppression cross, but only consisting of every other cube - or there's a suppression shell, so (1249,0,0) is suppressed but (999,0,0) isn't. Depending on how the code has evolved, it's entirely possible that leftovers from a previous design might be doing that - for example, what if there were previously four types of suppression at nested cubes in 625 LY steps. The decision is taken that this isn't needed, just simplify it to a single 1250LY cube. The programmer takes out the three calls to the functions testing the other cubes ... and doesn't notice, because all the unit tests defined above pass, that the remaining function only suppresses the 625-1250 cubic shell, and the area right next to Sol is no longer suppressed. (The tests so far would also not notice them forgetting to disable the 1875-2500 shell)

Coverage and branching metrics don't help here: you can get total branch coverage on this function with, depending on how it's written internally, likely somewhere between two and seven tests. That's nowhere near enough to guarantee that it's branching the right way for every input set.

So ... how many more tests? There are very roughly 10^18 possible 3D coordinates this function needs to work on. We're not going to write 10^18 assertions to make sure that each one individually behaves... but we probably need well over 100 to cover even just plausible shapes which are not an axis-aligned cube (and then we need some further tests to check that each call site is indeed passing in Sol-centred coordinates in the right order and units, rather than some other ints it had lying around)

DjVortex said:
but in many cases complex algorithms require a lot more thought and effort put into writing them.

Yes, but that complex algorithm should by the "make everything unit-testable" principle be decomposed into a series of simpler algorithms, each of which is independently tested, and then the recomposition of them is also tested, so the number of unit tests required is also much higher because the algorithm has a much greater number of boundaries than a simple cuboid.

As noted above, given that you can't possibly test every single combination of input values, you have to actually think about the tests you're writing so that each assertion covers as many wrong combinations as possible. That also requires thought and effort.

Complex algorithms - once you get out of the level of testing single lines of codes - can also have emergent bugs despite all their unit components working correctly ... as we see with the blue star cubes:
- the generation of stars in a cubic octree is entirely deliberate
- the variation of density between adjacent cubes is intentional and necessary
- the visibility of individual stars has been picked to replicate real visibility ranges
...and yet from certain view positions and directions you get an obvious pattern which looks wrong to human eyes. "It shouldn't look like that" isn't a formal spec that can be converted into unit tests, though. Maybe you can define some edge-detection algorithm and say that for no position and rotation should this algorithm ever find an edge with a strength greater than X ... but how many millions of positions and directions have explorers looked at the galaxy from now? How debugged is that edge-detection algorithm itself?

DjVortex said:
Meanwhile, unit tests are very simple and fast to write (and that's their very purpose; if you find yourself spending a lot of time writing one unit test, you are doing it wrong).

Indeed, but that doesn't make them immune to the sort of typo that creates the suppression cross - more often that not when I write a test and the test fails, it's because I wrote the test wrong - simple mistakes like putting two parameters in the wrong order, or defining an input object wrong for the behaviour I wanted to test. Worse: sometimes the test passes anyway. In the example above, what if I typo the test that checks that (1251,0,0) is outside the cube and put (2151,0,0) instead? Single character transpositions like that are trivial to make, especially if you're writing "easy" code like unit tests and therefore not paying full attention, and in this case the test will pass but isn't actually testing the right thing, so future code changes can result in an undetected regression.

Yes, they're individually very simple, but they're still code and still need debugging and reviewing and documenting in their own right.

marx · Aug 22, 2023

Ian Doncaster said:
And until 2.2 introduced neutron boosting it had very minimal gameplay effect, too - so what if a sector is a bit low on procedural hypergiants?

Neutron stars and black holes did have a gameplay effect even before the NS overhaul and addition of boosting: for a long while, farming them used to be The way of powerleveling your exploration rank. If memory serves, they paid something like half of ELWs, but since one could tell from the galaxy map exactly where they were, running around and scanning them yielded the highest credits per hour possible via exploration. Thankfully, this was later fixed in ED 2.3. (Earth-likes and such were boosted from 65 kCr to 700 kCr, neutron stars and black holes remained roughly the same, around 30 kCr.)

varonica said:
To be fair, there's a lot of stars in the galaxy, and the exclusion cross would certainly not have been noticable in any ordinary examination of the galaxy. It appears clearly now because, we, as pilots, have visited enough of them for it to be clearly visible when you visualise the galaxy using star class filters. So unless FDEV generated maps of the galaxy based on star class and examined them for such peculiarities there's every chance they never even knew it existed for a long time early in the game, until us pesky pilots started generating our stats of systems explored. They were probably as surprised as us!

The cross took a fair while to be noticed: early on, people thought there were four separate "neutron fields" in the galaxy, or eight if they counted the vertical separation separately. I'm not sure when exactly it became common knowledge that there are no separate fields, just the cross. Looking back on things, even when neutron star boosting was introduced (with the +400% range that was originally a bug and meant to be +25%, and Frontier made it a feature after many community outcries[1]), posts talking about building neutron star highways didn't even mention the +/- 1000 ly layers. Come to think of it, the cross might have only been mapped out due to the extra efforts to map all the neutron stars from then on.

In other words, Colonia was founded before people realised that the suppression cross was a thing: it took years for that to happen. In my opinion, it's understandable then that nobody at Frontier noticed it in three weeks. (Looking at the version history, as far as I know, that was the time before the final rework of the Stellar Forge in the beta and the launch of the gamma headstart, at which point the galaxy and its systems were locked in.)

[1]: I should note here that there wasn't a universal consensus, it was a controversial topic. Plenty of people said that the new jump ranges would be too much, referring to players Buckyballing to Sagittarius A* via the increased ranges in three hours, and so on. In the end though, more people wanted the far increased boost kept in, and that's what Frontier did.

Frankymole · Aug 22, 2023

Ian Doncaster said:
It's not a bug as such, so much as an unfortunate consequence of how the galaxy was generated.

Frankymole · Aug 22, 2023

marx said:
The cross took a fair while to be noticed: early on, people thought there were four separate "neutron fields" in the galaxy, or eight if they counted the vertical separation separately. I'm not sure when exactly it became common knowledge that there are no separate fields, just the cross.

Except.... the are two toruses of black holes and neutron stars around the galactic core, too, and they're broken into four curved fields each, but the cross doesn't centre on Sol. You go 1000ly down (Galactic South in real astronomy terms) to find the field, and can track it round the core until a gap, then it resumes... this has been tested with actual in-game travel rather than third party maps.

What's with these oddly regular star patterns?

Ian Doncaster

DjVortex

Ian Doncaster

marx

Frankymole

Frankymole