The ancient automated pathfinder-stations

schlowi123 · Feb 2, 2019

My fellow travelers and explorers. I've told you about my journey into the depth of Universal Cartographics headquarters and how I found old tomes. The latter allowed me to find the ways of The Ancients through the emptiness.

However, that was not all I found. I also happened to come across an interface which allows to interact with an equally ancient network of still functioning automated stations. The origin of these stations are unknown to me, but their presence correlates with the systems which are known to have been visited by the legend explorers of the past.

These stations must be tiny or very well hidden, since they don't show up on our scanners.

But why is this of interest for us explorers and travelers?

Well, it seems to be that these stations can not communicate directly with each other. To send a message from station A to another station Z, probes with a given (adjustable) jump range are send out to the stars within the reach of said probes.
If another station exists in the system, the message is multiplied and the whole process is repeated, until station Z is reached.

This is however very similar to the process of our traveling among the stars, utilizing the Frame Shift Drive!

The probe which finally reaches the station of interest contains the names of the systems through which the message was relayed between station A and Z. This information can be used to find a way through regions with extremely low star densities.

In how far could this be of use for us you may ask. Well, the jump range of these probes can be adjusted! A user can program the probes to the jump range(s) of her or his ship.

Thus, everbody will be able to plot a course through the void even when our own bordcomputers can not any longer because of a too low star density.
And this will take away a lot of the fear of getting stranded in the vast emptiness between the spiral arms of our beloved galaxy.

Like last time I've made this information available to all of you, because I hope, that it will help you, to decrease the number of jumps to reach your goal.

Commanders, I remain until next time … o7

schlowi123 · Feb 2, 2019

Hello again

Talking to the fuel rats some days ago motivated me enough to finally finish this project so that it can be presented to the community.
I play on PS4 and thus can not contribute in a good way to EDSM (yes, I use the console updater and that was one of last years great moments when I learned about it).

The above post was just some roleplaying within non-official lore.

This post is a bit more technical and I try to explain how the software which allows me to find a path in low star density regions works.

Just in case: the program can be found on GitHub under it's name gap_jumper.

My original problem had several aspects.

Wanting to cross from Outer Arm Vacuus to the Perseus Arm with as little "flying backwards" as possible. (I totally failed this task when I circumnavigated the galaxy.)
Low on Jumponium and thus fear of getting stranded
The amazing edastro.com shows clearly that travelers have crossed this gap. However, the resolution is not good enough to figure out which stars these are and the map is two dimensional anyway. No criticism! I love edastro and it helped me a lot to find my way around the galaxy .
In addition comes that I could not find detailed information about possible crossings (with some exceptions) at EDSM (or other sources). But this is important to estimate how much Jumponium is actually needed if a given route is traveled.
All the programs I could find here seemed not to be suitable to the task at hand (or for an operating system I don't have). But I may be wrong at this.

Solution:
The EDSM database has the names and coordinates of all reported systems. Thus I should be able to find a way between a given start- and endpoint.

And this is what my program is about.

So, how does it work?

Well, first of all the approximate start- and end-coordinates are used to define the line between these two.

This line is the center of a cylinder with 500 lightyears diameter (adjustable). The program then finds all (reported) systems within this cylinder. This takes some time.
I provide two mechanisms. One that uses the EDSM API. While I loved programming that. However, it will send a lot of requests and is in the end slower than using method two.
Method two requires to download the "systemsWithCoordinates.json" database and will work on that file. It still needs some time to go through the many million reported systems to find those relevant.

A big thank you to zibadian from the Fuel Rats (I think) for discussing during the above mentioned chat some methods in connection with my program and for making me aware of the EDSM API. I wish I could have stayed longer but I had to go quite suddenly.

Anyway, after finding the systems in the region of interest, the pathfinding algorithm starts. This is what the rest of this post is about.

So, I haven't studied computer science, but naturally I first thought of Dijkstra's algorithm to find the shortest path.

I had a nice implementation (and I didn't care about running time) but then I realized that I was not able to implement boosted jumps in case ALL possible distances at a given location were longer than the regular jump range. Or well, I had ideas how to do it, but all of them were not at all elegant, required at all times information about the whole state of the system of possible stars to jump to or would have needed re-calculation of everything. That was unsatisfactory.

This task was (for me) so difficult because I wanted to use boosted jumps just as a very last measure. So I wanted to punish boosting since Jumponium is so rare.

Fortunately am I a physicist. So after two days of thinking (not in sum, since I also have to do other stuff) how to make Dijkstra's algorithm work I thought about a very different approach. And this approach was electrical current!

An electrical current ALWAYS finds a way through a network of conductors AND it will always find the way of least resistance.

What does that mean for our case?

Well, jumps that require boosting are equivalent to having a higher resistance. And the higher boost grade is needed, the higher the resistance. A current will take the path with the higher resistance but JUST if it can NOT find another way. The latter may mean going backwards, but this was totally acceptable for me since saving Jumponium had the highest priority.
Thus, an electrical current inherently "punishes" passages with higher resistance.

So, I came up with the idea of probes (being the current) traveling from station to station (being the network of conductors). In the code I call them Jumpers and Nodes.
All systems have one station and all stations are EMPTY in the beginning. The exception: the station at the start-coordinates contains a probe. The process of finding a path from start to end is now the following.

The program has a global parameter which tells if a jump from one station to another took place. This is the "resistance of the galaxy".

Each station that contains a probe checks if the stations within regular jump range are empty. If this is the case, a probe is send to these stations. One such jump is enough to set the global paramter to True. If no jump took place in the whole cylinder of all considered stars this parameter is set to False. The latter can happen if stations contain no probe to send or all other stations are out of reach.

In the False case the "resistance" of the distances between the systems changes. This allows all stations to use probes with a higher jump range, meaning: all stations within grade 1 boosted jump range are now considered.

If a station within grade 1 boosted jump range is found, the global parameter is set to True, and the next jumps will be with regular jump range again.
If still not a single station could send out a probe with grade 1 boosted jump range the same process starts again, but considering now systems which need a grade 2 boosted jump to reach; and so on.

This algorithm does several things I wanted it to do.
1.: It uses boosted jumps JUST if in NONE of the stations that contain jumpers can send a probe with a regular jump.
2.: It does not necessarily find the route with the least amount of jumps (shortest route), but it does find a route close to it. The reason is that if a probe can be send to a system at the edge of the regular jump range, than this is done. A probe does NOT need to visit all systems in between. Thus, the algorithm is by itself close to the best solution. I think it is a greedy algorithm but I'm not sure and I would not really know what kind of "greediness" it is.
3.: It goes backwards and up and down if necessary. Something the regular ED plotting algorithm seems not to do. This can be seen e.g. in the "loop" in Hugo's High Road

Some more comments.

Instead of using Jumponium are of course first jumps on fumes considered.
I have implemented a function which considers the available fuel and would not allow a jump to an unscoopable star if this would strand the probe there. However, this function is not really active, because the information if a system contains a scoopable star or not is not available for all stars. Thus I consider all stars as scoopable.
For the just discussed case I have implemented a feature which checks for nearby scoopable stars and would tell the user to do a detour to fill up the tank. However, I've never seen this to be triggered since I didn't run into the case described. I constructed some weird test-cases in which it worked but this feature is to be considered as … well, experimental.
The order of stations to send out a probe is random. Thus, by running the algorithm several times more efficient routes can be found. And this confuses me because it shouldn't. I can just explain it the algorithm being greedy and thus not always finding the global minimum. I tested and tested and checked the code but could not find the source why it does that. Thus my only explanation was that sometimes (due to the random order of stations when to send a probe) a series of fortunate events allows the algorithm to find the globaly best path. If anybody of the readers points me to the (logical) error in my code I will be really thankful.
However, the algorithm is rather close to the optimum already (as greedy algorithms are in general). I ran several thousand tests for some routes and the routes found durig the first twenty tries were very close to the most efficient route found e.g. at try 666.
Saving grade 3 Jumponinum is considered as much more important than to save grade 2 Jumponinum which is much more important than to save grade 1 Jumponinum. Thus it is entirely possible that the program comes up with a route which uses more grade 1 boosts but less grade 2 or grade 3 boosts. However, during tests that never happened.
If the used databases is downloaded the location of the file needs to be provided in the code.
The code is not very efficient, but I don't care. The main time consuming process is looking up the information in said databases.
VERY IMPORTANT: DON'T USE THE PROGRAM IN REGIONS WITH HIGH STAR DENSITY! Or well, use it as you wish, but this will take forever and it will gobble up your memory.

I don't provide a GUI and all the necessary parameters need to be changed in the code. Because so far it is just me who actually uses the program. But if other commanders show interest, I may program a GUI. I may even provide an executable for a non-GNU/Linux operating system. However, if you run the code directly (it is written in python3) it should already work under all operating systems.
Eidt: 2020-01-21: I p've provided a GUI and an executable that runs under windows.

This is free software (like in freedom of speech) and the code is published under the GPL version 3. It is also free of charge (like in free beer).
So if anybody wants to implement it into their existing programs, feel free to do as you wish (as long as you obey the license -- don't jail my babies!)

If you have suggestions how to make things better, or if you have an idea how to get boosting into Dijkstra's algorithm, I would really like to read about it.

And if anybody is interested in a discussion, I would like to read your opinions regarding the following:

Will this take away some dangers of exploration? Will this break the spirit of exploration? If yes, why then having a Galactic Mapping Project?
Is this considered cheating? If yes, why then having a Galactic Mapping Project?

Fly save commanders

Spaceman Si · Mar 3, 2019

I've only just become aware of this extremely useful tool - what a great write up and explanation of how it works!

I am just on my way home for a re-fit and to attempt getting the Guardian FSD booster after nearly a year out in the deep, so I won't be using it quite yet, but my next voyage will be a longer one to some of the sparser regions, so this is going to get fitted "as standard" to my navigation suite

It would have been handy when I was picking my way to a distant open cluster that has about 2 ways in and out of it only, as far as I could work out!

And as for whether it is in the "spirit" of exploration, of course it is in my opinion. No explorer of earth set out with deliberately less information and tools than was available to them at the time, and I see no reason not to make use of what we already know about the galaxy as it is - besides, there is rather a lot more of it left to discover

marx · Mar 3, 2019

Interesting. I'll have to take a look at this later, see what results it produces. Thanks a bunch for sharing!

schlowi123 said:
Method two requires to download the "systemsWithCoordinates7days.json" database and will work on that file.

Minor note: that one's not the full database dump, just the new ones added in the last seven days. So if someone wants to run this on their own computers, they'll need "https://www.edsm.net/dump/systemsWithCoordinates.json" instead. (Note: to conserve your and EDSM's bandwidth, I'd recommend passing an "accept-encoding: gzip" header so that you download a compressed version instead.)

As for some thinking this would be cheating... First, this isn't breaking the EULA. Second, it's not an unfair advantage, and not a competition either. Besides, this is all about using already-discovered systems, so you're literally following in others' foot-steps. Personally, I think that finding your own way through somewhere is much more fun, but it's not like anyone's forced to use others' routes, be they manually transcribed (plenty of those out already, I even posted one, which was the first direct route shared that crossed the Outer Arm Vacuus directly South, way back when) or now, automatically generated. If anything, it's good to have generated ones now, as there are more options. Besides, it's a fun experiment.

Now that I think about it, a fun thing to try would be using this to generate a route, then fly there and see if you can find new undiscovered stars that would enable a better one after they're added to the database.

Orvidius · Mar 3, 2019

Wow, somehow I missed this earlier. Pretty cool stuff! I was also going to point out that the 7days JSON won't be adequate, but marx already addressed that. Just note that the full dump is over 51 GB in size currently (uncompressed) and has over 35 million entries now.

schlowi123 · Mar 3, 2019

marx said:
Interesting […]

Thank you

marx said:
Minor note: that one's not the full database dump, just the new ones added in the last seven days. So if someone wants to run this on their own computers, they'll need "https://www.edsm.net/dump/systemsWithCoordinates.json" instead. (Note: to conserve your and EDSM's bandwidth, I'd recommend passing an "accept-encoding: gzip" header so that you download a compressed version instead.)

Yes, of course! Thank you very much for pointing out this severe error in my post. I've corrected it at once.

marx said:
Personally, I think that finding your own way through somewhere is much more fun, […]

In general I agree with you, but some circumstances (e.g. low on jumponium, or just checking if the ship would make it at all) require a look-up in the "old" maps.

marx said:
Besides, it's a fun experiment.

This is exactly the reason why this tool exists at all. It bothered me that I couldn't find a direct way across, while seeing that there where plenty of systems discovered in the gap as shown by adastro. And then it was for me an experiment (and a lot of fun) if I'm able to program something that can find a path within given parameters .
And this is what ED is for me, too. It motivates to do stuff in RL. If it is reading about magnetic white dwarfs, or building tools

.

marx said:
Now that I think about it, a fun thing to try would be using this to generate a route, then fly there and see if you can find new undiscovered stars that would enable a better one after they're added to the database.

I so much hope some CMDR's will do that.

But I can just re-iterate my warning: Don't use this tool all the way from Sol to Beagle Point

. It is certainly NOT memory optimized.

Orvidius said:
Wow, somehow I missed this earlier. Pretty cool stuff! I was also going to point out that the 7days JSON won't be adequate, but marx already addressed that. Just note that the full dump is over 51 GB in size currently (uncompressed) and has over 35 million entries now.

Thank you for the compliment

But the 51 GB File is not needed. Originally I did use it to try to figure out if a star is scoopable or not if the information was not given in the "systemsWithCoordinates"-file. But the search in the large file slowed the whole pathfinding process unnecessarily down (a lot), without really improving the results. So I scrapped this part in the end.

But I use the "bodies.json" file for statistics. Right now I'm trying to visualize if there are sectors which have a higher Ammonia- / Earth-like- / water-world density. Because when I circumnavigated the galaxy I discovered almost 50 Ammonia worlds, while the whole trip Sol/BP/Sol yielded just three. So I wondered if they are more common in the outer rim.

One side note. This tool helped already a fellow commander finding his way back to the DW2-fleet after getting stranded in a low density star region (high) above the the galactic zero plane

Anyway, I'm open for suggestions if others think that there is room for improvements.

Orvidius · Mar 4, 2019

schlowi123 said:
But the 51 GB File is not needed. Originally I did use it to try to figure out if a star is scoopable or not if the information was not given in the "systemsWithCoordinates"-file. But the search in the large file slowed the whole pathfinding process unnecessarily down (a lot), without really improving the results. So I scrapped this part in the end.

Yes, you're right. I had a brain-fart there, and looked at the wrong file. The full systems dump is less than 5 GB currently.

marx · Mar 4, 2019

schlowi123 said:
Right now I'm trying to visualize if there are sectors which have a higher Ammonia- / Earth-like- / water-world density. Because when I circumnavigated the galaxy I discovered almost 50 Ammonia worlds, while the whole trip Sol/BP/Sol yielded just three. So I wondered if they are more common in the outer rim.

I looked into this a while ago, so this might help you. I'll run the numbers again at the end of May, one year after the last import.
There was also a different one focusing on ringed Earth-likes, see here.

There's one big problem with the sparse sectors of the rim though: there aren't enough systems out there to give a large enough sample size. Based on metallicity surveys and such, we strongly suspect that boxels tend to use their own generation characteristics, but whether this would extend to more (or less) ELWs out on the edges is pretty much impossible to tell, what with so few systems in a boxel. However, it might give the appearance of more finds simply because it's easier to find better candidate systems there - mass code D, main sequence. What filters did you use on your trip to BP and back, and how many jumps was it?

schlowi123 · Mar 4, 2019

marx said:
What filters did you use on your trip to BP and back, and how many jumps was it?

I don't filter, I fly in an as straight line as possible. Sometimes that keeps me in the badlands for one hundred jumps (or so) and I'm fine with the occasional plotting to a scoopable star.

I haven't noted how many jumps it was from BP to the bubble last time. Must have been around 2000. I also had the impression that along the edge of Outer Arm Vacuus the Ammonia world density was highest.

I'm also not a planet hunter (ELW's, Ammonias and so). If I come across a special one I'm happy, make a picture, map it and continue. So with the recent "discussion" what a "real" explorer is, I am probably more of a traveler. The exploration aspects happen almost by accident.

And do I see it correctly that you provide a list of ELW's in each sector? Don't misunderstand me, it's a great analysis. But I visualized it in a nice false-colour map. Unfortunately I can't figure out how to upload pictures here without using imgur or another of such services.
Otherwise I agree with you, for many sectors the known systems are too few. However, 50 discovered systems (with or without planets of interest) seem to be good enough for a somewhat meaningful statistic.

marx · Mar 5, 2019

schlowi123 said:
And do I see it correctly that you provide a list of ELW's in each sector? Don't misunderstand me, it's a great analysis. But I visualized it in a nice false-colour map. Unfortunately I can't figure out how to upload pictures here without using imgur or another of such services.
Otherwise I agree with you, for many sectors the known systems are too few. However, 50 discovered systems (with or without planets of interest) seem to be good enough for a somewhat meaningful statistic.

First off: you can't upload pictures to the forum anymore, so you do need to host them elsewhere and link them here.

You can see some nice maps on EDAstro.com now; back when I made that analysis, I was more concerned if there were significant differences between sectors. For that, a visual map doesn't really work (I take it you'd mark visited systems and ELWs discovered as the two colours?), since you need a count of how many systems have been discovered.

Fifty discovered systems per sector? No offense, but that would be far too few. EDSM currently lists ~138,000 ELWs in 32.6 million systems, so if that's the kind of minimal chance we're talking about, even fifty discovered systems per boxel (let alone sector) would be far too few. For a somewhat representative sample, you'd need visited systems at least on the scale of tens of thousands, say, at least 50,000 - and the problem is that the low-density sectors don't even have that many.
That is, if the aim is to determine whether there's any difference between sectors. Or, to put it another way: would a system of the same kind have better chances of having an ELW in one sector or in another? It's an interesting one, since we do know that metallicity isn't uniform in the Forge's galaxy either, but how big of a difference (in ELWs) would that translate to?
Doesn't look like a lot though.

You did say that you didn't use filters, so the reason why you found more ELWs near the edges is likely that you've gone through more better candidate systems there. The distribution of system types is "better" there, after all. (EDAstro's mass code maps show this very well, too.)

schlowi123 · Mar 5, 2019

marx said:
First off: you can't upload pictures to the forum anymore, so you do need to host them elsewhere and link them here.

Thank you for clarifying me suspicion

marx said:
You can see some nice maps on EDAstro.com now; back when I made that analysis, I was more concerned if there were significant differences between sectors. For that, a visual map doesn't really work (I take it you'd mark visited systems and ELWs discovered as the two colours?), since you need a count of how many systems have been discovered.

No, I meant e.g. blue for sectors with a low ELW density, green for a "mid-range" ELW density and red for a sector with a high ELW density. Sth. like that (just static of course).

marx said:
Fifty discovered systems per sector? No offense, but that would be far too few. EDSM currently lists ~138,000 ELWs in 32.6 million systems, so if that's the kind of minimal chance we're talking about, even fifty discovered systems per boxel (let alone sector) would be far too few.

I know and I have thought of that and this limit is certainly a shortcoming. But …
- I assume that an average all sectors are (mostly) the same, except for the number of stars. I know, they are not and figuring that out was one of the goals which brings me to …
- Any differences are NOT super huge and and NOT obvious (like the AND/OR logic error in the stellar forge, leading to the cross in the edastro-maps for some stars).
- I assume also that the exploration data is (more or less) evenly distributed, no large gaps in the starmaps (edastro supports this assumption)
- The last assumption is that ELWs, AWs and WWs are actually scanned by most explorers.
- Basically this is Occam's Razor paired with the assumption that all observables follow (somewhat) a normal distribution

I knew about the 0.4 % chance to find an ELW. But I also had to deal with the fact that I simply don't have that many discovered systems everywhere (as you said, too). Thus I needed a (much) lower limit. And I was a bit surprised myself, that 50 actually gives (somewhat) meaningful results.

Yes, any "hotspot" might be due to one accidentally discovered ELW in a sector in which just 51 systems were visited at all. That happens. But these would be singular red squares in the map. However, if in a part of the galaxy consisting of 8 x 8 sectors a cluster of green-squares shows up (so several sectors with "middle high" ELW density) compared to the rest of the galaxy being blue (or black if no ELW is in the sector), then this _may_ be something.

So I agree with you. With the 50-discovered-systems limit, the statistics are not good enough and I don't claim that any results are the truth. Even ELW/AW-clusters may (likely are) still very well be statistical flukes. But given the circumstances (low number of systems discovered at all in the outer regions) I couldn't come up with a better solution. You should probably not ask how many experiments contributed to my PhD theses … that was when I stopped worrying about too little data and learned to love working with what I had

Also, if a cluster of Ammonia world does indeed show up in the data (and it does), this may be of highest interest, considering that Thargoids may come from Ammonia worlds. So a dedicated expedition into these sectors may bring back better data … and most likely make the statistical fluke disappear.

marx said:
[…] you've gone through more better candidate systems there. The distribution of system types is "better" there, after all. (EDAstro's mass code maps show this very well, too.)

I just started with ED last year May. Do you have a link where I can read about mass codes and why mass code D has a higher chance to have an ELW? What about AWs?

Spaceman Si · Mar 6, 2019

There is a current thread with some excellent stats analysis in it here https://forums.frontier.co.uk/showthread.php/479605-ELW-Frustration

and Marx’s guide for finding ELWs is a great description of mass codes and ELW distribution https://forums.frontier.co.uk/showthread.php/339164-Marx-s-guide-to-finding-Earth-like-worlds

And if you want to delve into the deep workings of the mass codes and much more, do have a read of Jackie Silver’s treatise here.. https://forums.frontier.co.uk/showthread.php/196297-RV-Sonnenkreis-Decoding-Universal-Cartographics

marx · Mar 6, 2019

Hm, you know, we're talking about 50 systems per sector here, but the reality is that even a year ago, most sectors had far more than that discovered. Of course, there aren't a whole lot with enough systems discovered for sample sizes that would be good enough for conclusions, but if one is just looking for suspicions that some places might be worth looking at in more detail, then why not?

Oh. I just realised I forgot something. When I made that sheet, I deliberately made it flexible enough so that you can easily modify the thresholds. It's easy enough to make a local copy and set the required systems from 40,000 to 50. Of course, this was to that sectors with very few systems don't throw off the displayed ratios - for example, 1 ELW and 5 systems total. If you modify that and take a look at the ratios then, you'll see that all the high value "hotspots" are those that only have a few hundred systems in total. Also, the Zunou sector jumps out, but that's because three people have been farming ELWs there, and there was nothing of interest to other explorers there. Area surveys and such things can skew things, too.

schlowi123 said:
The last assumption is that ELWs, AWs and WWs are actually scanned by most explorers.

For AWs, that certainly wasn't true. Before the FSS handed them on a silver platter to explorers, not only did you have to fly to them, but you also had to learn to recognize them. Little wonder then that the majority of them went unscanned (for EDSM). Many people didn't bother - it's not like the pay was that good, either. Now, since the introduction of the "I win" button, the rate at which new AWs are uploaded has pretty much exploded. So while you can safely assume now that AWs are scanned by most explorers, it certainly wasn't the case for a long while. That's something to consider as well - for ammonia worlds at least, you might want to cut down your samples to systems discovered after Chapter Four, but that's a huge cut.

schlowi123 said:
Also, if a cluster of Ammonia world does indeed show up in the data (and it does), this may be of highest interest, considering that Thargoids may come from Ammonia worlds. So a dedicated expedition into these sectors may bring back better data … and most likely make the statistical fluke disappear.

I haven't seen any clusters of ammonia worlds, beyond statistical clumping, but I didn't look at them all that much.

In any case, come the end of May, after DW2 reaches Beagle Point and is over (back to the return expedition), I'll update things with the new data. I'll be curious if anything new appears.

schlowi123 said:
I just started with ED last year May. Do you have a link where I can read about mass codes and why mass code D has a higher chance to have an ELW? What about AWs?

You already have the links now, so just a bit more about why D has a higher chance to have an ELW: it's mostly because the higher mass codes tend to have stars that are either too luminous for favourable ELW conditions (B, O) or are outside the main sequence and quite bad for them. The only exception being neutron stars there. Normally, you'd expect that more system mass would be better, and that's true up till a point - but then the habitable zones not only get wider (good) but farther away from the star (bad).
Now, if you take a look at the mass code distribution for ELWs, 60.87% were at code D (in 2018 May), which dropped to 1.46% at code E, then 0.05% at F, 0.01% at G, and 0.08% at H. The highest mass code does better because it can have all kinds of systems, and the overwhelming majority (56 of 58 total) of ELWs were ELMs there - which are affected by a temperature bug, making their surface temperatures considerably less than they should be. When it comes to "hot" systems, that helps. It's also probably the reason why ELMs are more abundant in code E (7.8% of ELWs) than in code D (1.8%).

System main star types and mass codes are also very closely tied together, at least in the main sequence. I don't have the numbers handy, but I remember something like 95% of F stars being code D, maybe more. The one exception was G stars, which were roughly evenly split between codes C and D. So there's also that if you filter your route plotter to F (for example), you can be fairly sure that nearly all the systems you're going to jump into will be code D.

The situation is remarkably similar with ammonia worlds as well. See the data here. The only difference there is that there is no AM contrast between mass codes D and E, and overall, mass codes C and D are closer together.

schlowi123 · Mar 6, 2019

Thank you for your replies and the links. Some interesting reads.

I'm a bit occupied atm but hope to be able to look at the ELW / AW analysis again, soon.

schlowi123 · Aug 19, 2019

Continued from here.

Maolagin said:
[…] breadth-first search of the node graph taking place, except that you're iterating in order over increasingly expensive jump boosts. I suspect it could go faster using a cost heuristic like in A

"cost heuristic like in A*"

… I'm not a computer scientist but a physicist. When approaching problems, my thinking is very much biased by that. So I'll be happy if you would elaborate on that (or post a link) because I don't know what you are talking about.
That means of course also that I often don't know the proper expressions for a given task/solution/method.
Hence, also: what is breadth first search? I think I know what you mean with "node graph".

Cost heuristic, is this something like Orvidius describes here?
If yes, I had thought about that for quite a while. But in the end my goal was to be 100 % sure that no non-jumponium path exists, even if that would mean to fly a _ huge _ detour, or even going backwards, before boosting. And I didn't think I could make that work with a "weighted distance" or something. Saving on jumponium was paramount for me at that time. But I'm open (and will be happy) for suggestions

.

Neutron boosting is also not properly implemented since this would mean to go over EDSM's bodies.json file since this information is not part of the systemsWithCoordinates.json file. I had something like this and it needs a lot of time to go through that file.
However, pseudo-neutron boosting can be achieved by simply setting the last jump distance to the neutron boosted distance. The user has than just to find a Neutron star close to the jumppoint.

Talking about getting the information. I guess that would be several orders of magnitudes faster if a proper database would be used.
However, while I've started to learn a bit about that I didn't know anything about it when writing said program.
I also wanted this program to be usable without the precondition that a user has to setup a database. So that she or he just needs to download the proper json-file from EDSM.

Regarding the tube around the line from point A to point B: I thought that was logical to do. But it sounds as if another possibility exists. Would that include all stars in the galaxy?

These are bottlenecks (I think):

Gathering the data => solution: a proper database, with the drawback that this is additional "infrastructure" needed to be set up by the user
The setup of the nodes. Since all distances to all stars need to be calculated that takes a lot of time if many stars are to be considered. I just speed that one up by a factor of 20 by doing that just for stars within a cube with twice the max jumpdistance as length of the sides. Six simple comparisons are apparently that much faster than Pythagors . Beside the fact that the whole nodes-construct is inefficient as it can be.
When actually finding the path I'm always checking all nodes if they have a traveler (or jumper how I call them). If a node has a jumper I always check if it can continue its travels (for a given jumprange). So this is always checking everything which needs a lot of time. However, I've implemented it this way to be able to find a way in the following situation: consider a long area of close stars and this area suddenly stops at a "gap" (like a cape). The gap can't be crossed, except close to the start with a boosted jump. This however will not be considered until all of the "cape" is traveled, thus I need to always check all nodes. First traveling the whole "cape" is also intentional since boosted jumps shall be considered as a last resort (as mentioned earlier). Or in the terms of the example: the "cape" could have a bridge at its very end, thus making the use of jumponium unnecessary.
And than probably a thousand things were a different structure in the code or how the data is stored would speed the things up a lot. Not to talk about the whole algorithm itself.

The first point needs most time if the star density is low. This however always needs the same amount of time, so the second and increasingly the third point need more time with increasing star density.
However, the program was meant to be used when something like a couple of hundred, up to a few thousand stars can be used. In that case setup of the nodes and the search algorithm need a bearable amount of time. It doesn't really matter if something takes 20 seconds or two. With tens of thousand stars it takes hours though.

Please ask if you have any questions. And yes, you can submit changes through GitHub. That will be a great opportunity for me to finally learn how that works.

And also know that I will be very happy reading your thoughts. I'm not working as a professional programmer. Hence, there is a lot I can learn and I don't really have the opportunity to talk about such things in RL.
For example after years of programming in pythin I just learned to like something as fundamental as < set() >. And that, even though I have used it before. So please feel free to express your thoughts

Maolagin · Aug 20, 2019

schlowi123 said:
I'm not a computer scientist but a physicist. When approaching problems, my thinking is very much biased by that.

Respect, I'm not trained as a computer scientist either. But I am an astronomer who ended up specializing in helping other scientists do effective scientific computing, so this is up my alley.

Anyway, to answer some questions:

schlowi123 said:
"cost heuristic like in A*" …

Hence, also: what is breadth first search? I think I know what you mean with "node graph".

Cost heuristic, is this something like Orvidius describes here?
If yes, I had thought about that for quite a while. But in the end my goal was to be 100 % sure that no non-jumponium path exists, even if that would mean to fly a _ huge _ detour, or even going backwards, before boosting. And I didn't think I could make that work with a "weighted distance" or something. Saving on jumponium was paramount for me at that time. But I'm open (and will be happy) for suggestions .

By "node graph" I mean the (abstract) data structure consisting of nodes (stars in this case) and edges (possible jumps between stars). In your case you don't explicitly compute the edges of the graph, but that is implicitly what you are doing when you make a list of the nodes in range for a given jumper. "Breadth first" means that the search goes wide before deep - i.e. it explores all places a jumper can reach from a given star, then all places it can reach from those places, etc. The opposite would be depth first, which would be if you picked a direction and started jumping that way until you were stopped, then backed up and tried a new route. Generally depth-first works well if you have a way of guessing ahead of time a path that is likely to work out, whereas breadth-first guarantees thoroughly exploring the graph.

So Orvidius was talking about Djikstra's algorithm, which is a breadth-first algorithm that picks the best route using a cost function - each possible route it finds gets a numerical score, and it picks the lowest. In physical path-finding, the cost function is usually just the number of steps in the route. But each step doesn't have to have the same cost, and Orvidius was suggesting giving the boosted jumps a high cost. There's no particular ratio you have to use; you could even set it so high that it won't pick a boosted jump until it has run out of other options.

I mentioned A* because it is popular in games and other cases where speed is desired. It takes the idea of Djikstra, and adds the concept of a heuristic function. This incorporates the fact that in real geometry (as opposed to abstract mathematical graphs) it is often possible to calculate how far you are from the destination. A* is a hybrid depth-breadth approach, in that it explores nodes that are closer to the destination first, since it is reasonable to assume that for most of the path you won't be going backwards.

Here's a lightweight introduction: https://medium.com/@nicholas.w.swift/easy-a-star-pathfinding-7e6689c7f7b2

schlowi123 said:
Neutron boosting is also not properly implemented since this would mean to go over EDSM's bodies.json file since this information is not part of the systemsWithCoordinates.json file. I had something like this and it needs a lot of time to go through that file.

Indeed, I went looking and there doesn't seem to be a dump anywhere of just the known neutron stars. That's too bad, there's only a couple million of them.

schlowi123 said:
Talking about getting the information. I guess that would be several orders of magnitudes faster if a proper database would be used.
However, while I've started to learn a bit about that I didn't know anything about it when writing said program.
I also wanted this program to be usable without the precondition that a user has to setup a database. So that she or he just needs to download the proper json-file from EDSM.

It wouldn't necessarily be faster, for small numbers of stars it's quick enough to parse the data. It is certainly slow to parse the entire multi-GB json file, so it could make sense to write that to a more efficient representation if the user will be using it frequently. But I think it will be more user-friendly to continue with the approach of fetching just the blocks of stars needed through the EDSM API, perhaps cacheing them on disk so they don't have to be re-fetched on later runs. There are plenty of ways to do that without setting up a database server though.

schlowi123 said:
For example after years of programming in pythin I just learned to like something as fundamental as < set() >. And that, even though I have used it before. So please feel free to express your thoughts

Shall do, I'll be in touch as I learn more. Although speaking of fundamental Python features, I was surprised to see that you don't use Numpy at all. Any objection to that?

Orvidius · Aug 20, 2019

Maolagin said:
Indeed, I went looking and there doesn't seem to be a dump anywhere of just the known neutron stars. That's too bad, there's only a couple million of them.

I can help with that. Hot off the presses!

To keep it simple, this is a list of neutron stars by name and EDSM system ID. The system ID can be used to match them to the star systems data:

https://edastro.com/mapcharts/files/neutron-stars.csv

(currently 48 MB, and 1,388,315 rows)

Maolagin · Aug 20, 2019

Orvidius said:
I can help with that. Hot off the presses! To keep it simple, this is a list of neutron stars by name and EDSM system ID. The system ID can be used to match them to the star systems data:

https://edastro.com/mapcharts/files/neutron-stars.csv

(currently 48 MB, and 1,388,315 rows)

Many thanks! I didn't want to just throw more work at you, but I was hoping you'd see that as a worthwhile suggestion.

schlowi123 · Aug 20, 2019

Thank you for the extensive answers

.

Breadth first - Depth first:
I thought for a while how to do the latter. But in the end I always ended up with the "cape-problem" as described above and I concluded that I would need to do a full exploration of the graph (I hope I use these terms correct) anyway.

Djikstra:
That was the first I thought about. I even already had a working implementation … and well, the rest of this story is descibred in the first post.

Frequent use:
I use that program regularly, but certainly not "a lot". It turned out that the hassle of regularly downloading the data and updating a more efficient representation is not worth the hassle. It has a lot to do with the fact that I wrote this tool to be used in regions with a low star density. I don't know if anybody is often enough in such regions. And the pilots that are, probably don't need such a tool.

Using the EDSM API:
This _ is _ implemented. However, I have tested it and got the impression that EDSM allows just a couple of inquiries per user / hour / I-don't-know and this limited the use so much that I decided to rather download the json-file and do it offline. But you just need to set the < search_offline > parameter in gap_jumper.py to < False >.

Caching the data for later runs:
This is also implemented. The found stars (and also the created nodes) are saved on disc. But of course just for the cylinder of stars of your original route of interest.
However, you have to remove some comments around some lines and set some comment around some other lines in gap_jumper.py to use that file. But where to do this is clearly marked in the code.

Orvidius said:
I can help with that. Hot off the presses! To keep it simple, this is a list of neutron stars by name and EDSM system ID. The system ID can be used to match them to the star systems data:

https://edastro.com/mapcharts/files/neutron-stars.csv

(currently 48 MB, and 1,388,315 rows)

Woohoo! Thank you very much. I will think about how to implement neutron jumps now.

Question: Is there an interest, that I provide some kind of user interface. It will be just textinput on the console but that would remove the need to change the parameters in the sourcecode. I didn't do that before since I was the only user and thus didn't see the need for it.

schlowi123 · Aug 20, 2019

Numpy:
Oehm … æhm … to my shame I don't know numpy good enough. I do work with it but than it is usually a rather specific problem and I never delve into it in depth.
In this specific case it is also true that, once I had the idea of "current that find a path" my brain was "locked in" and I didn't know to do this in numpy.
Also, once I had this idea I was nerd sniped and really wanted to see if I actually am able to implement this specific idea into code. … I was and that made me happy, but there likely is a much better solution using numpy and real math to do all of that.