Discussion What is the most efficient way to crowdsource the 3D system coordinates

RedWizzard · Oct 25, 2014

TornSoul said:
But that means we get an extra area/volume (assuming star density is about the same) of 3 (and a bit) extra pills to play with.

I would have thought they would have pushed it much further than that..., considering SB3 supposedly is the last beta stage.

588 stars or 2406... I don't know, but that doesn't make much of a difference in my book - Testing wise.

You'd have to go at least a couple of magnitudes up from now, before it ought to test "the boundaries" of the code.
If expanding to 2406 is doing that - Then I'm a bit worried...

What is likely to cause bugs is new content (e.g. the B type stars they mentioned in the newsletter), if 2406 has all the different content they need tested then there is no difference between that and a larger volume for the purposes is testing. I suspect the reason they're keeping it pretty small is that they want to keep everyone reasonably close together. Look at how much whining there has been in beta 2 about not seeing human players. What will be interesting is whether we get Sol - that should be tested.

RedWizzard · Oct 25, 2014

Snuble said:
What I do have is a 90% works for me 20^3 grid of the current pill, that I can easily expand to the new playable area. I'm guessing its a sphere this time, as it would fit 110ish of my boxes, giving an average of 22 systems. Thats way more dense then what I have now, so my guess is that 240-300 boxes will be more correct.

As a sphere it would only be 60 Ly radius for the volume they quoted. I guess that is approximately the radius of the long axis of the Pill so maybe that is it. I hope the beta 3 area encompasses Sol and/or Achenar though. Are your 20^3 boxes aligned to the 10 Ly large scale grid in the galaxy map?

Snuble said:
For Beta 3 make sure you have made enough credits to get that explorer with 20LY jump range (13ish should be enough tough), plenty of fuel tanks and well the 1.5m for a advanced scanner would be very helpfull too... Hopefully there will be no 3 day showstopper bugs this time.

A Type 6 is just as good and a lot cheaper. I had in excess of 30 Ly range in 6 at one stage. Of course everything may change radically again as it did for beta 2. Michael did mention that he'd toned down the longer jump ranges so we might not be able to get 20+ Ly ranges.

TornSoul · Oct 25, 2014

RedWizzard said:
What is likely to cause bugs is new content (e.g. the B type stars they mentioned in the newsletter), if 2406 has all the different content they need tested then there is no difference between that and a larger volume for the purposes is testing.

That was kind of what I was getting at. Seeing as stuff is supposedly procedurally created - They'd need a lot of systems to make sure odd (maybe fatal) combos show up - And if some 2400 stars are enough for that, that makes me a bit sad...
They could ofc have hand seeded some stuff (like the B star etc) - But still

RedWizzard said:
I suspect the reason they're keeping it pretty small is that they want to keep everyone reasonably close together.

That's probably the best argument actually - I forgot that all those extra magnitudes of stars, and their potential fatal combos, actually needs to be tested/visited as well to prove their fatalness

So yeah - Keeping it fairly small - to make sure every nook and cranny gets tested more than once. Makes sense.

TornSoul · Oct 25, 2014

Ughh(*) - Just realised something I'd really like to see/have...

A GalactiPedia wiki - With an entry for every single star, with all the relevant data.
Systems with special events/things whatever could very simply have those descriptions added (what a wiki does so well after all)

(Don't know if there's any wiki with integrated API for pulling specific data etc - But that's sort of (but not quite) a separate issue)

(*)Ughh - Because I just can't see how to make it work...

TornSoul · Oct 25, 2014

RedWizzard said:
That's very good! Visualisation is where I want to go next with this stuff and that looks like an interesting avenue to explore. Can you change the size of the dots? I'm looking for a quick way to visualise which parts of the pill I've searched so I'd like to load all the stars and then change the one's I've checked to 20 Ly radius so I can see which regions I've missed.

Well here is the *real* way to do it (I think

): http://edstarcoordinator.com/display.html
Using THREE.js and WebGL.
(sometimes you need to click outside the browser for it to start up... no idea why

)

display.html + js files are NOT productioncode - just my "scratchpad" for playing around (the output is VERY basic)

I've never used THREE.js before today - so don't ask. I haven't got a clue

I just hacked this together with some copy/paste from the web.

(couldn't throw it on fiddle due to the included file for camera controls - or I would have)

Snuble · Oct 25, 2014

RedWizzard said:
As a sphere it would only be 60 Ly radius for the volume they quoted. I guess that is approximately the radius of the long axis of the Pill so maybe that is it. I hope the beta 3 area encompasses Sol and/or Achenar though. Are your 20^3 boxes aligned to the 10 Ly large scale grid in the galaxy map?

A Type 6 is just as good and a lot cheaper. I had in excess of 30 Ly range in 6 at one stage. Of course everything may change radically again as it did for beta 2. Michael did mention that he'd toned down the longer jump ranges so we might not be able to get 20+ Ly ranges.

My choice for box was to put SOL in the center, so that make it aligned to the grid yes. This box with SOL in the center is named 0 0 0. And currently contain no systems.

This give 4 x/y slices of 6x8. I'm contemplating just generating a set of dumb html files and upload somewhere free.

wolverine2710 · Oct 25, 2014

The Great Collector.

I think its time to get serious about automating our discovery plans. SB3's 2,406 star instead of 573 as in SB2. Hence 1833 systems to get coordinates for. In SB3 we had to discover approx 573-307+12=278 systems. 1833/279=6.56 times more star systems to map then with SB2. Lets face it, the bulk of all distances collection has been done by the commanders who were creating tools. They could use the distances to further fine tune there algorithms. Which is great but for SB3 the bulk of the collecting should be done by normal volunteers (NV). Normal volunteers who know that they will profit from it in the end - at least when they use navigation route planners (NRP). I'm talking about coords but like Snuble said other info is also higly valuable. But I want to concentrate on coords. In my view other info like distance to station/platform, anarchy/federation status is a refinement. Lets concentrate on the basics first. That said Trade Dangerous also has a stations.csv file and Smacker has a TD fork which displays stuff like "fedeation"

My suggestion is "The Great Collector" (TGC) which is running on a webserver somewhere. Basically it does the following. A volunteer enters distances for a star system using ONE the tools created by a commander and that tool calculates the coords. After that the tool sends distances and coords to The Great Collector. TGC then updates The One Reference (TOR). TOR could be in multiple format. For example The Reference Format (TRF) of RedWizzard and a Trade Dangerous compatible CSV format or the CSV format of wtbw. TOR is used by all tools to highlight to a user that the systems a user wishes to enter distances for is already done.

Entering distance data. I do NOT want to force distance volunteers to use one tool. The major reason is that it would indicate that one tools is better then another and would throw away the extremely hard work of a commander. We could establish a common format for uploading distances and coords to TGC. Also needless. It would force tool authors to change their format which is a waste of time. Harbinger has a JSON structure which he uses, TornSouls has a JSON structure, as does RedWizzard. I'm sure there are more tools a NV could use but forigve me I have forgotten a certain tool. As its all JSON its easy for TGC to parse it and update The Reference Format (TOR. I DO think that the The Reference Format (TRF) should be that of RedWizzard. Already quite a few tools do parse that format and it would be a waste to throw that work away.

Some small technical details to get the discussion started. the TGC webapp could run on multipe ports, each for every commander tool. Spreadsheets like TunaMage can also use TGC. All what is needed is that TunaMage creates an overview page with all relevant info. That could be copy/paste in a form which uploads to TGC. A simple html page with javascript could check TGC to see if a system is also entered.

I do think that with modest changes to the existing tools they all could use TGC. TGC itself should not be that complicated. Harbinger and Tornsouls have webspace and donated it iirc. Also I believe ShadowGar has offered webspace.

So, what do you all think. Is this the way to go or is it rubbish. Lets discuss it and the technical details.

KevinMillican · Oct 25, 2014

wolverine2710 said:
The Great Collector.
So, what do you all think. Is this the way to go or is it rubbish. Lets discuss it and the technical details.

I'm fine with this. My site already does exports and I can easily add additional ones in an agreed format, along with imports in the other direction.

It would be nice to know what Frontier intend to let us have officially for the final game however.

wolverine2710 · Oct 25, 2014

I'm fine with this. My site already does exports and I can easily add additional ones in an agreed format, along with imports in the other direction.

It seems I have missed your site, can you give me the url?

KevinMillican · Oct 25, 2014

wolverine2710 said:
It seems I have missed your site, can you give me the url?

www.edpilotlog.co.uk

info is here :-
http://forums.frontier.co.uk/showthread.php?t=51113

wolverine2710 · Oct 25, 2014

KevinMillican said:
www.edpilotlog.co.uk

info is here :-
http://forums.frontier.co.uk/showthread.php?t=51113

NICE tool. Its almost impossible to know what is out there. I did find distance to station data but can't seem to find coords for a star system. Am I looking in the wrong places. Curious, what is the source for your distances to station info?

RedWizzard · Oct 25, 2014

TornSoul said:
Well here is the *real* way to do it (I think ): http://edstarcoordinator.com/display.html
Using THREE.js and WebGL.
(sometimes you need to click outside the browser for it to start up... no idea why )

display.html + js files are NOT productioncode - just my "scratchpad" for playing around (the output is VERY basic)

I've never used THREE.js before today - so don't ask. I haven't got a clue
I just hacked this together with some copy/paste from the web.

(couldn't throw it on fiddle due to the included file for camera controls - or I would have)

Yes, that been on my list to check out for a while. Looks good.

KevinMillican · Oct 25, 2014

wolverine2710 said:
NICE tool. Its almost impossible to know what is out there. I did find distance to station data but can't seem to find coords for a star system. Am I looking in the wrong places. Curious, what is the source for your distances to station info?

The Export tab | View & Export Systems allows you to see or export the System data and X,Y,Z coordinates if they exist.
The Proximities are manually input for each visited system. I try to do about 5 per system and only takes about 2 minutes to input this number once you get the hang of it (tip: you can create several inline new rows at once). The idea is to use the least squares method to then populate the coordinate data. I haven't coded this yet, but it looks fairly trivial.

The distances from star to station are also manually input the first time I visit a station. You can do this when you first arrive at a star, but it's usually easier to do it when you dock. I say usually, because sometimes the station is located at a different star from your jump exit and you have to remember which star you arrived near. It's likely these distances actually change for some systems as they orbit on non-circular paths, but the intent is that the maximum distances are stored.

Actually, this star-to-station distance information along with the station facilities, was the main driver for the tool development; I took a couple of missions where I didn't have enough time to complete deliveries, because the target station was located 1.2 million LS from the star and I had to abort and eject cargo. Of course, if I'd also known whether the station had a black market, I might have hung on to the cargo. So I started to jot down this information down in a notebook, which quickly became cluttered and the more data I collected, the harder it was to find the information. This was what prompted me to start developing the online tool last Sunday. I figured if other people found this useful, then perhaps they would populate some of it and we'd all benefit.

I have quite a bit more station data written down in my notebook which I need to input yet, but I'm still fine-tuning the online tool.

graham.reeds · Oct 25, 2014

All tools should allow the user to enter stations.
I haven't worked on my OCR tool for a week (work

) but the major issue is the commodities screen doesn't have the star name on it, just the station. So I need to think about automating the download of up to date system/station data. Does anyone know if there are indentically named stations in B2?

RedWizzard · Oct 25, 2014

wolverine2710 said:
My suggestion is "The Great Collector" (TGC) which is running on a webserver somewhere. Basically it does the following. A volunteer enters distances for a star system using ONE the tools created by a commander and that tool calculates the coords. After that the tool sends distances and coords to The Great Collector. TGC then updates The One Reference (TOR). TOR could be in multiple format. For example The Reference Format (TRF) of RedWizzard and a Trade Dangerous compatible CSV format or the CSV format of wtbw. TOR is used by all tools to highlight to a user that the systems a user wishes to enter distances for is already done.

Sounds good. Here's how I think the webservice (TGC) should work at a slightly higher level of detail (all of this IMHO, of course):
1. The main input should be distances between two systems. At least initially our frontend tools should only submit distances that are known to locate systems without error, i.e. at least 5 distances from the unknown system to reference stars with coordinates provided by FD that together generate correct coordinates (correct coordinates beings ones that generate the same distances to 3 dp using the single precision calculation). I do think it is worth collecting any distance data people are willing to submit (such as insufficient numbers of distances and distances to non-reference systems), but let's focus on what we need to actually locate systems first.

2. Submitting coordinates with the distances should be optional, and if they are submitted they should be checked by the TGC. Since it can check coordinates it should. We're all getting the same results so it doesn't really matter whose algorithm is used by the TGC, the implementor can choose.

3. Input format should be a JSON block with the same structure as the output. It seems like the simplest structured format to use. XML, for example, would be overkill.

4. TGC should store two lists of systems: verified systems and unverified systems. At least initially, I think at least one of us should be confirming that a star exists before it is considered verified. This would simply involve opening the galaxy map, searching for the unverified system, and checking that the distance to it (from wherever you are right now) is consistent with the calculated coordinates. This process should give us a high level of confidence that the verified systems are accurate. How the TGC actually stores the data (DB/files/whatever) is fairly irrelevant and whoever implements it can choose.

5. Multiple output datasets should be offered: verified systems with coordinates, unverified systems with coordinates, verified systems with coordinates and distances, unverified systems with coordinates and distances, unattached distance data if we capture it, and perhaps static lists of reference systems. Tools that are only interested in the results don't need the distances but those of us who've written tools to locate systems will want to be able to check the output of TGC is correct and will need distances.

6. Output formats: JSON with the same structure as the input. The implementor can do anything else they want to, of course, e.g. TD CSV, but tool specific formats can just as easily be done by the tool so the implementor shouldn't feel compelled.

7. As the data volume increases it is likely that we will need to be able to fetch data by region: this should be considered in the design. Probably just allowing the fetcher to specify a system and a distance would be enough (i.e. "give me coordinates for all verified systems within 50 Ly of Sol"). I think it should be designed for caching as well: data requests should be URL based (either query or hierarchy, it doesn't really matter) and the TGC should correctly identify unchanged data. If the bandwidth use gets high then we can do things like reducing the frequency of updates to the verified list (once a day or whatever).

8. Duplicate data: TGC definitely shouldn't throw any data away. If a duplicate system is submitted the distances should be merged and the coordinates rechecked. It might be worth keeping a count of how many times each distance has been reported too. If Sol - Ross 128 has been reported 3 times identically then we have a higher degree of confidence about that distance.

9. Bad data: any bad data should be logged and reported somehow. I've seen too many cases where data that looked bad actually turned out to be correct to just throw stuff away.

Additional considerations:

I think that additional data such as distance to stations, allegiance, economy, etc should be stage 2. We'd want that in place by release (and ideally by gamma) though.

Do we want to capture unknown system names without distances or coordinates (which could be then used to prompt volunteers to get distance data)? I was thinking of a logbook tool which kept a record of where a commander flew. It could easily submit system names with no additional effort from the user. But is it the value of that data worth bothering with?

TGC could provide some support for Snuble's plan to partition up the space for searching. E.g. tracking who is searching which box, generating lists of known stars in each box, etc.

wolverine2710 · Oct 25, 2014

RedWizzard said:
Do we want to capture unknown system names without distances or coordinates (which could be then used to prompt volunteers to get distance data)? I was thinking of a logbook tool which kept a record of where a commander flew. It could easily submit system names with no additional effort from the user. But is it the value of that data worth bothering with?

TGC could provide some support for Snuble's plan to partition up the space for searching. E.g. tracking who is searching which box, generating lists of known stars in each box, etc.

I totally agree with basically everything. To be frank, I had thought of most stuff but wanted to keep the thread relative short

Good to see two cmdrs with similar ideas though. A few more things.

A tool which runs on a normal volunteers computer and reads its netlog.log (reveals where a cmdr is, like system but also station/platform) could be useful indeed. If enough cmdrs do it we at least get a complete list of all system names out there and now which are missed. As a side effect we also get insight how far cmdrs leave the centre etc. But for that an UUID is needed and it has a side effect that in theory a cmdr can be tracked. But if its only stored in the TGC and used for statistics under the assumption we are trusted its less of an issue.

Biteketkergetek has made a verifier. Iirc its working pretty well. That could be used to verify coords at the TGC. Might need to rewritten to run on the host of the TGC. It also could be provided in the form of a webservice so that other tools could use it as an extra check before submitting it to the TGC. I believe your own bulklocate and the algorithm behind your tool which calculates an RMS could be used in the TGC and or provided as a webservice. But lets keep things simple and try to get a barebone version of the TGC out around or shortly after the SB3 release.

Of course TGC has to be hosted somewhere and code needs to be written. But YES we need to address the basic functionality first.

wolverine2710 · Oct 25, 2014

I wrote Michael Brookes today - combined with previous experiences it seems he and FD are in crunch mode and work in the weekends. I asked him if it would perhaps be possible to get the coordinates before SB3. His response.

Michael Brookes: I'm afraid I can't provide this data before the update is released.

This was to be expected but tried it anyway as it would give us a head start -defining good, even better reference points etc.

wolverine2710 · Oct 25, 2014

Perhaps KevinMillican (his tool) and/or Smacker can provide some inside in this. From the netlog.log you could read which system/station a cmdr currently is. Sad about your work stress. The OCR thread at this point has resulted in less positive results so far then this thread. OCR-ing with the prices project on the canopy (thus a changing background for the prices) proves to be NOT trivial.

TornSoul · Oct 25, 2014

wolverine2710 said:
The Great Collector.
We could establish a common format for uploading distances and coords to TGC. Also needless. It would force tool authors to change their format which is a waste of time. <snip>. As its all JSON its easy for TGC to parse it and update The Reference Format (TOR. I DO think that the The Reference Format (TRF) should be that of RedWizzard. Already quite a few tools do parse that format and it would be a waste to throw that work away.

If I understand this correctly: It would mean that every time a new tool is written, with it's own format - TGC would need to be updated to accommodate that new format (to be able to accept data from that tool).

That's just not going to work.

For TGC to be sustainable it needs to have ONE input format - Tools will need be fixed to use it.
That is much much better than the other way around.

Otherwise we risk that whoever maintains TGC (one or more persons) don't get around to writing a parser for a new format for a new tool, meaning that tool can't use TGC, and thus the tool itself isn't very useful (not at all).
This would mean tool lock-in. What we have is what we get. No new tools (worst case - granted).

You simple can't have a webservice that needs to be fixed everytime a new user wants to use it (to exaggerate a bit), that's just not how this kind of thing works.

In summation:

TGC needs to have one firmly defined input format, and we better get that right - as *subsequent* chances would be bad (all tools would need to be updated).
TGC can have multiple output formats - no problem. and new ones can easily be established as time goes - That won't break anything.

---

On output - I have some difficulty differentiating between TGC and TOR.
As TGC is collecting the data - it already, well.. has the data - Which makes it the TOR as well?

I have more, but will leave it at that for now, as nailing down the TGC is in my mind the foundation on which everything else will have to rest.

Harbinger · Oct 25, 2014

@TornSoul, whilst I can't argue that having just one JSON format is optimal. Once you know the structure of someone else's JSON output it literally takes 2 minutes to figure out how to only grab what you need from it from there on out. A JSON output is essentially just an array in string form.

It could just as easily be done with conditionals and a flag within the json output, identifying the type used.

Discussion What is the most efficient way to crowdsource the 3D system coordinates

RedWizzard

RedWizzard

TornSoul

TornSoul

TornSoul

Snuble

wolverine2710

Tutorial & Guide Writer

KevinMillican

wolverine2710

Tutorial & Guide Writer

KevinMillican

wolverine2710

Tutorial & Guide Writer

RedWizzard

KevinMillican

graham.reeds

RedWizzard

wolverine2710

Tutorial & Guide Writer

wolverine2710

Tutorial & Guide Writer

wolverine2710

Tutorial & Guide Writer

TornSoul

Harbinger