Discussion What is the most efficient way to crowdsource the 3D system coordinates

Biteketkergetek · Nov 5, 2014

Hmm, the captain's log would be a good idea even for my map.

It's web based but can be run locally. I could make it work a bit like a real GUI by hiding it in the notification area, and let it show up in the browser if the icon is clicked. But this would work for any web app, where the login can be made optional.

One benefit of this approach (if I'm not mistaken) even an installed app could use OAuth, and send any data with valid source information. Even a signature field could be used in submissions using public key cryptography for added security, or just for the fun of it

RedWizzard · Nov 6, 2014

Michael Brookes said:
Only if it's the system's name and it exists. Can you see the name in the map and not search for it?

That's correct. E.g. "Core Sys Sector HH-V b2-7" which is quite close to LHS 3549, but can't be searched for. I see Snuble posted some examples too.

I actually haven't found one that does work.

Iain M Norman · Nov 6, 2014

I've failed to find search results often to find some subtle difference in spacing or separators that I'd not noticed because of the font.

I'd love to know how the search works given the PG nature of the system names and coordinates, that sounds like a very interesting programming problem to solve.

Athan · Nov 6, 2014

Iain M Norman said:
I've failed to find search results often to find some subtle difference in spacing or separators that I'd not noticed because of the font.

I'd love to know how the search works given the PG nature of the system names and coordinates, that sounds like a very interesting programming problem to solve.

I'd wager there's some one to one relationship between the PG names and where that name is used. So when a star is generated its position determines its name, and when searched on the name informs the PG routines where it will be so they can regenerate it as needed.

By the by I've not yet had partial searches work at all. Including just missing the final letter off a name that works otherwise.

RedWizzard · Nov 7, 2014

Iain M Norman said:
I've failed to find search results often to find some subtle difference in spacing or separators that I'd not noticed because of the font.

I'd love to know how the search works given the PG nature of the system names and coordinates, that sounds like a very interesting programming problem to solve.

I'm pretty careful with searching - I've had to do a lot of it. Just to rule this possibility out though I visited "Yin Sector CL-Y d143" and made sure I had the name exactly as it appears in the netlog. Still can't search for it in the galaxy map. Have you managed to successfully search for any "...Sector..." systems?

In terms of implementation they'd obviously have to use an index for the names which have been specified. There are off-the-shelf solutions for that capable of handling massive datasets, e.g. Elasticsearch. For the Wredguia style names (now masked as "...Sector..." names in the beta 3 region), I believe the name encodes a fairly small area of the galaxy. This is apparent if you move more than a few dozen Ly: instead of "Wredguia..." you'll see something like "Plaa Eurk IW-Y B55-4" (near Alpha Cygni), or "Stuemeae FG-Y d15249" (near the galactic center). So those sort of names are more like addresses.

RedWizzard · Nov 7, 2014

Athan said:
I'd wager there's some one to one relationship between the PG names and where that name is used. So when a star is generated its position determines its name, and when searched on the name informs the PG routines where it will be so they can regenerate it as needed.

By the by I've not yet had partial searches work at all. Including just missing the final letter off a name that works otherwise.

In FE2/FFE the names were randomly generated from a list of syllables. The PRNG was seeded with the sector and star number so the names were stable. But those games never allowed searching by star name, ED needs a reverse lookup too and as you said the obvious solution is to make the names encode enough information to find the system easily.

I haven't had any luck with partial searches either. I guess that hasn't actually made it into beta 3. I'm not actually clear on how it should work - what if there are multiple results for a search?

wolverine2710 · Nov 7, 2014

@Tornsouls. Have you been able to progress with TGC - The Great Collector.
I´m travelling back this afternoon from Germany to my home in The Netherlands. If there are things which can/needs to be tested I´m more then happy to put TGR on the rack - meant in a nice way ;-)

TornSoul · Nov 7, 2014

The main bits for accepting input is done - today will mostly be a day of testing all input variations (there's a lot...). With a bit of luck I should be able to release the first alpha later today.

And it will definitely need some serious testing

RedWizzard · Nov 7, 2014

TornSoul said:
The main bits for accepting input is done - today will mostly be a day of testing all input variations (there's a lot...). With a bit of luck I should be able to release the first alpha later today.

And it will definitely need some serious testing

What's your plan for testing? Separate test system or reset the real system after a time?

wolverine2710 · Nov 7, 2014

@RedWizzard. I saw that almost all systems have been verified. We all know SB2 systems have been deleted,moved,renamed etc. Do you perhaps have an overview of how many systems have stayed the same. This could be an indicator of what to expect for Gamma. This could be used to decide if its better to wait with the en mass crowd sourcing until Gamma hits or that we start doing it now. In Slopey´s BPC thread commanders have asked questions about crowd sourcing (they seem to be unaware of this thread). Slopey´s BPC users are a VERY motivated bunch, we did crowd sourcing for the 55 systems of SB1. If they would be pointed here by Slopey I´m sure we get a lot of new normal volunteers.

You have discussed how to verify SB2/SB3 systems and you have done that yourself (brilliant). Can your tool currently be used to verify systems. As in I could verify the last unverified systems or is it a matter of providing the distances (1 or 2) here in this thread so you can verify them?

TornSoul · Nov 7, 2014

I want to encourage hammering it will all kinds of junk (random) data, without anyone worrying about polluting the data.
So that the "mechanics" of it can get a thorough testing.

It'll really need it! - Things got rather complicated, as I'm basically allowing any combinations of inputs of known/unknown systems/distances etc, and then handling that everything gets checked out when/if it needs to be etc.

This was a much much larger task than I thought it would be - the complexity of covering all cases and properly updating cr counters etc - yuck... I've been going over it a million times and I'm still only 98% sure I actually got everything covered.
It's so complex it's easy for a slip-up to go unnoticed - Hence the need for some serious testing.

I'll seed it with the FD supplied data.

Once satisfied things work as they should - I'll reset it.

Hopefully I'll have something ready to be released at the end of the day - But I'm not entirely ruling out it could be tomorrow instead (as mentioned I'm about to start some serious testing, and who knows what might crop up...)

wolverine2710 · Nov 7, 2014

Things ALWAYS take more time then anticipated ;-( Release it when you feel comfortable with it.
I will use the time to perfect my torture testing techniques ;-)

I hope you are able to create some documentation for it (API part) and that it can be tested without having to write a program for it. Something like you did with your web-api. I hope that the tool commanders are able to change their programs (soonish) so it can interface with TGC. As in uploading data to it and determining using TGC if a system has been done or not. Exciting times ;-)

TornSoul · Nov 7, 2014

Regarding rw's system.json
What should one filter on to get just the SB3 FD released data.

I tried with
"contributor == FD" && "region == "Beta3"

But that misses out Eranin ("region == "Alpha4") and lot's others.

Any suggestions?

It's probably obvious but...

RedWizzard · Nov 7, 2014

wolverine2710 said:
@RedWizzard. I saw that almost all systems have been verified. We all know SB2 systems have been deleted,moved,renamed etc. Do you perhaps have an overview of how many systems have stayed the same. This could be an indicator of what to expect for Gamma. This could be used to decide if its better to wait with the en mass crowd sourcing until Gamma hits or that we start doing it now. In Slopey´s BPC thread commanders have asked questions about crowd sourcing (they seem to be unaware of this thread). Slopey´s BPC users are a VERY motivated bunch, we did crowd sourcing for the 55 systems of SB1. If they would be pointed here by Slopey I´m sure we get a lot of new normal volunteers.

The only ones that changed were the the Wredguia/Wregoe systems and a few on the reference lists. I didn't find any others in the end. So I'm reasonably confident that it would be worth crowd-sourcing all the systems except for the ones with "Sector" in the name (those are procedurally generated and will probably move). I'd focus on the systems with catalog or constellation names first as those will be almost certain to remain in the same place in gamma. Station/economy/etc data is almost certainly going to change and as that is what Slopey's users are most interested in I think it might be best to hold off on trying to get them involved. Otherwise they may become discouraged if we have to throw all that data away. At least we should be upfront that the data might all change in gamma.

wolverine2710 said:
You have discussed how to verify SB2/SB3 systems and you have done that yourself (brilliant). Can your tool currently be used to verify systems. As in I could verify the last unverified systems or is it a matter of providing the distances (1 or 2) here in this thread so you can verify them?

Yes, but just as a tool, not in any specific way. What I did was I flew to the reference system I decided on, and then I used systems.html to calculate the expected distance to each star I wanted to verify. Specifically I sorted by name and then region (so that the page ended up sorted by region first and then by name) and then I searched for my reference system and selected that to get the distances from the reference system. Then I checked each distance against what the galaxy map had. Finally I flew to a different reference system about 60 Ly away and repeated the process. Once I'd verified all the stars I (carefully) did a manual search and replace in the systems.json.
These checks actually generated valid distance data which I do plan to add into systems.json soon.

TGC integration will be the priority once that is up, but I do have a couple of other changes in mind that could help with verification:

I may create a version of the the data entry page that works the other way round: select the reference system you are at and then enter distances to multiple unknown systems. But this would require collecting the data from several reference systems before any coordinates could be calculated so it would need a server that could take unverified raw distance data (either TGC or a server specifically for my pages).

I probably will enhance entry.html to allow the user to load an existing system and add more distances.

TornSoul · Nov 7, 2014

re TGC

I'm storing both systems and distances (obviously)

Each has a cr "counter" (Confidence Rating).

Basically if a distance get's submitted, that is already in the DB I increment the cr counter for that distance.
Sounds trivial at first sight but... The devil is in the detail however...

So when should this counter get incremented in your opinion?

Very much open to input on this one.

TornSoul · Nov 7, 2014

RedWizzard said:
select the reference system you are at and then enter distances to multiple unknown systems. But this would require collecting the data from several reference systems before any coordinates could be calculated so it would need a server that could take unverified raw distance data (either TGC or a server specifically for my pages).

TGC will cover that scenario (and pretty much any other you (I) can think off).

At it's extreme - You could simply enter the name for p0 (and nothing else - no refs) and TGC will store that name.

Or p0 and one references (whether it's in the DB already or not - If we have coords or not).
TGC will store the systems and the distance.
And will further more use those in the future if appropriate for calculating other coords.
So if you did that (same p0 - and just one ref) 5 times in a row (different refs ofc) - on the 5th submission, TGC would try to trilaterate the position of p0 and update the DB accordingly.
(You could in fact swap p0/ref around as you went along if you wanted - would make no difference to TGC)

Basically TGC will take anything you throw at it - and try to make the best of it (combining it with data already stored etc)

(I've added an (optional) commander name field as well)

RedWizzard · Nov 7, 2014

TornSoul said:
Regarding rw's system.json
What should one filter on to get just the SB3 FD released data.

I tried with
"contributor == FD" && "region == "Beta3"

But that misses out Eranin ("region == "Alpha4") and lot's others.

Any suggestions?

It's probably obvious but...

Filter on contributor == 'FD' and region doesn't contain 'outside Beta3'. Coordinates/distances are all valid for beta 3 so that should get you Michael's list (plus Sol).

TornSoul · Nov 7, 2014

v.region.indexOf('outside Beta3')<0 && v.contributor=='FD'

Gives me 802 systems - Does that sound about right?

RedWizzard · Nov 7, 2014

TornSoul said:
TGC will cover that scenario (and pretty much any other you (I) can think off).

At it's extreme - You could simply enter the name for p0 (and nothing else - no refs) and TGC will store that name.

Or p0 and one references (whether it's in the DB already or not - If we have coords or not).
TGC will store the systems and the distance.
And will further more use those in the future if appropriate for calculating other coords.

Basically TGC will take anything you throw at it - and try to make the best of it (combining it with data already stored etc)

(I've added an (optional) commander name field as well)

That's good. I've also come to the conclusion that maximum flexibility of data gathering is the best option. I've used that pattern of flying to the reference system and getting distances to a bunch of stars a lot when supplementing other people's data and for verifying.

TornSoul said:
So when should this counter get incremented in your opinion?

Good question. How about this: for distances just increment the CR each time the distance is submitted. There could be multiple distances for any pair of systems and TGC would output them all. Algorithms should only use the version with the highest CR (if it's a tie then ignore that pair of systems as we have no confidence in the data).

For systems, CR is set to 1 as soon as there are enough distances to calculate a position (using the highest CR distance for each reference star). Then each time a new set of distances are submitted for that system, if they are consistent with the calculated position then increment the CR. If they are not consistent with the current position but there is a position that is consistent with all known distances then reset the CR to 1 and change the coordinates. If there is no position consistent with all known distances then set CR to 0 (which indicates bad or insufficient data and wait for new data to sort it out).

One problem with this implementation would be that it would be relatively easy to pollute with bad data if someone was determined to do so.

RedWizzard · Nov 7, 2014

TornSoul said:
v.region.indexOf('outside Beta3')<0 && v.contributor=='FD'

Gives me 802 systems - Does that sound about right?

Yes, though thinking about it that does include systems that Michael supplied for earlier betas as well as the ones he supplied for beta 3 (their positions have been verified). If you only want the systems he supplied for beta 3 (755 of them I believe) then I have a spreadsheet I can send you. Or I can make a filtered version of systems.json (tomorrow - I'm about to go to bed). I guess for testing it won't hurt to use the 802.

Discussion What is the most efficient way to crowdsource the 3D system coordinates

Biteketkergetek

RedWizzard

Iain M Norman

Athan

RedWizzard

RedWizzard

wolverine2710

Tutorial & Guide Writer

TornSoul

RedWizzard

wolverine2710

Tutorial & Guide Writer

TornSoul

wolverine2710

Tutorial & Guide Writer

TornSoul

RedWizzard

TornSoul

TornSoul

RedWizzard

TornSoul

RedWizzard

RedWizzard