Discussion What is the most efficient way to crowdsource the 3D system coordinates

TornSoul · Oct 19, 2014

JesusFreke said:
If you require a single person to input all the data required to calculate the coordinate for a star, then only "explorers" visiting new stars will be able to enter data - and it puts a larger onus on them because they have to gather more data.

If you let anyone from anywhere enter some data.. then you'll end up getting a lot more data.

In pure statistical terms I agree with you.

But with human nature involved my point remains that (to put it on it's head): You wont find anyone(*) willing to enter just starnames (and not coordinates).

(*)for some small error value

I just cant imagine anyone being interested enough in this gathering of data to stop short there - at just starnames.
What would be their incentive? - and again, if they already got incentive... why would it stop short there.

Maybe I'm just too biased - But I just cant imagine why anyone would behave like that.

Would love to be proved wrong ofc - as it would get us more (part) data to work forward with.

TornSoul · Oct 19, 2014

JesusFreke said:
In any case, just a couple of comments - regarding "random reference systems" - I'm not a fan of using a fixed set of references. Especially as we move out further into the galaxy, you won't be able to use the same set of stars, because they'll be too far away and too clustered together due to the distance.

Good point.

TornSoul · Oct 19, 2014

JesusFreke said:
I've described it elsewhere, but basically - trilaterate to find an initial coordinate and then do a hill descending algorithm on the error function (the sum of the squares of distance errors for all known distances). At this point, you'll have a non-grid aligned coordinate that is within the "candidate region" - the region where all the spherical shells overlap. (Or, if you don't, it means that one of the data points is wrong, and there is no common overlap). And then, I start "exploring" the grid-aligned points in the vicinity of that non-grid aligned "good" coordinate, to find all grid-aligned coordinates that are within the candidate region.

I think the term is "hill _climbing_ algorithm" (or it's an algorithm I don't know

)

Anyways - Thanks for the explanation (I get it)

I've been puttering with the idea of trying something similar - To see if I can reduce the percentage of values I throw away.
But again - I want to be 100% sure to not allow any false positives.
So if two grid points both pass the (reverse) distance checks - I'd not accept either.

The question if that could even occur is what I am trying to figure out - So thought to ask if someone knew for sure, before crunching the numbers my self

JesusFreke · Oct 19, 2014

TornSoul said:
In pure statistical terms I agree with you.

But with human nature involved my point remains that (to put it on it's head): You wont find anyone(*) willing to enter just starnames (and not coordinates).

(*)for some small error value

I just cant imagine anyone being interested enough in this gathering of data to stop short there - at just starnames.
What would be their incentive? - and again, if they already got incentive... why would it stop short there.

Maybe I'm just too biased - But I just cant imagine why anyone would behave like that.

Would love to be proved wrong ofc - as it would get us more (part) data to work forward with.

I would

.

I can see 2 cases when someone would enter new stars. Let's say someone is hanging around a certain part of space, and they want to get it mapped out. Instead of having to visit each and every star.. they just grab the names, maybe a few distances and enter them into the system, knowing that after a day or two or whatever, there will be enough data to calculate the coordinates.

The second case is someone that's just going exploring - they've picked a direction and are heading in that direction for as long as possible. They don't want to stop and visit each and every star. But they'll grab the names, and maybe the low precision distance from the nav panel and submit them to the database.

JesusFreke · Oct 19, 2014

TornSoul said:
I think the term is "hill _climbing_ algorithm" (or it's an algorithm I don't know )

But it's not climbing a hill

. In particular, I'm using the Levenberg–Marquardt algorithm, as implemented by python's lmfit module.

Unfortunately, I can't share my code due to [reasons], but it's similar to (and inspired by) this example: https://github.com/dlockman/Trilateration-Demo/blob/master/PythonServer/basicTrilateration.py. With the primary difference that the error function takes into account the accuracy of the measured distance (as long as the calculated distance is within +/- .0005 of the (3 decimal accuracy) measured distance, the error term for that piece of data is 0), and it uses all available distances when calculating the error.

The candidate region mapping is custom code that I implemented myself. I'm not 100% positive that it is foolproof, but I haven't found any case where it hasn't worked yet

TornSoul said:
So if two grid points both pass the (reverse) distance checks - I'd not accept either.

That's exactly what my algorithm does

. If it finds multiple points, then it doesn't use either - instead it notes down that the star still needs more data. Essentially, if you find multiple points, in means that the "candidate region" is too large, and you need more distances to reduce the common overlap between all the spherical shells.

TornSoul · Oct 20, 2014

JesusFreke said:
That's exactly what my algorithm does . If it finds multiple points, then it doesn't use either - instead it notes down that the star still needs more data. Essentially, if you find multiple points, in means that the "candidate region" is too large, and you need more distances to reduce the common overlap between all the spherical shells.

Aha, so that's where the idea of putting out/up lists (of stars needing more data) comes from.
With that background I can see it could make sense to do so - Just one more (probably) datapoint and that star is logged accurately.

I'll keep that in mind as I keep tweaking on EDSC

Smacker · Oct 20, 2014

TornSoul said:
Aha, so that's where the idea of putting out/up lists (of stars needing more data) comes from.
With that background I can see it could make sense to do so - Just one more (probably) datapoint and that star is logged accurately.

I'll keep that in mind as I keep tweaking on EDSC

|I am getting this: "Index was outside the bounds of the array."

TornSoul · Oct 20, 2014

Using EDSC?

What values are you inputting? (not that it should matter - But I'll investigate)

Are you by any chance copy pasting star names into the fields - and not also clicking the name in the drop down list after that?

oh and - what browser are you using? (It requires a modern browser - no i8 etc)

RedWizzard · Oct 20, 2014

JesusFreke said:
In any case, just a couple of comments - regarding "random reference systems" - I'm not a fan of using a fixed set of references.
...
Although, I think you alluded to the solution -- the algorithms we use to calculate the coordinate should be able to determine if/when more data is needed, as well as being able to detect bad data.

Yes, that's the approach I've taken. A fixed (or at least preferred) set of references (as Harbinger and TunaMage are using) works for the pill but it will be difficult to extend to the wider galaxy. My experience is that it's not hard to find a group of reference stars that work in a given case and TornSoul's testing provides some hard data: about 3/4 of random cases work for sets of five references. So I'm happy to let the user pick the references and then figure out if they are good or not.

Smacker · Oct 20, 2014

TornSoul said:
Using EDSC?

What values are you inputting? (not that it should matter - But I'll investigate)

Are you by any chance copy pasting star names into the fields - and not also clicking the name in the drop down list after that?

oh and - what browser are you using? (It requires a modern browser - no i8 etc)

Sorry, should have said. This is from the API call
http://edstarcoordinator.com/api.asmx/GetSystems

RedWizzard · Oct 20, 2014

TornSoul said:
Very long post (as are mine...) I'm not going to comment on it point by point (dangit I nearly ended up doing just that...)

I think I can keep my response short(er) this time...

TornSoul said:
The formula (wikipedia) first calculates x, then y and z.
The calculation for y depends on x, and z on both x and y.

If the calculated value for x is "very accurate" (due to the deltas being small) then y will be "less off" than if two other systems (with larger deltas) had been used.

(I know you retracted this but I want to talk about it anyway

) You are right the error contribution from the three references is not even in the final coordinates and therefore order does matter. My point was that it wasn't significant enough to affect the results at the 1/32 precision we're rounding the results to. The absolute error in each distance is the same: +/-0.0005, but shorter distances have greater relative error which might have an impact (though my testing hasn't found such a case). The thing is, the effect should be stable: you should be able to simply assign the references in a specific order of distance to minimise the error and so still avoid calculating all the permutations.

TornSoul said:
This is likely controversial - But I do not subscribe to the idea of "fixed reference systems".
The reason being my above observations regarding deltas on distances.
I know it has kind of been accepted as "the way to go" in this thread - But I'm sorry, the math says it's irrelevant.

Any set of "fixed reference systems" is just as likely to have large deltas as any randomly picked systems.

Actually I may have given the wrong impression there. I think we're about evenly split over using a fixed set of references or not. I'm actually in the "not" camp - see my reply to JesusFreke. Where you and I disagree on this is around your requirement for all permutations of the references to generate the same coordinates. The effect of that is to require all 4 star subsets of the reference stars to be independently good sets of references. I don't think that's necessary; you only need one good set of reference stars to generate the correct coordinates, you just need to be able to tell if they are good or not.

TornSoul said:
No "fixed reference systems" that one picks can ever guarantee that there wont me max deltas (large rounding errors on the distances)
And those deltas are what matters.

A "good reference system" is one that (just so happens) to give small deltas.
And that is entirely dependent on the target systems coordinates.

Right, but remember we're dealing with a fixed volume of space: the Pill. It's entirely possible to find a fixed set of references for all stars in the Pill. In fact I've yet to find a case where TunaMage's set of [h Draconis, Wyrd, LHS 2884, and Keries] doesn't work.

TornSoul said:
"Which means the reference stars are degenerate in some way, e.g. they all lie in a plane."
The degenerate case is if the reference stars are *collinear*.
The reference systems p1-p3 will always be in a plane (3 points always will be)

But that was to try to determine if p4 where in a plane with p1-p3, and if that was bad for the result of trilateration formulas.
The consensus in this thread seems to have been that that would be the case - as your comment just above also shows.

So I ended up disproving that - after all too much liniar algebra
...
Wether p1-p4 - or indeed p0-p4 all are very close to being in the same plane or not - has no impact on the final calculations.

I wasn't meaning if p1-p3 lie in a plane (or course they do), I was meaning all four. If you're saying you've disproved that four points in a plane is a problem, I think you might have made a mistake somewhere as that arrangement definitely is a problem. Consider, you've used three points to generate two candidate coordinates. Those three points define a plane and the candidate coordinates are on either side of that plane (one above, one below). The fourth reference picks the correct candidate based on distance. But if the fourth reference point is also in the plane then the distance to both candidates will be equal and you cannot select between them. Obviously that situation will hold regardless of which point is selected as the fourth point. The same issue applies to larger sets of references if they all lie in a plane (regardless of permutation of the points).

TornSoul said:
Comparing the p1-p3 distances after the calculation...
...
I'm going to correlate this with the results of my other tests - and see if they match up or not.

My suspicion is that that still leaves room for error though...

My approach goes further than just testing the p1-p3 distances. I test all the distances I have. Basically my approach works like this:

I have input distances to N reference stars from Michael's list (i.e. that have confirmed coordinates). I don't trust the results unless N >= 5.
I generate candidate coordinates using every 3 star combination from those N references. This results in at most C(N,3)*2 candidate coordinates.
I then select the best candidate coordinate as measured by the total difference between the calculated distances and the input distances for all N stars (actually I used the total squared error).
Any output that results in a visible error (i.e. err >= 0.001) between the calculated distance and the supplied distance is not good enough and requires more data.

An alternate way of looking at this problem of finding coordinates is as a system of simultaneous equations. Our unknown point P0 must satisfy
||Pn - P0|| = dn
for all reference points Pn and distances dn.
I believe someone is actually solving the problem this way using a least squares method. My approach is a hybrid approach: I use trilateration to produce candidate solutions and test them using the simultaneous equations.

TornSoul · Oct 20, 2014

uhm that kind of error message can't come from the api - You'd get a "Server error blah blah" kinda thing.

Have you tried copy pasting the example from http://edstarcoordinator.com/api.html ?

I just did - and it worked fine.

I think the error must be somewhere in your code that runs a loop - either before or after the call.

RedWizzard · Oct 20, 2014

gazelle said:
CHI Hercules (unknown): not enough data
Mirfak (unknown): not enough data
Paul Friedrichs Star (unknown): not enough data

These three are spelling mistakes: "Chi Herculis", "Mirphak", "Paul-Friedrichs Star". Is it sad that I know exactly where you got the data from just from the spelling mistakes?

gazelle said:
LHS 465 (unknown): not enough data

I have coordinates for this. Here are the distances:
Sol: 26.883
Wolf 497: 39.915
Huokang: 35.869
Demeter: 43.255
Clotti: 73.222
Fu Haiting: 69.263
San Guaralaru: 120.7
Haras: 97.088
Arabha: 104.073

gazelle said:
Ross 868 (-24.6875,19.125,21.3125): ref: 2; calc: 4;
LP 329-18 (-26.625,39.875,23.28125): ref: 3; calc: 4;
Wredguia YS-O B47-4 (-135.59375,39.75,-40.71875): ref: 4; calc: 4;
LTT 14542 (-57.21875,57.84375,-14.625): ref: 4; calc: 4;
Wredguia DO-O B47-1 (-69.71875,26.375,-31.625): ref: 4; calc: 4;
Wredguia XH-Q B46-3 (-81.96875,46.71875,-40.375): ref: 4; calc: 4;

These all match the coordinates I have. I have extra distance data for them.

RedWizzard · Oct 20, 2014

Harbinger said:
Forgot to mention. I did find another unmapped system on my travels (WREDGUIA LW-E D11-129). It's right on the edge of the pill next to HIP 2453.

No matter how many times I attempted to enter that system in order to map the coordinates I got the infinite hyperdrive animation. (I still need to ticket that.)

Not sure if any of you guys can get in.

I can't get there either. Long hyperspace sequence then disconnect from server. I'll ticket it.

TornSoul · Oct 20, 2014

RedWizzard said:
Where you and I disagree on this is around your requirement for all permutations of the references to generate the same coordinates. The effect of that is to require all 4 star subsets of the reference stars to be independently good sets of references. I don't think that's necessary; you only need one good set of reference stars to generate the correct coordinates, you just need to be able to tell if they are good or not.

(my emphasis)
Being able to tell is the crux of it all

As of yet I'm not convinced that the (reverse) distance test will never produce false positives.
I'm working on some tests to try and convince me (or not) otherwise - As I would in fact prefer this method (over mine) as it seems to discard fewer values.
But that is also why I'm a bit suspicious

RedWizzard said:
Right, but remember we're dealing with a fixed volume of space: the Pill. It's entirely possible to find a fixed set of references for all stars in the Pill.

Absolutely - I do think I've made a couple of "good enough for beta" references here and there.
But in the (very) long wrong it won't work.
And I really prefer that people can simply look at the galaxy max and hover over the nearest systems, without having to enter a name in the search field to find a specific star etc.
That's really my main objection to that methodology - it's not user friendly enough

RedWizzard said:
I wasn't meaning if p1-p3 lie in a plane (or course they do), I was meaning all four. If you're saying you've disproved that four points in a plane is a problem, I think you might have made a mistake somewhere as that arrangement definitely is a problem. Consider, you've used three points to generate two candidate coordinates. Those three points define a plane and the candidate coordinates are on either side of that plane (one above, one below). The fourth reference picks the correct candidate based on distance. But if the fourth reference point is also in the plane then the distance to both candidates will be equal and you cannot select between them. Obviously that situation will hold regardless of which point is selected as the fourth point. The same issue applies to larger sets of references if they all lie in a plane (regardless of permutation of the points).

I should have been MUCH more precise in my formulation there (especially as i was knit picking a bit at you

)

p1-p4 all *exactly* in the same plane = big trouble - agreed.

The critical line (and takeaway) is my last line

"Wether p1-p4 - or indeed p0-p4 all are very close to being in the same plane or not - has no impact on the final calculations."
(emphasis added)
That was the main thrust of my argument.
In all my 1M runs of finding "best fitting plane" it never happened that p1-p4 (or p0-p4) happened to be in the same plane - There was however *many* where it was a very close call (if i recall the distance to the best fitting plane could be out on the 2nd decimal).

I collected those - and investigated them closer.
To test for accuracy vs test cases of "normal" average distances to the plane.
Will in fact i never bothered comparing them, as there was no need - The accuracy for the "all p1-p4 very very close to best fitting plane) where all over the place. Some very good, some very bad, and everything in between.

As I've mentioned a few times - it was all a rather large waste of time (with regards to the objective)

But my reporting it hopefully can save someone else from wandering down that dead end

RedWizzard said:
My approach goes further than just testing the p1-p3 distances. I test all the distances I have. Basically my approach works like this:

I have input distances to N reference stars from Michael's list (i.e. that have confirmed coordinates). I don't trust the results unless N >= 5.

I generate candidate coordinates using every 3 star combination from those N references. This results in at most C(N,3)*2 candidate coordinates.

I then select the best candidate coordinate as measured by the total difference between the calculated distances and the input distances for all N stars (actually I used the total squared error).

Any output that results in a visible error (i.e. err >= 0.001) between the calculated distance and the supplied distance is not good enough and requires more data.

On the first two point we do exactly the same.
It's on the last two where I get a bit more draconian in my "filter methodology"
Might be overkill on my part - time will tell (when I've done some more testing)

JesusFreke · Oct 20, 2014

JesusFreke said:
I checked the first one (LP 274-8), and using Harbinger's coordinate, I get a distance of 60.7279 to Moros, which doesn't match my recorded distance of 60.738 for that pair.

I'll take a look at more when I have the time. It would be good to double-check the Moros <-> LP 274-8 distance, to make sure it isn't a typo in the data.

I finally managed to get a class 5D drive for my asp, so I'm back in business for exploration. I went to LP 274-8 and confirmed that the LP 274-8 <-> Moros distance is 60.738. Since this doesn't match up with Harbinger's coordinate, I believe my coordinate (-29.71875, 45, 20.28125) is the correct one.

RedWizzard · Oct 20, 2014

Harbinger said:
As we're nearing 570 stars in the pill it's getting very hard to locate the missing ones now but:

Great stuff. I've updated my list. I now have 577 systems of which 12 are outside the Pill (see below). I haven't included WREDGUIA LW-E D11-129 yet, I'll have to collect some distances to it.

By the way, thanks for the validation outputs. I'm using the json versions to import the stars into my list.

Last night I found two systems on my list that I hadn't realised were outside the Pill: LP 625-34, and LP 336-6. These two were mentioned in Codec's spreadsheet and I got some extra distances for them from the reference stars.

Harbinger · Oct 20, 2014

JesusFreke said:
I finally managed to get a class 5D drive for my asp, so I'm back in business for exploration. I went to LP 274-8 and confirmed that the LP 274-8 <-> Moros distance is 60.738. Since this doesn't match up with Harbinger's coordinate, I believe my coordinate (-29.71875, 45, 20.28125) is the correct one.

It was one of my first 25 stars so used an averaging method and was converted to 1/32 after the fact so yours is most likely the correct result.

I really need to double check on the coordinates from my first 25 stars.

TornSoul · Oct 20, 2014

Just to add a bit of fun among all the math posts:

What's the furthest away star anyone has found?

Mine is: YOOXIAE RI-Z D1-0
Distance : 65759.172 LY

At the time i was checking I was at end of the pill closest to YOOXIAE RI-Z D1-0 - so if you are at the other end your distance will be higher obviously.
But you need to find one even further away

Happy hunting.

JesusFreke · Oct 20, 2014

Smacker said:
I have been through your new data and updated what I can on my TradeDangerous fork. There were a few I couldn't reconcile:
'Wredguia AT-O B47-0',-103.125,41.9375,-43.25,'JesusFreke2','2014-10-18 00:00:00'
'WREDGUIA AT-O B47-0',-103.125,41.96875,-43.21875,'Combined','2014-10-13 23:00:00'

(-103.125,41.96875,-43.21875) gives a distance of 25.29841 to 35 draconis, which doesn't match up with the recorded distance of 25.329 (verified)

Smacker said:
'Wredguia PC-D D12-74',-116.46875,35.96875,-19.3125,'JesusFreke2','2014-10-18 00:00:00'
'WREDGUIA PC-D D12-74',-116.5,35.96875,-19.28125,'Combined','2014-10-13 23:00:00'

(-116.5,35.96875,-19.28125) gives a distance of 65.0762 to Jurua, which doesn't match up with the recorded distance of 65.034 (verified)

Smacker said:
'Wredguia XH-Q B46-4',-109.90625,30.6875,-58.65625,'JesusFreke2','2014-10-18 00:00:00'
'WREDGUIA XH-Q B46-4',-109.9375,30.6875,-58.65625,'Combined','2014-10-13 23:00:00'

(-109.9375,30.6875,-58.65625) gives a distance of 50.19504 to Jurua, which doesn't match up with the recorded distance of 50.164 (verified)

Smacker said:
'WREDGUIA YD-I C23-11',-128.53125,15.75,-26.65625,'Combined','2014-10-13 23:00:00'
'Wredguia YD-I C23-11',-128.53125,15.84375,-26.46875,'JesusFreke2','2014-10-18 00:00:00'

(-128.53125,15.75,-26.65625) gives a distance of 117.00314 to Keries, vs the recorded distance of 116.931 (verified)

Smacker said:
'WREDGUIA YD-I C23-12',-109.96875,46.75,-45.59375,'Combined','2014-10-13 23:00:00'
'Wredguia YD-I C23-12',-110.03125,46.71875,-45.625,'JesusFreke2','2014-10-18 00:00:00'

(-109.96875,46.75,-45.59375) gives a distance of 30.63817 to 35 Draconis, vs. the recorded distance of 30.712 (verified)

Smacker said:
'Wredguia ZS-O B47-1',-111.53125,40.3125,-25.21875,'JesusFreke2','2014-10-18 00:00:00'
'WREDGUIA ZS-O B47-1',-111.53125,40.375,-25.1875,'Combined','2014-10-13 23:00:00'[/CODE]

(-111.53125,40.375,-25.1875) gives a distance of 59.12653 to Jurua, vs. the recorded distance of 59.096 (verified)

I went to Keries, Jurua and 35 draconis and verified all recorded distances that I mentioned, to ensure it wasn't due to a typo in the distance data.

(Also: Holy cow, that 40 LY jump distance with Asp + 5D FSD is awesome

)