[CLOSED] EDDB - a site about systems, stations, commodities and trade routes in Elite: Dangerous.

Hello Commanders,

today I'm launching my latest project: eddb - Elite: Dangerous Database

It's a site about systems, stations, commodities and trade routes in Elite: Dangerous.


  • The system data is based on EDSM.
  • The commodity data is being updated from EDDN.
  • The data can be edited on the ROSS backend.
  • From the very beginning it was very important to have a clean database. EDDB is taking care of this and checks the plausibility of every single dataset.


Latest News: EDDB proudly presents the introduction of Bodies!

More news on EDDB

themroc
 
Last edited:

wolverine2710

Tutorial & Guide Writer
Cool initiative. A question about the database sql scheme. Did you create that from scratch? The reason I ask this is because of the thread "[GENERAL] Common Use SQL Game Database Source", in which an extensive scheme has been described/discussed?

I DO like the idea of validating (prices) data but is the EliteOCR (and RegulatedNoise) OCR data that bad? Afaik EOCR does ONLY update data to EDDN (and create a .BPC file for Slopeys the BPC) when its at least 1920x1080. Otherwise its not uploaded. I haven't checked the data in EDDN myself so I hope you interpret this a question and not some kind of rude remark. How much of the received data was 'incorrect' after it has been validated by your tool?
 
Last edited:
Cool initiative. A question about the database sql scheme. Did you create that from scratch? The reason I ask this is because of the thread "[GENERAL] Common Use SQL Game Database Source", in which an extensive scheme has been described/discussed?

I DO like the idea of validating (prices) data but is the EliteOCR (and RegulatedNoise) OCR data that bad? Afaik EOCR does ONLY update data to EDDN (and create a .BPC file for Slopeys the BPC) when its at least 1920x1080. Otherwise its not uploaded. I haven't checked the data in EDDN myself so I hope you interpret this a question and not some kind of rude remark. How much of the received data was 'incorrect' after it has been validated by your tool?

My experience is that I have had to spend so much time cleaning data I got from EDDN that I have stopped using it. That is not to say EDDN is broken it is working very well, it just is not it's job to keep the data very clean, it's job is more of a sharing thing. I haven't tried the OP's tool but seems something like this is needed.

Edit - that is not the only reason I stopped using EDDN, it is useful. In main I choose to use my own local data anyway because I find I get a sense of satisfaction from accumulating it from my travels
.
 
Last edited:
Cool initiative. A question about the database sql scheme. Did you create that from scratch? The reason I ask this is because of the thread "[GENERAL] Common Use SQL Game Database Source", in which an extensive scheme has been described/discussed?

I created the schema from scratch since I had special requirements which were not implemented in that schema.

I DO like the idea of validating (prices) data but is the EliteOCR (and RegulatedNoise) OCR data that bad? Afaik EOCR does ONLY update data to EDDN (and create a .BPC file for Slopeys the BPC) when its at least 1920x1080. Otherwise its not uploaded. I haven't checked the data in EDDN myself so I hope you interpret this a question and not some kind of rude remark. How much of the received data was 'incorrect' after it has been validated by your tool?

Maybe 1% of the data has wrong station name (which I try to guess with the Levenshtein algorithmus) or delivers prices which are totally off. So a data correction was absolutely mandatory.
 

wolverine2710

Tutorial & Guide Writer
Thanks for the clarification.
I have a question about eddb, its EDDN related so I've posted it in the EDDN thread here. Perhaps you are able to answer my question there.
 
I enforced some stronger validation for station names on my database a few days ago and the amount of crap that goes through is... high, haven't checked the percentages but maybe higher than 1%. I'll release a REST APi in the future too if anybody wants to check against the db.
 
This looks excellent. I had an idea last week for creating a web app pretty much exactly the same as you've created (I was harvesting data from EDDN and inserting into a Firebase DB so I could display live data as it was updated), looks like you got there first :) I'd be interested to know how you tackled calculating distance between places as that was something I was wrangling with in my initial work. Once again, excellent work!
 

wolverine2710

Tutorial & Guide Writer
This looks excellent. I had an idea last week for creating a web app pretty much exactly the same as you've created (I was harvesting data from EDDN and inserting into a Firebase DB so I could display live data as it was updated), looks like you got there first :) I'd be interested to know how you tackled calculating distance between places as that was something I was wrangling with in my initial work. Once again, excellent work!
If you deem it worthwhile why don't you continue with your public available web app? We can't have to many tools. Especially if you are also looking into validating/correcting incoming EDDN prices. Your corrected data could be fed into EDDN again (just like themroc's data if he chooses to do so) and then distributed (publsihed) to all subscribers (clients using the EDDN data. Its being discussed in the EDDN thread starting here. Looking forward to meeting you there.
 
Last edited:

wolverine2710

Tutorial & Guide Writer
@themroc. I think your API could also be used by for example OCR programs to enhance their output. Lets look at prices. System and station names can be corrected in an OCR client using Approximate string matching (could be Levenshtein distance only). Just like you do now. The OCR-ed name for system/station is checked against a dictionary of known and good spelled systems/stations. When your API provides all systems/stations they can use that.

Numbers are far more difficult because there is no dictionary to check against. So you have probably implemented logic to determine what a good prices range for a certain commodity at a certain station/platform is. If you are API can provide a list of correct prices ranges this could be used by OCR programs and of course any other programs. That way they don't have to collect data themselves and determine prices ranges and can use the knowledge you have gathered by validating prices.

You probably don't want to let hammer OCR clients your API to check if a certain name, price is correct for each and every name/price they have OCR-ed. Hence providing them with a names dictionary and or prices range so they can do that locally.
 
Last edited:

wolverine2710

Tutorial & Guide Writer
@themroc. Thanks for the productive chat yesterday evening about EDDN. I will be sending you a PM shortly after this. I've also created a post in the EDDN thread about validator services - which clean up uploaded data to EDDN. Its post #325 in the EDDN thread.
 
Forgive my "newbness" when looking at this stuff but I'm only asking because I do like the site and what you have there, but how does one update the data? Meaning, I'm noticing some star systems or stations not being loaded in there yet so I'd be happy to use and update the info manually but I don't see how/where to do so.
 
Forgive my "newbness" when looking at this stuff but I'm only asking because I do like the site and what you have there, but how does one update the data? Meaning, I'm noticing some star systems or stations not being loaded in there yet so I'd be happy to use and update the info manually but I don't see how/where to do so.

The data is coming from EDDN - which again is filled from tools like EliteOCR. A backend for manual data manipulation is not implemented but is planned...
 
Hi there just wanted to let you know that I've been playing with creating a mobile app using your data via htto (thanks for doing this project!!) - i couldnt get the contact form to work so I'm posting here...


thought it best if I point out the json isn't gzipped - which makes such large files pretty heavy going.

Secondly it looks like you update the files on a chron job whether there have been changes or not (i.e. the last modified updates nightly) which may result (if use is widespread) in heavy server loads for you.


And finally you have a couple of Stations listed as [.....] 0rbital (number zero not O) - not sure if this is correct assuming its an OCR thing but thought I'd point it out while I was here


Thanks for the good work
Christian
 
Top Bottom