wolverine2710 said:
In the future I'm hoping for a great and Blinking EDDN X-mas tree. What do I mean: Have a look at
Eve Market Data Relay (EMDR) - Relay activity map. The solar systems light up as market data arrives. I believe EVE has over 5000 solar systems. Its from
this page. EMDR Monitor also look nice....
If we ever would have that much data coming into EDDN that would be dream come true ;-) It would also mean lots and lots of useful data (less obsolete/old ones) to feed trading tools with. profit, profit AND PROFIT. Perhaps someone is already building such a tool ;-)
I guess for this to happen, an OCR tool has to work mostly independent from manual verification. I believe that not many people are willing to put an effort into manual steps involved. Just my 2 cent...
I totally agree BUT its possible to reduce the amount of manual work/steps. It probably will reduce the number of (good) data send to EDDN. Let me try to explain. Lets suppose you have created an EDDN validator service and its working splendidly. Correcting names using ASP/Levenshtein and throwing away commodities based on the OCR-ed prices which are in the range they should have been (for a certain system/station/time). The checks can be simple/barebone or really complicated checks. At the end you return the prices to EDDN.
I've said it before, I'm a true believer of the Unix/Linux way. Especially the pipe mechanism. A complex (monolithic, hard to maintain) program is divided into smaller parts (programs). Each program performs an action on the input received and outputs it to the next program. The output of one program becomes the input of the next program in the pipeline. The Unix tools are in principle all barebone (not speaking about stuff like Perl and possibly awk), do a very specific task and do it extremely well. Combine them an you can create complex programs with it. I've done that in the past lost of times - record is 15+ tools in the pipeline.
Its just plain fun ;-) James and myself are trying to apply the same philosophy to EDDN. EDDN is NOT a big, monolithic, hard to maintain, complex beast of a program. EDDN is small, has limited functionality, conceptually beautiful (thanks Andreas), does a simple task BUT does it well and relies on other ""add value services" to become (more) useful. Unix pipe mechanism.....
If you deem it worthwhile you could extend that to the .csv/.bpc files generated by OCR tools. You receive a file, correct it (and or ditch/remove commods or flag them as bad) and then send it back to the sender. The OCR tool before showing a screen where commanders HAVE to manually check the OCR-ed screen (EOCR/RN shows whats possibly wrong with a commodity) can use the send back data to make adjustments and directly flag what is wrong. That way less manual checks have to be done. If a validation tool is working (near) perfectly they could also decide (an option) to NOT show a commander a correction screen but simply generate a .csv/.bpc file and send the data to EDDN. The point I'm trying to make: A commander can be brilliant at OCR-ing, not so at validating stuff. A commander can be brilliant at validating, not so at OCR-ing. You get the picture. Combine the tools created by the commanders and the end result is BETTER then what they both achieve indipendently.
Disadvantage of pipes running across the internet is that when one program dies (a validator) the chain normally breaks. You and/or other validator service commanders could go one step further. If you create your validator in such a way that it is NOT part of a big monolithic programs but a small standalone program. A program which contains the logic and relies on data from your databases (for range etc). Then that program does NOT have to live/reside on your server - it can run locally on the PC of a commander. The data can be gathered from your server and cached in for example sqlite (when eddb dies the local validator still functions. You and others can take this as far as one like. OCR output is in multiple format (.csv files). Hence you have to support multiple file format. One could just say NO, I only support ONE format - ie the EDDN JSON format. When commanders/tools want to send data to the validator it has to be in that format. Other commanders could make a tool which supports multiple .csv format (and .bpc) and output a JSON structure which is supported by you. There could be separate tools for uploading to EDDN etc etc. Unix pipes. Starting/executing small programs can make things a bit slower, thats true but the advantages are great.
In an ideal world: A brilliant commander makes a superb OCR solution (better then all the rest) but can't do anything with it because (s)he is not a programmer and can't create JSON, can't upload to EDDN, can't create a .bpc/TD .prices file etc. Should that commander have a toolbox with useful stuff (s)he could still make a complete tool............ I know of commanders with OCR solutions but can't take it to the next level. The viability of the above ofc relies on the language/framework used.
I know I get carried away with this kind of stuff but perhaps one might take the above in consideration when creating tools. Tools/programs which are absolutely appreciated - even when they happen to be beasts ;-)