If you want online, live error correction, then you'd want to run ML server side, with learning from client side. I've forked the project, and a ML implementation for the numbers is in progress :-)
Unfortunately this crashes on my machine with a MemoryError. I believe this is due to my screen size being 5040x1050. Is there a simple way to crop the middle 1/3 of the image (horizontally) as that's where the bit's we'd be interested in are? (I'm ok with python, but I don't know tesseract or numpy at all, and the docs are plain confusing)
Excellent, commander!!! The OCR is working quite well, I couldn't make tesseract work like this. I tried, but I'm too stupid. When it gets to collecting the CSV in a database, I'm with you.
if interested I will do the crop with openCV or numpy. Imagemagick would add another library and increase the size of the tool.
No, honest: ImageMagick does the trick, look at http://www.imagemagick.org/index.php. There is a Win executable, too: http://www.imagemagick.org/script/bi...es.php#windows
You need the "crop" feature: http://www.imagemagick.org/Usage/crop/#crop
Oh just as a heads up to those who are thinking of using this excellent project as the basis for automating updates of a database. Beware, station names are not unique (I've seen two Forrester Ports in different star systems!)
- - - - - Additional Content Posted / Auto Merge - - - - -
I am already doing that tho - since the log has the systemname - Checking system names with station names - these are IDs / keys in a table that are counter looked up. Problem would be if Frontier moved this out of the logfile...
They keep talking about an API (at least Michael is) so one can hope...
MOSTLY HARMLESS BROKER SURVEYOR
HARMLESS -> MOSTLY HARMLESS (08-Jan-15)
PEDDLER -> DEALER (08-Dec-14) -> MERCHANT (14-Dec-14) -> BROKER (28-Mar-15)
MOSTLY AIMLESS -> SCOUT (30-Dec-14) -> SURVEYOR (23-Jan-15)