Dangerous Neurons - commodities market OCR with near to 100% accuracy

Hi! I am developing a new OCR tool for Elite. It converts commodities market screenshots to text.

It is similar to EliteOCR, but I am trying to achieve higher accuracy. My tool is not based on TesseractOCR or any other OCR solution. It is built from scratch.

I have developed an image filtering algorithm which filters out the background and preserves most text features (filtered screenshot example: http://ishack.co/f0d8JYe7p). Filtered images are then processed by a custom neural network. Using specialized solutions gives much better results than any general purpose OCR system.

The prototype version is currently available here:

http://dangerousneurons.com/

Remember, it is just a prototype. Digits recognition accuracy is at 100% according to my test data, but letters accuracy is at just about 94%. Accuracy will improve when more training data will be provided. Some image filtering parameters still need tweaking, currently it may drop some parts of screenshots.
It is best to upload PNG screenshots. JPEG format is not suitable for text and will cause recognition problems.

Please report any recognition errors you encounter.
 
Just updated the image filtering algorithm, retrained the neural network and updated website. Recognition time decreased to just a few seconds (thanks to new FFT library), but there is still much to improve. Website will now display the original image with recognized characters overlay after upload.
Now I am going to start working on a client application which will make screenshots, recognize them and store information in some kind of database.
 
One more update to recognition logic. Rightmost column will not be ignored as it does not really contain any valuable information. There are now two recognition engines: one for letters and one for numbers. Recognition accuracy for numbers is now at 100%, letter accuracy is at 98% (this is a high enough accuracy and there is no sense to improve it, instead commodities names can be looked up from a dictionary).
Here is a screenshot just in case website goes down again: http://ishack.co/idokIXANp
 
Neural network seems to work pretty good, there is not much more to improve. Working on the user interface. I decided to implement Chrome based solution. It will be a combination of website and browser plugin. Plugin will only be needed to make screenshots via hotkey.
Here is a screenshot of current state: http://ishack.co/ex7RlZ7hp
Everything displayed on the screenshot is functional. The area on the top is screenshot recognition queue, when screenshots are recognized they can be combined into station. I would recommend always make several screenshots of each commodity item on different scroll levels, the system will automatically pick the best result.
The "Q" column in commodities table indicates the quality of screenshots, or, basically, how certain the neural network is about the result. "ERR" column indicated probability of at least one digit recognized incorrectly. "COLTAN" commodity on this screenshot was partially out of view, so neural network is indicating that recognition result may contain mistakes. The result is actually correct.
Spent a lot of time implementing image compression. Part of image processing is done in the browser, but then image has to be transferred to the server without losing any data. So I had to implement a lossless image compression algorithm in Javascript (few years ago I would not even take such idea seriously). So I have implemented a specialized algorithm which is about twice as effective as PNG or TIFF on this type of images. Just need to optimize it a bit (currently image processing freezes the browser for a second).
 
Last edited:
+1 Rep for all your hard work :)
I am currently exploring so I will bookmark this page for when I get back and trade :)
 
Looks very promising.
Anyway when I visit dangerousneurons.com my browser window looks very different, see screenshot. And when I upload the commodities market screenshot, the result is some JSON like [{"column":0,"character":"S","height"...
asset.php
 

Attachments

  • dangerousneurons.jpg
    dangerousneurons.jpg
    81.2 KB · Views: 252
Back
Top Bottom