Page 11 of 126 FirstFirst ... 89101112131421 ... LastLast
Results 151 to 165 of 1890

Thread: EliteOCR – Optical Character Recognition for The Commodities Market

  1. #151
    Originally Posted by zxctypo View Post (Source)
    @everyone else: WOW! Thank you so much for sending me your training images! I have 8.5k total source images now, 17 TIMES MORE than before! I thought maybe I'd get a few hundred tops- I'm both impressed and humbled by all of your support :-)
    1st: WE (the non-tool-creators) have to thank you and all the other tool-creators for spending their free time into this. We are fighting space ships, while you are fighting data.

    2nd: if you need more files just say so (best PM me). I have now about 4000 files for you to test, constantly growing.

  2. #152
    Originally Posted by zxctypo View Post (Source)
    IMPORTANT:
    Neural network/machine learning code is almost complete. I currently have 500 source images (taken from the nn_training_images folder)- and of these, only 400 are being used for training (the rest for testing)

    http://i.imgur.com/UqORuKX.png

    With that few (for machine learning, that is a very tiny sample size :-) ), and with only testing out these two methods (totally untuned- the SVM accuracy can really be improved with better input parameters),
    as you can see, I have an error rate of only 5%.


    An error rate of < 1% is easily achievable (and something I have achieved on previous projects)! I'm at the point where I NEED MORE DATA!

    If you want to help make this project more accurate, to the point where we can trust it to run automatically, then please, please SEND ME YOUR NN_TRAINING_IMAGES FOLDER! :-D
    Please use mega.co.nz, or any other simple/fast file share :-)



    https://mega.co.nz/#!XRpDCIRb!gtrxOL...la7qks_aZE7kO0

  3. #153
    Originally Posted by seeebek View Post (Source)
    I was yesterday travelling without access to my PC. I was online with my phone. Today my sister had birthday so I was not able to work much on the tool. Here is a pre-release version since I'm sure it's very buggy, but I hope it will calm some of you down a little. I will work more on it tomorrow.
    https://mega.co.nz/#!m9IxwBiC!S0ckv-...lpW98-co8Lza6o
    Thank you! This tool is so awesome! Its fun to scan :b.. nerdy, I know.. but typing all the prices was not that fun after a while... now its fun again.

    EDIT: Having problem starting the new Exe. getting a "The application has failed to start because its side-by-side configuration is incorrect...."

    EDIT 2: Ok, fixed it by installing Visual C++ Redistributable Packages 2008/2010 for x86 on my Windows 2012 r2 64bit PC that i run the tool from over LAN.

  4. #154
    Originally Posted by Anteronoid View Post (Source)
    Thank you! This tool is so awesome! Its fun to scan :b.. nerdy, I know.. but typing all the prices was not that fun after a while... now its fun again.

    When trying to run the BPC Feeder from EliteOCR 0.3.2.1, I am getting the error:

    Line 13697 (File
    "C:\Users\*******\Downloads\eliteocr.0.3.2.1\plugins\BPC_Feeder\BPC feeder.exe"):

    Error: Variable must be of type "Object".

    Are you having any issues?

    @Seebek
    I know that you are still working on it so if anything, I am reporting this for possible bug.

    Keep up the awesome work!

  5. #155
    @seeebek: first feedback to your newest version:

    1.) The excel export now contains floating values with a decimal delimiter ".". All values ending with ".0" which is quite unnecessary as there are only integers.
    That causes problems for some of us using excel for processing your CSV output (like me). In some areas of this planet the decimal delimiter is "," and not ".". ;-)
    Would be nice, if you can remove this again, as it is out off sense anyway.

    2.) I guess the BPC-Feeder doesn work (haevn't tested yet) as there is still the problem with the "sell error". That need to be fixed by slopey I think. Am I right?

    3.) A small suggestion to improve usablity: can you increase the font of the OCR results fields? They are quite small to read compared to the OCR gaphis above them. A bigger display of all these fields would help identifying mismatches.

    And now the major point:
    4.) Thank you very much vor this tool. Its really g awesome and helps A LOT!!!!!

    EDIT:
    5.) found a small bug: setting "Remove duplicates in tbale" didn't work anylonger.

  6. #156
    Originally Posted by Oakshios View Post (Source)
    When trying to run the BPC Feeder from EliteOCR 0.3.2.1, I am getting the error:

    Line 13697 (File
    "C:\Users\*******\Downloads\eliteocr.0.3.2.1\plugins\BPC_Feeder\BPC feeder.exe"):

    Error: Variable must be of type "Object".

    Are you having any issues?

    @Seebek
    I know that you are still working on it so if anything, I am reporting this for possible bug.

    Keep up the awesome work!
    Yes, but if you look in the .ini file in the settings in the feeder folder you see that the file path is wrong.
    But i figure its not really finished work yet, so I will probably not play with it until i know more.

  7. #157
    Originally Posted by Conehead View Post (Source)
    @seeebek: first feedback to your newest version:

    1.) The excel export now contains floating values with a decimal delimiter ".". All values ending with ".0" which is quite unnecessary as there are only integers.
    That causes problems for some of us using excel for processing your CSV output (like me). In some areas of this planet the decimal delimiter is "," and not ".". ;-)
    Would be nice, if you can remove this again, as it is out off sense anyway.
    My Excel export looks OK and no delimiters at all. Isn't it a config setting maybe in your Excel?

    EDIT: My Excel Export Works fine, but if i do a CSV Export i get the same result as Conehead with delimiters.

    Excel Export - Fine



    CSV Export - Floating/Delimiters


  8. #158
    Originally Posted by Anteronoid View Post (Source)
    Yes, but if you look in the .ini file in the settings in the feeder folder you see that the file path is wrong.
    But i figure its not really finished work yet, so I will probably not play with it until i know more.
    I had changed the path already. I get the error with the correct path.

  9. #159
    Originally Posted by Oakshios View Post (Source)
    I had changed the path already. I get the error with the correct path.
    Yes, same here I just noticed.

  10. #160
    Just a quick update on what all of the data you guys sent in today has helped achieve:


    The yellow rows are labels.

    So out of 3.3k digits tested, it got 14 of them incorrect. Not too bad of a start, eh? :-)

    Shouldn't take much longer to get even more accurate :-)


    Edit: Went through and looked through the ones it got wrong to see if I could notice a pattern: turns out, it's better at telling 8/9/6 apart than some humans: some of the input files were labeled wrong ;-)
    With those relabeled correctly, the real accuracy is starting to get awesome :-)


    Wooo .24% error = 99.76% accuracy. That means it only gets one wrong out of ~416 :-)

  11. #161
    Yeah there is definitely going to be some incorrect entries in the OCR stuff...

  12. #162
    This is the k-nearest algorithm, which is fast to train, but really susceptible to noise, so it's a great place to start to get a feel for data cohesion/cleanliness.

    The plan, if anyone is interested, is to first get a good accuracy rate with kmeans (which we are quickly approaching), and then transition over to a pure neural net, or if I have time, preferably a convolution neural net- both of which are much more noise tolerant. We'd have the most advanced video game OCR market data scraping program in the world hahaha :-)
    And with an accuracy rate of >= 99.9%, we'd be in the area of trusting enough to have it automatically recognize

  13. #163
    Here are some more training images.

    https://mega.co.nz/#!KEwGnAAQ!FI18dN...8lt9PJlEMAdhA0

    - - - - - Additional Content Posted / Auto Merge - - - - -

    Originally Posted by Anteronoid View Post (Source)
    My Excel export looks OK and no delimiters at all. Isn't it a config setting maybe in your Excel?

    EDIT: My Excel Export Works fine, but if i do a CSV Export i get the same result as Conehead with delimiters.

    Excel Export - Fine
    *SNIP*

    CSV Export - Floating/Delimiters
    *SNIP*

    Any chance that when exporting CSV it is triggering decimals to be turned on in Excel? Can you go to formatting options in Excel and turn of decimal places?

  14. #164
    Originally Posted by Decker View Post (Source)
    Yeah there is definitely going to be some incorrect entries in the OCR stuff...
    Yes, some are definitely from me. Sorry, guys.
    But at least thats prove your algorithm are well working

    Originally Posted by zxctypo View Post (Source)
    This is the k-nearest algorithm, which is fast to train, but really susceptible to noise, so it's a great place to start to get a feel for data cohesion/cleanliness.

    The plan, if anyone is interested, is to first get a good accuracy rate with kmeans (which we are quickly approaching), and then transition over to a pure neural net, or if I have time, preferably a convolution neural net- both of which are much more noise tolerant. We'd have the most advanced video game OCR market data scraping program in the world hahaha :-)
    And with an accuracy rate of >= 99.9%, we'd be in the area of trusting enough to have it automatically recognize
    I don't understand much of neural nets, but it sounds weird Just go ahead

  15. #165
    Originally Posted by NoiseCrime View Post (Source)
    You're right it is a lot of work and sadly i'm not sure its worth it as one of the biggest issues is resolution and the additional lose of resolution towards the edges of the screen. For example if you check out the link to the large image the demand column on the dewarped version looks pretty good as its in the sweet spot of the Rift (center), but the text for the commodity names and even the station name has less resolution so when dewarped it becomes quite a mess. Bilinear filtering may help that and certainly taking higher resolution screenshots (if possible via the rift) will be essential as without the additional resolution I don't see OCR working even if we can automatically/manually remove the rotation/perspective on the panel.

    I gave alt F10 a go as well on that previous screenshot. it seems to cut out the right lens, remove some distortion, and significantly increase the file size/resolution.

    WARNING: Large picture!


    http://i.imgur.com/GzRBcN0.jpg

Page 11 of 126 FirstFirst ... 89101112131421 ... LastLast