More updated tutorial: https://github.com/gulakov/tesseract-ocr-sample
1. Download and install the full windows version of Tesseract. This way you won’t have to extract all the different separate files.
Leave the destination folder as the default (C:Program FilesTesseract-OCR)
Remember to check Tesseract Development files!
2. Open up Microsoft Visual Studio 2008 and go to Tools -> Options
Project solutions -> VC++ Directories -> Show directories for include files
3. Next click show directories for -> Library Files
4. Configure linker options for Tesseract
Right click your project in solution explorer and click properties
Configuration Properties -> Linker->Input ->Additional Dependencies
Add this in there:
**You will have to do this for every project
***I think you can do this with the property sheets but I don’t know how to set it up. Message me if you do!
5. Copy liblept168.dll, liblept168d.dll, libtesseract302.dll and libtesseract302.dll from C:Program FilesTesseract-OCR into your project folder (Optional)
If for some reason when you run your program and you get .dll missing add these files into your project folder.
6. Hello World!
To check if your project works create your main cpp file with this code:
using namespace std;
api.Init(“”, “eng”, tesseract::OEM_DEFAULT);
PIX *pixs = pixRead(image);
api.ProcessPages(image, NULL, 0, &text_out);
Copy this image into your project folder: (Right click save file as)
Copy eng.traineddata from C:Program FilesTesseract-OCRtessdata into your project folder and it should output Hello World! The traineddata file will be used as the data file for reading the text.
More to come! I will be making a tutorial maybe next week on linking OpenCV with Tesseract and maybe also on how to train Tesseract.