Tutorial: How to Install Tesseract OCR 3.02.02 for Visual Studios 2008 on Windows Vista

Categories Computer Vision, Uncategorized
I could not find a single good tutorial for setting up Tesseract on VS2008 other than the docs that come with Tesseract so I decided to make my own tutorial for those interested.

More updated tutorial: https://github.com/gulakov/tesseract-ocr-sample

1. Download and install the full windows version of Tesseract. This way you won’t have to extract all the different separate files.

http://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-ocr-setup-3.02.02.exe
Leave the destination folder as the default (C:Program FilesTesseract-OCR)
Remember to check Tesseract Development files!

2. Open up Microsoft Visual Studio 2008 and go to Tools -> Options
Project solutions -> VC++ Directories -> Show directories for include files

Add:
C:Program FilesTesseract-OCRinclude
C:Program FilesTesseract-OCRincludetesseract
C:Program FilesTesseract-OCRincludeleptonica

3. Next click show directories for -> Library Files


Add:
C:Program FilesTesseract-OCRlib

4. Configure linker options for Tesseract


Right click your project in solution explorer and click properties

Configuration Properties -> Linker->Input ->Additional Dependencies

Add this in there:

libtesseract302.lib
libtesseract302d.lib
liblept168.lib
liblept168d.lib

**You will have to do this for every project
***I think you can do this with the property sheets but I don’t know how to set it up. Message me if you do!

5. Copy  liblept168.dll, liblept168d.dll, libtesseract302.dll and libtesseract302.dll from C:Program FilesTesseract-OCR into your project folder (Optional)


If for some reason when you run your program and you get .dll missing add these files into your project folder.

6. Hello World!


To check if your project works create your main cpp file with this code:



#include <baseapi.h>
#include <allheaders.h>
#include <iostream>

using namespace std;

int main(void){

tesseract::TessBaseAPI api;
api.Init(“”, “eng”, tesseract::OEM_DEFAULT);
api.SetPageSegMode(static_cast<tesseract::PageSegMode>(7));
api.SetOutputName(“out”);

cout<<“File name:”;
char image[256];
cin>>image;
PIX   *pixs = pixRead(image);

STRING text_out;
api.ProcessPages(image, NULL, 0, &text_out);

cout<<text_out.string();

}

Copy this image into your project folder: (Right click save file as)


Copy eng.traineddata from C:Program FilesTesseract-OCRtessdata into your project folder and it should output Hello World! The traineddata file will be used as the data file for reading the text.

More to come! I will be making a tutorial maybe next week on linking OpenCV with Tesseract and maybe also on how to train Tesseract.

57 Comments

  • Unknown
    November 14, 2012

    Hi, I did what you wrote here and it works but my "C:Program FilesTesseract-OCRinclude" folder I have "leptonica" folder and some header files but I can not find "tesseract" folder! I think my setp didn't go well.
    can you upload this folder?
    Thanks

  • Tiara Livia Permata
    January 18, 2013

    does Tesseract compatible with Visual Studio 2010?

    • ayoungprogrammer
      January 18, 2013

      Yes Tesseract is compatible with VS 2010 but it requires a slightly different installation method.

    • sabir jamal
      March 3, 2013

      you said "Yes Tesseract is compatible with VS 2010 but it requires a slightly different installation method."
      please can you give more detail about this method

    • admin
      September 9, 2013

      Michael please give complete details about this installation method

  • Dhut
    February 27, 2013

    Hey Nice Tutorial,
    Problem is that I am not able to locate tesseract302.lib
    libtesseract302d.lib these two files, What should I do?

    • Dhut
      February 27, 2013

      sorry its
      libtesseract302.lib
      libtesseract302d.lib

    • ayoungprogrammer
      February 27, 2013

      You might have to do step 6 and move these libraries into your folder. If you followed my installation instructions, they should be in C:Program FilesTesseract-OCR

  • Avadhut Chaudhari
    March 5, 2013

    Hello Michael,
    I followed your installation instructions. But my C:Program FilesTesseract-OCR didn't contain those files..So then I downloaded tesseract-3.02.02-win32-lib-include-dirs.zip from https://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-3.02.02-win32-lib-include-dirs.zip&can=2&q=.

    and copied those files from folder to my prj folder and in C:Program Files (x86)Tesseract-OCRlib .. so finally it started working..

    This is tested on VS 2010.. working absolutely fine.

  • Imagelife
    March 5, 2013

    Hello Michael,
    I followed your installation instructions. But my C:Program FilesTesseract-OCR didn't contain those files..So then I downloaded tesseract-3.02.02-win32-lib-include-dirs.zip from https://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-3.02.02-win32-lib-include-dirs.zip&can=2&q=.

    and copied those files from folder to my prj folder and in C:Program Files (x86)Tesseract-OCRlib .. so finally it started working..

    This is tested on VS 2010.. working absolutely fine.

  • Vicky Patil
    March 6, 2013

    Hello Michael,
    I followed your installation instruction but I am getting below error
    'Error 1 fatal error C1083: Cannot open include file: 'baseapi.h': No such file or directory d:workprojectsocrtesseracttesseracttesseract.cpp 5 TesserAct'
    Please provide me appropriate solution.
    Thanks and regards.
    Vikky

    • ayoungprogrammer
      March 6, 2013

      If the drive you are using for everything is in the D drive then you will have to re-do everything except instead of using C:\ use D:\

  • Vicky Patil
    March 11, 2013

    Hello Michael,
    While installing the tesseract-ocr-setup-3.02.02, I am getting following error
    "http download error. Download Status of : File Not Found(404). Click OK to continue". Please tell why I am getting this error. And also provide solution to solve this error.
    Thanks and Regards,
    Vicky Patil

  • Sagar Patil
    March 13, 2013

    Hello Michael,
    While installing the tesseract-ocr-setup-3.02.02, I am getting following error
    "http download error. Download Status of : File Not Found(404). Click OK to continue". Please tell why I am getting this error. And also provide solution to solve this error.
    Thanks and Regards,
    Sagar Patil

  • Sharath Raju
    March 29, 2013

    Hi Michael,
    By following your tutorial i have installed Tesseract . While checking with your code in visual studio 2008 , I am getting below error,
    Error opening data file ./tessdata/eng.trained data. Please make sure the TESSDATA-PREFIX environment variable is set to the parent directory of your "tessdata" directory. Failed loading language 'eng'.Tesseract couldn't load any languages.

    Really I cant understand this error. I have already copied eng.trained data file from TesseractOCR File to the project folder !!
    help me in this regard !! Thank you!!

    • Jeremy Langer
      September 27, 2013

      You need to put your traineddata file inside of a 'tessdata' folder. So it should look something like this:
      [project folder]tessdataeng.traineddata

  • Muhammad Shariq
    April 5, 2013

    Hi Michael,

    I followed your tutorial on visual studio 2008 without much problem except that some lib files and tesseract directory in include folder was missing. What I found is that if we install tesseract from the installer available at its website then this directory and lib files are not included in the package.

    The solution is to download "tesseract-3.02.02-win32-lib-include-dirs.zip" file from tesseract's website, unzip it, copy the "tesseract: directory in "Program Files (x86)Tesseract-OCRinclude" and missing lib files into "Program Files (x86)Tesseract-OCRlib" folder.

    I hope this will be helpful for the future visitors.

    • ayoungprogrammer
      April 5, 2013

      Thanks! Hopefully this can help other people who have been having problems

  • Ragchaabazar Bud
    May 7, 2013

    i follow this tutorial for Visual Studios 2010 but i get this error "the program can't start because libtesseract302.dll is missing from your computer. Try reinstalling th program to fix this problem" please help me?

    • ayoungprogrammer
      May 7, 2013

      Follow step 5:

      5. Copy liblept168.dll, liblept168d.dll, libtesseract302.dll and libtesseract302.dll from C:Program FilesTesseract-OCR into your project folder (Optional)

      And this should fix your error

    • Alex Ferraro
      August 26, 2013

      I originally downloaded tesseract from the SVN, and built the solution to get libtesseract302.dll. But it never gave me the .lib files. So I did these instructions and additionally downloaded the latest files because even the download didn't come with libtesseract302d.dll so now I have everything. And I've put all of these in my project directly and also included them in the link additional and still get the missing libtesseract302d.dll error

      liblept168.dll
      liblept168.lib
      liblept168d.dll
      liblept168d.lib
      libtesseract302.lib
      libtesseract302.dll
      libtesseract302d.dll
      libtesseract302d.lib

      maybe I've put them in the wrong path, but I've tried in the solution folder, in the project folder, and in the project debug folder. I've tried in the places separately and all of the places at the same time. I have no idea how to get this running

    • ayoungprogrammer
      August 26, 2013

      Try putting the .dlls in the same folder as your .exe and run the .exe directly instead of running from VS and see if that works

  • NDQUANGR DEV
    May 15, 2013

    I have done all of above step, and the sample code build success. But when I try to run/debug it, it throw exception: "The application was unable to start correctly (0xc0150002). Click OK to close the application". What might be the problem? Thank you very much.

    • Witek
      June 3, 2013

      Same situation here…although only in Debug mode. In Release mode it works without a problem. Strange…Anyone has a solution for that? I am using VS2010 Professional under Windows 7 64bit

    • Gene
      January 23, 2016

      Hello Michael. Thanks for this tutorial. I know it's been ages but I just ran into the same problem on VS 2013 on windows 8, 64 bit. Any chance you could help? I'm so close. Thanks in advance for your efforts.

    • ayoungprogrammer
      January 28, 2016

      What are the error messages you are getting?

  • Emmanuel Lopez Lopez
    May 24, 2013

    HI, i follow your tutorial, but i guet this error
    warning C4627:'#include ': skipped when looking for precompiled header

    • Emmanuel Lopez Lopez
      May 24, 2013

      its the same error to
      allheaders.h

    • ayoungprogrammer
      May 24, 2013

      When you create a new console project, make sure you are not using precompiled headers

    • Emmanuel Lopez Lopez
      May 24, 2013

      i solved that error : now i have this problem,
      fatal error LNK1104: can not open file 'libtesseract302.lib'

    • ayoungprogrammer
      May 25, 2013

      You did not configure your directories properly:

      Open up Microsoft Visual Studio 2008 and go to Tools -> Options
      Next click show directories for -> Library Files
      Add:
      C:Program FilesTesseract-OCRlib

      If you have installed the libraries somewhere else, set the directory there

  • Emmanuel Lopez Lopez
    May 25, 2013

    i did it , now i get this :
    "the program can't start because libtesseract302.dll is missing from your computer. Try reinstalling th program to fix this problem".
    libtesseract302.dll theres no in C:Program Files (x86)Tesseract-OCR
    so i downloaded it , but i dond know what i have to do whit that dll, i did the step 5 from your tutorial

    • Emmanuel Lopez Lopez
      May 25, 2013

      i solve that, the code its working thank you for all

  • ayoungprogrammer
    May 25, 2013

    Download the .dll and put it in your project folder

  • nimantha lakmal
    August 3, 2013

    interface is totally different in visual studio 2010. Can you give me the steps that I have to follow in that.IT WILL BE REALLY HELPFUL. NO REFERENCE FOR INCLUDING TESSERACT API IN 2010

  • Santhosh Bander
    August 6, 2013

    my project is in vs 2012 4.0 framework.. i was using tessnet2.0 but very poooor results, how can i implement 3.02 in vs 2012

  • Thanhtai Le
    August 20, 2013

    Hey all guys, Let's use this rar file, it containt all library as you need,
    If you see, warning libtesseract302.dll missing. Just copy 2 files :
    libtesseract302.dll and libtesseract302d.dll into your solution folder.
    this is the link files:
    http://www.mediafire.com/download/dge5mtdmp9q2e1z/tesseractlib.rar

    Santhosh Bander, with C++ in Visual Studio 2012 we do similar, you see in the tail an arrow button, select edit and do the same with above.
    as you send me email: [email protected]

  • akhil nair
    September 2, 2013

    It worked in VS 2010.

    Thank you

  • Matěj Ecler
    November 22, 2013

    Hi man, thank a lot for this tut, but I have thi problem:

    c:usersbenderdocumentsvisual studio 2008projectstesseracttesttesseracttesttesseracttest.cpp(5) : fatal error C1083: Cannot open include file: 'baseapi.h': No such file or directory

    what should I do?

    • ayoungprogrammer
      November 23, 2013

      It is most likely you installed the Tesseract header files in the wrong place or did not add the dependency for headers correctly

  • wael alkhatib
    November 26, 2013

    tesseract OCR in C++ in Visual Studio 2012
    I have been struggling with this for a week, the problem is that i should got
    the libtesseract302.dll and libtesseract302d.dll recompiled for vs2012
    and tesseract dose not provide that

    • Stas
      January 15, 2014

      Using v100 toolset(alt+enter -> platform toolset) or recompile lib for vs2012(v110)

  • Swati Jagtap
    March 8, 2014

    Do you have idea of installing tesseract in Qt creator….I am struggling for it for the last week but didnot find well

  • Robitics
    March 20, 2014

    I have error on ->> :
    Error 1 error C2365: 'PT_UNKNOWN' : redefinition; previous definition was 'enumerator' c:,,,desktoptesseractincludetesseractcapi.h

    in capi.h is:
    typedef enum TessPolyBlockType { PT_UNKNOWN, PT_FLOWING_TEXT, PT_HEADING_TEXT, PT_PULLOUT_TEXT, PT_TABLE, PT_VERTICAL_TEXT,
    PT_CAPTION_TEXT, PT_FLOWING_IMAGE, PT_HEADING_IMAGE, PT_PULLOUT_IMAGE, PT_HORZ_LINE, PT_VERT_LINE,
    PT_NOISE, PT_COUNT } TessPolyBlockType;

  • Gleisson Gomes
    June 12, 2014

    hello, first like to apologize for my typos.
    I'm having a little trouble installing the version 3.2.02 on windows 7 x86, and import the library in microsoft visual stuidio 2013.
    it could be possible to make a tutorial.

  • Varun Paul
    September 10, 2014

    I got an output like

    1>—— Build started: Project: tessprogram, Configuration: Debug x64 ——
    1>Build started 9/10/2014 12:46:31 PM.
    1>InitializeBuildStatus:
    1> Touching "x64Debugtessprogram.unsuccessfulbuild".
    1>ClCompile:
    1> All outputs are up-to-date.
    1>tesspgm.obj : error LNK2019: unresolved external symbol "public: virtual __cdecl tesseract::TessBaseAPI::~TessBaseAPI(void)" (??1TessBaseAPI@tesseract@@UEAA@XZ) referenced in function main
    1>tesspgm.obj : error LNK2019: unresolved external symbol "public: __cdecl STRING::~STRING(void)" (??1STRING@@QEAA@XZ) referenced in function main
    1>tesspgm.obj : error LNK2019: unresolved external symbol "public: char const * __cdecl STRING::string(void)const " (?string@STRING@@QEBAPEBDXZ) referenced in function main
    1>tesspgm.obj : error LNK2019: unresolved external symbol "public: bool __cdecl tesseract::TessBaseAPI::ProcessPages(char const *,char const *,int,class STRING *)" (?ProcessPages@TessBaseAPI@tesseract@@QEAA_NPEBD0HPEAVSTRING@@@Z) referenced in function main
    1>tesspgm.obj : error LNK2019: unresolved external symbol "public: __cdecl STRING::STRING(void)" (??0STRING@@QEAA@XZ) referenced in function main
    1>tesspgm.obj : error LNK2019: unresolved external symbol pixRead referenced in function main
    1>tesspgm.obj : error LNK2019: unresolved external symbol "public: void __cdecl tesseract::TessBaseAPI::SetOutputName(char const *)" (?SetOutputName@TessBaseAPI@tesseract@@QEAAXPEBD@Z) referenced in function main
    1>tesspgm.obj : error LNK2019: unresolved external symbol "public: void __cdecl tesseract::TessBaseAPI::SetPageSegMode(enum tesseract::PageSegMode)" (?SetPageSegMode@TessBaseAPI@tesseract@@QEAAXW4PageSegMode@2@@Z) referenced in function main
    1>tesspgm.obj : error LNK2019: unresolved external symbol "public: __cdecl tesseract::TessBaseAPI::TessBaseAPI(void)" (??0TessBaseAPI@tesseract@@QEAA@XZ) referenced in function main
    1>tesspgm.obj : error LNK2019: unresolved external symbol "public: int __cdecl tesseract::TessBaseAPI::Init(char const *,char const *,enum tesseract::OcrEngineMode,char * *,int,class GenericVector const *,class GenericVector const *,bool)" (?Init@TessBaseAPI@tesseract@@QEAAHPEBD0W4OcrEngineMode@2@PEAPEADHPEBV?$GenericVector@VSTRING@@@@3_N@Z) referenced in function "public: int __cdecl tesseract::TessBaseAPI::Init(char const *,char const *,enum tesseract::OcrEngineMode)" (?Init@TessBaseAPI@tesseract@@QEAAHPEBD0W4OcrEngineMode@2@@Z)
    1>c:usersvarundocumentsvisual studio 2010Projectstessprogramx64Debugtessprogram.exe : fatal error LNK1120: 10 unresolved externals
    1>
    1>Build FAILED.
    1>
    1>Time Elapsed 00:00:00.38
    ========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========

    So can anyone please help me to solve this error?
    I'm using visual studio 2010 in a 64 bit computer.

  • intan mohd yunus
    March 22, 2015

    Hi Michael. thank you for this post. This is really a good tutorial for people like me. (first time meeting tesseract)
    FYI, i'm in the midst of learning on how to develop a text signage recognition mobile application in Android.
    I decided to use Tesseract and OpenCv and integrate both of them in Visual Studio 2008.
    I have follow ur steps above to create a simple program, however the program failed to run and the error says:

    " 1>c:usersintandocumentsvisual studio 2008projectstesseracttesttesseracttestmain.cpp(1) : fatal error C1083: Cannot open include file: 'baseapi.h': No such file or directory "

    Why is it this happen? I really hope u can help me in solving this issue. I'm really new in this but willing to learn. Thnks btw. 🙂

    • Jonnathan
      December 15, 2016

      Hi, intan mohd yunus
      I think you forgot to connect any library “baseapi.h”. Try to decompile any apk file and deal with a mistake in reverse.
      I usually take the examples here: http://androidappforyou.com
      Try to and I’m sure you’ll solve your problem fast. Good luck.

  • buyi wen
    September 17, 2015

    if you like tesseract ocr, you may like this free online ocr tool using tesseract ocr 3.02

  • Unknown
    December 12, 2015

    Hi Michael,thank you for your post.
    i am getting these errors in visual studio 2013,please solve me out.these errors are from headerfiles.

    Warning 1 warning C4267: 'initializing' : conversion from 'size_t' to 'int', possible loss of data

    Error 2 error C4996: 'strncpy': This function or variable may be unsafe. Consider using strncpy_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS.

    Warning 3 warning C4005: 'snprintf' : macro redefinition

    Warning 4 warning C4305: 'initializing' : truncation from 'double' to 'const l_float32'

    Warning 5 warning C4305: 'initializing' : truncation from 'double' to 'const l_float32'

  • Phạm Ngọc Bách
    January 7, 2016

    Thanks

  • Oscar Weiss
    June 1, 2016

    Want to tell you great way how to solve your problem with missed .dll library files, it's very simple way. You need just download missed .dll library file from http://fix4dll.com/msvcp110_dll and add it into the right directory follows the instructions. Try to and I'm sure you'll solve your problem fast. Good luck.

  • Equation OCR Tutorial Part 1: Using contours to extract characters in OpenCV – ayoungprogrammer's blog
    July 3, 2016

    […] Installing Tesseract: http://blog2.ayoungprogrammer.com/2012/11/tutorial-installing-tesseract-ocr-30202.html/ […]

Leave a Reply to sabir jamal Cancel reply

Your email address will not be published. Required fields are marked *