Tutorial: How to Install Tesseract OCR 3.02.02 for Visual Studios 2008 on Windows Vista

Computer Vision, UncategorizedNovember 4, 2012

I could not find a single good tutorial for setting up Tesseract on VS2008 other than the docs that come with Tesseract so I decided to make my own tutorial for those interested.

More updated tutorial: https://github.com/gulakov/tesseract-ocr-sample

1. Download and install the full windows version of Tesseract. This way you won’t have to extract all the different separate files.

http://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-ocr-setup-3.02.02.exe
Leave the destination folder as the default (C:Program FilesTesseract-OCR)
Remember to check Tesseract Development files!

2. Open up Microsoft Visual Studio 2008 and go to Tools -> Options
Project solutions -> VC++ Directories -> Show directories for include files

Add:
C:Program FilesTesseract-OCRinclude
C:Program FilesTesseract-OCRincludetesseract
C:Program FilesTesseract-OCRincludeleptonica

3. Next click show directories for -> Library Files

Add:
C:Program FilesTesseract-OCRlib

4. Configure linker options for Tesseract

Right click your project in solution explorer and click properties

Configuration Properties -> Linker->Input ->Additional Dependencies

Add this in there:

libtesseract302.lib
libtesseract302d.lib
liblept168.lib
liblept168d.lib

**You will have to do this for every project
***I think you can do this with the property sheets but I don’t know how to set it up. Message me if you do!

5. Copy liblept168.dll, liblept168d.dll, libtesseract302.dll and libtesseract302.dll from C:Program FilesTesseract-OCR into your project folder (Optional)

If for some reason when you run your program and you get .dll missing add these files into your project folder.

6. Hello World!

To check if your project works create your main cpp file with this code:

#include <baseapi.h>
#include <allheaders.h>
#include <iostream>

using namespace std;

int main(void){

tesseract::TessBaseAPI api;
api.Init(“”, “eng”, tesseract::OEM_DEFAULT);
api.SetPageSegMode(static_cast<tesseract::PageSegMode>(7));
api.SetOutputName(“out”);

cout<<“File name:”;
char image[256];
cin>>image;
PIX *pixs = pixRead(image);

STRING text_out;
api.ProcessPages(image, NULL, 0, &text_out);

cout<<text_out.string();

}

Copy this image into your project folder: (Right click save file as)

Copy eng.traineddata from C:Program FilesTesseract-OCRtessdata into your project folder and it should output Hello World! The traineddata file will be used as the data file for reading the text.

More to come! I will be making a tutorial maybe next week on linking OpenCV with Tesseract and maybe also on how to train Tesseract.

Unknown

November 14, 2012

Reply

Hi, I did what you wrote here and it works but my "C:Program FilesTesseract-OCRinclude" folder I have "leptonica" folder and some header files but I can not find "tesseract" folder! I think my setp didn't go well.
can you upload this folder?
Thanks
Tiara Livia Permata

January 18, 2013

Reply

does Tesseract compatible with Visual Studio 2010?

ayoungprogrammer

January 18, 2013

Reply

Yes Tesseract is compatible with VS 2010 but it requires a slightly different installation method.
sabir jamal

March 3, 2013

Reply

you said "Yes Tesseract is compatible with VS 2010 but it requires a slightly different installation method."
please can you give more detail about this method
admin

September 9, 2013

Reply

Michael please give complete details about this installation method

Dhut

February 27, 2013

Reply

Hey Nice Tutorial,
Problem is that I am not able to locate tesseract302.lib
libtesseract302d.lib these two files, What should I do?

Dhut

February 27, 2013

Reply

sorry its
libtesseract302.lib
libtesseract302d.lib
ayoungprogrammer

February 27, 2013

Reply

You might have to do step 6 and move these libraries into your folder. If you followed my installation instructions, they should be in C:Program FilesTesseract-OCR

Avadhut Chaudhari

March 5, 2013

Reply

Hello Michael,
I followed your installation instructions. But my C:Program FilesTesseract-OCR didn't contain those files..So then I downloaded tesseract-3.02.02-win32-lib-include-dirs.zip from https://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-3.02.02-win32-lib-include-dirs.zip&can=2&q=.

and copied those files from folder to my prj folder and in C:Program Files (x86)Tesseract-OCRlib .. so finally it started working..

This is tested on VS 2010.. working absolutely fine.
Imagelife

March 5, 2013

Reply

Hello Michael,
I followed your installation instructions. But my C:Program FilesTesseract-OCR didn't contain those files..So then I downloaded tesseract-3.02.02-win32-lib-include-dirs.zip from https://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-3.02.02-win32-lib-include-dirs.zip&can=2&q=.

and copied those files from folder to my prj folder and in C:Program Files (x86)Tesseract-OCRlib .. so finally it started working..

This is tested on VS 2010.. working absolutely fine.
Vicky Patil

March 6, 2013

Reply

Hello Michael,
I followed your installation instruction but I am getting below error
'Error 1 fatal error C1083: Cannot open include file: 'baseapi.h': No such file or directory d:workprojectsocrtesseracttesseracttesseract.cpp 5 TesserAct'
Please provide me appropriate solution.
Thanks and regards.
Vikky

ayoungprogrammer

March 6, 2013

Reply

If the drive you are using for everything is in the D drive then you will have to re-do everything except instead of using C:\ use D:\

Vicky Patil

March 11, 2013

Reply

Hello Michael,
While installing the tesseract-ocr-setup-3.02.02, I am getting following error
"http download error. Download Status of : File Not Found(404). Click OK to continue". Please tell why I am getting this error. And also provide solution to solve this error.
Thanks and Regards,
Vicky Patil
Sagar Patil

March 13, 2013

Reply

Hello Michael,
While installing the tesseract-ocr-setup-3.02.02, I am getting following error
"http download error. Download Status of : File Not Found(404). Click OK to continue". Please tell why I am getting this error. And also provide solution to solve this error.
Thanks and Regards,
Sagar Patil
Sharath Raju

March 29, 2013

Reply

Hi Michael,
By following your tutorial i have installed Tesseract . While checking with your code in visual studio 2008 , I am getting below error,
Error opening data file ./tessdata/eng.trained data. Please make sure the TESSDATA-PREFIX environment variable is set to the parent directory of your "tessdata" directory. Failed loading language 'eng'.Tesseract couldn't load any languages.

Really I cant understand this error. I have already copied eng.trained data file from TesseractOCR File to the project folder !!
help me in this regard !! Thank you!!

Jeremy Langer

September 27, 2013

Reply

You need to put your traineddata file inside of a 'tessdata' folder. So it should look something like this:
[project folder]tessdataeng.traineddata

Muhammad Shariq

April 5, 2013

Reply

Hi Michael,

I followed your tutorial on visual studio 2008 without much problem except that some lib files and tesseract directory in include folder was missing. What I found is that if we install tesseract from the installer available at its website then this directory and lib files are not included in the package.

The solution is to download "tesseract-3.02.02-win32-lib-include-dirs.zip" file from tesseract's website, unzip it, copy the "tesseract: directory in "Program Files (x86)Tesseract-OCRinclude" and missing lib files into "Program Files (x86)Tesseract-OCRlib" folder.

I hope this will be helpful for the future visitors.

ayoungprogrammer

April 5, 2013

Reply

Thanks! Hopefully this can help other people who have been having problems

Ragchaabazar Bud

May 7, 2013

Reply

i follow this tutorial for Visual Studios 2010 but i get this error "the program can't start because libtesseract302.dll is missing from your computer. Try reinstalling th program to fix this problem" please help me?

ayoungprogrammer

May 7, 2013

Reply

Follow step 5:

5. Copy liblept168.dll, liblept168d.dll, libtesseract302.dll and libtesseract302.dll from C:Program FilesTesseract-OCR into your project folder (Optional)

And this should fix your error
Alex Ferraro

August 26, 2013

Reply

I originally downloaded tesseract from the SVN, and built the solution to get libtesseract302.dll. But it never gave me the .lib files. So I did these instructions and additionally downloaded the latest files because even the download didn't come with libtesseract302d.dll so now I have everything. And I've put all of these in my project directly and also included them in the link additional and still get the missing libtesseract302d.dll error

liblept168.dll
liblept168.lib
liblept168d.dll
liblept168d.lib
libtesseract302.lib
libtesseract302.dll
libtesseract302d.dll
libtesseract302d.lib

maybe I've put them in the wrong path, but I've tried in the solution folder, in the project folder, and in the project debug folder. I've tried in the places separately and all of the places at the same time. I have no idea how to get this running
ayoungprogrammer

August 26, 2013

Reply

Try putting the .dlls in the same folder as your .exe and run the .exe directly instead of running from VS and see if that works

NDQUANGR DEV

May 15, 2013

Reply

I have done all of above step, and the sample code build success. But when I try to run/debug it, it throw exception: "The application was unable to start correctly (0xc0150002). Click OK to close the application". What might be the problem? Thank you very much.

Witek

June 3, 2013

Reply

Same situation here…although only in Debug mode. In Release mode it works without a problem. Strange…Anyone has a solution for that? I am using VS2010 Professional under Windows 7 64bit
Gene

January 23, 2016

Reply

Hello Michael. Thanks for this tutorial. I know it's been ages but I just ran into the same problem on VS 2013 on windows 8, 64 bit. Any chance you could help? I'm so close. Thanks in advance for your efforts.
ayoungprogrammer

January 28, 2016

Reply

What are the error messages you are getting?

Emmanuel Lopez Lopez

May 24, 2013

Reply

HI, i follow your tutorial, but i guet this error
warning C4627:'#include ': skipped when looking for precompiled header

Emmanuel Lopez Lopez

May 24, 2013

Reply

its the same error to
allheaders.h
ayoungprogrammer

May 24, 2013

Reply

When you create a new console project, make sure you are not using precompiled headers
Emmanuel Lopez Lopez

May 24, 2013

Reply

i solved that error : now i have this problem,
fatal error LNK1104: can not open file 'libtesseract302.lib'
ayoungprogrammer

May 25, 2013

Reply

You did not configure your directories properly:

Open up Microsoft Visual Studio 2008 and go to Tools -> Options
Next click show directories for -> Library Files
Add:
C:Program FilesTesseract-OCRlib

If you have installed the libraries somewhere else, set the directory there

Emmanuel Lopez Lopez

May 25, 2013

Reply

i did it , now i get this :
"the program can't start because libtesseract302.dll is missing from your computer. Try reinstalling th program to fix this problem".
libtesseract302.dll theres no in C:Program Files (x86)Tesseract-OCR
so i downloaded it , but i dond know what i have to do whit that dll, i did the step 5 from your tutorial

Emmanuel Lopez Lopez

May 25, 2013

Reply

i solve that, the code its working thank you for all

ayoungprogrammer

May 25, 2013

Reply

Download the .dll and put it in your project folder

Emmanuel Lopez Lopez

May 25, 2013

Reply

thank you for all Michael
Emmanuel Lopez Lopez

May 25, 2013

Reply

One more question, do you have a tutorial or something like that?, i need to explain the code of this project
ayoungprogrammer

May 25, 2013

Reply

Are you looking for a tutorial of how to use Tesseract? I have a 3 part series on using it with OpenCV:
http://ayoungprogrammer.blogspot.ca/2013/01/equation-ocr-part-1-using-contours-to.html
Emmanuel Lopez Lopez

May 25, 2013

Reply

Thank you

nimantha lakmal

August 3, 2013

Reply

interface is totally different in visual studio 2010. Can you give me the steps that I have to follow in that.IT WILL BE REALLY HELPFUL. NO REFERENCE FOR INCLUDING TESSERACT API IN 2010
Santhosh Bander

August 6, 2013

Reply

my project is in vs 2012 4.0 framework.. i was using tessnet2.0 but very poooor results, how can i implement 3.02 in vs 2012
Thanhtai Le

August 20, 2013

Reply

Hey all guys, Let's use this rar file, it containt all library as you need,
If you see, warning libtesseract302.dll missing. Just copy 2 files :
libtesseract302.dll and libtesseract302d.dll into your solution folder.
this is the link files:
http://www.mediafire.com/download/dge5mtdmp9q2e1z/tesseractlib.rar

Santhosh Bander, with C++ in Visual Studio 2012 we do similar, you see in the tail an arrow button, select edit and do the same with above.
as you send me email: [email protected]
akhil nair

September 2, 2013

Reply

It worked in VS 2010.

Thank you
Matěj Ecler

November 22, 2013

Reply

Hi man, thank a lot for this tut, but I have thi problem:

c:usersbenderdocumentsvisual studio 2008projectstesseracttesttesseracttesttesseracttest.cpp(5) : fatal error C1083: Cannot open include file: 'baseapi.h': No such file or directory

what should I do?

ayoungprogrammer

November 23, 2013

Reply

It is most likely you installed the Tesseract header files in the wrong place or did not add the dependency for headers correctly

wael alkhatib

November 26, 2013

Reply

tesseract OCR in C++ in Visual Studio 2012
I have been struggling with this for a week, the problem is that i should got
the libtesseract302.dll and libtesseract302d.dll recompiled for vs2012
and tesseract dose not provide that

Stas

January 15, 2014

Reply

Using v100 toolset(alt+enter -> platform toolset) or recompile lib for vs2012(v110)

Swati Jagtap

March 8, 2014

Reply

Do you have idea of installing tesseract in Qt creator….I am struggling for it for the last week but didnot find well
Robitics

March 20, 2014

Reply

I have error on ->> :
Error 1 error C2365: 'PT_UNKNOWN' : redefinition; previous definition was 'enumerator' c:,,,desktoptesseractincludetesseractcapi.h

in capi.h is:
typedef enum TessPolyBlockType { PT_UNKNOWN, PT_FLOWING_TEXT, PT_HEADING_TEXT, PT_PULLOUT_TEXT, PT_TABLE, PT_VERTICAL_TEXT,
PT_CAPTION_TEXT, PT_FLOWING_IMAGE, PT_HEADING_IMAGE, PT_PULLOUT_IMAGE, PT_HORZ_LINE, PT_VERT_LINE,
PT_NOISE, PT_COUNT } TessPolyBlockType;
Gleisson Gomes

June 12, 2014

Reply

hello, first like to apologize for my typos.
I'm having a little trouble installing the version 3.2.02 on windows 7 x86, and import the library in microsoft visual stuidio 2013.
it could be possible to make a tutorial.
Varun Paul

September 10, 2014

Reply

I got an output like

1>—— Build started: Project: tessprogram, Configuration: Debug x64 ——
1>Build started 9/10/2014 12:46:31 PM.
1>InitializeBuildStatus:
1> Touching "x64Debugtessprogram.unsuccessfulbuild".
1>ClCompile:
1> All outputs are up-to-date.
1>tesspgm.obj : error LNK2019: unresolved external symbol "public: virtual __cdecl tesseract::TessBaseAPI::~TessBaseAPI(void)" (??1TessBaseAPI@tesseract@@UEAA@XZ) referenced in function main
1>tesspgm.obj : error LNK2019: unresolved external symbol "public: __cdecl STRING::~STRING(void)" (??1STRING@@QEAA@XZ) referenced in function main
1>tesspgm.obj : error LNK2019: unresolved external symbol "public: char const * __cdecl STRING::string(void)const " (?string@STRING@@QEBAPEBDXZ) referenced in function main
1>tesspgm.obj : error LNK2019: unresolved external symbol "public: bool __cdecl tesseract::TessBaseAPI::ProcessPages(char const *,char const *,int,class STRING *)" (?ProcessPages@TessBaseAPI@tesseract@@QEAA_NPEBD0HPEAVSTRING@@@Z) referenced in function main
1>tesspgm.obj : error LNK2019: unresolved external symbol "public: __cdecl STRING::STRING(void)" (??0STRING@@QEAA@XZ) referenced in function main
1>tesspgm.obj : error LNK2019: unresolved external symbol pixRead referenced in function main
1>tesspgm.obj : error LNK2019: unresolved external symbol "public: void __cdecl tesseract::TessBaseAPI::SetOutputName(char const *)" (?SetOutputName@TessBaseAPI@tesseract@@QEAAXPEBD@Z) referenced in function main
1>tesspgm.obj : error LNK2019: unresolved external symbol "public: void __cdecl tesseract::TessBaseAPI::SetPageSegMode(enum tesseract::PageSegMode)" (?SetPageSegMode@TessBaseAPI@tesseract@@QEAAXW4PageSegMode@2@@Z) referenced in function main
1>tesspgm.obj : error LNK2019: unresolved external symbol "public: __cdecl tesseract::TessBaseAPI::TessBaseAPI(void)" (??0TessBaseAPI@tesseract@@QEAA@XZ) referenced in function main
1>tesspgm.obj : error LNK2019: unresolved external symbol "public: int __cdecl tesseract::TessBaseAPI::Init(char const *,char const *,enum tesseract::OcrEngineMode,char * *,int,class GenericVector const *,class GenericVector const *,bool)" (?Init@TessBaseAPI@tesseract@@QEAAHPEBD0W4OcrEngineMode@2@PEAPEADHPEBV?$GenericVector@VSTRING@@@@3_N@Z) referenced in function "public: int __cdecl tesseract::TessBaseAPI::Init(char const *,char const *,enum tesseract::OcrEngineMode)" (?Init@TessBaseAPI@tesseract@@QEAAHPEBD0W4OcrEngineMode@2@@Z)
1>c:usersvarundocumentsvisual studio 2010Projectstessprogramx64Debugtessprogram.exe : fatal error LNK1120: 10 unresolved externals
1>
1>Build FAILED.
1>
1>Time Elapsed 00:00:00.38
========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========

So can anyone please help me to solve this error?
I'm using visual studio 2010 in a 64 bit computer.
intan mohd yunus

March 22, 2015

Reply

Hi Michael. thank you for this post. This is really a good tutorial for people like me. (first time meeting tesseract)
FYI, i'm in the midst of learning on how to develop a text signage recognition mobile application in Android.
I decided to use Tesseract and OpenCv and integrate both of them in Visual Studio 2008.
I have follow ur steps above to create a simple program, however the program failed to run and the error says:

" 1>c:usersintandocumentsvisual studio 2008projectstesseracttesttesseracttestmain.cpp(1) : fatal error C1083: Cannot open include file: 'baseapi.h': No such file or directory "

Why is it this happen? I really hope u can help me in solving this issue. I'm really new in this but willing to learn. Thnks btw. 🙂

Jonnathan

December 15, 2016

Reply

Hi, intan mohd yunus
I think you forgot to connect any library “baseapi.h”. Try to decompile any apk file and deal with a mistake in reverse.
I usually take the examples here: http://androidappforyou.com
Try to and I’m sure you’ll solve your problem fast. Good luck.

buyi wen

September 17, 2015

Reply

if you like tesseract ocr, you may like this free online ocr tool using tesseract ocr 3.02
Unknown

December 12, 2015

Reply

Hi Michael,thank you for your post.
i am getting these errors in visual studio 2013,please solve me out.these errors are from headerfiles.

Warning 1 warning C4267: 'initializing' : conversion from 'size_t' to 'int', possible loss of data

Error 2 error C4996: 'strncpy': This function or variable may be unsafe. Consider using strncpy_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS.

Warning 3 warning C4005: 'snprintf' : macro redefinition

Warning 4 warning C4305: 'initializing' : truncation from 'double' to 'const l_float32'

Warning 5 warning C4305: 'initializing' : truncation from 'double' to 'const l_float32'
Phạm Ngọc Bách

January 7, 2016

Reply

Thanks
Oscar Weiss

June 1, 2016

Reply

Want to tell you great way how to solve your problem with missed .dll library files, it's very simple way. You need just download missed .dll library file from http://fix4dll.com/msvcp110_dll and add it into the right directory follows the instructions. Try to and I'm sure you'll solve your problem fast. Good luck.
Equation OCR Tutorial Part 1: Using contours to extract characters in OpenCV – ayoungprogrammer's blog

July 3, 2016

Reply

[…] Installing Tesseract: http://blog2.ayoungprogrammer.com/2012/11/tutorial-installing-tesseract-ocr-30202.html/ […]

Tutorial: How to Install Tesseract OCR 3.02.02 for Visual Studios 2008 on Windows Vista

57 Comments

Leave a Reply Cancel reply