The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV Accuracy test. Between 1995 and 2006 it had little work done on it, but it is probably one of the most accurate open source OCR engines available. The source code will read a binary, grey or color image and output text. A tiff reader is built in that will read uncompressed TIFF images, or libtiff can be added to read compressed images.
1. Download source code from URL : http://code.google.com/p/tesseract-ocr/
2. create a .sh file inside the tesseract source code folder
3. Write the following code in .sh file
Code Starts
#!/bin/sh
# build_fat.sh
#
# Created by Robert Carlsen on 15.07.2009. Updated 24.9.2010
# build an arm / i386 lib of standard linux project
#
# initially configured for tesseract-ocr v2.0.4
# updated for tesseract prerelease v3
outdir=outdir
mkdir -p $outdir/arm $outdir/i386
libdirs=( api ccutil ccmain ccstruct classify cutil dict image textord training viewer wordrec )
libs=( api ccutil main ccstruct classify cutil dict image textord training viewer wordrec )
count=${#libdirs[@]}
make distclean
unset CPPFLAGS CFLAGS LDFLAGS CPP CXX CC CXXFLAGS DEVROOT SDKROOT LD
export DEVROOT=/Developer/Platforms/iPhoneOS.platform/Developer
export SDKROOT=$DEVROOT/SDKs/iPhoneOS4.1.sdk
export CFLAGS="-arch armv6 -pipe -no-cpp-precomp -isysroot$SDKROOT -miphoneos-version-min=3.0 -I$SDKROOT/usr/include/"
export CPPFLAGS="$CFLAGS"
export CXXFLAGS="$CFLAGS"
export LDFLAGS="-L$SDKROOT/usr/lib/"
export LD="$DEVROOT/usr/bin/ld"
export CPP="$DEVROOT/usr/bin/cpp-4.2"
export CXX="$DEVROOT/usr/bin/g++-4.2"
export CC="$DEVROOT/usr/bin/gcc-4.2"
./configure --host=arm-apple-darwin
make -j3
index=0
while [ "$index" -lt "$count" ]
do
cp ${libdirs[index]}/.libs/libtesseract_${libs[index]}.a $outdir/arm/libtesseract_${libs[index]}_armv6.a
((index++))
done
make distclean
unset CPPFLAGS CFLAGS LDFLAGS CPP CXX CC CXXFLAGS DEVROOT SDKROOT LD
export DEVROOT=/Developer/Platforms/iPhoneSimulator.platform/Developer
export SDKROOT=$DEVROOT/SDKs/iPhoneSimulator4.1.sdk
export CFLAGS="-arch i386 -pipe -no-cpp-precomp -isysroot$SDKROOT -miphoneos-version-min=3.0 -I$SDKROOT/usr/include/"
export CPPFLAGS="$CFLAGS"
export CXXFLAGS="$CFLAGS"
export LDFLAGS="-L$SDKROOT/usr/lib/"
export LD="$DEVROOT/usr/bin/ld"
export CPP="$DEVROOT/usr/bin/cpp-4.2"
export CXX="$DEVROOT/usr/bin/g++-4.2"
export CC="$DEVROOT/usr/bin/gcc-4.2"
./configure
make -j3
index=0
while [ "$index" -lt "$count" ]
do
cp ${libdirs[index]}/.libs/libtesseract_${libs[index]}.a $outdir/i386/libtesseract_${libs[index]}_i386.a
((index++))
done
# are the fat libs making the bundle too big?
index=0
while [ "$index" -lt "$count" ]
do
/usr/bin/lipo -arch armv6 $outdir/arm/libtesseract_${libs[index]}_armv6.a -arch i386 $outdir/i386/libtesseract_${libs[index]}_i386.a -create -output $outdir/libtesseract_${libs[index]}.a
((index++))
done
unset CPPFLAGS CFLAGS LDFLAGS CPP CXX CC CXXFLAGS DEVROOT SDKROOT
Code Ends
Source : http://robertcarlsen.net/2010/09/24/compiling-tesseract-v3-for-iphone-1299
5. Navigate to the tesseract source code folder by Terminal
6. RUN sh ./yourfilename.sh from Terminal. It will configure & build library for iOS.
7. After compile finished, check for outdir folder inside the tesseract source code folder. You will find the all required library files for iOS.
iOS implementation for tesseract ocr : https://github.com/rcarlsen/Pocket-OCR
Hi Julie,
You can get help from the following sites:
http://iphone.olipion.com/cross-compilation/tesseract-ocr
http://robertcarlsen.net/2009/07/15/cross-compiling-for-iphone-dev-884
http://robertcarlsen.net/2010/01/12/ocr-for-iphone-source-1080
https://github.com/nolanbrown/Tesseract-iPhone-Demo
http://robertcarlsen.net/2010/09/24/compiling-tesseract-v3-for-iphone-1299