Search This Blog

Wednesday, November 17, 2010

Installing Tesseract on Mac OSX

Instructions for installation on Mac:

Installation on a mac (Snow leapord 10.6.4) with english language ocr in mind:

1) Download tesseract-3.00.tar.gz and eng.traineddata.gz from the downloads page (http://code.google.com/p/tesseract-ocr/downloads/list)

2) Open a terminal, cd to wherever you downloaded the above files, then do:

tar -xf tesseract-3.00.tar.gz.gz

cd tesseract-3.00

3) Will want to install libraries libtiff (read compressed tiff files) and leptonica. Install Macports if not already installed and execute:

sudo port install tiff

sudo port install leptonica

4) Then run:

./configure

make

sudo make install

or alternatively there is a macport for tesseract:

sudo port install tesseract

5) Then move the english language pack for use with tesseract

cd ..

tar -xf eng.traineddata.gz

sudo mv eng.traineddata /usr/local/share/tessdata

- you now have a working install of tesseract set up to do ocr on english language documents. Run in the directory containing the desired .tif :

tesseract inputimage.tif outputtext -l eng

and you should get a file called outputtext.txt. in the same directory with the results!

1 comment:

  1. ranlib: unrecognized option `-q'
    ranlib: Try `ranlib --help' for more information.
    ar: internal ranlib command failed
    make[2]: *** [libltdl/dlopen.la] Error 1
    make[2]: *** Waiting for unfinished jobs....
    libtool: compile: cc -DHAVE_CONFIG_H -I. -DLTDLOPEN=libltdl "-DLT_CONFIG_H=" -DLTDL -I. -I. -Ilibltdl -I./libltdl -I./libltdl/libltdl -g -O2 -c libltdl/ltdl.c -o libltdl/libltdl_libltdl_la-ltdl.o >/dev/null 2>&1
    make[1]: *** [install-recursive] Error 1
    make: *** [install] Error 2

    ReplyDelete