Scanning in an image and converting it to text is relatively straightforward in Linux provided you have the correct software installed. I plumped for Tesseract as it was reputedly the best command line OCR program but I also wanted to have a graphical user interface with it so I used gImageReader as a front-end to Tesseract.

Here's how to install both of them.

Firstly, install tesseract (and the associated language files if needed):

sudo apt-get install tesseract-ocr

Install a language file (e.g. -eng, -deu, -fra, -ita, -ndl, -por, -spa, …)

sudo apt-get install tesseract-ocr-eng

Next, install gImageReader as a frontend to tesseract.

Add the application repository:

sudo add-apt-repository ppa:sandromani/gimagereader

Update the repository sources

sudo apt-get update

Install the application

sudo apt-get install gimagereader

Now you should be ready to go. gImageReader can be accessed on your graphics menu. Happy Character Recognising!

Date Tags linux


comments powered by Disqus
Gary Hall Gary Hall is based in East Yorkshire, England, and has a background in education, marketing and technology. This site is a collection of ideas and resources on these topics.