Scanning in an image and converting it to text is relatively straightforward in Linux provided you have the correct software installed. I plumped for Tesseract as it was reputedly the best command line OCR program but I also wanted to have a graphical user interface with it so I used gImageReader as a front-end to Tesseract.

Here's how to install both of them.

Firstly, install tesseract (and the associated language files if needed):

sudo apt-get install tesseract-ocr

Install a language file (e.g. -eng, -deu, -fra, -ita, -ndl, -por, -spa, …)

sudo apt-get install tesseract-ocr-eng

Next, install gImageReader as a frontend to tesseract.

Add the application repository:

sudo add-apt-repository ppa:sandromani/gimagereader

Update the repository sources

sudo apt-get update

Install the application

sudo apt-get install gimagereader

Now you should be ready to go. gImageReader can be accessed on your graphics menu. Happy Character Recognising!

Date Category Technology  Tags linux
Gary Hall
Article by Gary Hall
A teacher based in Beverley, England. Enjoys walking, travelling, reading and writing interesting content to help others. Feel free to comment below.


comments powered by Disqus