Ask Your Question

Application to index images/pdfs and recognise text

asked 2014-02-05 04:33:10 -0600

Derek Ekins gravatar image

Are there any apps out there that will index all the images/pdfs etc in a folder and create a searchable index from the data?

Basically I would like evernote capabilities but just on the desktop.

edit retag flag offensive close merge delete

3 Answers

Sort by ยป oldest newest most voted

answered 2014-02-05 16:48:18 -0600

davidva gravatar image

Hi, Do you want a program OCR for scan? or a simple manage of pictures, books, movies etc..?


image description

yum -y install alexandria


yum -y install calibre
edit flag offensive delete link more

answered 2014-02-05 14:25:34 -0600

nonamedotc gravatar image

You are looking for recoll. It is a brilliant tool for indexing files - both filenames and contents. It supports a rather large number of file types.

Recoll is a personal full text search package for Linux, FreeBSD and other Unix systems. It is based on a very strong back end (Xapian), for which it provides an easy to use, feature-rich, easy administration interface.

It can be installed as yum install recoll

edit flag offensive delete link more

answered 2014-02-05 11:56:20 -0600

$ yum search OCR

should get you started, as well as

$ yum list pdf\*

Additionally, feel free to take a look at this article from Ubuntu's pages on Optical Character Recognition (OCR):

Arguably the one producing the best (most accurate) results is Tesseract

with a list of other command-line OCR programs including gocr, cuneiform and ocrad, among others; as for the OCR programs with GUIs, there's gscan2pdf, Apache's pdfbox, pdfchain, pdfedit, pdfmod, pdfshuffler, pdftk. I'm sure Okular can be used for some of your intended actions as well.

edit flag offensive delete link more

Question Tools

1 follower


Asked: 2014-02-05 04:33:10 -0600

Seen: 219 times

Last updated: Feb 05 '14