Go Back   Cyber Tech Help Support Forums > Operating Systems > Linux

Notices

Reply
 
Topic Tools
  #1  
Old September 7th, 2007, 02:30 AM
modmidget modmidget is offline
Senior Member
 
Join Date: Jan 2001
O/S: Linux
Location: Woodland, CA USA
Posts: 440
OCR Software

I'm running Ubuntu 7.04 (Fiesty Fawn?) and I'd like to find a good OCR program. I've already looked at Tesseract but it can only be downloaded as a tarball and I've NEVER been successful at installing a program in that format. I'd like to find a good OCR program in .deb format that I can just double click on and watch the install happen. I tried installing Kooka but it requires the KDE desktop and would not run on my machine. I have Omnipage 11 loaded on my Win2000 computer but it doesn't work very well, so I'm hoping I can find something better for my Linux PC.

Does anyone have any suggestions?
__________________
1968 Dodge Charger owner............. BLESSED ARE THE CRACKED; FOR IT IS THEY WHO LET IN THE LIGHT
Reply With Quote
  #2  
Old September 7th, 2007, 07:01 AM
kage's Avatar
kage kage is offline
Cyber Tech Help Moderator
 
Join Date: Apr 2004
O/S: Linux
Age: 19
Posts: 1,259
Checkout Clara, GOCR and Ocrad. Kooka also doesn't look bad, if you're using KDE (or don't mind using KDE/QT libraries).
__________________
Tips for Linux Newcommers

If we have helped you, please consider supporting Cyber Tech Help with a subscription.

Reply With Quote
  #3  
Old September 7th, 2007, 05:18 PM
modmidget modmidget is offline
Senior Member
 
Join Date: Jan 2001
O/S: Linux
Location: Woodland, CA USA
Posts: 440
Thanks much for the links Kage. I checked them out and it looks like most of them are downloaded as tar.gz files, which never seem to work for me. I also did some additional research and found that the ocr process in Linux can be a real pain and very time consuming. The tesseract-ocr instructions I found on the web say that the user has to scan and save the document using sane, then you have to use something like Gimp to clean-up the file, then you have to do a command line instruction to convert the file to a .tiff format, then you can finally run tesseract from the command line to convert the file to a txt format. By the time your finished you have 3 files for a one page document, you end up with a bmp/jpg file, a tiff file, and a txt file.

I was hoping to find something that worked more like Omnipage where you just put your document in the scanner, click on aquire, and then click on the OCR button. I don't do a lot of OCR stuff so I think I'll just stick with Windows Omnipage.
__________________
1968 Dodge Charger owner............. BLESSED ARE THE CRACKED; FOR IT IS THEY WHO LET IN THE LIGHT
Reply With Quote
  #4  
Old September 8th, 2007, 12:35 AM
kage's Avatar
kage kage is offline
Cyber Tech Help Moderator
 
Join Date: Apr 2004
O/S: Linux
Age: 19
Posts: 1,259
You're using Ubuntu. Don't download the source packages (tar.gz / .tar.bz2) of those programs, just use apt-get or synaptic (gui frontend for apt-get) to install them. Installation shouldn't take you 5 minutes.
__________________
Tips for Linux Newcommers

If we have helped you, please consider supporting Cyber Tech Help with a subscription.

Reply With Quote
  #5  
Old September 8th, 2007, 02:43 AM
modmidget modmidget is offline
Senior Member
 
Join Date: Jan 2001
O/S: Linux
Location: Woodland, CA USA
Posts: 440
One hour ago I didn't even know what synaptic was/is...... it only took a few minutes to figure it out and less than 5 minutes to install Tesseract. Thanks once again Kage. Now I'll have to figure out how to use Xsane so I can get it to scan a whole document. I've scanned a one page document 3 times and I only get the left half, not the entire page. If you have any suggestions on how to fix that problem too Kage they would be appreciated.
__________________
1968 Dodge Charger owner............. BLESSED ARE THE CRACKED; FOR IT IS THEY WHO LET IN THE LIGHT
Reply With Quote
Reply

Bookmarks

Topic Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT +1. The time now is 01:28 AM.

[ RSS ]