![]() |
|
Confused about Frequency Counters
On Mon, 29 May 2017 23:04:15 -0400, rickman wrote:
It is so easy to be misunderstood. I'm talking about the text showing up in the PDF document. I receive d a document that was clearly a scanned image in a PDF file. But the text was selectable and copyable. The two options are the image was scanned and OCR when the PDF was made, or the PDF viewer had OCR scanning built in. Since I couldn't select the text in another scanned image PDF it must be the former. The example I provided was how to do the latter. I scanned the image in one program, and added a searchable text layer with a PDF viewer. There are scanning programs that will seem to do the process in one step such as Nuance Omnipage, Paperport, Adobe Acrobat (NOT reader), etc. To the casual user, it looks like the process is being done in one step. In reality, it first scans to a bitmap. Next, the OCR software reads the bitmap to produce the searchable text layer. It then saves the result as a PDF file. To the best of my limited knowledge, none of the available software does the OCR step *WHILE* scanning, but I might be wrong about that. Original document scanned to JPG using Irfanview 4.44: http://802.11junk.com/jeffl/OCR%20Demo/JPG.jpg This is not searchable. Same document saved to PDF using Irfanview 4.44: http://802.11junk.com/jeffl/OCR%20Demo/PDF-no-OCR.pdf This is also NOT searchable. Same document in PDF-Xchange 6.0 build 322.4 after OCR: http://802.11junk.com/jeffl/OCR%20Demo/PDF-after-OCR.pdf This one can be searched. PDF-Xchange screen grab showing a typical search result: http://802.11junk.com/jeffl/OCR%20Demo/PDF-Xchange-screen.jpg I never did figure out how to display and edit the OCR text in PDF-Xchange Editor. Looking through their feature list of other versions, it seems to be something at only the more advanced and expensive versions will do. Bummer. -- Jeff Liebermann 150 Felker St #D http://www.LearnByDestroying.com Santa Cruz CA 95060 http://802.11junk.com Skype: JeffLiebermann AE6KS 831-336-2558 |
Confused about Frequency Counters
Jeff Liebermann wrote on 5/30/2017 12:58 AM:
On Mon, 29 May 2017 23:04:15 -0400, rickman wrote: It is so easy to be misunderstood. I'm talking about the text showing up in the PDF document. I receive d a document that was clearly a scanned image in a PDF file. But the text was selectable and copyable. The two options are the image was scanned and OCR when the PDF was made, or the PDF viewer had OCR scanning built in. Since I couldn't select the text in another scanned image PDF it must be the former. The example I provided was how to do the latter. I scanned the image in one program, and added a searchable text layer with a PDF viewer. There are scanning programs that will seem to do the process in one step such as Nuance Omnipage, Paperport, Adobe Acrobat (NOT reader), etc. To the casual user, it looks like the process is being done in one step. In reality, it first scans to a bitmap. Next, the OCR software reads the bitmap to produce the searchable text layer. It then saves the result as a PDF file. To the best of my limited knowledge, none of the available software does the OCR step *WHILE* scanning, but I might be wrong about that. I'm still not getting through. I'm not looking for ways to make PDF images text selectable. I'm reporting on what I saw. Original document scanned to JPG using Irfanview 4.44: http://802.11junk.com/jeffl/OCR%20Demo/JPG.jpg This is not searchable. Same document saved to PDF using Irfanview 4.44: http://802.11junk.com/jeffl/OCR%20Demo/PDF-no-OCR.pdf This is also NOT searchable. Same document in PDF-Xchange 6.0 build 322.4 after OCR: http://802.11junk.com/jeffl/OCR%20Demo/PDF-after-OCR.pdf This one can be searched. PDF-Xchange screen grab showing a typical search result: http://802.11junk.com/jeffl/OCR%20Demo/PDF-Xchange-screen.jpg I never did figure out how to display and edit the OCR text in PDF-Xchange Editor. Looking through their feature list of other versions, it seems to be something at only the more advanced and expensive versions will do. Bummer. -- Rick C |
All times are GMT +1. The time now is 06:20 AM. |
|
Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright ©2004 - 2014 DIYbanter