UK Document Management

May 31, 2008

Image Scanner

Filed under: Document Scanners — admin @ 7:05 pm

In computing, a scanner is a device that optically scans images, printed text, handwriting, or an object, and converts it to a digital image. Common examples found in offices are variations of the desktop (or flatbed) scanner where the document is placed on a glass window for scanning. Hand-held scanners, where the device is moved by hand, were briefly popular but are now less common due to the difficulty of obtaining a high-quality image. Mechanically driven scanners that move the document are typically used for large-format documents, where a flatbed design would be impractical.

Modern scanners typically use a charge-coupled device (CCD) or a Contact Image Sensor (CIS) as the image sensor, whereas older drum scanners use a photomultiplier tube as the image sensor. A rotary scanner, used for high-speed document scanning, is another type of drum scanner, using a CCD array instead of a photomultiplier. Other types of scanners are planetary scanners, which take photographs of books and documents, and 3D scanners, for producing three-dimensional models of objects.

Another category of scanner is digital camera scanners, which are based on the concept of reprographic cameras. Due to increasing resolution and new features such as anti-shake, digital cameras have become an attractive alternative to regular scanners. While still having disadvantages compared to traditional scanners, digital cameras offer advantages in speed and portability.

Document processing

The scanning or digitization of paper documents for storage makes different requirements of the scanning equipment used than scanning of pictures for reproduction. While documents can be scanned on general-purpose scanners, it is more efficiently performed on dedicated document scanners manufactured by Atiz Innovation, Böwe Bell & Howell, Canon, Epson, Fujitsu, HP, Kodak and other companies.

When scanning large quantities of documents, speed and paper-handling is very important, but the resolution of the scan will normally be much lower than for good reproduction of pictures.

Document scanners have document feeders, usually larger than those sometimes found on copiers or all-purpose scanners. Scans are made at high speed, perhaps 20 to 150 pages per minute, often in grayscale, although many scanners support color. Many scanners can scan both sides of double-sided originals (duplex operation). Sophisticated document scanners have firmware of software that cleans up scans of text as they are produced, eliminating accidental marks and sharpening type; this would be unacceptable for photographic work, where marks cannot reliably be distinguished from desired fine detail. Files created are compressed as they are made.

The resolution used is usually from 150 to 300 dpi, although the hardware may be capable of somewhat higher resolution; this produces images of text good enough to read and for optical character recognition (OCR), without the higher demands on storage space required by higher-resolution images.

Document scans are often processed using OCR technology to create editable and searchable files. Most scanners use ISIS or Twain device drivers to scan documents into TIFF format so that the scanned pages can be fed into a document management system that will handle the archiving and retrieval of the scanned pages. Lossy JPEG compression, which is very efficient for pictures, is undesirable for text documents, as slanted straight edges take on a jagged appearance, and solid black (or other color) text on a light background compresses well with lossless compression formats.

While paper feeding and scanning can be done automatically and quickly, preparation and indexing are necessary and require much work by humans. Preparation involves manually inspecting the papers to be scanned and making sure that they are in order, unfolded, without staples or anything else that might jam the scanner. Additionally, some industries such as legal and medical may require documents to have Bates Numbering or some other mark giving a document identification number and date/time of the document scan.

Indexing involves associating keywords to files so that they can be retrieved by content. This process can sometimes be automated to some extent, but is likely to involve manual labour. One common practice is the use of barcode-recognition technology: during preparation, barcode sheets with folder names are inserted into the document files, folders, and document groups. Using automatic batch scanning, the documents are saved into the appropriate folders, and an index is created for integration into document-management software systems.

A specialized form of document scanning is book scanning. Technical difficulties arise from the books usually being bound and sometimes fragile and irreplaceable, but some manufacturers have developed specialized machinery to deal with this. For instance, Atiz DIY scanner uses a V-shaped cradle and a V-shaped transparent platen to handle brittle books. Often special robotic mechanisms are used to automate the page turning and scanning process

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URL

Leave a comment

You must be logged in to post a comment.

Powered by WordPress