Title: Cartographic Character Recognition

Author(s): Howard Rafal, Digital Equipment Corporation, Nashua, NH 03062, and Matthew O. Ward, Computer Science Department, Worcester Polytechnic Institute, Worcester, MA 01609

Source: SPIE Vol. 1199, Visual Communications and Image Processing IV, 1989.

Abstract: This work details a methodology for recognizing text elements on cartographic documents. Cartographic Character Recognition differs from traditional OCR in that many fonts may occur on the same page, text may have any orientation, text may follow a curved path, and text may be interfered with by graphics. The technique presented reduces the process to three steps: blobbing, stringing, and recognition. Blobbing uses image processing techniques to turn the grey level image into a binary image and then separates the image into probably graphic elements and probable text elements. Stringing relates the text elements to words. This is done by using proximity information of the letters to create string contours. These contours also help to retrieve orientation information of the text element. Recognition takes the strings and associates a letter with each blob. The letters are first approximated using feature descriptions, resulting in a set of possible letters. Orientation information is then used to refine the guesses. Final recognition is performed using elastic matching. Feedback is employed at all phases of execution to refine the processing. Stringing and recognition give information that is useful in finding hidden blobs. Recognition helps make decisions about string paths. Results of this work are shown.

Matthew O. Ward (matt@cs.wpi.edu)