Abstract
This dissertation deals with algorithms for the detection and recognition of text that is present in photos of natural or urban scenes. Such algorithms, for example, make it possible to use a smart phone to translate unknown script that you see on the street into your own language. Street and shop names can also be recognized and a moving robot could find rooms in an office building based on the text labels at the doors. Although major leaps forward are being made in this area, there are still many fundamental problems. The system must deal with the wide variation in exposure, color, and angle of view of the camera. What is actually the foreground (letters) and what is the background? This is often difficult, especially with advertisement texts. The traditional methods used simple brightness differences to separate text and background. In this research, however, it is suggested to train specialist models for the foreground (text) and the background, which is often not uniform in color. There are also major differences between international script types. In the Asian cityscape, the text images are more colorful and complex in shape than in a Western context. That is why various methods are being tested in this study to detect both Asian (Kannada and Thai) and Western script. The algorithms are based on the detection of important points in the image with sharp edges and corners. Furthermore, a new method has been designed to better process color variation. This proved to be especially useful for the Asian types of writing. After the improved detection of text segments, the resulting cropped images can be presented to text-recognition algorithms.
Original language | English |
---|---|
Qualification | Doctor of Philosophy |
Awarding Institution |
|
Supervisors/Advisors |
|
Award date | 28-Feb-2020 |
Place of Publication | [Groningen] |
Publisher | |
Print ISBNs | 978-94-034-2496-5 |
Electronic ISBNs | 978-94-034-2495-8 |
DOIs | |
Publication status | Published - 2020 |