||In this report we describe an efficient pipeline for real-time text detection to be implemented on different architectures, with particular reference to smart phones.
The text detection pipeline is based on a rather standard segmentation followed by a classification of each segmented connected component. Segmentation is performed by a linear implementation of MSER -- known to be the state-of-the-art method for text detection-, where we compute as many descriptive features as possible while segmenting. Classification is carried out by a cascade of SVM classifiers, where each layer captures different levels of complexity by means of an appropriate choice of descriptors and kernel functions. Each detected text element, or character, is then merged into lines of text and words. Further on, each element can be feed to a multi-class classifier that performs character recognition --- this functionality is currently under development.
We report experiments aiming at assessing the appropriateness of the procedure to detect text and its performance quality when running on both x86 and ARM processors. Our computational speed is superior to all optimized implementations available while the classification rates are in line with the state of the art.