Classifying serial (time-based) data

Open holder66 opened this issue 3 years ago • 0 comments

A great many interesting real-world use cases for machine learning classification make use of serial or time-based data; examples include speech recognition, gesture recognition, stock market prediction using historical data, fault detection in rotating machines by looking at vibration or sound. There are probably many additional use cases that have been little explored. In his book "Logic of the Living Brain" (John Wiley & Sons, 1974) Gerd Sommerhoff makes a compelling argument that brains store properties of objects in the environment as pairs of internal representations, essentially stimulus-response pairs. For example, the inertia of an object would be stored as the set of forces inducing movement together with the set of sensory transformations resulting from that movement. This is clearly time-based. In Chapter 9, “Shape Recognition and Internal Models”, Sommerhoff proposes the idea that objects in the visual environment are recognized by tracking the outline of the object, using specialized line detectors in the retina and brain. It is likely that microsaccades play a role in line detection, whereas ordinary saccadic movements are important for tracking an object’s outline. Of course, outline tracking is not the only mechanism at play. Symmetry, asymmetry, regularity or irregularity, roundness, squatness, jaggedness and other features of the outline are believed to be important, as are patterns within the outline. The most salient aspect of outline tracking, however, is that each line segment can be expressed as a vector, with direction and length. Thus, a complete outline of an object can be represented by a series of vectors, ie a serial data pattern. Now, suppose that other than the initial vector, the direction of subsequent vectors is given as the number of degrees of difference from the preceding vector, and the length as a fraction of the initial vector’s length. Thus, visual object recognition becomes a serial data classification task. When expressed in this fashion, a series of vectors that together describe an object by tracking its outline, becomes invariant as to the object’s size in the visual field (ie invariant as to distance from the observer) and also invariant as to rotation in the plane of the visual field. This eliminates two major difficulties in visual object recognition. To a certain extent, other distortions (for example, an italic letter compared to a upright letter) or a point of view partially to one side, or partially above or below the plane of the object) will reduce classification accuracy but not eliminate it. An ML algorithm with a tiny memory requirement and simple processing needs, such as the HamNN algorithm, can be easily adapted to serial pattern classification. For example, an analog speech recording can be broken into discrete time slices; for each slice, apply bandpass filtering to obtain amplitude values at different frequencies. This serial data stream can be labelled, for instance as spoken words. If instead of absolute frequencies, frequency shifts away from a baseline frequency are used, and relative time instead of absolute time for each slice, such data becomes relatively invariant to speaker pitch or speed.

Mar 08 '22 20:03 holder66