Cortical Segmentation of Syllables
Authors: Harald Höge
Abstract:
In the early times of automatic speech recognition, bottom-up segmenting of speech into syllables has been investigated. But this approach was not competitive to current solutions. Recent cortical measurements lead to the conclusion, that evolution has found a neural implementation, which perform reliably a bottom-up segmentation of the auditory signal into syllables. This segmentation is based on θ – oscillations, where the duration and position of each syllable is related to the duration and position of each cycle of the θ-oscillations [6]. For transporting information and steering rhythmic tasks, θ oscillations were observed in many locations of the brain, especially in the thalamus. Yet, nor the cortical location of the θ-oscillator for segmentation of syllables, nor the implementation of the θ-oscillator itself is known. Neural models of θ-oscillators for syllable segmentation are scarce. We follow the approach [9], where the θ-oscillator is built up by PING microcircuits. In [9] these PINGs are driven by the sum of auditory signals given in each critical band (CB). In this paper this approach is extended by using onset edge features extracted in CBs. First experiments showed that the CB-PINGs deliver spikes related to the onset of syllables, which corresponds to a specific phase of the θ-cycles. The timing of the spikes from the CB-PINGs differs slightly. By interconnecting appropriate CB- PINGs, it seems possible to reduce the differences leading to a `unified’ θ-oscillation.