Recognizing  Modern Sound Poetry with LSTM Networks

Burkhard Meyer-Sickendiek; Hussein Hussein; Timo Baumann

Recognizing Modern Sound Poetry with LSTM Networks

Authors: Burkhard Meyer-Sickendiek, Hussein Hussein, Timo Baumann

Abstract:

Our paper focuses on the computational analysis of “readout poetry” (german: Hördichtung) – recordings of poets reading their own work – with regards to the most important type of this genre, the modern “sound poetry” (german: Lautdichtung). Whereas “readout poetry” often uses normal words and sentences, the “sound poetry”, developed by dadaistic poets like Hugo Ball and Kurt Schwitters or concrete poets like Ernst Jandl, Oskar Pastior, or Bob Cobbing, combines the “microparticles of the human voice” like the segments in Ernst Jandls sound poem “schtzngrmm” (“schtzngrmm / schtzngrmm / tttt / tttt / grrrmmmmm / tttt / sch / tzngrmm”). Within the genre of sound poetry, there are two main forms: The lettristic and the syllabic decomposition. A short anecdote will explain this difference: The dadaist Raoul Hausmann developed the lettristic sound poetry in his early dadaistic poem “fmsbw” from 1918. This is said to have inspired his successor Schwitters, whose famous “Ursonate” [The Sonata in Primal Speech] begins with the words “Fümms bö wö tää zää Uu”. With the “Ursonate”, Schwitters developed a syllabic variation of the lettristic poems of Hausmann. The paper shows how to train a bidirectional LSTM network in order to differ between these “dadaistic” sound poems and the “normal” read out poems. In a further step, we will also show how to distinguish between the lettristic and the syllabic decomposition. Based on a bidirectional LSTM network that reads encodings of the character sequence in the poem and uses the output of each directional layer, we identify poems of the sound poetry genre and differentiate between its two types of decompositions. The classification results of sound poetry vs. other poetry as well as lettristic vs. syllabic decomposition are with a high performance, yielding a f-scores of 0.86 and 0.84, respectively.

Year: 2018
In session: Speech Processing and Prosody
Pages: 192 to 199