ESSV Archive

Studientexte zur Sprachkommunikation Band 110: Elektronische Sprachsignalverarbeitung 2025

Conference proceedings of the 36st conference in Halle/Saale with 41 contributions. Editor(s): Sven Grawunder ISBN: 978-3-95908-803-9

Cover of the ESSV 2025 proceedings book.

Multimodal Perception of Speech and Non-verbal Cues

Smiling PAULE

Konstantin Sering

Auf die inneren Werte kommt es an? – Relevanz von Stimme und Gesicht bei der Beurteilung von Attraktivität, Sympathie Und Persönlichkeit

Anabell Hacker

Recognition of audio-visual attitudes

Phrashant Khatri, Hansjörg Mixdorff, Preeti Rao, Albert Rilliard

Computational linguistics and LLM-related systems

Wortgenerator für minimalistische Grammatiken

Johannes Kuhn, Matthias Wolff, Isidor Konrad Maier

Structured review on rag- and multi-agent frameworks: literature overview

Md Monsur Ali, Abdullah Al Foysal, Siddarth Venkateswaran, Ronald Böck

Structured review on rag- and multi-agent frameworks: application-based assessment

Md Monsur Ali, Abdullah Al Foysal, Siddarth Venkateswaran, Ronald Böck

Frequency-magnitude relation of numeral words based on search-engine results

Isidor Konrad Maier, Tillmann Rosenow, Okko Tuuri, Matthias Wolff

Recognition in HMI and Therapeutic Applications

Evaluating the user interface of the Rehalingo speech training system with aphasic patients

Hans-Günter Hirsch, Yannic Tiggelkamp, Christian Neumann, Hendrike Frieg, Stefan Knecht

Evaluating optopalatography sensor positions for command word recognition

Arne-Lukas Fietkau, João Menezes, Peter Birkholz

Cross lingual transfer learning does not improve aphasic speech recognition

Sara Mühlhausen, Sarah Gomez, Norina Lauer, Timo Baumann

Testing the strategic elicitation of creative pronunciations in monologues and dialogues

Daniel Duran, Leonie Schade, Joana Cholin, Petra Wagner

Benchmarking ASR and TTS

Significance scoring for summarizing lecture recordings: a multi-modal perspective

Raviteja Boddu, Anderson De Lima Luiz, Munir Georges

Evaluation of recognition errors of hybrid and transformer-based ASR systems in German video lectures

Thomas Ranzenberger, Ilja Baumann, Sebastian P. Bayerl, Dominik Wagner, Tobias Bocklet, Korbinian Riedhammer

Speech-to-text in upper Sorbian: current state

Ivan Kraljevski, Frank Duckhorn, Daniel Sobe, Constanze Tschöpe, Matthias Wolff

Rule-based grammatical error detection on spontaneous children’s speech

Christopher Gebauer, Lars Rumberg, Fabian Witt, Edith Beaulac, Hanna Ehlert, Jörn Ostermann

Multilingual Speech and Language Data Processing

A multilingual corpus of German, French and Italian political discourse: goals and methodological challenges

Silvia Modena, Marcella Palladino, Vincenzo Gannuscio

Eine Datenbank für Markensprechweise (BrandDB)

Markus Brückl, Anabell Hacker, Nancy Wünderlich, Katrin Talke, Dalida Valeeva

Teilautomatisierter Workflow zur Aufbereitung grosser Audiodatenmengen für Signalbasierte Analysen

Christoph Draxler, Felicitas Kleber, Sven Grawunder, Jurgen Trouvain

Real-time audio transcriber for language barrier-free classrooms

Huiyu Liu, Gokul Srinivasagan, Munir Georges

Voice, Language and Cognition

Effects of loudness on timbre features: comparison of different languages and scenarios

Oliver Niebuhr, Rongjie Shi, Wentao Gu

The effects of lexical frequency on anticipatory voice assimilation in Bulgarian obstruents

Mitko Sabev, Bistra Andreeva, Bernd Möbius, Ivan Yuen, Omnia Ibrahim

It all starts with a little difference tensors as data and code

Markus Huber-Liebl, Tillmann Rosenow, Ronald Römer, Günther Wirsching, Matthias Wolff

State space model of airflow in the human vocal apparatus

Ian S. Howard

Poster

Cortical segmentation of syllables based on phases of Ɵ-cycles

Harald Höge

Relationship between speaking speed and pleasantness of listening speed

Daniel Schuhmann, Philipp L. Harnisch, Stefan Hillmann

Politolinguistics and spoken language processing: comparative analysis of German and Italian political speeches. A methodological framework

Marcella Palladino

Quality of experience of German machine translation and automatic text summarization

Shushen Manakhimova, Vivien Macketanz, Sebastian Möller

Adapting a student-facing chatbot to the needs of first generation students: a user experience study

Maria K. Wolters, Tatjana Kukic, Stefan Hillmann

Modular text normalization pipeline for language model training

Lisa Winkler, Melanie Schindler, Aaricia Herygers, Christian Gaida, Felix Gräßer, Rico Petrick, Frank Eisenhaber, Matthias Henker

Gender spectrum data from podcasts – a proof of concept

Jan Marquenie, Mareile Leonhardt, Sven Grawunder, Ingo Siegert

Annotation of disfluencies in child speech

Valentin Kany, Jürgen Trouvain

Pattern-based parsing of German traffic regulations (StVO) for legal knowledge graph construction

Ibrahim Siddig, Sviatoslav Tugeev, Munir Georges

Evaluating chain-of-thought prompting for abstractive dialogue summarization with large language models for German

Neha Deshpande, Stefan Hillmann, Sebastian Möller

An unsupervised approach to exploring speaking task complexity based on fluency metrics

Neda Mousavi, Sven Grawunder

Experimente zur Transkription von Verwaltungsbesprechungen und domänenangepasste Ergebnisprotokollierung

Robin Bitterlich, Oliver Jokisch, Ullrich Prax, Rocco Zimmermann

Speech technology in psychotherapy: exploring transcription tools and their potential impact

Martha Schubert, Matthias Busch, Julia Krüger, Ingo Siegert

Scalable engine and the performance of different LLM models in a SLURM based HPC architecture

Anderson De Lima Luiz, Shubham Vijay Kurlekar, Munir Georges

Show and Tell

Optopalatographic device “OPG2023”

Arne-Lukas Fietkau, João Menezes, Jihyeon Yun, Peter Birkholz

Avatar-gestützte digitale Aphasietherapie im Projekt APHADIGITAL – Prototyp der therapeutischen Komponenten

Judith Pietschmann, Susanne Voigt-Zimmermann, Elisabeth Zeuner, Richard Fiebelkorn, Eugenia Rykova, Mathias Walther

Voice and personality – music psychological aspects in speech perception

Dalida Valeeva

Phonetic distances in L3-speech

Konstantin Sering, Yu-Hsiang Tseng, Adriana Hanulikova

ESSV Konferenz Elektronische Sprachsignalverarbeitung