Program - ESSV 2026

Conference Program / Wissenschaftliches Programm

Time	Wednesday (March 4)	Thursday (March 5)	Friday (March 6)
Morning	Arrival	9:00–10:00 Keynote 2: Thomas Haigh	9:30–10:30 Keynote 4: Hans Rudolf Straub
Morning	Arrival	10:20–11:40 Session 3: Speech Synthesis	10:50–12:10 Session 5: Voice, Language and Cognition
Noon	11:30–12:30 Registration 12:30–13:00 Welcome	11:40–12:40 Mittag in der Mensa 12:40 Fototermin (...)	12:10 Best Student Paper Award Closing Remarks 12:30 Farewell
Afternoon	13:00–14:00 Keynote 1: Rüdiger Hoffmann	13:00–14:40 Stadtführungen	Departure
	14:20–15:40 Session 1: Speech Signal Recognition and Enhancement	15:00–16:00 Keynote 3: Thomas Hoffmann
		15:00–16:00 Keynote 3: Thomas Hoffmann
	16:00–17:20 Session 2: Speech Analysis I	16:20–17:20 Session 4: Speech Analysis II
Evening	17:30 Reception (Sommerresidenz, Holzer-Saal)	17:40-18:40 Show & Tell Poster Session
Evening	19:00 ESSV Business Meeting (at Café im Paradeis)	19:00 Conference Dinner at GUTMANN

The detailed programm may still change slightly.

Rüdiger Hoffmann:
Über den Versuch, ein totgesagtes Pferd zu reiten.
Sprachtechnologie in der DDR im zeitlichen und räumlichen Kontext

Thomas Haigh:
Is speech recognition “artificial intelligence”?
A historical examination of academic branding.

Thomas Hoffmann:
Constructions, Computation and Creativity:
How different is LLM and human linguistic creativity?

Hans Rudolf Straub:
Formale Semantik im offenen semantischen Raum

Chair: Sven Grawunder

Chair: Bernd Möbius

Abdullah al Foysal, Ronald Böck:
Enhancing ASR for German Medical Domain without Fine-Tuning
Nilesh Madhu:
Iterative Ambient-Signal-Aware Speech Enhancement via Cascaded DNN Processing without Retraining
Lisa Winkler, Andreas Wendemuth:
An Approach to Improving Robustness in Dynamic Acoustic Environments: Context Noise Representation Learning from Urban Speech Emotion Recognition
Sophie Hoppe, Anabell Hacker, Markus Brückl:
Im Raum der Täuschung - Raumhall als Schwachstelle automatischer Deepfake-Erkennung

Chair: Günther Wirsching

Marcella Palladino:
Zur Transkription mündlicher Phänomene in der politischen Sprache
Jürgen Trouvain:
Dialektale Vielfalt in visuellen und auditiven Illustrationen: „Nordwind und Sonne“ in saarländischen Dialekten
Neda Mousavi, Felix Burkhardt:
The Emotional Portrayal of an Ordinary Talk
Daniel Duran, Robert Fromont, Jennifer Hay, Allie Osborne, Melanie Weirich, Miriam Oschkinat, Stefanie Jannedy:
Evaluating Full Automation of Computational Sociophonetics for the Plapper Corpus

Chair: Andreas Wendemuth

Chair: Jürgen Trouvain

Yamini Sinha, Ingo Siegert
From Writing to Speaking: on the Limits of Text-Trained Authorship Models for Speech Transcripts
Ian Howard:
A Servo-Motor-Actuated Artifical Lung for Robotic Speech Production
Tianyi Zhang, Peter Birkholz:
Joint Estimation of Source and Filter Parameters for Speaker Adaptation in Articulatory Speech Synthesis
Paul Kontantin Krug, Christoph Wagner, Peter Birkholz, Timo Stich:
TensorTract3: Pushing the Limits of Articulatory Speech Synthesis

Chair: Oliver Jokisch

Chair: Ronald Böck

Daniel Duran, Laurens Winkler, Sina Zarrieß:
How well can LLMs handle novel phonetic forms?
Eugenia Rykova, Tanja Rinker, Angela Grimm:
ASR-based Automatic Assessment of Oral Production Tasks in Multilingual Children
Niklas Berensmeyer, Stefan Hillmann, Wolfgang Maier:
Measuring User Acceptance of Proactively Played Touristic Texts in an In-Car Voice Assistant

Chair: Ronald Römer

Chair: Sebastian Möller

Matthias Busch, Jonas Schewior, Andreas Wendemuth, Ingo Siegert:
Creating Documents with Voice: Maybe it is not about Transcription but Reflection?
Moinam Chatterjee, Behnam Ensan, Andreas Wendemuth, Ayoub Al-Hamadi:
Think Like a Team: Graph-based Representation of Shared Mental Models in Human-Agent Collaboration
Ronald Römer, Johannes F. Kuhn, Markus Huber-Liebl, Peter b. Graben, Matthias Wolff:
Ein konzeptioneller Beitrag zur Entwicklung und Nachbildung von Problemlöse- und Sprachfähigkeiten
Stefan Hillmann, Philipp Harnisch, Daniel Schuhmann, Navid Ashrafi, Jan-Niklas Voigt-Antons:
A Modular Multimodal Dialog Architecture for Digital PROM Collection

P1 Harald Höge:
Towards a Brain-Computer Interface Modelling the Phonological Short-Term Memory
P2 Syed Hur Abbas, Peter Birkholz, Muhammad Arif:
Feature-Enhanced Consensus Graph Model for EEG-based Imagined Word Recognition
P3 Marcella Palladino, Gannuscio Vincenzo:
Außerparlamentarische politische Kommunikation: Datenerhebung und Analyseperspektiven
P4 Anabell Hacker, Iris Sidonie Bakker, Ingo Siegert:
Evaluation of WebRTC as a Framework for Voice Recordings in Online Surveys
P5 Martha Schubert, Valentin Kany:
Automatic Detection of Disfluencies in L1 and L2 Child Speech
P6 Sven Grawunder, Ute Gradmann:
Assessing Speaking Modes in Radio News Using Topic Classification and Acoustic Parameters
P7 Zihao Huang, Tianyi Zhang, Peter Birkholz:
Self-Supervised Multi-Task Learning for Enhanced Prosody Prediction in German Articulatory Speech Synthesis
P8 Robin Bitterlich, Paul Böhm, Oliver Jokisch:
Parameter Optimization for Administration-Specific Speech Transcription with the Faster Whisper System

S1 Thomas Ranzenberger, Steffen Fresinger, Tobias Bocklet, Korbinian Riedhammer:
HAnS: Multimodal RAG-based Persona Generation for Media and Documents in E-Learning
S2 Felix Gräßer, Robert Wardenga, Dominik Jülg, Christian Gaida, Rico Petrick:
Alphaspeech Transcribe – eine autonome, containerisierte Speech-to-Text-Plattform für professionelle Transkriptions- und Dokumentationsworkflows
S3 Bernd J. Kröger:
DYNARTmo: Ein dynamisches Artikulationsmodell für die Lehre