Phoneme-to-phoneme alignment and conversion


This paper deals with new methods for phoneme-to-phoneme (P2P) alignment and conversion. Alignment is carried out by dynamic programming for Levenshtein distance calculation. Cost functions based on phoneme co-occurrence statistics and on distinctive feature vector distances accounting for connected speech processes are comparatively evaluated. Given the aligned data, decision trees for P2P conversion across word boundaries are trained and evaluated. Amongst others it turned out, that while accounting for assimilation processes improved alignment quality, these quality differences showed no impact on P2P conversion performance.

Year: 2010
In session: Speech Synthesis
Pages: 126 to 133