Data-Driven Measures of Lectal Coherence via Speech Embedding Analysis

Michele Gubian, James Kirby, Barbara Plank (LMU Munich)

This project develops AI-based methods to quantify lectal coherence by analysing speech embeddings. Instead of relying solely on manual annotation, it systematically manipulates and processes speech corpora to measure how phonetic, lexical, and prosodic features contribute to coherence. By comparing manipulated and original data, it derives scalable, automated metrics for dialectal variation. These methods are validated against other RU projects, advancing both linguistic theory and machine learning applications to model variation, coherence, and language change.