course participants course announcements about this wiki questionnaires and assignments slides of presentations course schedule related resources Gerhard Fischer Hal Eden Mohammad Al-Mutawa Ashok Basawapatna Lee Becker Jinho Daniel Choi Guy Cobb Holger Dick Nwanua Elumeze Soumya Ghosh Rhonda Hoenigman elided#1 Dan Knights Kyu Han Koh elided#2 Yu-Li Liang Paul David Marshall Keith Maull Jane Kathryn Meyers John Michalakes Michael Wilson Otte Deleted Page Joel Pfeiffer Caleb Timothy Phillips Dola Saha deleted |
Advances in Children’s Speech Recognition with Application to Interactive Literacy Tutors By Andreas Hagen University of Colorado, Boulder, 2006 The focus of this thesis was the role of automatic speech recognition systems in automated literacy tutors, and techniques for improving speech recognition when the speaker is a child. An automated literacy tutor needs to function like a real teacher in that it selects tasks appropriate for child, observes the child’s behavior, and provides feedback. Part of this involves being a good listener. However, automatic speech recognition systems typically have higher error rates for children than for adults. The thesis presents an overview of the most common speech recognition techniques and existing existing automated literacy tutors, both academic and commercial. The strengths and weaknesses of these systems are analyzed. Also included is a discussion of why speech recognition is difficult for children. There is a significant amount of variability in the characteristics generally used for recognizing speech, such as fundamental frequency, sentence length, and vowel and fricative length. There are additional difficulties for a reading tutor due to the nature of the product. Words are pronounced incorrectly, repeatedly, or using different timing and pausing. This thesis proposes a set of speech recognition techniques to improve oral reading tracking. These techniques involve modification to statistical language and acoustic modeling, and a combination of modifications in these areas significantly reduced error rates in read-aloud tasks. In addition, the thesis proposed a hybrid word/subword unit speech recognition system where the length of the subwords are somewhere between phoneme and entire word. This system showed accuracy rates equivalent to existing systems, but enabled finer grain analysis of speech so that mis-pronunciations could be identified at the subword level. What I learned from reading this is that a PhD thesis is going to take a long time to write. It would be hard to assess whether this was a good thesis, since it’s the first one I’ve ever read. However, it was definitely long and seemed very thorough. I think the subject matter was a worthwhile topic and I can see how this research would contribute to the betterment of mankind. Last modified 4 December 2007 at 10:07 pm by RhondaHoenigman |