How a machine can be a personalized language tutor 

Okim Kang at her desk

Speaking German to a machine is easy. The machine understanding and correcting you appropriately—that’s a little harder. 

Or rather, it’s hard for the machine to parse out a German language learner’s pronunciation. Did you make the small adjustment that an umlaut over a vowel makes? Was there enough phlegm in some of those more guttural words? Did the app cheerfully announce you did a good job while a German speaker would flinch at the butchering of a word? 

Well, the German speaker would probably let it go, but it would be better if the app didn’t. A language learning machine’s efficiency can be helped, or hindered, by how well it recognizes and corrects seemingly small variations in pronunciation. To achieve this, machines need a learner-specific language-feedback program that can discern pronunciation differences and correctly interpret learners’ speech programs. Currently, that does not exist, even as the need for global communications is increasing. 

“With the rise of English as an international language, intelligibility-based successful communication has been emphasized over native-like accents,” said Okim Kang, a professor of applied linguistics who studies accents and second-language learning. “However, second-language teachers often raise concerns about learners’ slow or stagnant pronunciation process, and they have no systematic way to assess each student’s speech changes, nor can students monitor and track feedback related to their pronunciation learning progression.” 

To address this lack, Kang, the principal investigator, received a $300,000 grant from the National Science Foundation’s Early-concept Grants for Exploratory Research program to explore the formulation of objective speech intelligibility measures with learner-specific feedback. The beginning phase of the project focuses on developing an operational collection of speech, language and perception-based measures to objectively assess speech intelligibility for second-language speech development.  

This transformative method of language learning will use advanced Automatic Speech Recognition (ASR)-based accent classification technology that will give both teachers and students individual and objective feedback. This first phase of the project will establish a baseline framework for operational objectives and proof-of-concept assessment feedback. Future phases should develop this ASR approach into much more effective language-learning technology. 

This approach will help teachers gauge learners’ intelligibility levels and allow learners to self-regulate their learning progress incrementally over time. It could be a game-changer, Kang said, bringing more effective language learning into living rooms and study halls and help create a low-pressure learning environment for people worried about tripping over unfamiliar words. Her machine-based pronunciation feedback will assess learners’ input based against most desirable second-language learners’ speech, not against a native speaker. Its feedback is created in comparison to other highly intelligible speakers; therefore, it’s more learner-friendly. 

“This is a pretty innovative project because basically no real learner-specific feedback programs exist in any of these language learning fields,” Kang said. “People develop various language learning and technology-based programs, but its effectiveness is always a question. No program can provide learner-based issues directly.” 

There are economic benefits to better language learning. The United States is home to thousands of skilled professionals from non-English speaking countries, many of whom work in various STEM fields. More effective language learning will make them more successful communicators. Additionally, this interdisciplinary project provides various opportunities for hands-on training and experience for both graduate and undergraduate students in the fields of language education, applied linguistics, computer engineering and speech technology. 

This is especially important for Arizona, which is home to many people who are not native English speakers, including more than 75,000 English language learners in Arizona’s public schools. When this program is developed, students and teachers will have a teaching and learning tool that better meets their needs. 

Kang is collaborating with co-principal investigators John Hansen from the University of Texas at Dallas and Stephen Looney from Penn State University. 

This research, because of its interdisciplinary nature, also will contribute to advancements in other fields, including computer science technologies, error correction feedback systems, algorithm development and artificial intelligence. Kang also expects to see more interdisciplinary work between linguistics, computer science, education and speech science. 

For Kang, whose research career is focused on improving successful global communication, this is the wave of the future. She is submitting another grant proposal for almost $1 million to continue this work. 

“In various fields, this artificial intelligence-related ASR-based approaches are taking up various research fields, certainly including the area of language education,” she said. “Technology is an inevitable aspect of language teaching and learning tools, and my research is providing direct evidence for effectively incorporating the advancement of technology into the language classroom.” 

Learn more about Kang’s research. 

Northern Arizona University Logo

Heidi Toth | NAU Communications
(928) 523-8737 |

NAU Communications