Learn a language by chatting with an AI tutor
Abstract
- With the Quazel app, learners can speak to an AI tutor in 21 languages. If they make a mistake, the app corrects them in real-time.
- The chatbot understands how well someone is speaking and adjusts its expressions, choice of words and sentence structure accordingly.
- One of the biggest challenges for the three founders was to tame the publicly available language models.
Engaging in regular conversations with a qualified language teacher or language-exchange partner is the best way to learn a foreign language. But this can be a complicated and expensive process.
One alternative is the new app developed by ETH spin-off Quazel, which lets users chat with an AI tutor on their smartphone whenever and wherever they choose. The AI tutor speaks 21 languages perfectly – and it has endless patience with learners!
“We want to make language learning as easy and accessible as possible,” says founder and CEO Philipp Hadjimina, who studied computer science at ETH Zurich. “The idea is to help as many people as possible enjoy the benefits of a truly personal language tutor.”
Fluid conversation with a chatbot
Until recently, most language-learning apps focused primarily on written tasks. Even those that offered a voice-recognition function generally only responded with predetermined sentences. This often gave an artificial and prescripted feel to conversations rather than allowing them to flow freely. Recently, however, the rapid development of large language models such as ChatGTP has, for the first time, made it possible to have fluid, natural-sounding conversations with an AI.
These technological advances have also been harnessed by the Quazel app: “Learners can discuss almost anything they like with their AI tutor – from ordering at restaurants and chatting about their favourite sport to debating philosophical issues. And if they make a grammatical mistake or use the wrong word, the app corrects them in real-time ,” Hadjimina says.
Conversations generally begin with the chatbot asking a question to which the learner provides a spoken response. Especially for beginners, conversations are then primarily driven by these prompts from the chatbot. As the conversation continues, however, the AI automatically adapts the complexity of its answers to the learner’s level.
Harnessing large language models
All these features are made possible by the large, publicly available language models running behind the scenes in the Quazel app. These models were trained on large datasets from online sources, including numerous books, articles, websites and social media posts. Based on these materials, the AI tutor learned the grammatical rules of 21 languages as well as the typical semantic relationships between words and sentences.
The principle behind these language models is similar to that of the text-recognition function on a smartphone: based on the messages we typed in the past, and drawing a large database of known words and sentences, an AI predicts which word is most likely to come next in a sequence.
Quazel works in much the same way: “Even though our chatbot may never have heard a user say a certain sentence in a particular way, it can still work out which response would be most appropriate based on the thematic context, past conversations and its background knowledge,” says ETH computer scientist and co-founder David Niederberger, who is responsible for the technology at Quazel.
Helping the model adapt
One of the biggest challenges for the three founders was to tame the publicly available language models. Models such as OpenAI’s GPT4 have become so good that their answers would be too complicated for language learners and students. “For example, we had to use feedback to teach our AI language tutor what someone at a beginner’s level actually sounds like,” Niederberger says. In short, the chatbot understands how well someone is speaking a language and adjusts its expressions, choice of words and sentence structure accordingly.
Choosing the right language model proved to be another source of sleepless nights for the three entrepreneurs. If the model is too complex, it requires very large amounts of computing power, which makes it too expensive for end users. But if it’s too simple, there’s no fluidity to the conversations. “Choosing the model was about finding the right balance between complexity and cost. It’s exactly the kind of classic engineering dilemma that we encountered during our time at ETH,” Niederberger says.
To complicate matters further, the rapid evolution of the language-model market means that new and better models are appearing on an almost weekly basis. Staying on top of which new developments are worth incorporating into Quazel requires a great deal of flexibility on the part of the three founders – and a good feel for how things are likely to develop in the future.
Strong demand from the start
Last autumn, the three company founders were invited to join the renowned Y Combinator startup accelerator, and they are currently working out of an Airbnb in San Francisco. As part of the program, they get access to a large network of experienced entrepreneurs and potential investors.
The popularity of their language-learning app was clear from the moment they launched it: “Quazel just exploded. We had 50,000 people using the prototype within just two days,” Hadjimina says. That’s a trend the Quazel founders hope to continue.