March 18, 2025 5 min read· The Lingo team

From text to voice: the next chapter

Text was the on-ramp. But these are spoken languages, and the people who hold them speak more than they write. Lingo becomes a place to contribute your voice.

productvoicecommunity

Our first chapter was text: a compiled corpus, French-pivot models, an open release. It worked — but it also kept bumping into the same wall. These are oral-first languages. The richest knowledge lives in people who speak fluently and may never write a line. A text-only project asks exactly the wrong thing of exactly the right people.

So Lingo grows a second half: a place to contribute your voice.

The loop

The design is deliberately simple, built for low bandwidth and for speakers who have never typed in their language:

Record. You see a prompt in a language you read. You hold a button and say it in your mother tongue. Re-record until it feels right.
Verify. Others listen and judge whether a recording matches the prompt. Consensus — not a single gatekeeper — decides what enters the corpus.
Earn. Quality contributions earn points from a campaign's budget, redeemable for cash, mobile money, or goods. Preservation should not be unpaid labour.

Why this changes everything

Voice data breaks the dependency on written, formal sources. Instead of the register of scripture and schoolbooks, we get everyday speech — how people actually greet, bargain, and tell stories. That is the data that lifts a model past its text-bound ceiling, and it can only come from speakers themselves.

Researchers launch campaigns, set a points budget, import prompts from a CSV (or generate culturally-adapted ones), and invite speakers and verifiers by role with a single link. Verified recordings flow into a clean, exportable dataset.

It runs in the browser and installs as an app, because the people who hold these languages are not all on the latest phone with unlimited data.

Text got us a foothold. Voice is how the languages get to speak for themselves.