Skip to main content
For non-Latin scripts

Half the world doesn't type in Latin letters. They speak.

Pinyin, kana, Hangul, and Arabic input methods all add steps that the Latin alphabet doesn't have. Voice removes every one of them. HeySpeak is the feedback tool that fits how your audience already communicates.

Try it with your audience
5 free responses, no credit card

Why voice beats text forms for non-Latin scripts

Typing in Mandarin, Japanese, Korean, Arabic, or Hindi on a phone is slower than typing in English. Pinyin input requires a phonetic-to-character conversion on every word. Japanese requires switching between hiragana, katakana, and kanji. Speech bypasses all of it. A 60-second voice note carries more answer than a customer would ever type into a form.
30-50
characters per minute typing Pinyin on mobile, vs 200-250 syllables per minute in spoken Mandarin
~50%
of the world reads and writes in a non-Latin script as a primary language
Daily
voice messages on WeChat, KakaoTalk, and LINE are already a default mode in East Asia

Why text forms are slow for half the world

English-speaking product teams write feedback forms with English-speaking customers in mind. A standard mobile keyboard on a US or European phone hands the user a one-to-one map: the key you press is the letter you get. The customer types a sentence and submits.

For a Mandarin, Japanese, Korean, Arabic, or Hindi speaker, the same form looks different. A Chinese user types Pinyin, a romanized phonetic version of the word, then picks the correct Han character from a list of homophones the input method shows them. Every word is a small decision tree. A Japanese user moves between hiragana, katakana, and kanji within a single sentence, often switching scripts mid-word. A Korean Hangul keyboard is faster than CJK input but still slower than speech. An Arabic or Hebrew speaker types right-to-left on a system that was designed left-to-right.

None of this is exotic. It is the everyday cost of typing in a non-Latin script on a phone. The result is the same in every market: customers write less than they would say. The form gets the short version of the answer, not the real one.

What Pinyin and IME input actually costs you

Average mobile typing speed in Pinyin lands somewhere around 30 to 50 characters per minute for most users. Spoken Mandarin runs at roughly 200 to 250 syllables per minute in normal conversation. The gap is not a few seconds. It is a different unit of effort.

That gap shows up in your feedback data as silence. The customer who would have given you three sentences of useful detail gives you one short answer, or skips the form entirely. You read the responses and conclude the audience “doesn't engage” with feedback. They do. They just don't want to type 200 characters of Pinyin to do it.

Voice flips the cost. The customer holds a button and talks. Voxtral transcribes the recording into the original-language script. The team reads the transcript or, if the team works in English, reads an AI summary translated into English. The customer's effort is talking for 30 seconds. The team's effort is reading the transcript that a human would have produced anyway.

Voice is already the dominant messaging mode in East Asia

WeChat introduced hold-to-record voice messages early in its life and they have stayed a default form of communication in mainland China since. Older users in particular use voice over text. Group chats are full of green audio bubbles, not paragraphs.

KakaoTalk in Korea and LINE in Japan and Taiwan show the same pattern. Voice messages are not a niche feature. They are the way a large portion of the user base prefers to talk when typing would be slow or awkward. The behavior is in place. A feedback tool that asks for voice fits the existing habit, not against it.

HeySpeak does not need to convince an Asian customer to learn a new mode of input. It needs to give them a familiar one. The receiver page is one URL, one record button, one submit. The flow looks like the inside of an app they already use every day.

Where this changes the math

Four cases where a non-Latin audience makes voice the obvious format and text forms a quiet failure mode.

Western e-commerce expanding into APAC

Your shop launches a Mandarin or Japanese storefront. The post-purchase email asks for a review and links to a form. Almost nobody fills it in. A QR code or link to a HeySpeak voice prompt, in the customer's language, gets answers because it does not require typing in their script.

Hotels and tourism in Asia

A hotel in Tokyo or Seoul wants honest end-of-stay feedback from local guests. The standard email survey gets ignored. A QR card at checkout, with one Japanese or Korean question, gets a 30-second voice note. The team reads a short summary in English the same evening.

Language learning and expat services

A language school or relocation service has customers across multiple writing systems. A single feedback prompt in the customer's native script removes the typing tax. You collect more answers, in more languages, with one tool.

Asian businesses talking to local customers

This is not only about Western teams reaching into Asia. A restaurant in Shanghai, a SaaS team in Seoul, or a clinic in Riyadh has the same problem with their own customers. Voice is the lower-friction option because it is the lower-friction option in any non-Latin script, regardless of whose product is asking the question.

Common questions

Does HeySpeak transcribe Chinese, Japanese, and Korean?
Yes. HeySpeak uses Mistral Voxtral for transcription, which supports major non-Latin languages including Mandarin, Japanese, Korean, Arabic, and Hindi. Customers record in their own language. The transcript appears in the same language in your dashboard, with an AI summary you can request in English. We are continuing to expand language coverage.
What about Arabic, Hebrew, Thai, or Hindi?
Voxtral handles Arabic and Hindi at production quality. Hebrew and Thai are supported in current Voxtral coverage. For any of these scripts, the typing argument is the same as for CJK: the input method adds friction that voice removes entirely. If you have a specific language in mind, send us a sample and we will tell you honestly how it transcribes.
Will my Asian customers actually use a Western tool?
The receiver page is a single mobile web URL with one record button. There is no signup, no app install, no Western login wall. Customers tap the link, hold to record, and submit. WeChat, KakaoTalk, and LINE have already trained the muscle memory. The flow is the part of HeySpeak that feels familiar, not the brand.
Can the question be in Mandarin and the dashboard summary in English?
Yes. You write the prompt in whatever language your audience speaks. The customer answers in that language. HeySpeak stores the original-language transcript and produces an AI summary in the language you choose, so an English-speaking team can read a one-paragraph summary of a Mandarin response without losing the original.
Is voice messaging really preferred in Asia?
Voice messages are a default mode of communication on WeChat in mainland China, on KakaoTalk in Korea, and on LINE in Japan and Taiwan. Older users in particular send voice notes more often than text. The cultural and ergonomic pattern is already in place. HeySpeak fits into a habit your audience already has.
How does transcription handle dialects like Cantonese or regional Japanese?
Voxtral is trained primarily on standard Mandarin and standard Japanese. Cantonese and strong regional dialects can transcribe with lower accuracy than the standard form. The audio is always preserved, so you can listen to the original recording even when the transcript is imperfect. We are continuing to expand language coverage.

Stop asking your customers to type in a script that fights them.

Five free responses to start. Works in Mandarin, Japanese, Korean, Arabic, Hindi, and more. No credit card.

Create a voice prompt