How can you tell if the East-Asian-looking faces you see on the street are Korean, Chinese, or Japanese, if you don’t know their languages at all? The trick for you is to listen out for their exclamations. When hearing something surprising, Koreans are most likely to go, uhhhh? [ʌ][1], the Chinese, ahhhh? [ɐ], and the Japanese, ehhhh? [ɛ], all with a rising intonation. (For those who are not familiar with IPA symbols, you can listen to the sounds here.)

These observations led me to this informal pilot study on the cross-linguistic use of vowels in the exclamation of surprise. The chart below shows some of the questions I had in mind.

There were at least five different types of surprise I could identify, as summarised in the figure below: Disgust (and surprise), Pain (and surprise, and so on), Terror, Disbelief, and Pure Surprise. I gave my informants a couple of specific scenarios, to make sure we were all on the same page about what we mean by each type of surprise. I was only interested in the vowel quality, so I disregarded the following variables, though they are also potentially very interesting to look at: consonants that occur with the vowels, other linguistic aspects such as intonation, duration, nasality and voice quality, words with a specific meaning, such as whaaaat?, Reeeeally?, and Nooooo!, and all kinds of swear words.

I collected data from a total of 21 informants representing 12 languages: three for Korean, four for Chinese (two from the North, two from the South), two for Japanese, one for Cantonese, one for Vietnamese, two for English (one for British, one for American), two for French, one for German, one for Swedish, one for Greek, one for Bulgarian, and two for Polish. I made phonetic transcriptions of the informants demonstrating the sounds to me in person or by audio recordings. When this was difficult, I relied on self-reports. Most self-reports were from those who had received training in phonetics and I often asked follow-up questions to make sure the phonetic symbols chosen were appropriate. Finally, I observed each language directly myself, except Swedish and Greek, at least once. Putting aside tons of methodological limitations which still remain, here’s what I found.

If you want to know which languages use which vowels, I’ve summarised that in the table at the end of the article. But for now, look at the diagrams below for the aggregated results from all 12 languages for each type of surprise. (If you don’t know how to read a vowel quadrilateral[2], see Endnote 2.) A small circle on a vowel quadrilateral represent a response containing a rounded vowel, and a small square represent a response with an unrounded vowel. I must say that the exact positions of the dots are rough impressionistic representations as I haven’t made any acoustic or articulatory measurements. The arrows indicate vowels consisting of more than one quality (which phoneticians call diphthongs, triphthongs, etc.), such as [aʊ] in house. Blue arrows show diphthongs and triphthongs which start with an unrounded vowel and end with a rounded vowel. The vowel in house would thus be marked with a blue arrow going from [a] to [ʊ] as in the diagram for pain. Yellow is for diphthongs and triphthongs going from an unrounded vowel to an unrounded vowel; white is for those from a rounded to an unrounded vowel; finally, purple is for those from a rounded to a rounded vowel. Multiple answers from the informants are all shown on the diagram, but when there were more than three identical answers, they weren’t marked more than three times for reasons of simplicity. The big square dot in ‘terror’ indicates that most answers were either [a] or [ɐ̞].

So what do these results tell us? We know that pretty much all kinds of vowels can be used in exclamations of surprise. The informants were also happy to make sounds which were not part of their vowel inventory. For example, both English and French speakers reported to utter the close-back vowel [ɯ] when expressing disgust, although it isn’t an existing vowel (phoneme) in their languages.

However, the degree of lexicalisation of an exclamation forms a continuum, and there seems to be a correlation between the degree of lexicalisation and how “typical” the vowel is in the language. For example, my Japanese informants strongly preferred shouting the fully meaningful [atsʉʔ] ‘it’s hot’ and [itaʔ] ‘it hurts!’ for ‘pain’, and my German informant reported to utter [vas] meaning ‘what?’ in the scenarios I provided for ‘disbelief’. And these had to be excluded from the answers for the reasons stated above. Even after such exclusions, however, some responses such as ew and ouch in English, and usch /ɵʂ/ ‘yuk’ in Swedish are more lexicalised and conscious than other responses such as [əɜ] (as if to vomit) and [aʊ] in English, and [ɨə] in Swedish, which are felt to be more natural and spontaneous. In turn, the more lexicalised the exclamation, the more likely the vowel will come from an existing sound category in the language.

While the answers do vary for different languages and different individuals, we can certainly observe general cross-linguistic tendencies to prefer certain vowels for certain types of surprise. To which do we attribute these general tendencies? It’s partly because some of the languages are historically related, and also because they often borrow from each other, e.g. German autsch, English ouch, and Polish auć.

However, given the range of languages covered, many of which are historically unrelated and don’t exchange direct linguistic influences, the explanation above seems insufficient. I speculate that such widespread tendencies are to do with typical facial expressions that are natural in different situations in which exclamations occur.

For ‘disgust’, close vowels such as [ɯ, ɨ, u, i] are common. Most of the diphthongs also start from close vowels. This may be related to the facial expression of frowning and closing and tensing your jaw when disgusted. The exception of [a] as in [bʎaks] or [bʎaχ] from Greek still show the expected initial high position of the tongue caused by the [ʎ] (which is like a retracted [l]). The diphthongs [ɛə] from Cantonese and [əɜ] from American English clearly relate to the act of vomiting. Admittedly, there was a truly baffling answer of a simply [a] with a rising intonation from Vietnamese for which I don’t have a good explanation.

For ‘pain’ and ‘terror’, open front vowels are clearly most popular. For ‘pain’, non-open back vowels are not rare, but for ‘terror’, all the vowels end with an open and non-back vowel. Diphthongs are more likely to be closing ones, that is, from a more open to less open quality for ‘pain’, except for triphthongs such as [aja] for which the initial portion is a closing diphthong. (I didn’t distinguish between glides [j, w] and vowels [i, u] in this study for practical reasons.) On the other hand, the two reported cases of diphthongs for ‘terror’ are opening diphthongs.

The responses for ‘pure surprise’ were similar to those for ‘pain’, although the vowel quality was slightly less peripheral in general and triphthongs were absent for ‘pure surprise’. The different direction of diphthongs may be explained by the nature of making a noise in a particular context. For ‘pain’ and ‘pure surprise’, speakers are most likely to utter sounds spontaneously and stop quickly. In other words, you will start making a sound with your mouth open and likely to close your mouth shortly, creating either a short monophthong or a closing diphthong. On the other hand, realistically speaking, the most natural and spontaneous response in terror-inducing situations would just be a sharp inhale. If someone does make a noise, that would imply an intention to be heard and to receive help. Given this motivation, you would most likely make your mouth as open as possible to be loud. In support of the speculation, some informants noted that triphthongs such as [aʊa] in German and [ajɛ] in Swedish for ‘pain’ communicate more annoyance than their diphthong counterparts, [aʊ] and [aj]. In such cases, the assumption is that the addition of the last open vowel quality prolongs the exclamation and makes it louder to be recognised by the person causing the pain and serve as a complaint.

The results from ‘disbelief’ have more central vowels and fewer diphthongs compared to the other types, and completely lack close and close-mid vowels. Your jaw drops in surprise and disbelief, but your mouth is not as open as when you are in pain or in immediate danger. However, informants reported that the higher the degree of ‘surprise’, the more open the vowel they would use. In Korean, for example, a mildly unbelievable story could plausibly get responses such as [ɪŋ], [ɯŋ] and [ɯɪŋ], but a story as unbelievable as “My puppy became a human!” would probably yield an [ʌ] response, with a much longer duration, louder volume, and greater rise in pitch. And this trend is probably universal.

A slightly worrying question creeps into mind: is this topic even linguistic at all? After all, my explanations for the observed cross-linguistic similarities are based on seemingly extra-linguistic factors, such as typical facial expressions associated with the situations, and the need to be loud and heard and the need to shut up quickly after a spontaneous scream. If “linguistic” means “arbitrarily coded”, then a given exclamation would be as linguistic as they are lexicalised. My explanation for cross-linguistic use of vowels in exclamations is similar in nature to Ohala’s (1983) evolutionary explanation for the cross-linguistic use of pitch. Small animals naturally have small larynxes and thus produce high-pitched sounds. By analogy, high pitch has come to indicate submissiveness, deference and uncertainty. This leads to various universal patterns in the use of pitch, such as high pitch at the end of an utterance to indicate a question.

In our case, natural facial expressions lead to a certain range of jaw opening and tongue position, which in turn sets a range of quality of the vowel. Over time, these sounds get analysed in relation to existing vowel sounds of a language, and eventually get fossilised to communicate different emotions of surprise in a language. This is my theory to explain the high degree of overlap among wide-ranging languages of a preferred vowel space for different exclamations of surprise!


<Cross-linguistic use of vowel in the exclamation of surprise>

* Disclaimer: The answers are based on 1-4 informants per language and are not meant to be an exhaustive list of possible vowels.

Types of surprise

Disgust Pain Terror Disbelief Pure surprise





[ʌ], [ɛ]

[ʌ], [a]

Chinese  [4]

[ə~ɤ(ᵊ)], [ɨ], [ʉɐ],     [ji], [je̞]



[ɜ], [ɐ]

[a], [a(i)ja]


[ʉɛ], [ʉwɐ], [ja] as in [kjaʔ][5] or [gjaʔ], [ɛ] as in [gɛ]

[wa], [ja]

[ja], [wa]


[wa], [a]


[ji], [ɛə]

[ɐ̞], [o̞u]











[ɯ], [iʉ], [ɤ̟] as in yuk [jɤ̟k][6], [iu] as in piu for smell, [əɜ]

[a], [aʊ]



[ɜ~a], [əʊ]


[ɯ], [ ɤ ], [ʊ], [ɞ] as in beurk [bɞk]






[i] as in igitt [igɪt]




[o], [a]


[ɵ] as in usch [ɵʂ], [ɨə]

[aj], [a],[ajɛ]


[a] as in [va]


[u] as in puf, [ɐ] as in [bʎaks] or [bʎaχ]

[ɐ], [o],[ɐu]


[ɛ], [ɐ] as in [bɐ]



[ɯə], [u]

[ɔu],[au], [ɔ]




Polish [uj] as in [fuj], [ɛ] a in [blɛ] or [wɛ] [awa],[aʊ] [a] [ɔ]




Ohala, J.J., 1983. Cross-language use of pitch: an ethological view, Phonetica 40, 1-18.



[1] I used audio illustrations by Peter Ladefoged as a reference to exact sound qualities denoted by the International Phonetic Alphabet.

[2] The trapeziums you see in the diagrams are called ‘vowel quadrilaterals’. To understand the concept, you can think of it as representing the possible space in your mouth where your tongue can be placed. A dot on a vowel quadrilateral marks the highest point of the tongue when articulating a vowel. The corners of the trapezium thus indicate the extreme, ‘cardinal’ vowels. When your tongue is brought as front and high (or ‘close’ referring to the position of the jaw) as possible, then you get an [i]. If you round your lips keeping your tongue in the same position, you get an [y]. Sliding your tongue back until you can’t any more, you get an [u]. Unround your lips, and you will get an [ɯ]. Opening your jaw from there as much as you can, you get an [ɑ], as in half in British English, and so on. The very centre of the trapezium marks the most neutral vowel called a ‘schwa’ [ə], which is the vowel most English speakers have for the first vowel in the word phonetics.

[3] As mentioned above, I did not record consonants which may occur with the vowels. For this particular vowel, the actual exclamation may be [a], [at], [ak], or [aʔ]. There are definitely interesting patterns to be looked into with the use of consonants, and nasality, too. For ‘pain’, for instance, vowels were often followed by velar or uvular consonants such as [k], [ʔ], and [χ], and if any, consonants preceding the vowels were likely to be pronounced in the front of the mouth, such as labials and labiodentals. For ‘disbelief’, [h] was the most common onset and the vowels were often nasalised or followed by a nasal consonant.

[4] The responses from the Chinese informants from different dialectal areas were quite compatible, as well as the English speakers, so I put them in the same categories.

[5] I give full syllables for relatively unusual answers or when deemed helpful for the reader’s imagination.

[6] This response is from a British speaker of a Standard Northern English.