An Experimental Verification of Some Cross-linguistic Sound Symbolisms

A recent investigation of 6452 languages (Blasi et al., 2016) uncovered a number of cross-linguistic correspondences between speech sounds and meaning. For example, the phone [z] was associated with the meaning ‘star.’ In the present study, 16 of these sound symbolisms were tested by presenting English and Spanish speakers with pairs of nonce words along with a definition of the words. Their task was to choose the word that sounded best with the meaning given. One member of the pair of words contained phones found to be associated with the meaning of the word while the other did not. For instance, participants were asked to choose between [zolz] and [folf] as the word they felt was most likely to mean ‘star. ‘ Seven of the sound and meaning correspondences observed in the study by Blasi et al. (2016) were corroborated by both Spanish and English speakers. Three additional sound correspondences were only significant in one of the experimental languages.


Introduction
One of the tenets of structuralist linguistics is that the relationship between sound and meaning is assumed to be arbitrary (Saussure, 1916(Saussure, /1972. Sound symbolisms, or the pairing of language sounds and semantics, are common exceptions to this idea. Sound symbolisms may be ideophones such as bling bling that evoke sensory events (Akita & Pardeshi, 2019;Dingemanse, 2012), onomatopeia in which the sound of the referent is mimicked (e.g. gurgle and puff) and phonological iconicity (Dingemanse, Blasi, Lupyan, Christiansen, & Monaghan, 2015;Schmidtke, Conrad & Jacobs, 2014) such as reduplication of an adjective (e.g. sooooo long) where a longer form maps onto an intensification of its meaning. When the same language sounds appear in a variety of words with similar meaning this can create phonaesthemes (Bergen, 2004;Firth, 1930). For example, words beginning with /fl/ carry an association of light and airy in English due to the existence of words such as fly, fluff, float, and flutter.
The great majority of studies on sound symbolism focus on how a limited number of sounds may have symbolic representations in only one or a small number of languages (e.g. Newman, 1933;Sapir, 1929;Taylor & Taylor, 1965). However, in a recent study (Blasi, Wichmann, Hammarström, Stadler, & Christiansen, 2016) researchers mined basic vocabulary word lists from 6452 languages in search of sound symbolisms. They searched for correspondences between meaning and what they call symbols, which consist of single phones or closely related phones. For example, one symbol is for a nasal dental, another includes mid front vowels both round and unround, while another includes voiceless bilabial stops and fricatives. Using their methodology, they discovered 74 sound meaning correspondences. One example of a positive association is the appearance of nasals in words meaning 'nose.' While some symbols involve positive associations in which the presence of the symbol is related to the word meaning such as between nasal phones and 'nose,' others involved negative associations in which words with a particular meaning are highly unlikely to contain the symbol in question. For example, [j] is negatively associated with 'bone' since it rarely occurs in words for 'bone.' The existence of such associations may be due to a number of factors such as extensive borrowing among languages or the existence of a protolanguage that related words may derive from. However, Blasi et al. (2016: 10821) argue against these explanations and conclude instead that "the signals are due to factors common to our species, such as sound symbolism, iconicity, communicative pressures, or synesthesia" and they call for further research. Of course, this topic has been of interest to linguists and psycholinguists since Brown, Black, & Horowitz (1955) suggested that there may be a universal sound-symbolic substrate underlying all languages, and the existence of certain sound symbolisms has been demonstrated cross-linguistically in a number of recent studies (Joo 2020, Mompeán, Fregier, & Valenzuela 2020. In scientific endeavors, evidence resulting from a particular methodology is always valuable. However, when a theory is supported by evidence that has been gathered by many different means, the theory gains more credibility. When multiple methods produce similar outcomes, that fact eliminates the possibility that the results are due to, or unduly influenced by, the particular experimental paradigm used to test it. Using a data mining technique, Blasi et al. (2016) uncovered a number of phonetic traits that they assert are due to sound symbolism rather than common ancestry or borrowing. If these symbolisms exist, they should be relevant to the mental processing of speakers regardless of language. The purpose of the present paper, then, is to test some of their signals experimentally. The test languages used are English and Spanish.

Method
Participants were recruited on Mechanical Turk and paid $1 for completing the questionnaire. After completing the consent form and providing biographical information the participants were asked to go to a quiet room and put on headphones or earbuds before proceeding. The English speakers saw these instructions and the Spanish speakers saw the equivalent instructions in Spanish: You will hear two different words in a foreign language. You will also see a picture and the definition of the word. Your job is to guess which of the spoken words sounds most like the word it defines. Since you don't know the words you'll need to rely on your gut feeling or instinct to make the choice. Click on the triangle and listen to the two words. Choose the word that sounds most like it means 'test item'. Please listen to the entire recording before answering. In addition to the test items three attention tester questions were included. At the end of each of these three recordings the participants were asked to leave the question blank. Participants who answered these questions were deemed to not be following the instructions nor paying sufficient attention, which rendered their answers invalid and for this reason they were eliminated as participants. The order that the questions were presented, as well as the order that the two possible responses appeared was randomized. Most participants were able to complete the study in 8-10 minutes.

Test Items
Eighteen nonce words were devised to test some of the sound symbolisms found by Blasi et al. (2016). Items 1-10 contain phones that were positively associated with the meaning of the word (Table 1). Therefore, the predicted word contains the associated phone and the other test option does not. In items 11-14 the predicted test item contains phones associated with the meaning while the other test option contains phones that were found to be negatively correlated with the meaning. The predicted words in items 15-17 do not contain phones negatively associated with the meaning while the other option does contain phones negatively associated with the meaning. In item 18 the predicted word contains only one sound negatively associated while the other option contains two.
When designing test items, care was taken to insure that they had no phones in common with the corresponding words in either Spanish or English. These stringent requirements are responsible for the unequal number of test items that have positive or other kinds of associations. When the test was prepared for Spanish speakers this requirement necessitated removal of two test items that shared phones with the , 'fish.' In order to make the nonce words seem plausibly foreign, they also contained a number of phones that are not found in English nor in Spanish such as lateral fricatives, geminate stops, and ejectives, as well as consonant clusters that are not attested in either language (e.g. [-tm, ʃw-]). The recordings were done at 44,100hz using Audacity and a Sades SA902 headset microphone in the researcher's own voice.

Participants
There were two groups of participants: native English speakers and native Spanish speakers. A total of 164 English speakers took the questionnaire consisting of 67 women and 98 men with a mean age of 37.4. All but two participants were from the US, the others were from the UK and Canada. The participants' education level was quite high. Half held a Bachelor's degree, 31.7% had some college, 8.5% held a graduate degree, and the remaining 9.8% were high school graduates. Most of the participants were White (78%), while 9.1% were Black, 4.3% Asian, and 3.7% Hispanic. The remainder indicated that they were multiethnic.
Of the 120 Spanish speakers, 38 were women and 82 men. Their mean age was 31.4. A college degree was held by 43.8%, while 30% had some college education, and 17.5% held graduate degrees. The remaining 6.7% had a high school degree. The participants were from the following countries: Spain (65), Venezuela (28), Mexico (9), Argentina (5), Colombia (5), Dominican Republic (2), and one participant from Costa Rica, El Salvador, USA, Guatemala, Peru, and Uruguay. Equal numbers (47.5%) indicated Hispanic and White as their ethnic identity while 5% considered themselves multiethnic.

Results
A chi square was calculated on the number of the predicted word versus the other word responses (see Table 2) for each word pair. Among the English speaking participants, the test item 'star,' for example, garnered 146 responses to the predicted response [zolz] and 18 to the other response [folf]. Separate analyses were done for the English and Spanish speakers. In order to better visualize the results, the raw percentage of responses to the predicted word was converted into how much the predicted response deviated from the chance rate of 50%. For example, 89% of the responses to star were [zolz], therefore these responses deviate from 50% by 39% (Figure 1). Black bars indicate responses that were in the predicted direction and reached a significance level of .05 or smaller. Vertically striped bars are given to test items that are not significant at the p < .05, but may be considered marginal since the alpha falls between .1 and .051. Gray bars are given to items that do not reach significance, and horizontally striped bars indicate items that are statistically significant, but in the opposite direction of what was predicted. ilr.ideasspread.org International Linguistics Research Vol. 4, No. 4; Figure 2. How much each predicted test word deviates from the change 50% level for Spanish speakers The purpose of the experiment was to test the degree to which certain phones are related to certain meanings. In Figures 1 and 2 it can be seen that the majority of responses fall above the 50% line, which indicates a general trend in the predicted direction. This is more so in the case of the responses provided by the Spanish speaking participants. It is important to reiterate that Blasi et al. (2016) found that some phones were positively associated and others were negatively associated with the word meanings. For example, /k/ is common in the word for 'bone' in many languages, while /j/, on the other hand, was rare in words for 'bone.' It should be noted that in two instances English speakers provide counter-evidence for two sound-meaning correspondences by highly preferring responses that go in the opposite direction of what was predicted. For 'dog' they preferred the word with /t/ [tátiɬ] over the word containing predicted /s/ sásiɬ]. In like manner, they preferred 'fish' with an /e/ [eɬéʔ] over [aɬáʔ] with predicted /a/. In any event, Table 3 summarizes the ten sets of soundmeaning correspondences from the experiment that corroborate the findings of Blasi et al. (2016). Seven of them are corroborated in both Spanish and English, while three and language-specific. Table 4, in contrast, lists the six sets that were not supported by the present study.

Discussion
The results of the experiment showed a general trend in the direction predicted by the findings of Blasi et al. (2016). However, only ten of the sound-meaning correspondences were significant while no evidence for six others was observed. There may be a number of reasons for this. First, the correlations between a particular sound and meaning most likely vary in their strength. Second, although nonce words were chosen that had no phones in common with the lexical item in Spanish or English, this does not eliminate the possibility that other words with related meanings may have influenced the participant's choice by analogy. For example, Blasi et al. (2016) found that [s] was positively associated with 'dog' and [t] negatively associated with that meaning. Spanish speakers preferred the nonce word containing [s]. English speakers, in contrast, strongly preferred the nonce item with [t]. Is it possible that this is due to Toto, the name of the dog from the Wizard of Oz or an association with 'tail'? The existence of this kind of associations is difficult to ascertain, but such language-specific correlations may be controlled for in future studies by carrying out the experiment with speakers from a a wide variety of languages.
Some theories of language posit strict divisions between different modules of language which predicts that elements belonging to one module should not affect those of a different one. In the case of sound symbolisms, there should be no influence of sound on meaning. However, the present study gives evidence for the existence of sound symbolisms that undermine the notion of strict separation of modules. There are, of course, language specific sound symbolisms. For example, English words beginning with [gl] are associated with words related to light, but this is due to the existence of English words such as glisten, gleam, glitter and glow. However, the study by Blasi et al. (2016) found cross-linguistic sound symbolisms, which supports Brown, Black, and Horowitz's (1955) idea that some sound symbolisms may be universal. Numerous other studies that evince sound symbolism shed doubt on the idea that there is a strict separation between the phonological and semantic modules of languagee The present study demonstrates the existence of sound symbolisms in both Spanish and English.

Conclusion
By mining basic word vocabulary from over 6400 languages Blasi et al. (2016) uncovered a number of correspondences between sound and meaning. They argue that the existence of such correspondences is not due to borrowing or genetic relationship among languages, but to relationships people make between sound and meaning. The present study tested this notion by asking participants to decide which of two nonce words they perceived to be most closely related to a particular meaning. One of the nonce words contained phones found to be related to certain meanings and the other did not. Since the nonce words had no phones in common with the Spanish or English words with the same meaning, the possible influence of borrowing or common ancestry of the word was greatly reduced. In this way, the participant's preferences are arguably based on sound symbolism.
Of course, the present study only supports 10 of the 16 sound symbolisms tested, and only speakers of two languages were tested. In order to solidify arguments that sound symbolisms cut across languages experimental evidence from a large number of languages from different language families needs to be gathered. Another motivation for extending this line of research into other languages has to do with the associations that may be tested. For instance, the relationship between [n] and 'nose' was observed by Blasi et al. (2016), but since that phone exists in both English nose and Spanish nariz this sound symbolism cannot be tested in these languages. In the present study, only a portion of the sound-meaning correspondences could be examined. Therefore, further experimentation is also necessary to test a larger number of the specific sound symbolisms uncovered by Blasi et al. (2016). While they uncovered significant correspondences between sounds and meaning in a wide variety of languages, the question that needs to be answered is whether those relationships influence speaker's linguistic cognition. The present study is an attempt to answer that question.