A (belated) happy mother language day! If you missed it yesterday, you can catch up on the what and the why here.

One family of languages that could never be counted as mother tongues are programming languages. Yet various US states are considering allowing coding classes in schools to count alongside Spanish, Chinese or Italian lessons towards foreign language learning requirements. Last week, as a bill with this kind of suggestion was being debated in Florida, the popular linguistics writer Gretchen McCulloch was asked how natural languages differ from programming languages (and so why this is a bad idea).

Here, with quite a bit of help from my software engineer husband, I consider some more differences, as well as similarities, between programming languages and natural languages.

1. First up, syntactic ambiguity. As Gretchen McCulloch mentioned, natural languages like English are often syntactically ambiguous. What do we mean by this? Take the following examples:

  • A boy climbed every tree.
    > There was a boy and that boy climbed every tree (i.e., one boy did lots of climbing).
    > For every tree, there was a boy that climbed it (but not necessarily the same one).
  • I’m not going to give a talk in London on Thursday
    … I’m going to attend a talk
    … I’m going to give a talk in Brighton
    … I’m going to give a talk in London on Friday
  • The girl saw a man with a telescope

That is, there is more than one possible mapping from the surface form to the meaning of the utterance. Now, in natural languages, the context, as well as prosodic cues like stress in speech, allow us to disambiguate the intended meaning fairly easily. In contrast, as Gretchen McCulloch says, “formal languages don’t want you to do that.” Indeed, most programming languages have a perfect form-function mapping between syntax and semantics. So, more properly, they don’t allow you to do that. However, most programming languages do allow grammatical structures which are, on the face of it, ambiguous. Consider the following sentence of English:

If it’s raining tomorrow, then if I need to go shopping, I’ll take the car, otherwise I’ll go on my bike.

Admittedly, it’s fairly unlikely that someone would construct a sentence like that in spontaneous speech. But assuming they did, then the listener hits the problem of how the ‘otherwise’ clause resolves – is it attached to ‘if it’s raining tomorrow’, or to ‘then if I need to go shopping’? In other words, what happens when it’s raining but I don’t need to go shopping, or if I need to go shopping but the sun is shining? Programming languages, lacking stress and pitch, resolve these syntactical issues by precisely defining how the sentence is interpreted, with languages typically resolving the “dangling else” by attaching it to the second “if”.
As well as with the syntax, some programming languages include whole other classes of ambiguity, such as features of Haskell (type inference), C++ (template resolution), and Java. Unlike a human listener who uses context to work out what the speaker meant, the compiler simply throws an error when it meets it; the programme has to specify how the structure is meant to be disambiguated.

2. Secondly, and more briefly, implicated meaning. And of a particular sort: in natural languages, speakers can convey meaning not only through what they say, but also in how they say it – the forms that they use. For example, saying ‘Might I possibly ask you to close the window?’ conveys not only a request but also the fact that the speaker is being polite and respectful. Similarly, if I tell you that ‘yesterday Bob was driving along when, suddenly, he caused the car to stop’, you wonder if he pulled the handbrake or hit a pothole (or even a tree), otherwise I would have told you ‘he stopped’.

In programming languages, just like in natural languages, there are usage conventions. However, these are for the benefit of the human reader, not for the computer. A software engineer might look at some code and infer something about its style, what kind of experience the programmer has, and so on, but this isn’t part of the communicative act – the compiler, who plays the part of the interlocutor, doesn’t care about any of those things.

3. Thirdly, linguistic change. This is a characteristic of natural language that all speakers are aware of. Often, this comes in the form of language pundits who bemoan the use of like as a quotative or the singular gender-neutral use of ‘they’, or the many other ways English (or any other language) is thought to be going down the drain. Language change is inevitable, and happens not only at the level of word meanings, but also sounds and syntactic constructions. It happens gradually over time as children acquire language from limited input, and as speakers use language and interact with speakers of different varieties and languages.

Language change happens in programming languages too. However, the kind that most closely parallels natural language change is change in usage, not in the grammar or lexicon: for instance, programmers might notice that a particular construction that was allowed by a language but not really used very often, is actually more useful than they initially thought, and start employing it more. Changes to the grammar or lexicon, though, are decided by committee (for instance, Java 8 now allows the kind of ambiguities we were talking about earlier) – after all, when you only have 15 words in your language, changing the meaning of one is a pretty big deal. And of course, such conscious en masse decisions are something very rare and usually ineffective in natural languages.

You can think of many other ways that Java differs from Javanese, Python from Tok Pisin, and Swift from Spanish – and I may revisit the theme in a later post. But the fact that we’re never going to celebrate C on International Mother Tongue Day perhaps points to the most fundamental difference that means sacrificing natural language learning for coding isn’t going to be a wise move.