What’s in a word? Researchers say it depends how long it is

This seems to be an interesting study with a pretty uninteresting conclusion, but as sesquipedalians we are happy to circumbilivaginate to achieve a honorificabilitudinitatibus – Deskarati

The idea that the length of a word is a reflection of the frequency with which it is used in order to make language more efficient is a theory that has held sway for decades. With “the”, “of” and “and” the three most commonly used words in the American English vocabulary according to the Brown Corpus the theory seems to make sense. And just consider how long it would take to get out a sentence if “the” were as long as the name of an Icelandic volcano. Now a team of MIT cognitive scientists has used Google data to develop an alternative theory that a word’s length actually reflects the amount of information it contains.

Although the notion that higher frequency of use engenders shorter words has an intuitive appeal to it, Steven Piantadosi, a PhD candidate in MIT’s Department of Brain and Cognitive Sciences (BCS), says such a theory doesn’t take into account the dependencies between words.

That is, many words, such as the three commonly used words listed above, typically appear in predictable sequences along with other words. The researchers found that short words are not necessarily highly frequent, but because they don’t contain much information by themselves, appear with strings of other familiar words that, together, convey information.

Although it does it in a different way, the researchers say this creates an efficiency of its own with the clustering of short words helping to “smooth out” the flow of information in language by forming strings of similar-sized language packets. Also, whether delivered through clusters of shorter words or through individual longer words carrying greater information, language tends to convey information at consistent rates.

Read more here (or not!) What’s in a word? Researchers say it depends how long it is.

  1. Steve B says:

    If you would like to generate your own word cloud liket he one in this article? If you do then go along to http://www.wordle.net and get creative.

  2. alfy says:

    Interesting ideas.

    1 Politico-Business Speak
    To some extent these theories depend on the type of English we are examining. In political or managerial speak we find words which have no actual information content at all. They serve to convey an impression of important or profound ideas behind some mundane piece of legislation or organisation. Terms like, “forward-thinking”, or “balanced approach”, or “rationalisation” are easily detected as meaningless by trying the opposites for size. No politico or business pundit would say, “We have a new backward-looking plan, for the irrationalisation of our business by an unbalanced approach.”

    2. Faux Science
    Even worse is the use of scientific words to give a false sense of erudition or authority. The term “quantum-leap” has been very popular. That buffoon John Prescott was using it to mean a major change. If the arts-educated journalist interviewing him had even a smidgen of science he could have said “Did you know that a “quantum-leap” is incredibly tiny? Is that what you mean?”

    3. Wonderful Prepositions
    English makes a marvellous use of prepositions to produce quite different meanings.
    For example, take the simple verb, “to run”, and see the effect of adding anyone of the following prepositions: up, out, in, down, through, over. These are understandably very confusing for learners of English but a great boon to the rest of us. A little word like “up” has a small but clear information content but when used in tandem with verbs it can be seen as a most important word with a wide and considerable information content. Think of all the information contained in “run through”.

