Data is Singular

July 03, 2020

Throughout the history of English1, people who think too highly of themselves have prescribed how people should speak and write. These include prescriptions like "don't end sentences with a preposition" and "avoid singular they". For the most part, these prescriptions seem to have failed in infiltrating spoken English. Similarly, in much written English, these prescriptions were either never adopted or are starting to be abandoned. For instance, the NYT style guide states that "we must also guard against a reflexive traditionalism that would make The Times seem fusty or out of touch. Language changes, and we should carefully and judiciously reckon with those changes".2

However, there is one prescription that has held on for dear life, stubbornly refusing to die: plural data. If you haven't noticed or encountered this, it's when the word data is used as a plural noun, e.g. "the data are convincing". The argument for this usage, like many prescriptivist arguments, stems from Latin. Originally, data was the plural of datum, meaning a singular piece of data (e.g. a numerical result). However, in the vast majority of usage, data has come to be viewed as a mass noun meaning "information" and thus treated as singular. Despite this shift, plural usages of data abound. Particularly in academic writing, where the term data is often used and traditional writing norms have significant sway.

This usage may seem harmless enough but it has a few negative consequences. First, at a surface level, it makes the writing seem stilted and old-fashioned. This, in turn, can have the effect of making the authors seem distant, elitist, and unrelatable. This seems unwise, especially at a time when academia is coming under attack for being out-of-touch with the ordinary person (somewhat unfairly I think). Second, it can be jarring, taking the reader out of the flow, making them reparse the sentence, and altogether slowing and impeding understanding. I've actually heard this second point used in argument for prescriptive usage. For instance, my advisor once asked me to change a sentence to match some prescriptive form so I don't jar the reader as the prescriptive usage was the norm, even if my advisor didn't personally agree with the prescription. I would argue that the reverse effect, being jarred by the prescriptive form, can be even more significant, especially when the prescriptive usage deviates significantly from common usage. When I see data used as a plural noun, my reading flow is immediately shaken.

Ultimately, abandoning the plural usage of data could be a small step toward making writing, particularly academic writing, clear, relatable, and natural. Formal and academic writing have already made great strides in shedding some of the most onerous prescriptive shackles like avoiding singular they and sentence-ending prepositions. Taking this one further step would significantly align common and formal English in a beautiful and powerful harmony.3

  1. And in other languages

  2. I would argue the traditionalism that the Times talks about was largely never tradition in spoken language but only in written language. Thus, the changes the Times might make do not really reflect "language changes" but rather shifting attitudes about the alignment between spoken and written language and the ills of prescriptivism. Interestingly, the Times style guide is still definitely out of touch in certain areas, for instance proscribing singular "they".

  3. I realize this whole polemic may sound prescriptivist itself. However, I'm not saying you should not use plural data if you treat it is a plural in your spoken language. Though, if you do treat data as plural in your speech, I would consider reflecting on whether you acquired this usage later in life as a sign of education, rather than from your underlying natural speech.