Infinite dictionary: Erin McKean, former editor in chief of US Dictionaries for Oxford University Press, has created Wordnik, expanding the meaning and uses of a dictionary.
(Jasmine Scott/Special to The Christian Science Monitor)Photos (1 of 1)
New online dictionary redefines ‘look it up’
Lexicographer Erin McKean’s interactive ‘Wordnik’ is projected to be the largest online dictionary ever.
By Jina Moore | Correspondent / March 16, 2009 edition
Chicago
Erin McKean doesn’t look much like a revolutionary. She speaks softly. She sews her own skirts and writes a daily blog entry about vintage patterns. She does work out of a basement, but it’s got carpeting and good lighting and roughly 1,500 books, many of whose titles involve the word “words.” Her suburban Chicago home is not exactly the picture of subversion.
This week, though, she is slated to launch what may be the biggest revolution in the printed word since, well, printed words.
Ms. McKean’s brainchild is called Wordnik, and it combines the best practices of the old-fashioned desk reference with Internet innovations. Words can be tagged like a blog entry, their pronunciation recorded and replayed like streaming radio, their related words cataloged like a list of books customers also bought at an online book depot. When the paper page gives way to the Web page, everything about the way we think of words will change, McKean says. “This project,” she predicts in a quiet voice devoid of bravado, “is going to completely revolutionize all of dictionarymaking forever.”
Granted, a dictionary is closer to a database than a mystery thriller, its authors nothing like, say, John Grisham. But to McKean, nothing has ever seemed more fascinating than collecting and organizing American words.
McKean was 8 years old when she decided that when she grew up, she wanted to be a lexicographer – the technical term for a writer or editor of dictionaries. She first found it in her daily scouring of The Wall Street Journal. Her father was a Journal devotee, and McKean liked the human interest stories (but, she jokes, “even then, I knew enough not to read the editorial page.”) A feature article celebrated Oxford University Press’s 1980 Word of the Year – ayatollah – and talked about preparing the newest edition of its most famous title, the Oxford English Dictionary.
“I think I was really attracted by the fact that it was taking 21 years to make the second edition of the Oxford English Dictionary,” she recalls. “I was 8. Twenty-one years was forever.”
The lexicography bug stuck, in part because McKean loved language. She was a voracious reader, plowing through her local libraries’ stacks and devouring anything she found at home, she says. “If it was lying around, I read it. If my parents didn’t want me to read it,” she says, “they had to hide it.”
As her classmates abandoned childhood dreams of firefighting or Broadway stardom for teaching or nursing, McKean stuck with words. “Nobody ever tried to talk me out of it. Nobody knew enough about it to know if it was easy or difficult,” she recalls. “Nobody had a brother who was a lexicographer the way they might have a brother who was a firefighter or an English teacher or a doctor or a lawyer. Nobody had ever met one.”
For good reason, she found out as she pursued joint bachelor’s and master’s degrees in linguistics at the University of Chicago: There aren’t a whole lot of jobs for lexicographers. McKean estimates there may be 200 working lexicographers in America today, and that the field sees about two full-time openings a year.
McKean got her start through a combination of luck and ingenuity: She called up the only dictionary publisher based in Chicago and asked for an internship. After graduation, the internship turned into a job, which eventually turned into a career at Oxford University Press, a move she likens to “being called up by the Yankees.” At age 29, McKean was the chief editor of the American dictionaries group. “If it had Oxford and American in the title,” she says, “it was my fault.”
She could dream up bestsellers, like the Oxford American Writers Thesaurus, but among her favorite books is the first one she acquired at her new home, a publishing house with a reputation for erudition. “It was called Slayer Slang….[It] is a treatment of the slang of Buffy the Vampire Slayer,” the title character in a hit television drama from the late 1990s.
The purchase revealed as much about McKean’s sensibility as it did about her business sense. And when it comes to dictionaries, McKean says, sensibility is key. “People have this idea of the Platonic ideal of the dictionary. That’s why they call it ‘the dictionary’…. They think that all dictionaries are pretty much the same.” Not so, she says. There are five print dictionary publishers in the US, each choosing which of the billions of words they’ve collected will make it into print.
What gets left out depends on the personality of the publishing house. On the other hand, how to evaluate what gets in is a task beyond most people. “Most consumers don’t have a good metric for deciding on whether the dictionary they want to use is a good one … so they flip the book over, then go to the back, and it says, ‘over 250,000 entries.’ And they go, ‘Great, this dictionary must be awesome!’ ” she says. “Because if you don’t know a word, how do you judge the quality of the definition?”
Enter Wordnik, McKean’s newest project. In the infinite space of the Internet, she can define as many words as she wants.
“There are hundreds of thousands of words that aren’t in any print dictionary today … because there’s no space for all of them.”
Wordnik has space for many of them, and for their bells and whistles. Her team of seven has analyzed what print and online dictionaries do and don’t do well. They’ve built a user-friendly resource that should be the best – and biggest – of both worlds. Wordnik generates its content from a database of 4 billion words, twice as many as that of her last employer. “Four billion words,” she says with a shrug, “is what you can pick up lying around on the floor of the Internet.”
Want to evaluate a definition of a word you’ve never met? No problem; other users can tell you if they favor that definition. Want to know what other words often appear in the same sentence as what you’ve just looked up? There’s a section called “related” for words used in the same context as yours. Need to know what a farthingale, for instance, looks like? Images are imported to the page from photo-depot giant Flickr. Unsure if you really understood the definition? Every word has several example sentences, culled at random from that Internet floor and then sorted so the best rise to the top of your search page.
These, McKean says, are critical. They’ve been vanishing from print dictionaries as publishers try to cram them with more words, but contextual sentences are what make people pick up reference books in the first place. “We think people go to a dictionary to find out what a word means,” she says. Not so. “Most people go to the dictionary because they don’t want to look stupid.”
They don’t want to sound stupid, either, which is why every word has an audio file of its pronunciation. Users can record their own pronunciations, too.
Print dictionaries do have one clear advantage, though: They show more than one word at a time. That makes skimming the print page fun, and McKean has tried to mimic that feeling with a “serendipity” feature, which generates words at random.
Perhaps the most surprising element of McKean’s new dictionary is a frequency graph, which shows how often the word you’ve looked up was used, as a written word, in a year. That can tell you more about history than just the etymological: Take “chad,” for instance. The word’s frequency in 2000 is high – thanks, of course, to that year’s presidential election controversy. But there are signs of heavy usage much earlier. [Editor’s Note: The original version of this story incorrectly used the word “entymological” instead of “etymological.” A reader pointed this out here. You can read our response here.]
“We have one text from 1870 that has the word ‘chad’ a lot, because it’s about Jacquard [weaving] looms, which used to be run on punch cards,” McKean explains. “They had the same chad problems as the Florida ballots.”
Ultimately, McKean’s goal is rather humble, when judged against the volume of words that have accumulated in the 400-year history of modern English.
“Ideally my goal is, before I die, to have some information about every word that’s ever been used in print.”
That may be the real revolution: digitizing a bit of data about every word we English speakers have ever put on the old-fashioned page. Byte by byte, the soft-spoken lexicographer will see her revolution through.
2. shadowsprite | 03.16.09
This is a great idea. My friend who wears hearing aids might benefit by the audio features. He reads all the time so he knows a huge number of words. Sometimes it is funny when he says one of them because he has no idea how to pronounce it. His speech is fairly normal since he could hear okay as a small child. Being able to look up a word and hear how it sounds when it is spoken could help him communicate more fluently at work.
5. Bill Blask | 03.16.09
For Petrich: definition of God (no need to shout). Why, as many as there are. This is digital media, yes? Be sure to include those from all religions and faiths, to be accurate. And then there are the antonyms, and..
Now perhaps the pictures will pose a problem.
7. music | 03.16.09
This sounds great - I skimmed so forgive if I missed this. will there be audio for music? say you type in a band can it play a song (good way to get more people involved) then the bands could post upcoming shows.
8. Mackenzie Kelly | 03.16.09
I hope that you will incorporate Skeats Etymological Dictionary’s findings
9. dionigi | 03.17.09
“This week, though, she is slated to launch what may be the biggest revolution in the printed word since, well, printed words.”
Or since… Wikipedia? This whole project sounds like Wiktionary with some extra bells and whistles. It hardly sounds revolutionary at all, actually.
10. Ian Monroe | 03.17.09
Sounds like a good resource in the making.
@dionigi: It’s mission is a lot smaller then that of Wiktionary though. Wiktionary aims to have an English definition for every word in every language. Wordnik is just about English. Also it doesn’t appear to be editable, its more about just using vast databases of text and analyzing them. So its seems to be complimentary to Wiktionary. Perhaps a future mashup will pair Wiktionary definitions with Wordnik data.
11. G. Abramowitz | 03.17.09
“Every word has several example sentences, culled at random from that Internet floor and then sorted so the best rise to the top of your search page.” Literacy began its decline in 1963, when Webster’s Third New International replaced the Second in publishing houses. The Third eliminated the Second’s invaluable usage examples, which illustrated not only meaning but syntax.
This venture sounds promising, but I’m not sure I’d want to rely on usage picked off the “floor” of the Internet. Too much “reign in” for “rein in,” “go to the mattresses” for “go to the mat,” etc.
—A copyeditor
12. M Knapp | 03.17.09
To shadowsprite: many online dictionaries already have this feature. At the very least, Merriam-Webster does.
13. backstory | 03.17.09
CSMonitor response to comment #1:
A slip of the keyboard, and we added an ‘n.’ An iroic mistake. Er, ironic.
14. Virginia | 03.17.09
Fun to read about Erin and Wordnik. I first learned about her love of words via this engaging talk she gave: http://www.ted.com/index.php/talks/erin_mckean_redefines_the_dictionary.html
15. Chuck Lewis | 03.18.09
Great idea, great article. And, speaking of the word “God” for God’s sake don’t forget the Orthadox or even semi-Orthadox word “G-d” as written. Not sure how you pronounce that.
17. Leslee | 03.20.09
I can’t wait to see how the new online dictionary defines “marriage” and “church”. I just went yesterday to an old Webster’s dictionary and then to the World Book dictionary 1987 and then Encarta and the words certainly have evolved!
18. Naumadd | 03.25.09
I truly hope Wordnik will be created and used as a guide to how language, namely english, IS used rather than with the usual viewpoint of a dictionary as a rulebook to how language MUST be used. One can see the attitude clearly in several comments here already. I’ve always longed for a dictionary or perhaps dictionaries that approach the listing of words and their many possible meanings as simply a guide to how words have been or currently are used, not as a rulebook to how they must be used or defined for all time and in all contexts. The truth is, each of us define the words we use in our own ways based on our own experiences and for good reason. A truly useful dictionary provides both common and as many uncommon definitions as is possible to include and perhaps ought to come with a caveat warning the reader not to construe the definitions provided as perfected, mandatory, permanent or unalterable. Language is only a tool. It necessarily changes and evolves over time and must necessarily adapt to each new context. There are no and can be no final definitions. Only in very specific contexts is any definition considered “mandatory”. In most contexts, definitions are necessarily fluid. Because language evolves, dictionaries must evolve. It’s always surprising how many individuals do not view either language or dictionaries in this manner.
19. Theo Halladay | 03.27.09
Re Naumadd’s comment that hopes the dictionary wont be taken as perfected, mandatory and permanent: I hope readers wil understand that this caveat aplyes also [I use reformd spelling] to the way English is speld. Altho lexicografers in the past hav offen modestly declared that they ar only publishing wot is in use, not making rules, that message dusnt seem to reach the public. Once a spelling has apeard in a dictionary, it takes on a holiness it shud not hav. English spelling is a disorganized hodgepodge, based on at least 4 language roots jumbled together, & has never been propperly edited. Dictionary may not be arbitrators, but I hope Ms.McKean wil include sum articles pointing out the inconsistencys & listing suggestions that hav been made for regularization of spellings. Such comments wud encurrage the much-needed updating wich other European cuntrys do at intervals for their ritten languages - an update wich strangely has never been dun in either the UK or the US for English.
20. Mekhong Kurt | 03.28.09
Naumadd: I’m a retired university English teacher who has taught at several universities in the U.S. and Asia, and I can tell you that a great many people already recognize that not everyone talks in the same way. Last I knew, the respected Oxford English Dictionary recognized something on the order of 22 “national ‘Englishes’.”
That said, there has to be some common meaning if two or more people are to communicate, and the latter part of your entry seems to suggest that individual meanings are what you would prefer. But were that the case, you would have only two choices: not communicate with anyone, or share your meaning (or learn the other person’s) — and then it’s not individual anymore, is it?
This is a great project, and I thoroughly enjoyed the article.
21. Kris Fulsaas | 03.31.09
Re: Naumadd’s comment, Lewis Carroll’s Humpty Dumpty expresses your desires so well: “When I use a word, it means just what I choose it to mean–neither more nor less.” I couldn’t disagree more.
22. Samuel Johnson | 03.31.09
How amusing, yet how sad. Half of the first twenty comments are concerned with neither literature nor linguistics. Instead, they advance predictable ideologies as if the authors thought they were sharing brilliant new insights.
Perhaps a day spent with both Wordnik and Wikipedia would help. Look up Aristotle, the Talmud, Mencken, and Eco on logos, mimesis, concept, meaning, description, prescription, dialect, jargon, slang, argot, and a few hundred other related terms. Then start again with Chomsky and Derrida. Maybe there will be some inkling as to whether a dictionary, as opposed to an encyclopedia or a tract, would (or should) include (or exclude) “god”, “gods”, “God”, “Gods”, “g-d”, “damnation”, “tarnation”, “holy smoke”, “gee whiz”, “hocus pocus” — with a picture (or a graven image) of each.
At least the Faithful Reader will realize that such controversy is eternal, not novel.
23. John Davis | 03.31.09
Sounds (!) like a great resource. I will check it out immediately. But I have a suggestion born of frustration with many dictionaries: include the names of things such as the tools used in crafts or operations (those of hearths and for barrel-making come to mind). It is very hard to verify and learn about such things.
Which brings me to my second suggestion: I hope there are cross-links to synonyms and antonyms. This, along with definitions and frequency information, tells a story about a concept or idea that is carried by several terms simultaneously. Such stories are devilishly difficult to dig out of most dictionaries and sourcebooks.
I look forward to watching Wordnik grow.
24. Teresa Allen | 04.02.09
I am tickled to find out about this new resource; I work in a library and providing patrons with avenues to information is a large part of what we do. I also love the engagement of the readers allowed by the on-line format. I had been thinking I would miss the printed daily Monitor; I suppose I still will, but this is a great consolation prize.
25. robert ingram | 04.03.09
i know that ms. mckeans work will be a valuble tool to many of us intelligentsia.
26. marjory | 04.11.09
To the editor with the ironic mistake; under the circumstances, could also be irenic.
Trackbacks/Pingbacks
2. Ask the leadership coach » New online dictionary redefines ‘look it up’ | csmonitor.com | 03.16.09
Leave a Comment
We do not publish all comments, and we do not publish comments immediately. The comments feature is a forum to discuss the ideas in our stories. Constructive debate - even pointed disagreement - is welcome, but personal attacks on other commenters are not, and will not be published.
Tip: Do not write a novel. Keep it short. We will not publish lengthy comments. Come up with your own statements. This is not a place to cut and paste an email you received. If we recognize it as such, we won't post it.
Please do not post any comments that are commercial in nature or that violate copyrights.
Finally, we will not publish any comments that we regard as obscene, defamatory, or intended to incite violence.



1. Paul Lundquist | 03.16.09
Gentle writer: You use the word entymological when you mean etymological in the paragraph on the frequency graph. I’m not sure yours is a word as the study of bugs is entomology with only one letter y.
Otherwise I am so happy to learn about this new online dictionary and its creator. Thank you.