Dictionaries and Word Lists

The other day I was working on a new application which needed to process large batches of words – as comprehensively as possible. After some quick searches I found that there are (unsurprisingly) a number of freely available dictionary/wordlist files available on the Internet.

The first repository that I tried was that of one hosted on Sourceforge, simply called ‘Wordlist‘. Many of the lists hosted on that page are spell-checker centric, but the 12 Dicts package, in particular, was rather comprehensive. It originally contained 12 dictionaries, which has since been pruned down. Within the package there are a number of different dictionaries, some contain old English words, some have hyphenated words, some have acronyms, etc. You need to use the grid, that they provide, to determine which package is best suited for you. After doing some work with this list, however, I determined that it simply wasn’t comprehensive enough for me (at 74,000 words).

After some more digging I came across the public domain list called ENABLE, which is overwelmingly comprehensive. This particular list is used in just about every word game on the planet – containing approximately 173,000 words! This particular list is very clear-cut and has no limitations imposed as to the words contained within it. If you need a word list for any of your upcoming projects, I highly recommend it!

Posted: August 4th, 2005


Subscribe for email updates

3 Comments (Show Comments)



Comments are closed.
Comments are automatically turned off two weeks after the original post. If you have a question concerning the content of this post, please feel free to contact me.


Secrets of the JavaScript Ninja

Secrets of the JS Ninja

Secret techniques of top JavaScript programmers. Published by Manning.

John Resig Twitter Updates

@jeresig

Infrequent, short, updates and links.