This is a list of over 100,000 English words transcribed orthographically. I obtained it from The Interociter bulletin board in Dallas (214/258-1832). The original read.me file said that the list came from Public Brand Software.
The original list contained 146,440 words, but I discovered that there were thousands of duplicate words. I resorted the list and removed the duplicates using the Unix utility uniq. The total number of words is now 109,582. I have repackaged the list into four files (the original was five):
File Bytes Words Range --------- ------ ----- ----- words1.lst 315376 29839 A-D words2.lst 242484 23101 E-K words3.lst 325716 30439 L-R words4.lst 270759 26203 S-Z ---------------- Total 1154335 109582
This word list includes inflected forms, such as plural nouns and the -s, -ed and -ing forms of verbs. Thus the number of lexical stems represented in the list is considerably smaller than the total number of words.
Academic Computing Department
Summer Institute of Linguistics
7500 W. Camp Wisdom Road
Dallas, TX 75236