I am writing a book to help International students overcome
literacy problems. Part of my development is to sift through
every lesson guide our engineering department has published
and extract any word often used in engineering which I suspect
they might have trouble with. (This will not be the same as
generating a 'glossary of engineering terms').
Is there a software program which has the ability to remove all
conjunctions, transitional words, adjectives, adverbs, etc and list
the remaining words (preferably) in alphabetical order ?
Many thanks,
Tony Skinner
Reply:
You're looking for a concordance generator. The one I use is
monoconc, but there are others. That will produce a wordlist in
alphabetical order. For removing words by part of speech, you'd
probably want to look at tools in something like the natural
language toolkit (www.nltk.org).