Using Stanford POS tagger in NLTK

Add enviroment variable first:

export CLASSPATH=dir/stanford-postagger-full-2015-04-20/stanford-postagger.jar

export STANFORD_MODELS=dir/stanford-postagger-full-2015-04-20/models

http://stackoverflow.com/questions/13883277/stanford-parser-and-nltk/34112695#34112695


>>> from nltk.tag import StanfordPOSTagger
>>> st = StanfordPOSTagger('english-bidirectional-distsim.tagger')
>>> st.tag('What is the airspeed of an unladen swallow ?'.split())
[('What', 'WP'), ('is', 'VBZ'), ('the', 'DT'), ('airspeed', 'NN'), ('of', 'IN'), ('an', 'DT'), ('unladen', 'JJ'), ('swallow', 'VB'), ('?', '.')]

 

Leave a Reply