NLTK Vs Spacy Vs Stanford CoreNLP

POS(Part of  Speech) and NER(Named Entity Recognition) are one of the most important tasks in NLP. It’s important to select a library which can perform these tasks with high accuracy and low latency for real world applications. Here is a comparison between the best open source Python libraries in the market.

Feature Availability*

Feature Spacy NLTK Core NLP
Easy installation Y Y Y
Python API Y Y N
Multi Language support N Y Y
Tokenization Y Y Y
Part-of-speech tagging Y Y Y
Sentence segmentation Y Y Y
Dependency parsing Y N Y
Entity Recognition Y Y Y
Integrated word vectors Y N N
Sentiment analysis Y Y Y
Coreference resolution N N Y


Speed: Key Functionalities – Tokenizer, Tagging, Parsing*

Package Tokenizer Tagging Parsing
spaCy 0.2ms 1ms 19ms
CoreNLP 2ms 10ms 49ms
NLTK 4ms 443ms


Accuracy: Entity Extraction*

Package Precition Recall F-Score
spaCy 0.72 0.65 0.69
CoreNLP 0.79 0.73 0.76
NLTK 0.51 0.65 0.58


*Source: www.analyticsvidhya.com

