/Best Python libraries for Data Science

Best Python libraries for Data Science

It’s not debatable to say that it’s always good to know libraries in case of a need.

Core libraries that you have to start with

  1. Pandas
  2. Numpy
  3. Scikit
  4. Matplotlib
  5. Seaborn
  6. Jupyter notebook – For interactive Python (Highly recommended)

NLP libraries

  1. Spacy (Checkout NLTK Vs Spacy Vs Stanford CoreNLP)
  2. NLTK
  3. Textacy (Built over Spacy)
  4. Stanford CoreNLP
  5. Textblob (Built over NLTK and scikit)
  6. RegEx (Inbuilt in Python) – For word patterns
  7. Newspaper – Scraping content of news articles and extracting keywords
  8. Python CRFSuite – Just for making CRFs for NER or other purposes
  9. Gensim – For generating word embeddings
  10. Librosa – Audio analysis
  11. LDA – For Latent Dirichlet Allocation
  12. Textract – For parsing any text file or image containing text

You can find more over here.

Image manupulations

  1. OpenCV
  2. PIL
  3. Imgaug
  4. Tesseract (OCR engine)

Reading text files

  1. PyPDF2
  2. Python-docx
  3. Docx2txt

Downloading videos

  1. Youtube-dl – Can download whole playlists
  2. Coursera-dl – Can download all videos of a course
  3. Udacity-dl – Can download all videos of a course

Scraping web pages

  1. Beautiful soup – For scraping HTML pages
  2. Scrapy – For extracting certain fields from a html web page (Can be trained to extract certain fields with samples)
  3. Selenium – For scraping Javascript loaded pages
  4. Newspaper – For getting text of news articles

Deep learning

  1. PyTorch
  2. Tensorflow
  3. Keras

Stock trading

  1. Quandl
  2. Ta-lib
  3. PyAlgoTrade
  4. QSTrader

Other utilities

  1. Pigar – Generating requirements.txt from all python files in a repo

You can find more libraries over here.

In case you are facing error in installing a package, dowload the .whl file from here and install it using pip install packagename.whl

This list will keep on updating. Let us know if you have any suggestions!

An AI evangelist and a multi-disciplinary engineer. Loves to read business and psychology during leisure time. Connect with him any time on LinkedIn for a quick chat on AI!