/Tricks in NLP

  • Use semi-supervised learning to create training data faster
  • Augmenting data by character and word level replacements using nlpaug
  • Character encoding
    • CountVectorizer with analyser=’char’
    • ELMo embedding
  • Word encoding
    • Use topic modelling over paragraphs to get relevant words
    • Use idf*word_embeddings instead of mean word embedding to represent a chunk
  • Imbalance
    • Duplicate instances occurring only once instead of removing them otherwise you get train_test_split error
  • Use label smoothing on y


  • Use normalisation layers
  • Use gelu activation instead of tanh/ReLu
  • Transformer-based models
    • Try with/without finetuning language model
    • Try adapters in between
  • Try models with different learning rates for adam
  • Use fit_one_cycle by Leslie Smith
  • Use layer specific learning rate like ULMFiT
  • Use gradient clipping in RNN
  • Imbalance
    • Use oversample for minority and undersample/fixed_sample/partial_sample for majority
    • Use class_weights in model.fit of scikit/keras
  • If there are incorrect samples in the data, use class_weight to give more importance to correctly tagged labels
  • Try multi-task learning when data is less for target task but data exists for similar tasks

