/Tricks in NLP

Tricks in NLP

“Don’t Just

Don’t just learn, experience.
Don’t just read, absorb.
Don’t just change, transform.
Don’t just relate, advocate.
Don’t just promise, prove.
Don’t just criticize, encourage.
Don’t just think, ponder.
Don’t just take, give.
Don’t just see, feel.
Don’t just dream, do. 
Don’t just hear, listen.
Don’t just talk, act.
Don’t just tell, show.
Don’t just exist, live.” 
― Roy T. Bennett, The Light in the Heart


  • Use semi-supervised learning to create training data faster
  • Augmenting data by character and word level replacements using nlpaug
  • Character encoding
    • CountVectorizer with analyser=’char’
    • ELMo embedding
  • Word encoding
    • Use topic modelling over paragraphs to get relevant words
    • Use idf*word_embeddings instead of mean word embedding to represent a chunk
  • Imbalance
    • Duplicate instances occurring only once instead of removing them otherwise you get train_test_split error
  • Use label smoothing on y


  • Use normalisation layers
  • Use gelu activation instead of tanh/ReLu
  • Transformer-based models
    • Try with/without finetuning language model
    • Try adapters in between
  • Try models with different learning rates for adam
  • Use fit_one_cycle by Leslie Smith
  • Use layer specific learning rate like ULMFiT
  • Use gradient clipping in RNN
  • Imbalance
    • Use oversample for minority and undersample/fixed_sample/partial_sample for majority
    • Use class_weights in model.fit of scikit/keras
  • If there are incorrect samples in the data, use class_weight to give more importance to correctly tagged labels
  • Try multi-task learning when data is less for target task but data exists for similar tasks

An AI evangelist and a multi-disciplinary engineer. Loves to read business and psychology during leisure time. Connect with him any time on LinkedIn for a quick chat on AI!