Introduction to Natural Language Processing

Duration

10-20 hours

Natural Language Processing (NLP) is an area of research within the field of Artificial Intelligence that solves problems related to speech-to-text conversion, text processing, or speech generation, among others. Applications of this type of analysis are numerous: text classification, sentiment analysis, or generation of automatic summaries, to name a few.

In this course (using the Python programming language) we will learn pre-analysis text processing techniques and the main text vectorization methods. We will also apply this methodology to five practical areas.

To prequalify for this course, participants should be proficient with the Python programming language and have at least a basic knowledge of Machine Learning.

Course content:

  • Introduction to NLP
  • Strings in Python
  • Regular Expressions
  • Text preprocessing
  • Feature Engineering
    • Bag of Words
    • Tf-Idf
    • Word2Vec
  • Document classification
  • Sentiment analysis
    • Supervised sentiment analysis
    • Unsupervised sentiment analysis
  • Text similarity
    • Distances
    • Recommenders
  • Text summarization
    • Key term extraction
    • Topic modeling
    • Topic modeling with LSA
    • Topic modeling with NMF
    • Text summarization
  • Document clustering