Neural NLP for Morphologically Rich Languages

Pēteris Paikens (University of Latvia) , Mārcis Pinnis (University of Latvia), Inguna Skadiņa (University of Latvia)

During the recent years, neural networks (NN) have become the dominant approach in natural language processing (NLP). In this course, we will introduce learners to widely used NN architectures and will provide an overview of machine learning approaches currently used for NLP tasks.

Since much of the research and competition in state of the art NLP development is focused on English, effective application of these approaches to morphologically rich languages requires certain modifications. This course will focus on the possibilities to adapt and apply the current NN approaches for processing of morphologically rich and less resourced languages. Practical application of NN techniques will be illustrated through several use cases – machine translation, speech processing and other NLP advancements in the Baltic states.

Course Materials

Course outline

Day 1. Introduction. Morphological rich languages. Low resourced languages. Introduction to neural networks.

Day 2. Common neural net architectures and their application for basic NLP tasks.

Day 3. Application of deep learning to machine translation task for morphological rich languages. Data sparsity. BPE algorithm. Advanced techniques in low resourced environment.

Day 4. Neural nets for advanced NLP tasks and architecture adaptations for low resourced and morphological rich languages.

Day 5. Recent achievements with neural nets for languages of Baltic languages.