Language Technology teaching

Language Technology study module (25 credits)

Organized together with the Department of Future Technologies and the School of Languages and Translation Studies. No prior knowledge of language technology is needed. Students coming outside the Department of IT will learn the basics of programming and automatic text processing during the first and second period courses so that they are able to continue to more advanced cources. All advanced courses are organized so that motivated students also outside IT Department are able to complete the study module.

Courses

KKLT0030 Automaattinen tekstiprosessointi (5 op)

Teacher: Veronika Laippala, School of Languages and Translation studies

Language: Finnish

Time: Every year, first period Update: The course starts on Monday 10.9. Classes on Mondays at 10.15 and Thursdays at 12.15 in A252, Arcanum, IT-luokka.

Level: Intermediate

After the course the student knows how to manipulate and analyze large corpora from command line. The student is familiar with various simple Unix tools, such as sorting and counting frequencies, using regular expressions, running loops and using pipes. Further, the student knows how to search for instructions in online manuals. The practical assignments prepare the students to apply the learned skills for instance in theses, and the learned skills are further developed in the more advanced courses of the studying module.

BIOI2250 Introduction to Programming (5–6 ECTS)

Teacher: Department of Future Technologies

Language: English

Time: Every year, first and second period

Level: Intermediate

The course targets students with no prior programming experience. The students will acquire basic skills in algorithm design and programming, learning to write simple, practical programs in the Python programming language.

Students from the Department of Future Technologies (TKT or DI) cannot take this course, you must take the programming courses meant for TKT or DI degree students.

TKO_8966 Johdatus kieliteknologiaan (5 op) (Introduction to Language Technology)

Teacher: Jenna Kanerva and Kai Hakala, Department of Future Technologies

Language: Finnish

Time: Every year, spring

Level: Intermediate

The course introduces the basic concepts and applications of language technology, such as text segmentation, morphological tagging, language modeling and machine translation, introduces interesting problems which make human language difficult for computer to understand, how language can be modeled in a machine learnable way, and teaches of how to apply machine learning and neural networks in language technology applications. Machine learning is covered from a very practical point of view and the course includes the very basics of machine learning theory needed, so no prior knowledge is needed.

KKLT0031 Korpuksia ja kieliteknologiaa (5 op)

Teacher: Veronika Laippala, School of Languages and Translation studies

Language: Finnish

Time: Every year, spring

Level: Advanced

After the course the student is familiar with ready-made corpora from different fields, understands the importance of corpora in linguistics and knows how to avoid the most common problems in corpus compilation. Further, the student knows how to use corpus tools, such as Antconc and Wordsmith, is familiar with basic natural language processing tools and their functioning and understands the potentials of machine learning for language studies. In addition, the students learn methods to analyze large digital corpora. These include both traditional corpus linguistics methods and new possibilities offered by natural language processing, such as automatic syntactic analysis, distributional semantics, text classification and sentiment analysis. The studied corpora represent various languages and genres, such as social media, learner language and texts form different time periods.

TKO_8964 Textual Data Analysis (5 op)

Teacher: Filip Ginter, Department of Future Technologies

Language: English

Time: Every third year, spring

Level: Advanced

Understanding of the fundamental methods of mining information hidden across very large collections of text. Practical skills in working with machine learning methods and application of basic language technology tools to textual data. Understanding of the possibilities and limitations of state-of-the-art language technology methods. Topics: Web crawls and other large collections of textual data. Preprocessing of large text corpora: segmentation, tagging, and syntactic analysis. Pattern matching, supervised machine learning, and clustering. Information extraction and aggregation. Applications e.g. in scientific literature mining.

TKO_8963 Search Engines and Document Retrieval Methods (5 op)

Teacher: Filip Ginter, Department of Future Technologies

Language: English

Time: Every third year, spring

Level: Advanced

The course teaches the fundamentals of document indexing and search, giving the students understanding of how large text collections can be efficiently queried, and an insight into the technology behind large search engines. The course includes plenty of hands-on exercises where the students learn to deploy a commonly used search engine software on a large dataset, and understand the many choices involved in the process. The students also learn how to deal with inflective languages, synonyms, and other more advanced topics in information retrieval. The course also teaches how to cluster and label documents by topic. Topics: Inverted index and fundamentals of search, vector-space representations of documents and concepts, practical exercises in search engine deployment and document index generation, document clustering and topic modeling.

TKO_8965 Deep Learning in Human Language Technology (5 op)

Teacher: Filip Ginter, Department of Future Technologies

Language: English

Time: Every third year, spring

Level: Advanced

The course teaches the fundamental methods of neural networks and deep learning, with an especial focus on their application to human language data. In addition to the theoretical foundations, the course includes plenty of hands-on work, where the students learn how to build and train neural networks in practice on real-world tasks and data. The course also shows the vast array of possibilities machine learning brings to human language technology, broadening the horizons for students with interest in language. Students whose interest is more on the technical side learn how to apply deep learning methods to unstructured and ambiguous data, a skill which can easily be transfered to other domains, such as image and video data processing. The course also deepens the students’ programming skills. Topics: Deep learning methods and their application to various tasks in human language technology. Text classification, vector space representations, named entity recognition, sequence classification, machine translation, language generation. Developing and training neural networks using Python.