Prague at EPE 2017: The UDPipe System

Milan Straka1, Jana Straková2, Jan Hajic1
1Charles University, 2Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics


Abstract

We present our contribution to The First Shared Task on Extrinsic Parser Evaluation (EPE 2017). Our participant system, the UDPipe, is an open-source pipeline performing tokenization, morphological analysis, part-of-speech tagging, lemmatization and dependency parsing. It is trained in a language agnostic manner for 50 languages of the UD version 2. With a relatively limited amount of training data (200k tokens of English UD) and without any English specific tuning, the system achieves overall score 56.05, placing as the 7th participant system.