Bootstrapping a Persian Dependency Treebank


Abstract


This paper presents an ongoing project whose goal is to create a freely available dependency treebank for Persian. The data is taken from the Bijankhan corpus, which is already annotated for parts of speech, and a syntactic dependency annotation based on the Stanford Typed Dependencies is added through a bootstrapping procedure involving the open- source dependency parser MaltParser. We report preliminary parsing experiments with promising results after training the parser on a manually annotated seed data set of 215 sentences.

Keywords


treebank; Persian; dependency trees; MaltParser

Full Text: Untitled () PDF