![]() However, the accuracy of the Indonesian dependency parser built is still only 82.59% for UAS and 79.83% for LAS. For lemmatization, POS tagging, and morphological features analysis tasks, the resulting models have F1-score of more than 93% that shows that the consistency of annotations for the columns LEMMA, UPOS, and FEATS in the treebank is already good. To evaluate the quality of the resulting treebank, we built models for lemmatization, POS tagging, morphological features analysis, and dependency parsing using UDPipe, a trainable pipeline for tokenization, tagging, lemmatization, and dependency parsing of CoNLL-U files. We propose the use of 14 UD v2 features and the corresponding 27 feature-value tags. The objectives of our work are to propose the relevant Universal Dependencies (UD) morphological features for Indonesian dependency treebank and to apply the proposed features to an existing treebank. For word segmentation task, Aksara has accuracy of 96.9%, for lemmatization with case-sensitive it has accuracy of 94.83%, for POS tagging it has F1-score of 88.2% and finally for morphological features analysis, among 18 feature-value tags already implemented, nine tags already have F1-score more than 80%. ![]() The experiment results show that for all the four tasks Aksara outperforms MorphInd. We also compare the performance measures of Aksara with MorphInd, by mapping MorphInd output to CoNNL-U format. ![]() To evaluate the quality of this tool, we used an Indonesian dependency treebank that conforms to UD v2 as the gold standard. Aksara has capability to perform four tasks: 1) word segmentation, 2) lemmatization, 3) POS tagging, and 4) morphological features analysis. In building Aksara we use the same approach with MorphInd, another Indonesian morphological analyzer, that uses finite state compiler named Foma. ![]() Many works had developed Indonesian morphological analyzer, but as far as we know none conforms to the UD annotation guidelines. ![]() The objective of this work is to build an Indonesian morphological analyzer named Aksara that conforms to the Universal Dependencies (UD), especially UD v2. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |