Download PDFOpen PDF in browser

A System for Constituent and Dependency Tree Linearization

5 pagesPublished: February 16, 2023

Abstract

In this work, we introduce a framework that unifies existing implementations for the tasks of constituent and dependency parsing as sequence labeling problems. The system provides a way to encode both formalisms as sequences of one label per word, so they can be used with any existing general-purpose sequence labeling architecture. More particu- larly, we implement three linearizations to encode constituent trees and four linearizations for dependency trees. All encoding functions ensure completeness and injectivity. We will also train a sequence labeling neural system to learn such encodings, and compare their ef- fectiveness on standard constituent (PTB and SPMRL treebanks) and dependency parsing (a subset of treebanks from the UD collection) evaluation frameworks.

Keyphrases: constituent parsing, dependency parsing, natural language processing, nlp, sequence labeling, tree linearization

In: Alvaro Leitao and Lucía Ramos (editors). Proceedings of V XoveTIC Conference. XoveTIC 2022, vol 14, pages 83-87.

BibTeX entry
@inproceedings{XoveTIC2022:System_Constituent_Dependency_Tree,
  author    = {Diego Roca and David Vilares and Carlos Gómez-Rodríguez},
  title     = {A System for Constituent and Dependency Tree Linearization},
  booktitle = {Proceedings of V XoveTIC Conference. XoveTIC 2022},
  editor    = {Alvaro Leitao and Lucía Ramos},
  series    = {Kalpa Publications in Computing},
  volume    = {14},
  publisher = {EasyChair},
  bibsource = {EasyChair, https://easychair.org},
  issn      = {2515-1762},
  url       = {/publications/paper/kBBd},
  doi       = {10.29007/9m3p},
  pages     = {83-87},
  year      = {2023}}
Download PDFOpen PDF in browser