Span-based discontinuous constituency parsing: a family of exact chart-based algorithms with time complexities from O(n^6) down to O(n^3) - Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

Span-based discontinuous constituency parsing: a family of exact chart-based algorithms with time complexities from O(n^6) down to O(n^3)

Caio Corro

Résumé

We introduce a novel chart-based algorithm for span-based parsing of discontinuous constituency trees of block degree two, including ill-nested structures. In particular, we show that we can build variants of our parser with smaller search spaces and time complexities ranging from O(n^6) down to O(n^3). The cubic time variant covers 98% of constituents observed in linguistic treebanks while having the same complexity as continuous constituency parsers. We evaluate our approach on German and English treebanks (Negra, Tiger, and DPTB) and report state-of-the-art results in the fully supervised setting. We also experiment with pre-trained word embeddings and Bertbased neural networks.
Fichier principal
Vignette du fichier
2020.emnlp-main.219.pdf (723.42 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte

Dates et versions

hal-03029253 , version 1 (28-11-2020)

Identifiants

  • HAL Id : hal-03029253 , version 1

Citer

Caio Corro. Span-based discontinuous constituency parsing: a family of exact chart-based algorithms with time complexities from O(n^6) down to O(n^3). Empirical Methods in Natural Language Processing, Nov 2020, Punta Cana (virtual), Dominican Republic. ⟨hal-03029253⟩
52 Consultations
34 Téléchargements

Partager

Gmail Facebook X LinkedIn More