ARTICLE Mohammadshahi_TACL_2021/IDIAP Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement Mohammadshahi, Alireza Henderson, James EXTERNAL https://publications.idiap.ch/attachments/papers/2021/Mohammadshahi_TACL_2021.pdf PUBLIC https://publications.idiap.ch/index.php/publications/showcite/Mohammadshahi_TACL_2020 Related documents Transactions of the Association for Computational Linguistics (2021) 9 18 2021 https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00358/97778/Recursive-Non-Autoregressive-Graph-to-Graph URL https://doi.org/10.1162/tacl_a_00358 doi We propose the Recursive Non-autoregressive Graph-to-Graph Transformer architecture (RNGTr) for the iterative refinement of arbitrary graphs through the recursive application of a non-autoregressive Graph-to-Graph Transformer and apply it to syntactic dependency parsing. We demonstrate the power and effectiveness of RNGTr on several dependency corpora, using a refinement model pre-trained with BERT. We also introduce Syntactic Transformer (SynTr), a non-recursive parser similar to our refinement model. RNGTr can improve the accuracy of a variety of initial parsers on 13 languages from the Universal Dependencies Treebanks, English and Chinese Penn Treebanks, and the German CoNLL2009 corpus, even improving over the new state-of-the-art results achieved by SynTr, significantly improving the state-of-the-art for all corpora tested. ARTICLE Mohammadshahi_TACL_2020/IDIAP Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement Mohammadshahi, Alireza Henderson, James Natural language processing NLP Parsing Transformer Transactions of the Association for Computational Linguistics(under submission) 2020 We propose the Recursive Non-autoregressive Graph-to-graph Transformer architecture (RNG-Tr) for the iterative refinement of arbitrary graphs through the recursive application of a non-autoregressive Graph-to-Graph Transformer and apply it to syntactic dependency parsing. The Graph-to-Graph Transformer architecture of \newcite{mohammadshahi2019graphtograph} has previously been used for autoregressive graph prediction, but here we use it to predict all edges of the graph independently, conditioned on a previous prediction of the same graph. We demonstrate the power and effectiveness of RNG-Tr on several dependency corpora, using a refinement model pre-trained with BERT~\cite{devlin2018bert}. We also introduce Dependency BERT (DepBERT), a non-recursive parser similar to our refinement model. RNG-Tr is able to improve the accuracy of a variety of initial parsers on 13 languages from the Universal Dependencies Treebanks and the English and Chinese Penn Treebanks, even improving over the new state-of-the-art results achieved by DepBERT, significantly improving the state-of-the-art for all corpora tested.