CONF Fajcik_CASE@EMNLP2022_2022/IDIAP IDIAPers @ Causal News Corpus 2022: Extracting Cause-Effect-Signal Triplets via Pre-trained Autoregressive Language Model Fajcik, Martin Singh, Muskaan Zuluaga-Gomez, Juan Villatoro-Tello, Esaú Burdisso, Sergio Motlicek, Petr Smrz, Pavel http://publications.idiap.ch/index.php/publications/showcite/Fajcik_Idiap-RR-12-2022 Related documents The 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE @ EMNLP 2022) 2022 ACL Anthology (https://aclanthology.org/2022.case-1.10) https://preview.aclanthology.org/emnlp-22-ingestion/2022.case-1.10/ URL In this paper, we describe our shared task submissions for Subtask 2 in CASE-2022, Event Causality Identification with Casual News Corpus. The challenge focused on the automatic detection of all cause-effect-signal spans present in the sentence from news-media. We detect cause-effect-signal spans in a sentence using T5 -- a pre-trained autoregressive language model. We iteratively identify all cause-effect-signal span triplets, always conditioning the prediction of the next triplet on the previously predicted ones. To predict the triplet itself, we consider different causal relationships such as cause→effect→signal. Each triplet component is generated via a language model conditioned on the sentence, the previous parts of the current triplet, and previously predicted triplets. Despite training on an extremely small dataset of 160 samples, our approach achieved competitive performance, being placed second in the competition. Furthermore, we show that assuming either cause→effect or effect→cause order achieves similar results. REPORT Fajcik_Idiap-RR-12-2022/IDIAP IDIAPers @ Causal News Corpus 2022: Extracting Cause-Effect-Signal Triplets via Pre-trained Autoregressive Language Model Fajcik, Martin Singh, Muskaan Zuluaga-Gomez, Juan Villatoro-Tello, Esaú Burdisso, Sergio Motlicek, Petr Smrz, Pavel EXTERNAL http://publications.idiap.ch/attachments/reports/2022/Fajcik_Idiap-RR-12-2022.pdf PUBLIC Idiap-RR-12-2022 2022 Idiap November 2022 In this paper, we describe our shared task submissions for subtask 2 in CASE-2022, Event Causality Identification with Casual News Corpus (CNC). The challenge focused on the automatic detection of all cause-effect-signal spans present in the sentence from news-media. In this work, we detect cause-effect-signal spans in a sentence using T5, a pre-trained autoregressive language model. We iteratively detect all cause-effect-signal span triplets, always conditioning the prediction of the next triplet on the previously predicted ones. To predict the triplet itself, we consider different causal relationships such as cause -> effect -> signal. Each triplet component (i.e., cause, effect, or signal) is generated via a language model conditioned on the sentence, the previous parts of the current triplet, and previously predicted triplets. Despite training on an extremely small dataset of 160 samples, we show that our approach achieved competitive performance, placing 2nd in the competition. Furthermore, we show that assuming either cause -> effect or effect-> cause causal order achieves similar results. Our results further indicate that whichever component of the triplet is generated first, whether cause or effect, achieves stronger performance when generated first. Our code and model predictions will be released online.

</datafield>

<subfield code="a">Fajcik_CASE@EMNLP2022_2022/IDIAP</subfield>

</datafield>

<subfield code="a">IDIAPers @ Causal News Corpus 2022: Extracting Cause-Effect-Signal Triplets via Pre-trained Autoregressive Language Model</subfield>

</datafield>

<subfield code="a">Fajcik, Martin</subfield>

</datafield>

<subfield code="a">Singh, Muskaan</subfield>

</datafield>

<subfield code="a">Zuluaga-Gomez, Juan</subfield>

</datafield>

<subfield code="a">Villatoro-Tello, Esaú</subfield>

</datafield>

<subfield code="a">Burdisso, Sergio</subfield>

</datafield>

<subfield code="a">Motlicek, Petr</subfield>

</datafield>

<subfield code="a">Smrz, Pavel</subfield>

</datafield>

<subfield code="u">http://publications.idiap.ch/index.php/publications/showcite/Fajcik_Idiap-RR-12-2022</subfield>

<subfield code="z">Related documents</subfield>

</datafield>

<subfield code="a">The 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE @ EMNLP 2022)</subfield>

</datafield>

</datafield>

<subfield code="a">ACL Anthology (https://aclanthology.org/2022.case-1.10)</subfield>

</datafield>

<subfield code="u">https://preview.aclanthology.org/emnlp-22-ingestion/2022.case-1.10/</subfield>

</datafield>

<subfield code="a">In this paper, we describe our shared task submissions for Subtask 2 in CASE-2022, Event Causality Identification with Casual News Corpus. The challenge focused on the automatic detection of all cause-effect-signal spans present in the sentence from news-media. We detect cause-effect-signal spans in a sentence using T5 -- a pre-trained autoregressive language model. We iteratively identify all cause-effect-signal span triplets, always conditioning the prediction of the next triplet on the previously predicted ones. To predict the triplet itself, we consider different causal relationships such as cause→effect→signal. Each triplet component is generated via a language model conditioned on the sentence, the previous parts of the current triplet, and previously predicted triplets. Despite training on an extremely small dataset of 160 samples, our approach achieved competitive performance, being placed second in the competition. Furthermore, we show that assuming either cause→effect or effect→cause order achieves similar results.</subfield>

</datafield>

</record>

<subfield code="a">REPORT</subfield>

</datafield>

<subfield code="a">Fajcik_Idiap-RR-12-2022/IDIAP</subfield>

</datafield>

<subfield code="a">IDIAPers @ Causal News Corpus 2022: Extracting Cause-Effect-Signal Triplets via Pre-trained Autoregressive Language Model</subfield>

</datafield>

<subfield code="a">Fajcik, Martin</subfield>

</datafield>

<subfield code="a">Singh, Muskaan</subfield>

</datafield>

<subfield code="a">Zuluaga-Gomez, Juan</subfield>

</datafield>

<subfield code="a">Villatoro-Tello, Esaú</subfield>

</datafield>

<subfield code="a">Burdisso, Sergio</subfield>

</datafield>

<subfield code="a">Motlicek, Petr</subfield>

</datafield>

<subfield code="a">Smrz, Pavel</subfield>

</datafield>

<subfield code="i">EXTERNAL</subfield>

<subfield code="u">http://publications.idiap.ch/attachments/reports/2022/Fajcik_Idiap-RR-12-2022.pdf</subfield>

<subfield code="x">PUBLIC</subfield>

</datafield>

<subfield code="a">Idiap-RR-12-2022</subfield>

</datafield>

<subfield code="b">Idiap</subfield>

</datafield>

<subfield code="d">November 2022</subfield>

</datafield>

<subfield code="a">In this paper, we describe our shared task submissions for subtask 2 in CASE-2022, Event Causality Identification with Casual News Corpus (CNC). The challenge focused on the automatic detection of all cause-effect-signal spans present in the sentence from news-media. In this work, we detect cause-effect-signal spans in a sentence using T5, a pre-trained autoregressive language model. We iteratively detect all cause-effect-signal span triplets, always conditioning the prediction of the next triplet on the previously predicted ones. To predict the triplet itself, we consider different causal relationships such as cause -> effect -> signal. Each triplet component (i.e., cause, effect, or signal) is generated via a language model conditioned on the sentence, the previous parts of the current triplet, and previously predicted triplets. Despite training on an extremely small dataset of 160 samples, we show that our approach achieved competitive performance, placing 2nd in the competition. Furthermore, we show that assuming either cause -> effect or effect-> cause causal order achieves similar results. Our results further indicate that whichever component of the triplet is generated first, whether cause or effect, achieves stronger performance when generated first. Our code and model predictions will be released online.</subfield>

</datafield>

</record>

</collection>