Our Work

We are an interdisciplinary research team focused on AI, HCI, ML, NLP, accessibility and computational social science in support of Semantic Scholar's mission of accelerating science. Our team is part of the Allen Institute for AI, a nonprofit research institute advancing AI for the common good.

Follow us on Twitter for research updates!

10 Total Publications

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Clear

Words as Gatekeepers: Measuring Discipline-specific Terms and Meanings in Scholarly Publications

Li Lucy, Jesse Dodge, David Bamman, Katherine A. Keith

No items found.

Findings of ACL
July 9, 2023

TLDR

Scholarly text is often laden with jargon, or specialized language that can facilitate efficient in-group communication within fields but hi...

Are Layout-Infused Language Models Robust to Layout Distribution Shifts? A Case Study with Scientific Documents

Catherine Chen, Zejiang Shen, Dan Klein, Gabi Stanovsky, Doug Downey, Kyle Lo

No items found.

Findings of ACL
July 9, 2023

TLDR

Recent work has shown that infusing layout features into language models (LMs) improves processing of visually-rich documents such as scient...

Riveter: Measuring Power and Social Dynamics Between Entities

Maria Antoniak, Anjalie Field, Jimin Mun, Melanie Walsh, Lauren F. Klein, Maarten Sap

No items found.

ACL
July 9, 2023

TLDR

Estimating the Causal Effect of Early ArXiving on Paper Acceptance

Yanai Elazar, Jiayao Zhang, David Wadden, Boshen Zhang, Noah A. Smith

No items found.

arXiv.org
June 24, 2023

TLDR

What is the effect of releasing a preprint of a paper before it is submitted for peer review? No randomized controlled trial has been conduc...

Decomposing Complex Queries for Tip-of-the-tongue Retrieval

Kevin Lin, Kyle Lo, Joseph E. Gonzalez, Dan Klein

No items found.

arXiv
May 24, 2023

TLDR

This work introduces a simple yet effective framework for handling such complex queries by decomposing the query into individual clues, routing those as sub-queries to specialized retrievers, and ensembling the results.

A Controllable QA-based Framework for Decontextualization

Benjamin Newman, Luca Soldaini, Raymond Fok, Arman Cohan, Kyle Lo

No items found.

arXiv
May 24, 2023

TLDR

This work proposes a question-answering framework for decontextualization that allows for better handling of user information needs and preferences when determining the scope of rewriting, and presents results showing state-of-the-art LLMs under this framework remain competitive with end-to-end approaches.

Complex Mathematical Symbol Definition Structures: A Dataset and Model for Coordination Resolution in Definition Extraction

Anna Martin-Boyle, Andrew Head, Kyle Lo, Risham Sidhu, Marti A. Hearst, Dongyeop Kang

No items found.

arXiv
May 24, 2023

TLDR

Embedding Recycling for Language Models

Jon Saad-Falcon, Amanpreet Singh, Luca Soldaini, Mike D'Arcy, Arman Cohan, Doug Downey

No items found.

Findings of EACL
May 2, 2023

TLDR

It is shown how a simple ER technique that caches activations from an intermediate layer of a pretrained model, and learns task-specific adapters on the later layers, is broadly effective and reveals important areas of future work.

LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization

Kalpesh Krishna, Erin Bransom, Bailey Kuehl, Mohit Iyyer, Pradeep Dasigi, Arman Cohan, Kyle Lo

No items found.

EACL
May 1, 2023

TLDR

LongEval is presented, a set of guidelines for human evaluation of faithfulness in long-form summaries that addresses the following challenges: How can high inter-annotator agreement on faithfulness scores be achieved?

S2abEL: A Dataset for Entity Linking from Scientific Tables

Yuze Lou, Bailey Kuehl, Erin Bransom, Sergey Feldman, Aakanksha Naik, Doug Downey

No items found.

arXiv.org
April 30, 2023

TLDR

A neural baseline method designed for EL is introduced on scientific tables containing many out-of-knowledge-base mentions, and it significantly outperforms a state- of-the-art generic table EL method.

CiteSee: Augmenting Citations in Scientific Papers with Persistent and Personalized Historical Context

Joseph Chee Chang, Amy X. Zhang, Jonathan Bragg, Andrew Head, Kyle Lo, Doug Downey, Daniel S. Weld

No items found.

CHI
April 23, 2023

TLDR

CiteSee is a paper reading tool that leverages a user’s publishing, reading, and saving activities to provide personalized visual augmentations and context around citations to help users prioritize their exploration.

ComLittee: Literature Discovery with Personal Elected Author Committees

Hyeonsu B Kang, Nouran Soliman, Matt Latzke, Joseph Chee Chang, Jonathan Bragg

No items found.

CHI
April 23, 2023

TLDR

In order to help scholars understand and follow a research topic, significant research has been devoted to creating systems that help schola...

Srishti Palani, Aakanksha Naik, Doug Downey, Amy X. Zhang, Jonathan Bragg, Joseph Chee Chang

No items found.

CHI
April 23, 2023

TLDR

This work designs a system, Relatedly, that scaffolds exploring and reading multiple related work paragraphs on a topic, with features including dynamic re-ranking and highlighting to spotlight unexplored dissimilar information, auto-generated descriptive paragraph headings, and low-lighting of redundant information.

Beyond Summarization: Designing AI Support for Real-World Expository Writing Tasks

Zejiang Shen, Tal August, Pao Siangliulue, Kyle Lo, Jonathan Bragg, Jeff Hammerbacher, Doug Downey, Joseph Chee Chang, David Sontag

No items found.

The Second Workshop on Intelligent and Interactive Writing Assistants @ ACM SIGCHI 2023
April 5, 2023

TLDR

It is argued that developing AI supports for expository writing has unique and exciting research challenges and can lead to high real-world impacts.

Queer In AI: A Case Study in Community-Led Participatory AI

Organizers Of Queer in AI, Anaelia Ovalle, Arjun Subramonian, Ashwin Singh, Claas Voelcker, Danica J. Sutherland, Davide Locatelli, Eva Breznik, Filip Klubička, Hang Yuan, Hetvi J, Huan Zhang, Jaidev Shriram, Kruno Lehman, Luca Soldaini, Maarten Sap, Marc Peter Deisenroth, Maria Leonor Pacheco, Maria Ryskina, Martin Mundt, Melvin Selim Atay, Milind Agarwal, Nyx McLean, Pan Xu, A Pranav, Raj Korpan, Ruchira Ray, Sarah Mathew, Sarthak Arora, St John, Tanvi Anand, Vishakha Agrawal, William Agnew, Yanan Long, Zijie J. Wang, Zeerak Talat, Avijit Ghosh, Nathaniel Dennler, Michael Noseworthy, Sharvani Jha, Emi Baylor, Aditya Joshi, Natalia Y. Bilenko, Andrew McNamara, Raphael Gontijo-Lopes, Alex Markham, Evyn Dǒng, Jackie Kay, Manu Saraswat, Nikhil Vytla, Luke Stark

No items found.

FAccT
March 29, 2023

TLDR

We present Queer in AI as a case study for community-led participatory design in AI. We examine how participatory design and intersectional ...

Scim: Intelligent Faceted Highlights for Interactive, Multi-Pass Skimming of Scientific Papers

Raymond Fok, Hita Kambhamettu, Luca Soldaini, Jonathan Bragg, Kyle Lo, Andrew Head, Marti A. Hearst, Daniel S. Weld

No items found.

IUI
March 27, 2023

TLDR

This work introduces Scim, a novel intelligent interface that helps experienced researchers skim – or rapidly review – a paper to attain a cursory understanding of its contents and discusses design considerations and tensions for the design of future intelligent skimming tools.

The Semantic Reader Project: Augmenting Scholarly Documents through AI-Powered Interactive Reading Interfaces

Kyle Lo, Joseph Chee Chang, Andrew Head, Jonathan Bragg, Amy X. Zhang, Cassidy Trier, Chloe Anastasiades, Tal August, Russell Authur, Danielle Bragg, Erin Bransom, Isabel Cachola, Stefan Candra, Yoganand Chandrasekhar, Yen-Sung Chen, Evie (Yu-Yen) Cheng, Yvonne Chou, Doug Downey, Rob Evans, Raymond Fok, F.Q. Hu, Regan Huff, Dongyeop Kang, Tae Soo Kim, Rodney Kinney, Aniket Kittur, Hyeonsu B Kang, Egor Klevak, Bailey Kuehl, Michael Langan, Matt Latzke, Jaron Lochner, Kelsey MacMillan, Eric Marsh, Tyler Murray, Aakanksha Naik, Ngoc-Uyen Nguyen, Srishti Palani, Soya Park, Caroline Paulic, Napol Rachatasumrit, Smita Rao, Paul Sayre, Zejiang Shen, Pao Siangliulue, Luca Soldaini, Huy Tran, Madeleine van Zuylen, Lucy Lu Wang, Christopher Wilhelm, Caroline Wu, Jiangjiang Yang, Angele Zamarron, Marti A. Hearst, Daniel S. Weld

No items found.

arXiv
March 25, 2023

TLDR

This paper describes the Semantic Reader Project, a collaborative effort across multiple institutions to explore automatic creation of dynamic reading interfaces for research papers, and develops and releases a production reading interface that will incorporate the best features as they mature.

Comparing Sentence-Level Suggestions to Message-Level Suggestions in AI-Mediated Communication

Liye Fu, Benjamin Newman, Maurice Jakesch, Sarah Kreps

No items found.

International Conference on Human Factors in Computing Systems
February 26, 2023

TLDR

Traditionally, writing assistance systems have focused on short or even single-word suggestions. Recently, large language models like GPT-3 ...

The Semantic Scholar Open Data Platform

Rodney Michael Kinney, Chloe Anastasiades, Russell Authur, Iz Beltagy, Jonathan Bragg, Alexandra Buraczynski, Isabel Cachola, Stefan Candra, Yoganand Chandrasekhar, Arman Cohan, Miles Crawford, Doug Downey, J. Dunkelberger, Oren Etzioni, R. Evans, Sergey Feldman, Joseph Gorney, D. Graham, F.Q. Hu, Regan Huff, Daniel King, Sebastian Kohlmeier, Bailey Kuehl, Michael Langan, Daniel Lin, Haokun Liu, Kyle Lo, Jaron Lochner, Kelsey MacMillan, Tyler Murray, Christopher Newell, Smita Rao, Shaurya Rohatgi, P. Sayre, Zejiang Shen, Amanpreet Singh, Luca Soldaini, Shivashankar Subramanian, A. Tanaka, Alex D Wade, Linda M. Wagner, Lucy Lu Wang, Christopher Wilhelm, Caroline Wu, Jiangjiang Yang, A. Zamarron, Madeleine van Zuylen, Daniel S. Weld

No items found.

arXiv
January 24, 2023

TLDR

This paper combines public and proprietary data sources using state-of-theart techniques for scholarly PDF content extraction and automatic knowledge graph construction to build the Semantic Scholar Academic Graph, the largest open scientific literature graph to-date.

Exploring the Challenges of Open Domain Multi-Document Summarization

John Giorgi, Luca Soldaini, Bo Wang, Gary Bader, Kyle Lo, Lucy Lu Wang, Arman Cohan

No items found.

arXiv
December 20, 2022

TLDR

It is found that existing summarizers suffer large reductions in performance when applied as-is to this more realistic task, though training summarizers with retrieved inputs can reduce their sensitivity retrieval errors.

I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation

Chandra Bhagavatula, Jena D. Hwang, Doug Downey, Ronan Le Bras, Ximing Lu, Keisuke Sakaguchi, Swabha Swayamdipta, Peter West, Yejin Choi

No items found.

ACL
December 19, 2022

TLDR

Empirical results suggest that scale is not the only way, as novel algorithms can be a promising alternative, and leads to a new corpus of generics, Gen-A-tomic, that is the largest and highest quality available to date.

GENIE: Toward Reproducible and Standardized Human Evaluation for Text Generation

Daniel Khashabi, Gabriel Stanovsky, Jonathan Bragg, Nicholas Lourie, Jungo Kasai, Yejin Choi, Noah A. Smith, Daniel S. Weld

No items found.

EMNLP
December 7, 2022

TLDR

This work considers design choices for the annotation interface used to elicit human judgments and their impact on reproducibility, and develops an automated mechanism for maintaining annotator quality via a probabilistic model that detects and excludes noisy annotators.

Knowledge Transfer from Answer Ranking to Answer Generation

Matteo Gabburo, Rik Koncel-Kedziorski, Siddhant Garg, Luca Soldaini, Alessandro Moschitti

No items found.

EMNLP
December 7, 2022

TLDR

This paper proposes to train a GenQA model by transferring knowledge from a trained AS2 model, and proposes to use the As2 model prediction scores for loss weighting and score-conditioned input/output shaping, to aid the knowledge transfer.

Pre-training Transformer Models with Sentence-Level Objectives for Answer Sentence Selection

Luca Di Liello, Siddhant Garg, Luca Soldaini, Alessandro Moschitti

No items found.

EMNLP
December 7, 2022

TLDR

This paper proposes three novel sentence-level transformer pre-training objectives that incorporate paragraph-level semantics within and across documents, to improve the performance of transformers for AS2, and mitigate the requirement of large labeled datasets.

Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems

Yoshitomo Matsubara, Luca Soldaini, Eric Lind, Alessandro Moschitti

No items found.

Findings of EMNLP
December 7, 2022

TLDR

This paper proposes a Multiple Heads Student architecture (named CERBERUS), an efficient neural network designed to distill an ensemble of large transformers into a single smaller model, rivaling the state-of-the-art large AS2 models that have 2.7x more parameters and run 2x slower.

SciRepEval: A Multi-Format Benchmark for Scientific Document Representations

Amanpreet Singh, Mike D'Arcy, Arman Cohan, Doug Downey, Sergey Feldman

No items found.

arXiv
November 28, 2022

TLDR

It is shown how state-of-the-art models struggle to generalize across task formats, and that simple multi-task training fails to improve them, and a new approach that learns multiple embeddings per document, each tailored to a different format, can improve performance.

Cross-Lingual GenQA: Open-Domain Question Answering with Answer Sentence Generation

Benjamin Muller, Luca Soldaini, Rik Koncel-Kedziorski, Eric Lind, Alessandro Moschitti

No items found.

AACL
November 20, 2022

TLDR

This paper introduces G EN -T Y D I QA, an extension of the TyDiQA dataset with well-formed and complete answers for Arabic, Bengali, English, Japanese, and Russian questions and presents the first Cross-Lingual answer sentence generation system (C ROSS -L INGUAL G EN QA).

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao, Angela Fan, Christopher Akiki, Elizabeth-Jane Pavlick, Suzana Ili'c, Daniel Hesslow, Roman Castagn'e, A. Luccioni, Franccois Yvon, Matthias Gallé, J. Tow, Alexander M. Rush, Stella Rose Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major, Iz Beltagy, Huu Nguyen, Lucile Saulnier, Samson Tan, Pedro Ortiz Suarez, Victor Sanh, Hugo Laurenccon, Yacine Jernite, Julien Launay, Margaret Mitchell, Colin Raffel, Aaron Gokaslan, Adi Simhi, Aitor Soroa Etxabe, A. F. Aji, Amit Alfassy, Anna Rogers, Ariel Kreisberg Nitzav, Canwen Xu, Chenghao Mou, Chris C. Emezue, Christopher Klamm, Colin Leong, Daniel Alexander van Strien, David Ifeoluwa Adelani, Dragomir Radev, Eduardo G. Ponferrada, Efrat Levkovizh, Ethan Kim, E. Natan, F. Toni, Gérard Dupont, Germán Kruszewski, Giada Pistilli, Hady ElSahar, Hamza Benyamina, Hieu Tran, Ian Yu, Idris Abdulmumin, Isaac Johnson, Itziar Gonzalez-Dios, Javier de la Rosa, Jenny Chim, Jesse Dodge, Jian Zhu, Jonathan Chang, Jorg Frohberg, Josephine L. Tobing, J. Bhattacharjee, Khalid Almubarak, Kimbo Chen, Kyle Lo, Leandro von Werra, Leon Weber, Long Phan, Loubna Ben Allal, L. Tanguy, Manan Dey, M. Muñoz, Maraim Masoud, Mar'ia Grandury, Mario vSavsko, Max Huang, Maximin Coavoux, Mayank Singh, Mike Tian-Jian Jiang, Minh Chien Vu, M. A. Jauhar, Mustafa Ghaleb, Nishant Subramani, Nora Kassner, Nurulaqilla Khamis, Olivier Nguyen, Omar Espejel, Ona de Gibert, Paulo Villegas, Peter Henderson, Pierre Colombo, Priscilla Amuok, Quentin Lhoest, Rheza Harliman, Rishi Bommasani, R. L'opez, R. Ribeiro, Salomey Osei, Sampo Pyysalo, Sebastian Nagel, Shamik Bose, Shamsuddeen Hassan Muhammad, Shanya Sharma, S. Longpre, Somaieh Nikpoor, Stanislav Silberberg, S. Pai, S. Zink, Tiago Timponi Torrent, Timo Schick, Tristan Thrush, V. Danchev, Vassilina Nikoulina, Veronika Laippala, Violette Lepercq, V. Prabhu, Zaid Alyafeai, Zeerak Talat, Arun Raja, Benjamin Heinzerling, Chenglei Si, Elizabeth Salesky, Sabrina J. Mielke, Wilson Y. Lee, Abheesht Sharma, Andrea Santilli, Antoine Chaffin, Arnaud Stiegler, Debajyoti Datta, Eliza Szczechla, Gunjan Chhablani, Han Wang, Harshit Pandey, Hendrik Strobelt, Jason Alan Fries, Jos Rozen, Leo Gao, Lintang Sutawika, M Saiful Bari, Maged S. Al-shaibani, Matteo Manica, Nihal V. Nayak, Ryan Teehan, Samuel Albanie, Sheng Shen, Srulik Ben-David, Stephen H. Bach, Taewoon Kim, T. Bers, Thibault Févry, Trishala Neeraj, Urmish Thakker, Vikas Raunak, Xiang Tang, Zheng Xin Yong, Zhiqing Sun, Shaked Brody, Y. Uri, Hadar Tojarieh, Adam Roberts, Hyung Won Chung, Jaesung Tae, Jason Phang, Ofir Press, Conglong Li, D. Narayanan, Hatim Bourfoune, J. Casper, Jeff Rasley, Max Ryabinin, Mayank Mishra, Minjia Zhang, M. Shoeybi, Myriam Peyrounette, N. Patry, Nouamane Tazi, Omar Sanseviero, Patrick von Platen, Pierre Cornette, Pierre Franccois Lavall'ee, R. Lacroix, Samyam Rajbhandari, Sanchit Gandhi, Shaden Smith, S. Requena, Suraj Patil, Tim Dettmers, Ahmed Baruwa, Amanpreet Singh, Anastasia Cheveleva, Anne-Laure Ligozat, Arjun Subramonian, Aur'elie N'ev'eol, Charles Lovering, Daniel H Garrette, D. Tunuguntla, Ehud Reiter, Ekaterina Taktasheva, E. Voloshina, Eli Bogdanov, Genta Indra Winata, Hailey Schoelkopf, Jan-Christoph Kalo, Jekaterina Novikova, J. Forde, Jordan Clive, Jungo Kasai, Ken Kawamura, Liam Hazan, Marine Carpuat, Miruna Clinciu, Najoung Kim, Newton Cheng, Oleg Serikov, Omer Antverg, Oskar van der Wal, Rui Zhang, Ruochen Zhang, Sebastian Gehrmann, S. Pais, Tatiana Shavrina, Thomas Scialom, Tian Yun, Tomasz Limisiewicz, Verena Rieser, V. Protasov, V. Mikhailov, Yada Pruksachatkun, Yonatan Belinkov, Zachary Bamberger, Zdenvek Kasner, Alice Rueda, Amanda Pestana, A. Feizpour, Ammar Khan, Amy Faranak, A. Santos, A. Hevia, Antigona Unldreaj, Arash Aghagol, Arezoo Abdollahi, A. Tammour, Azadeh HajiHosseini, Bahareh Behroozi, B. Ajibade, B. Saxena, Carlos Muñoz Ferrandis, Danish Contractor, D. Lansky, Davis David, Douwe Kiela, D. A. Nguyen, Edward Tan, Emily Baylor, Ezinwanne Ozoani, Fatim T Mirza, Frankline Ononiwu, Habib Rezanejad, H.A. Jones, Indrani Bhattacharya, Irene Solaiman, Irina Sedenko, I. Nejadgholi, J. Passmore, Joshua Seltzer, Julio Bonis Sanz, Karën Fort, L. Dutra, Mairon Samagaio, Maraim Elbadri, M. Mieskes, Marissa Gerchick, Martha Akinlolu, Michael McKenna, Mike Qiu, M. Ghauri, Mykola Burynok, Nafis Abrar, Nazneen Rajani, Nour Elkott, N. Fahmy, O. Samuel, Ran An, R. Kromann, Ryan Hao, S. Alizadeh, Sarmad Shubber, Silas L. Wang, Sourav Roy, S. Viguier, Thanh-Cong Le, Tobi Oyebade, T. Le, Yoyo Yang, Z. Nguyen, Abhinav Ramesh Kashyap, Alfredo Palasciano, A. Callahan, Anima Shukla, Antonio Miranda-Escalada, A. Singh, Benjamin Beilharz, Bo Wang, C. Brito, Chenxi Zhou, Chirag Jain, Chuxin Xu, Clémentine Fourrier, Daniel Le'on Perin'an, Daniel Molano, Dian Yu, Enrique Manjavacas, Fabio Barth, Florian Fuhrimann, Gabriel Altay, Giyaseddin Bayrak, Gully A. Burns, Helena U. Vrabec, I. Bello, Isha Dash, J. Kang, John Giorgi, J. Golde, J. Posada, Karthi Sivaraman, Lokesh Bulchandani, Lu Liu, Luisa Shinzato, Madeleine Hahn de Bykhovetz, Maiko Takeuchi, Marc Pàmies, M. A. Castillo, Marianna Nezhurina, Mario Sanger, M. Samwald, Michael Cullan, Michael Weinberg, M. Wolf, Mina Mihaljcic, Minna Liu, M. Freidank, Myungsun Kang, Natasha Seelam, N. Dahlberg, N. Broad, N. Muellner, Pascale Fung, Patricia Haller, R. Chandrasekhar, R. Eisenberg, Robert Martin, Rodrigo L. Canalli, Rosaline Su, Ruisi Su, Samuel Cahyawijaya, Samuele Garda, Shlok S Deshmukh, Shubhanshu Mishra, Sid Kiblawi, Simon Ott, Sinee Sang-aroonsiri, Srishti Kumar, Stefan Schweter, S. Bharati, T. A. Laud, Th'eo Gigant, Tomoya Kainuma, Wojciech Kusa, Yanis Labrak, Yashasvi Bajaj, Y. Venkatraman, Yifan Xu, Ying Xu, Yun-chao Xu, Z. Tan, Zhongli Xie, Zifan Ye, M. Bras, Younes Belkada, Thomas Wolf

No items found.

arXiv
November 9, 2022

TLDR

BLOOM is a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers and achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning.

Exploring Team-Sourced Hyperlinks to Address Navigation Challenges for Low-Vision Readers of Scientific Papers

Soya Park, Jonathan Bragg, Michael Chang, Kevin Larson, Danielle Bragg

No items found.

CSCW
November 8, 2022

TLDR

It may be possible for readers of all abilities to organically leave traces in papers, and that these traces can be used to facilitate navigation tasks, in particular for low-vision readers.

Threddy: An Interactive System for Personalized Thread-based Exploration and Organization of Scientific Literature

Hyeonsu B. Kang, Joseph Chee Chang, Yongsung Kim, Aniket Kittur

No items found.

UIST
October 29, 2022

TLDR

A tool integrated into users’ reading process that helps them with leveraging authors’ existing summarization of threads, typically in introduction or related work sections, in order to situate their own work’s contributions is developed.

FeedLens: Polymorphic Lenses for Personalizing Exploratory Search over Knowledge Graphs

Harmanpreet Kaur, Doug Downey, Amanpreet Singh, Evie (Yu-Yen) Cheng, Daniel S. Weld, Jonathan Bragg

No items found.

UIST
October 29, 2022

TLDR

This work introduces a new technique, polymorphic lenses, that improves exploratory search over a KG by obtaining new leverage from the existing preference models that KG-based systems maintain for recommending content.

SciFact-Open: Towards open-domain scientific claim verification

David Wadden, Kyle Lo, Bailey Kuehl, Arman Cohan, Iz Beltagy, Lucy Lu Wang, Hannaneh Hajishirzi

No items found.

EMNLP 2022
October 25, 2022

TLDR

SciFact-Open is presented, a new test collection designed to evaluate the performance of scientific claim verification systems on a corpus of 500K research abstracts, and it is found that systems developed on smaller corpora struggle to generalize to SciFact- open, exhibiting performance drops of at least 15 F1.

A Dataset of Alt Texts from HCI Publications

Sanjana Chintalapati, Jonathan Bragg, Lucy Lu Wang

No items found.

ASSETS
October 23, 2022

TLDR

Multi-Scale Contrastive Co-Training for Event Temporal Relation Extraction

Hao-Ren Yao, Luke Breitfeller, Aakanksha Naik, Chunxiao Zhou, Carolyn Rosé

No items found.

arXiv.org
September 1, 2022

TLDR

It is empirically demonstrate that MulCo provides improved ability to fuse local and global contexts encoded using BERT and GNN compared to the current state-of-the-art.

MultiVerS: Improving scientific claim verification with weak supervision and full-document context

David Wadden, Kyle Lo, Lucy Lu Wang, Arman Cohan, Iz Beltagy, Hannaneh Hajishirzi

No items found.

Findings of NAACL
July 10, 2022

TLDR

MultiVerS is presented, which predicts a fact-checking label and identifies rationales in a multitask fashion based on a shared encoding of the claim and full document context, and allows MultiVerS to perform weakly-supervised domain adaptation by training on scientific documents labeled using high-precision heuristics.

Few-Shot Self-Rationalization with Natural Language Prompts

Ana Marasović, Iz Beltagy, Doug Downey, Matthew E. Peters

No items found.

Findings of NAACL
July 10, 2022

TLDR

This work identifies the right prompting approach by extensively exploring natural language prompts on FEB and demonstrates that making progress on few-shot self-rationalization is possible, and presents FEB -- a standardized collection of four existing English-language datasets and associated metrics.

Long Context Question Answering via Supervised Contrastive Learning

Avi Caciularu, Ido Dagan, Jacob Goldberger, Arman Cohan

No items found.

NAACL
July 10, 2022

TLDR

This work proposes a novel method for equipping long-context QA models with an additional sequence-level objective for better identification of the supporting evidence, via an additional contrastive supervision signal in finetuning.

Literature-Augmented Clinical Outcome Prediction

Aakanksha Naik, S. Parasa, Sergey Feldman, Lucy Lu Wang, Tom Hope

No items found.

Findings of NAACL
July 10, 2022

TLDR

A novel system that automatically retrieves patient-specific literature based on intensive care (ICU) patient information, aggregates relevant papers and fuses them with internal admission notes to form outcome predictions, which is able to substantially boost predictive accuracy on three challenging tasks in comparison to strong recent baselines.

Paragraph-based Transformer Pre-training for Multi-Sentence Inference

Luca Di Liello, Siddhant Garg, Luca Soldaini, Alessandro Moschitti

No items found.

NAACL
July 10, 2022

TLDR

This paper shows that popular pre-trained transformers perform poorly when used for fine-tuning on multi-candidate inference tasks, and proposes a new pre-training objective that models the paragraph-level semantics across multiple input sentences.

Multi-LexSum: Real-World Summaries of Civil Rights Lawsuits at Multiple Granularities

Zejiang Shen, Kyle Lo, Lauren Yu, Nathan Dahlberg, Margo Schlanger, Doug Downey

No items found.

arXiv
June 22, 2022

TLDR

Multi-LexSum is introduced, a collection of 9,280 expert-authored summaries drawn from ongoing CRLC writing that presents a challenging multi-document summarization task given the length of the source documents, often exceeding two hundred pages per case.

Data Governance in the Age of Large-Scale Data-Driven Language Technology

Yacine Jernite, Huu Nguyen, Stella Rose Biderman, A. Rogers, Maraim Masoud, V. Danchev, Samson Tan, A. Luccioni, Nishant Subramani, Gérard Dupont, Jesse Dodge, Kyle Lo, Zeerak Talat, Isaac Johnson, Dragomir R. Radev, Somaieh Nikpoor, Jorg Frohberg, Aaron Gokaslan, Peter Henderson, Rishi Bommasani, Margaret Mitchell

No items found.

FAccT
June 21, 2022

TLDR

The framework presented is a multi-party international governance structure focused on language data, and incorporating technical and organizational tools needed to support its work.

Multi-Vector Models with Textual Guidance for Fine-Grained Scientific Document Similarity

Sheshera Mysore, Arman Cohan, Tom Hope

No items found.

NAACL
June 11, 2022

TLDR

We present Aspire, a new scientific document similarity model based on matching fine-grained aspects.

VILA: Improving Structured Content Extraction from Scientific PDFs Using Visual Layout Groups

Zejiang Shen, Kyle Lo, Lucy Lu Wang, Bailey Kuehl, Daniel S. Weld, Doug Downey

No items found.

TACL
June 1, 2022

TLDR

We introduce new methods for incorporating VIsual LAyout (VILA) structures, e.g., the grouping of page texts into text lines or text blocks, into language models to further improve performance on automated scientific document understanding.

Improving the Generalizability of Depression Detection by Leveraging Clinical Questionnaires

Thong Nguyen, Andrew Yates, Ayah Zirikly, Bart Desmet, Arman Cohan

No items found.

ACL
May 25, 2022

TLDR

Grounding model predictions in clinically-relevant symptoms can improve generalizability while producing a model that is easier to inspect, and this approach can still perform competitively on in-domain data.

Zero- and Few-Shot NLP with Pretrained Language Models

Iz Beltagy, Arman Cohan, Robert Logan IV, Sewon Min, Sameer Singh

No items found.

ACL, tutorial
May 25, 2022

TLDR

This tutorial aims at bringing interested NLP researchers up to speed about the recent and ongoing techniques for zero- and few-shot learning with pretrained language models.

Penguins Don't Fly: Reasoning about Generics through Instantiations and Exceptions

Emily Allaway, Jena D. Hwang, Chandra Bhagavatula, K. McKeown, Doug Downey, Yejin Choi

No items found.

arXiv
May 23, 2022

TLDR

This work presents a novel framework informed by linguistic theory to generate exemplars—specific cases when a generic holds true or false and highlights the importance of linguistic theory-based controllability for generating exemplars, the insufficiency of knowledge bases as a source of exemplar, and the challenges exemplars pose for the task of natural language inference.

Generating Scientific Claims for Zero-Shot Scientific Fact Checking

Dustin Wright, David Wadden, Kyle Lo, Bailey Kuehl, Arman Cohan, Isabelle Augenstein, Lucy Lu Wang

No items found.

ACL
May 22, 2022

TLDR

This work proposes scientific claim generation, the task of generating one or more atomic and verifiable claims from scientific sentences, and demonstrates its usefulness in zero-shot fact checking for biomedical claims, and proposes CLAIMGEN-BART, a new supervised method for generating claims supported by the literature, as well as KBIN, a novel methods for generating claim negations.

ACCoRD: A Multi-Document Approach to Generating Diverse Descriptions of Scientific Concepts

Sonia K. Murthy, Kyle Lo, Daniel King, Chandra Bhagavatula, Bailey Kuehl, Sophie Johnson, Jon Borchardt, Daniel S. Weld, Tom Hope, Doug Downey

No items found.

arXiv
May 14, 2022

TLDR

ACCoRD, an end-to-end system tackling the novel task of generating sets of descriptions of scientific concepts, takes advantage of the myriad ways a concept is mentioned across the scientific literature to produce distinct, diverse descriptions of target scientific concepts in terms of different reference concepts.

PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

Wen Xiao, Iz Beltagy, G. Carenini, Arman Cohan

No items found.

ACL
May 9, 2022

TLDR

PRIMERA is introduced, a pre-trained model for multi-document representation with a focus on summarization that reduces the need for dataset-specific architectures and large amounts of fine-tuning labeled data.

Scaling Creative Inspiration with Fine-Grained Functional Facets of Product Ideas

Tom Hope, Ronen Tamari, Hyeonsu Kang, Daniel Hershcovich, J. Chan, A. Kittur, Dafna Shahaf

No items found.

CHI
May 1, 2022

TLDR

A novel computational representation that automatically breaks up products into fine-grained functional facets is proposed that leads to a significant boost in search accuracy and in the quality of creative inspirations, outperforming strong baselines and state-of-art representations of product texts by 50-60%.

From Who You Know to What You Read: Augmenting Scientific Recommendations with Implicit Social Networks

Hyeonsu Kang, Rafal Kocielnik, Andrew Head, Jiangjiang Yang, Matt Latzke, A. Kittur, Daniel S. Weld, Doug Downey, Jonathan Bragg

No items found.

CHI
April 30, 2022

TLDR

This work introduces multiple new methods for augmenting recommendations with textual relevance messages that highlight knowledge-graph connections between recommended papers and a user’s publication and interaction history and develops a novel method that highlights connections with proxy authors of interest to users.

S2AMP: A High-Coverage Dataset of Scholarly Mentorship Inferred from Publications

Shaurya Rohatgi, Doug Downey, Daniel King, Sergey Feldman

No items found.

JCDL
April 22, 2022

TLDR

This work contributes two datasets to the study of mentorship, one of which has over 300,000 ground truth academic mentor-mentee pairs obtained from multiple diverse, manually-curated sources, and linked to the Semantic Scholar (S2) knowledge graph.

Bursting Scientific Filter Bubbles: Boosting Innovation via Novel Author Discovery

Jason Portenoy, Marissa Radensky, Jevin D. West, E. Horvitz, Daniel S. Weld, Tom Hope

No items found.

CHI
April 13, 2022

TLDR

We construct a faceted representation of authors with information gleaned from their papers and inferred author personas, and use it to develop an approach that locates commonalities ("bridges") and contrasts between scientists. This approach helps users discover authors considered useful for generating novel research directions.

Infrastructure for rapid open knowledge network development

Michael Cafarella, Michael Anderson, Iz Beltagy, Arie Cattan, Sarah Chasins, Ido Dagan, Doug Downey, Oren Etzioni, Sergey Feldman, Tian Gao, Tom Hope, Kexin Huang, Sophie Johnson, Daniel King, Kyle Lo, Yuze Lou, Matthew Shapiro, Dinghao Shen, Shivashankar Subramanian, Lucy Lu Wang, Yuning Wang, Yitong Wang, Daniel Weld, Jenny Vo-Phamhi, Anna Zeng, Jiayun Zou

No items found.

AI Magazine
March 31, 2022

TLDR

A National Science Foundation Convergence Accelerator project is described to build a set of Knowledge Network Programming Infrastructure systems to address the issue of frustratingly slow building, using, and scaling large knowledge networks.

CiteRead: Integrating Localized Citation Contexts into Scientific Paper Reading

Napol Rachatasumrit, Jonathan Bragg, Amy X. Zhang, Daniel S. Weld

No items found.

IUI
March 21, 2022

TLDR

A novel paper reading experience that integrates relevant information about follow-on work directly into a paper, allowing readers to learn about newer papers and see how a paper is discussed by its citing papers in the context of the reference paper.

Don't Say What You Don't Know: Improving the Consistency of Abstractive Summarization by Constraining Beam Search

Daniel King, Zejiang Shen, Nishant Subramani, Daniel S. Weld, Iz Beltagy, Doug Downey

No items found.

GEM Workshop 2022
March 16, 2022

TLDR

PINOCCHIO is presented, a new decoding method that improves the consistency of a transformer-based abstractive summarizer by constraining beam search to avoid hallucinations.

LIMEADE: From AI Explanations to Advice Taking

B. Lee, Doug Downey, Kyle Lo, Daniel S. Weld

No items found.

TiiS
March 9, 2022

TLDR

This paper introduces LIMEADE, the first general framework that translates both positive and negative advice (expressed using high-level vocabulary such as that employed by post-hoc explanations) into an update to an arbitrary, underlying opaque model.

Paper Plain: Making Medical Research Papers Approachable to Healthcare Consumers with Natural Language Processing

Tal August, Lucy Lu Wang, Jonathan Bragg, Marti A. Hearst, Andrew Head, Kyle Lo

No items found.

arXiv
February 28, 2022

TLDR

To improve access to medical papers, we introduce a novel interactive interface-Paper Plain-with four features powered by natural language processing: definitions of unfamiliar terms, in-situ plain language section summaries, a collection of key questions that guide readers to answering passages, and plain language summaries of the answering passages.

One-Shot Labeling for Automatic Relevance Estimation

Sean MacAvaney, Luca Soldaini

No items found.

SIGIR
February 26, 2022

TLDR

This work examines an extreme evaluation setting wherein only a single known relevant document per query is available for evaluation, and finds that although the predictions of these One-Shot Labelers frequently disagree with human assessments, the labels they produce yield a far more reliable ranking of systems than the single labels do alone.

A Search Engine for Discovery of Scientific Challenges and Directions

D. Lahav, Jon Saad-Falcon, Bailey Kuehl, Sophie Johnson, S. Parasa, N. Shomron, Duen Horng Chau, Diyi Yang, E. Horvitz, Daniel S. Weld, Tom Hope

No items found.

AAAI
February 21, 2022

TLDR

Our goal is to bolster the ability of researchers and clinicians to keep track of difficulties, limitations and emerging hypotheses.

FLEX: Unifying Evaluation for Few-Shot NLP

Jonathan Bragg, Arman Cohan, Kyle Lo, Iz Beltagy

No items found.

NeurIPS
December 6, 2021

TLDR

Few-shot NLP research lacks a unified, challenging-yet-realistic evaluation setup. In response, we introduce FLEX, a rigorous few-shot learning NLP benchmark and public leaderboard measuring four transfer types. We also present UniFew, a simple, competitive baseline that does not rely on heavy prompt engineering or complex meta-learning methods.

Towards Personalized Descriptions of Scientific Concepts

Sonia K. Murthy, Daniel King, Tom Hope, Daniel S. Weld, Doug Downey

No items found.

EMNLP 2021 • WiNLP
November 11, 2021

TLDR

This paper proposes generating personalized scientific concept descriptions that are tailored to the user’s expertise and context and outlines a complete architecture for the task and releases an expert-annotated resource, ACCoRD.

MS2: Multi-Document Summarization of Medical Studies

Jay DeYoung, Iz Beltagy, Madeleine van Zuylen, Bailey Kuehl, Lucy Lu Wang

No items found.

EMNLP
November 7, 2021

TLDR

This work releases MS^2 (Multi-Document Summarization of Medical Studies ), a dataset of over 470k documents and 20K summaries derived from the scientific literature that facilitates the development of systems that can assess and aggregate contradictory evidence across multiple studies , and is the first large-scale, publicly available multi-document summarization dataset in the biomedical domain.

CDLM: Cross-Document Language Modeling

Avi Caciularu, Arman Cohan, Iz Beltagy, Matthew E. Peters, Arie Cattan, Ido Dagan

No items found.

Findings of EMNLP
November 7, 2021

TLDR

A new pretrained language model for cross document tasks.

SciA11y: Converting Scientific Papers to Accessible HTML

Lucy Lu Wang, Isabel Cachola, Jonathan Bragg, Evie (Yu-Yen) Cheng, Chelsea Hess Haupt, Matt Latzke, Bailey Kuehl, Madeleine van Zuylen, Linda M. Wagner, Daniel S. Weld

No items found.

ASSETS
October 18, 2021

TLDR

We present SciA11y, a system that renders inaccessible scientific paper PDFs into HTML.

SciCo: Hierarchical Cross-Document Coreference for Scientific Concepts

Arie Cattan, Sophie Johnson, Daniel S. Weld, Ido Dagan, Iz Beltagy, Doug Downey, Tom Hope

No items found.

AKBC
October 4, 2021

TLDR

An extension of cross-document coreference with a referential hierarchy over mention clusters, in the scientific document domain. New task, dataset and models with applications in faceted document retrieval and knowledge base construction.

Scientific Language Models for Biomedical Knowledge Base Completion: An Empirical Study

Rahul Nadkarni, David Wadden, Iz Beltagy, Noah A. Smith, Hannaneh Hajishirzi, Tom Hope

No items found.

AKBC
October 4, 2021

TLDR

Integrating scientific language models and graph embeddings for boosting drug discovery.

S2AND: A Benchmark and Evaluation System for Author Name Disambiguation

Shivashankar Subramanian, Daniel King, Doug Downey, Sergey Feldman

No items found.

JCDL
September 27, 2021

TLDR

In response to this challenge, we present S2AND, a unified benchmark dataset for AND on scholarly papers, as well as an open-source reference model implementation.

PAWLS: PDF Annotation With Labels and Structure

Mark Neumann, Zejiang Shen, Sam Skjonsberg

No items found.

Demo • ACL
August 2, 2021

TLDR

PAWLS is a new annotation tool designed specifically for the PDF document format. PAWLS supports span-based textual annotation, N-ary relations and freeform, non-textual bounding boxes, all of which can be exported in convenient formats for training multi-modal machine learning models.

Explaining Relationships Between Scientific Documents

Kelvin Luu, Xinyi Wu, Rik Koncel-Kedziorski, Kyle Lo, Isabel Cachola, Noah A. Smit

No items found.

ACL
August 2, 2021

TLDR

We address the task of citation text generation: given a pair of scientific documents, explain their relationship in natural language text in the manner of a citation from one text to the other.

ParsiNLU: A Suite of Language Understanding Challenges for Persian

Daniel Khashabi, Arman Cohan, Siamak Shakeri, et al.

No items found.

TACL
July 1, 2021

TLDR

We introduce ParsiNLU, the first benchmark in Persian language that includes a range of high-level tasks -- Reading Comprehension, Textual Entailment, etc. These datasets are collected in a multitude of ways, often involving manual annotations by native speakers.

MultiCite: Modeling realistic citations requires moving beyond the single-sentence single-label setting

Anne Lauscher, Brandon Ko, Bailey Kuehl, Sophie Johnson, Arman Cohan, David Jurgens, Kyle Lo

No items found.

NAACL
July 1, 2021

TLDR

We highlight three understudied phenomena for citation context analysis and release MultiCite, a new dataset of 12.6K citation contexts from 1.2K computational linguistics papers that fully models these phenomena.

Overview and Insights from the SciVer Shared Task on Scientific Claim Verification

David Wadden, Kyle Lo

No items found.

SDP Workshop • NAACL
June 10, 2021

TLDR

We present an overview of the SCIVER shared task. In addition to surveying the participating systems, we provide several insights into modeling approaches to support continued progress and future research on scientific claim verification.

Extracting a Knowledge Base of Mechanisms from COVID-19 Papers

Aida Amini, T. Hope, David Wadden, Madeleine van Zuylen, E. Horvitz, Roy Schwartz, Hannaneh Hajishirzi

No items found.

NAACL
June 6, 2021

TLDR

To navigate the collection of COVID19 papers from different domains, we present a KB of mechanisms relating to COVID19, to support domain-agnostic search and exploration of general activities, functions, influences and associations in these papers.

A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers

Pradeep Dasigi, Kyle Lo, Iz Beltagy, Arman Cohan, Noah A. Smith, Matt Gardner

No items found.

NAACL
June 6, 2021

TLDR

Qasper is a dataset of 5049 questions over 1585 NLP papers designed to facilitate document-grounded, information-seeking QA. Existing models that do well on other QA tasks do not perform well on these questions.

Simplified Data Wrangling with ir_datasets

Sean MacAvaney, Andrew Yates, Sergey Feldman, Doug Downey, Arman Cohan, Nazli Goharian

No items found.

arXiv
May 10, 2021

TLDR

A new robust and lightweight tool for acquiring, managing, and performing typical operations over datasets used in IR, primarily focus on textual datasets used for ad-hoc search.

Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance

Gagan Bansal, Tongshuang (Sherry) Wu, Joyce Zhou, Raymond Fok, Besmira Nushi, Ece Kamar, Marco Túlio Ribeiro, Daniel S. Weld

No items found.

CHI
May 8, 2021

TLDR

This work conducts mixed-method user studies on three datasets, where an AI with accuracy comparable to humans helps participants solve a task (explaining itself in some conditions), and observes complementary improvements from AI augmentation that were not increased by explanations.

Augmenting Scientific Papers with Just-in-Time, Position-Sensitive Definitions of Terms and Symbols

Andrew Head, Kyle Lo, Dongyeop Kang, Raymond Fok, Sam Skjonsberg, Daniel S. Weld, Marti A. Hearst

No items found.

CHI
May 8, 2021

TLDR

We introduce ScholarPhi, an augmented reading interface that brings definitions of technical terms and symbols to readers when and where they need them most.

What Do We Mean by “Accessibility Research”?: A Literature Survey of Accessibility Papers in CHI and ASSETS from 1994 to 2019

K. Mack, Emma J. McDonnell, Dhruv Jain, Lucy Lu Wang, Jon Froehlich, Leah Findlater

No items found.

CHI
May 8, 2021

TLDR

Accessibility research has grown substantially in the past few decades, yet there has been no literature review of the field. To understand current and historical trends, we created and analyzed a dataset of accessibility papers appearing at CHI and ASSETS since ASSETS' founding in 1994.

CODE: COMPILER-BASED NEURON-AWARE ENSEMBLE TRAINING

E. Trainiti, Thanapon Noraset, David Demeter, Doug Downey, Simone Campanoni

No items found.

Proceedings of Machine Learning and Systems
May 1, 2021

TLDR

CODE introduces neuron-level analyses and transformations aimed at identifying and removing redundant computation from the networks that compose the ensemble that enables CODE to train large DNN ensembles in a fraction of the time and memory footprint needed by current techniques.

Searching for Scientific Evidence in a Pandemic: An Overview of TREC-COVID

Kirk Roberts, Tasmeer Alam, Steven Bedrick, Dina Demner-Fushman, Kyle Lo, I. Soboroff, E. Voorhees, Lucy Lu Wang, W. Hersh

No items found.

arXiv
May 1, 2021

TLDR

This paper provides a comprehensive overview of the structure and results of TREC-COVID, an information retrieval (IR) shared task to evaluate search on scientific literature related to COVID-19.

Improving the Accessibility of Scientific Documents: Current State, User Needs, and a System Solution to Enhance Scientific PDF Accessibility for Blind and Low Vision Users

Lucy Lu Wang, Isabel Cachola, Jonathan Bragg, Evie Yu-Yen Cheng, Chelsea Hess Haupt, Matt Latzke, Bailey Kuehl, Madeleine van Zuylen, Linda M. Wagner, Daniel S. Weld

No items found.

arXiv
April 30, 2021

TLDR

The majority of scientific papers are distributed in PDF, which pose challenges for accessibility, especially for blind and low vision (BLV) readers. We characterize the scope of this problem...

LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis

Zejiang Shen, Ruochen Zhang, Melissa Dell, B. Lee, Jacob Carlson, Weining Li

No items found.

arXiv
March 29, 2021

TLDR

An open-source library for streamlining the usage of deep learning in document image analysis research and applications.

Gender trends in computer science authorship

Lucy Lu Wang, Gabriel Stanovsky, Luca Weihs, Oren Etzioni

No items found.

CACM
March 1, 2021

TLDR

An analysis of 2.87 million computer science papers reveals that, if current trends continue, parity between the number of male and female authors will not be reached in this century. With optimistic projection models, gender parity is forecast to be reached by 2100 in CS, but projected to be reached within two to three decades in the biomedical literature.

Optimizing AI for Teamwork

Gagan Bansal, Besmira Nushi, Ece Kamar, E. Horvitz, Daniel S. Weld

No items found.

AAAI
February 2, 2021

TLDR

It is argued that AI systems should be trained in a human-centered manner, directly optimized for team performance, and the benefit of modeling teamwork during training is shown through improvements in expected team utility across datasets, considering parameters such as human skill and the cost of mistakes.

On Generating Extended Summaries of Long Documents

Sajad Sotudeh, Arman Cohan, Nazli Goharian

No items found.

AAAI • Scientific Document Understanding Workshop
February 2, 2021

TLDR

In this paper, we present a new method for generating extended summaries of long papers.

GENIE: A Leaderboard for Human-in-the-Loop Evaluation of Text Generation

Daniel Khashabi, Gabriel Stanovsky, Jonathan Bragg, Nicholas Lourie, Jungo Kasai, Yejin Choi, Noah A. Smith, Daniel S. Weld

No items found.

arXiv
January 17, 2021

TLDR

This work introduces GENIE, an extensible human evaluation leaderboard, which brings the ease of leaderboards to text generation tasks. GENIE automatically posts leaderboard submissions to crowdsourcing platforms and presents both manual and automatic metrics on the leaderboard.

Text mining approaches for dealing with the rapidly expanding literature on COVID-19

Lucy Lu Wang, Kyle Lo

No items found.

Briefings in Bioinformatics
December 7, 2020

TLDR

This review discusses the corpora, modeling resources, systems and shared tasks that have been introduced for COVID-19, and lists 39 systems that provide functionality such as search, discovery, visualization and summarization over the COVID-19 literature.

Mitigating Biases in CORD-19 for Analyzing COVID-19 Literature

Anshul Kanakia, Kuansan Wang, Yuxiao Dong, Boya Xie, Kyle Lo, Zhihong Shen, Lucy Lu Wang, Chiyuan Huang, Darrin Eide, Sebastian Kohlmeier, Chieh-Han Wu

No items found.

Frontiers in Research Metrics and Analytics
November 23, 2020

TLDR

The results suggest that while CORD-19 exhibits a strong tilt toward recent and topically focused articles, the knowledge being explored to attack the pandemic encompasses a much longer time span and is very interdisciplinary.

PySBD: Pragmatic Sentence Boundary Disambiguation

Nipun Sadvilkar, M. Neumann

No items found.

EMNLP • NLP-OSS Workshop
November 19, 2020

TLDR

This work adapts the Golden Rules Set (a language specific set of sentence boundary exemplars) originally implemented as a ruby gem pragmatic segmenter to Python, ported to Python with additional improvements and functionality.

Document-Level Definition Detection in Scholarly Documents: Existing Models, Error Analyses, and Future Directions

Dongyeop Kang, Andrew Head, Risham Sidhu, Kyle Lo, Daniel S. Weld, Marti A. Hearst

No items found.

EMNLP • SDP workshop
November 19, 2020

TLDR

The task of definition detection is important for scholarly papers, because papers often make use of technical terminology that may be unfamiliar to readers. We develop a new definition detection system, HEDDEx, that utilizes syntactic features, transformer encoders, and heuristic filters, and evaluate it on a standard sentence-level benchmark.

Fact or Fiction: Verifying Scientific Claims

David Wadden, Kyle Lo, Lucy Lu Wang, Shanchuan Lin, Madeleine van Zuylen, Arman Cohan, Hannaneh Hajishirzi

No items found.

EMNLP
November 16, 2020

TLDR

We construct SciFact, a dataset of 1.4K expert-written scientific claims paired with evidence-containing abstracts annotated with labels and rationales. We develop baseline models for SciFact, and demonstrate that these models benefit from combined training on a large dataset of claims about Wikipedia articles, together with the new SciFact data.

TLDR: Extreme Summarization of Scientific Documents

Isabel Cachola, Kyle Lo, Arman Cohan, Daniel S. Weld

No items found.

Findings of EMNLP
November 16, 2020

TLDR

We introduce TLDR generation for scientific papers, a new automatic summarization task with high source compression and provide a new dataset and models for effective generation of TLDRs.

SciSight: Combining faceted navigation and research group detection for COVID-19 exploratory scientific search

Tom Hope, Jason Portenoy, Kishore Vasan, Jonathan Borchardt, Eric Horvitz, Daniel S. Weld, Marti A. Hearst, Jevin D. West

No items found.

EMNLP • Demo
November 16, 2020

TLDR

SciSight is a novel framework for exploratory search of COVID-19 research that integrates two key capabilities: first, exploring interactions between biomedical facets (e.g., proteins, genes, drugs, diseases, patient characteristics); and second, discovering groups of researchers and how they are connected.

SLEDGE-Z: A Zero-Shot Baseline for COVID-19 Literature Search

S. MacAvaney, Arman Cohan, N. Goharian

No items found.

EMNLP
November 16, 2020

TLDR

We present a zero-shot ranking algorithm that adapts to COVID-related scientific literature. Our approach filters training data from another collection down to medical-related queries, uses a neural reranking model pre-trained on scientific text (SciBERT), and filters the target document collection.

MedICaT: A Dataset of Medical Images, Captions, and Textual References

Sanjay Subramanian, Lucy Lu Wang, Sachin Mehta, Ben Bogin, Madeleine van Zuylen, Sravanthi Parasa, Sameer Singh, Matt Gardner, Hannaneh Hajishirzi

No items found.

Findings of EMNLP
November 16, 2020

TLDR

To address challenges in figure retrieval and figure-to-text alignment, we introduce MedICaT, a dataset of medical images in context.

ABNIRML: Analyzing the Behavior of Neural IR Models

Sean MacAvaney, Sergey Feldman, Nazli Goharian, Doug Downey, Arman Cohan

No items found.

TACL
November 2, 2020

TLDR

A new comprehensive framework for Analyzing the Behavior of Neural IR ModeLs (ABNIRML), which includes new types of diagnostic tests that allow us to probe several characteristics---such as sensitivity to word order---that are not addressed by previous techniques.

Generative Data Augmentation for Commonsense Reasoning

Yiben Yang, Chaitanya Malaviya, Jared Fernandez, Swabha Swayamdipta, Ronan Le Bras, J. Wang, Chandra Bhagavatula, Yejin Choi, Doug Downey

No items found.

Findings of EMNLP
October 6, 2020

TLDR

This work investigates G-DAUG^C, a novel generative data augmentation method that aims to achieve more accurate and robust learning in the low-resource setting, and demonstrates that it produces a diverse set of fluent training examples, and that its selection and training approaches are important for performance.

Modelling kidney disease using ontology: insights from the Kidney Precision Medicine Project

E. Ong, L. Lu Wang, J. Schaub, J. O’Toole, B. Steck, A. Rosenberg, F. Dowd, J. Hansen, L. Barisoni, S. Jain, I. D. de Boer, M. Valerius, S. Waikar, C. Park, D. Crawford, T. Alexandrov, C. Anderton, C. Stoeckert, C. Weng, et al

No items found.

Nature Reviews Nephrology
September 16, 2020

TLDR

Ontologies are critical to support the types of big data analysis necessary for kidney precision medicine, where heterogeneous clinical, imaging and biopsy data from diverse sources must be combined to define a patient's phenotype.

High-Precision Extraction of Emerging Concepts from Scientific Literature

Daniel King, Doug Downey, Daniel S. Weld

No items found.

SIGIR
July 25, 2020

TLDR

A novel, unsupervised method for extracting scientific concepts from papers, based on the intuition that each scientific concept is likely to be introduced or popularized by a single paper that is disproportionately cited by subsequent papers mentioning the concept.

Our Work

Experience a smarter way to search and discover scholarly research.

Stay Connected With Semantic Scholar