Meet Our Team

Semantic Scholar Research Team

We are an interdisciplinary research team focused on AI, HCI, ML, NLP, accessibility and computational social science in support of Semantic Scholar's mission of accelerating science. Our team is part of the Allen Institute for AI, a nonprofit research institute advancing AI for the common good.

Follow us on Twitter for research updates!

Our Team

Daniel S. Weld
Dan Weld

Chief Scientist

Doug Downey
Doug Downey

Senior Director

Maria Antoniak
Maria Antoniak

Young Investigator

Tal August
Tal August

Young Investigator

Jonathan Bragg
Jonathan Bragg

Senior Research Scientist

Arman Cohan
Arman Cohan

Research Scientist

Joseph Chee Chang
Joseph Chee Chang

Research Scientist

Sergey Feldman
Sergey Feldman

Lead Applied Research Scientist

Tom Hope
Tom Hope

Research Scientist

Kyle Lo
Kyle Lo

Lead Applied Research Scientist

Aakanksha Naik
Aakanksha Naik

Research Scientist

Ben Newman
Ben Newman

Predoctoral Young Investigator

Pao Siangliulue
Pao Siangliulue

Applied Research Scientist

Amanpreet Singh
Amanpreet Singh

Research Engineer

Luca Soldaini
Luca Soldaini

Senior Applied Research Scientist

David Wadden
David Wadden

Research Scientist

Lucy Lu Wang
Lucy Lu Wang

Research Scientist

Amy X. Zhang
Amy X. Zhang

Research Scientist

Alumni

Andrew Head
Andrew Head

Young Investigator

2021
⇨ Assistant Professor at University of Pennsylvania
Daniel King
Daniel King

Applied Research Scientist

2018-2021
⇨ Research Engineer at Mosaic ML
Haokun Liu
Haokun Liu

Predoctoral Young Investigator

2021
⇨ PhD student at University of North Carolina at Chapel Hill
Isabel Cachola
Isabel Cachola

Predoctoral Young Investigator

2019-2020
⇨ PhD student at Johns Hopkins University
Iz Beltagy
Iz Beltagy

Senior Research Scientist

2017-2021
⇨ AllenNLP team at AI2
Jon Saad-Falcon
Jon Saad-Falcon

Predoctoral Young Investigator

2021-2022
⇨ Research Assistant at Stanford University
Katherine Keith
Katherine Keith

Young Investigator

2021-2022
⇨ Assistant Professor at Williams College
Mark Neumann
Mark Neumann

Senior Research Engineer

2020-2021
⇨ Entrepreneur at AI2 Incubator
Sonia Murthy
Sonia Murthy

Predoctoral Young Investigator

2021-2022
⇨ PhD student at Harvard University
Waleed Ammar
Waleed Ammar

Research Manager

2016-2019
⇨ Senior Research Scientist at Google
Zejiang Shen
Zejiang (Shannon) Shen

Applied Research Scientist

2021-2022
⇨ PhD student at Massachusetts Institute of Technology

Interns

2023

  • Raymond Fok (University of Washington)
  • John Giorgi (University of Toronto)
  • Yue Guo (University of Washington)
  • Joe (Chao-Chun) Hsu (University of Chicago)
  • Hang Jiang (Massachusetts Institute of Technology)
  • Jeongyeon Kim (Stanford University)
  • Hyunji Lee (KAIST)
  • Yoonjoo Lee (KAIST)
  • Monica Munnangi (Northeastern University)
  • Marissa Radensky (University of Washington)
  • Nikhil Singh (Massachusetts Institute of Technology)
  • Orion Weller (Johns Hopkins University)
  • Fangyuan Xu (University of Texas at Austin)

2022

  • Nayha Auradkar (University of Washington): Scientific Text Mining for Alt Text Content
  • Hancheng Cao (Stanford University): The Death and Life of Great Research Artifacts
  • Cathy Chen (University of California Berkeley): Language Model Generation to Unseen Layouts
  • Chris Coleman (Northwestern University): Embedding Recycling
  • Carl Edwards (University of Illinois at Urbana-Champaign): Hypothesis Generation in Biomedicine
  • John Giorgi (University of Toronto): Retrieval in Multidocument Summarization
  • Lucy Li (University of California Berkeley): Language-Driven Map of Science
  • Yuze Lou (University of Michigan): Entity Linking in Scientific Tables
  • Hyeonsu Kang (Carnegie Mellon University): Reference-driven Social Recommendations
  • Tae Soo Kim (Korea Advanced Institute of Science & Technology): Localized Videos in Papers
  • Kalpesh Krishna (University of Massachusetts Amherst): Guidelines for Human Evaluation in Long-form Summarization
  • Rabeeh Mahabadi (EPFL - Swiss Federal Institute of Technology Lausanne)
  • Srishti Palani (University of California San Diego): Reusing Related Work Sections
  • Shaurya Rohatgi (Penn State University): Scientific Paper Clustering
  • Nouran Soliman (Massachusetts Institute of Technology): Shareable Lists of Scientific Papers
  • Qingyun Wang (University of Illinois at Urbana-Champaign): Knowledge Graph Link Prediction

2021

  • Tal August (University of Washington): Addressing barriers in medical literature for lay people
  • Avi Caciularu (Bar-Ilan University): Token-level Contrastive Learning for Multi-hop Question Answering
  • Sanjana Chintalapati (University of Washington): Increasing scientific figure accessibility with alt-text
  • Mike D'Arcy (Northwestern University): Joint Inference for Scientific Knowledge Graph Representation
  • Raymond Fok (University of Washington): Semantic Skimmer
  • Varun Gangal (Carnegie Mellon University): Interpretable text generation with operations
  • Zhipeng Hou (Northwestern University): Retrieve to achieve (better embeddings)
  • Hyeonsu Kang (Carnegie Mellon University): Synthetic Social Recommendations for Improving Access and Mitigating Bias in Scientific Discovery
  • Sheshera Myshore (University of Massachusetts Amherst): Explanation Informed Fine-grained Scientific Document Similarity
  • Aakanksha Naik (Carnegie Mellon University): Improving Clinical Outcome Prediction Using Evidence from Medical Literature
  • Napol Rachatasumrit (Carnegie Mellon University): Bootstrapping Reader Highlights
  • Shaurya Rohatgi (Penn State University): Modeling and Inferring Academic Mentorship on Semantic Scholar
  • Chantal Shaib (Northeastern University): BioWordPiece: Teaching WordPiece about morphology
  • Dustin Wright (University of Copenhagen): Decomposing Scientific Claims to Verifiable Units
  • Wen Xiao (University of British Columbia): Multi-document Summarization for Scientific Papers

2020

  • Arie Cattan (Bar-Ilan University): Scientific Concept Induction
  • Jay DeYoung (Northeastern University): Summarizing Evidence across Studies: Automating Systematic Review Generation
  • Raymond Fok (University of Washington): Recovery for AI-Infused UIs
  • Harmanpreet Kaur (University of Michigan): FeedLens: Trainable Research Feeds for Rapid Navigation of Paper List Objects
  • Rafal Kocielnik (University of Washington): Motivating Scientific Reading Habits with Interventions Based on Text Mining and Behavioral Psychology
  • Anne Lauscher (University of Mannheim): Citation Context Analysis Revisited
  • Sean MacAvaney (Georgetown University): Analyzing Behavior of Neural IR Models
  • Jason Portenoy (University of Washington): SciSight++: Visually exploring the network of CS authors, methods and tasks extracted from 10M papers
  • Marissa Radensky (University of Washington): Exploring the Role of Local and Global Explanations in Recommender Systems
  • Shivashankar Subramanian (University of Melbourne): An In-Depth Analysis of Author Disambiguation
  • Mohit Yadav (University of Massachusetts Amherst): End-to-End Document-Level IE

2019

  • Jim Chen (University of Washington): Building Better Topic Pages for All
  • Mike D'Arcy (Northwestern University): RefBERT: Representing References in Scientific Documents
  • Andrew Head (University of California Berkeley): Augmenting the Reading Experience for Scientific Papers from arXiv
  • Sarthak Jain (Northeastern University): Structured Results Extraction
  • Ben Lee (University of Washington): Explanation-Based Tuning of Opaque Machine Learners with Application to Paper Recommendation
  • Susan (Xueqing) Liu (University of Illinois Urbana-Champaign): A Quantitative Study on Citation Preferences in Scientific Literature
  • Pouya Pezeshkpour (University of California Irvine): Question Generation and Targeting for Assisted Flashcard Study of Scientific Papers
  • David Wadden (University of Washington): SciFact: A Dataset for Verification of Scientific Claims

2018

  • Christine Betts (University of Washington): GrapAL: Connecting the Dots in Scientific Literature
  • Muthu Chandrasekaran (National University of Singapore): Aggregation of Tabular Empirical Results from Scientific Documents
  • Xuezhe Ma (Carnegie Mellon University): Sci-Defi: Identifying Scientific Concept Definitions in Scholarly Documents
  • Amandalynne Paullada (University of Washington): Improving Clinical Trial Information Extraction with Weak Supervision

2017

  • Titipat Achakulvisut (University of Pennsylvania): Claim Extraction in Biomedical Publications Using Deep Discourse Model and Transfer Learning
  • Ahmed Elgohary (University of Maryland): Relation Extraction from Scientific Documents
  • Lucy Lu Wang (University of Washington): Ontology Alignment in the Biomedical Domain Using Entity Definitions and Context