AI Research Path: Google AI Language Team - Kristina Toutanova

Learn the research path of Google AI Language team research scientist Kristina Toutanova. Check out notable publications throughout her career on modeling the structure of natural language using machine learning.

By
Crossminds
on
March 15, 2021
Category:
Research Spotlights

Crossminds AI research path series aims to introduce notable work of researchers from technology and science institutes worldwide. This edition delves into the research path of Kristina Toutanova, a research scientist at Google AI Language and an affiliate faculty at the University of Washington. As a critical part of Google Research, the Language Team comprises multiple research groups working on a wide range of natural language understanding and generation projects to address the needs of current and future Google products.

Kristina obtained her Ph.D. from the Computer Science Department at Stanford University with Christopher Manning, and her MSc in Computer Science from Sofia University, Bulgaria. Prior to joining Google in 2017, she was a researcher at Microsoft Research, Redmond. Read on for Kristina's research highlights throughout her career.

Top Knowledge Areas

Through her career, Kristina focuses on modeling the structure of natural language using machine learning, most recently in the areas of representation learning, question answering, information retrieval, semantic parsing, and knowledge base completion.

Crossminds AI Research Video Knowledge Graph - NLP - Question Answering
Crossminds AI Research Video Knowledge Graph - NLP - Semantic Parsing
Crossminds AI Research Video Knowledge Graph - Information Retrieval


Top Co-Authors

Christopher Manning (Stanford); Kenton Lee (Google); Ming-Wei Chang (Google); Scott Wen-tau Yih (Facebook); Hoifung Poon (Microsoft)

Top Publications

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (NAACL 2019)

Authors: Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova (Google AI Language)

Summary: BERT(Bidirectional Encoder Representations from Transformers) is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. [Paper]

Watch the paper presentation video:

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (NAACL 2019)


Natural Questions: A Benchmark for Question Answering Research (Transactions of the Association for Computational Linguistics, 2019)

Authors: Tom Kwiatkowski, Jennimaria Palomaki, Olivia Rhinehart, Michael Collins, Ankur Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Matthew Kelcey, Jacob Devlin, Kenton Lee, Kristina N Toutanova, Llion Jones, Ming-Wei Chang, Andrew Dai, Jakob Uszkoreit, Quoc Le, and Slav Petrov (Google)

Summary: Natural Questions corpus is a question answering dataset. The public release consists of 307,373 training examples with single annotations; 7,830 examples with 5-way annotations for development data; and a further 7,842 examples 5-way annotated sequestered as test data. [Paper]

Watch the paper presentation video:

Natural Questions: A Benchmark for Question Answering Research (Transactions of the Association for Computational Linguistics, 2019)

BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions  (NAACL 2019)

Authors: Christopher Clark, Kenton Lee, Ming-Wei Chang, Tom Kwiatkowski, Michael Collins, Kristina Toutanova (University of Washington, Google AI)

Summary: This paper studies yes/no questions that are generated in unprompted and unconstrained settings. We build a reading comprehension dataset, BoolQ, of such questions, and show that they are unexpectedly challenging. We also explore the effectiveness of a range of transfer learning baselines. [Paper]

Watch the paper presentation video:

BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions  (NAACL 2019)

Cross-Sentence N-ary Relation Extraction with Graph LSTMs (Transactions of the Association for Computational Linguistics, 2017)

Authors: Nanyun Peng, Hoifung Poon, Chris Quirk, Kristina Toutanova, and Wen-tau Yih (Johns Hopkins University, Microsoft Research, Google Research)

Summary: We explore a general relation extraction framework based on graph long short-term memory networks (graph LSTMs) that can be easily extended to cross-sentence n-ary relation extraction. The graph formulation provides a unified way of exploring different LSTM approaches and incorporating various intra-sentential and inter-sentential dependencies, such as sequential, syntactic, and discourse relations. [Paper]

Watch the paper presentation video:

Cross-Sentence N-ary Relation Extraction with Graph LSTMs (Transactions of the Association for Computational Linguistics, 2017)

A nested attention neural hybrid model for grammatical error correction (ACL 2017)

Authors: Jianshu Ji, Qinlong Wang, Kristina Toutanova, Yongen Gong, Steven Truong, Jianfeng Gao (Microsoft AI & Research, Google Research)

Summary: We propose a new hybrid neural model with nested attention layers for GEC. Experiments show that the new model can effectively correct errors of both types by incorporating word and character-level information,and that the model significantly outperforms previous neural models for GEC as measured on the standard CoNLL-14 benchmark dataset. [Paper]

Watch the paper presentation video:

A nested attention neural hybrid model for grammatical error correction (ACL 2017)

A Dataset and Evaluation Metrics for Abstractive Compression of Sentences and Short Paragraphs (EMNLP 2016)

Authors: Kristina Toutanova, Chris Brockett, Ke M. Tran, Saleema Amershi (Microsoft)

Summary: We introduce a manually-created, multi-reference dataset for abstractive sentence and short paragraph compression. Multi-reference evaluation metrics are shown to offer significant advantage over single reference-based metrics. [Paper]

Watch the paper presentation video:

A Dataset and Evaluation Metrics for Abstractive Compression of Sentences and Short Paragraphs (EMNLP 2016)

Representing Text for Joint Embedding of Text and Knowledge Bases (EMNLP 2015)

Authors: Kristina Toutanova, Danqi Chen, Patrick Pantel, Hoifung Poon, Pallavi Choudhury, Michael Gamon (Microsoft Research, Stanford University)

Summary: We propose a model that captures the compositional structure of textual relations, and jointly optimizes entity, knowledge base, and textual relation representations. The proposed model significantly improves performance over a model that does not share parameters among textual relations with common sub-structure. [Paper]

Watch the paper presentation video:

Representing Text for Joint Embedding of Text and Knowledge Bases (EMNLP 2015)


Other Notable Publications:

  • Cheng, Hao, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2020. Probabilistic assumptions matter: Improved models for distantly-supervised document-level question answering. Proceedings of ACL, 2020. [Paper]
  • Lajanugen Logeswaran, Ming-Wei Chang, Kristina Toutanova, Kenton Lee, Jacob Devlin, and Honglak Lee. Zero-shot Entity Linking by Reading Entity Descriptions. Proceedings of ACL, 2019. [Paper]
  • Kristina Toutanova, Xi Victoria Lin, Scott Wen-tau Yih, Hoifung Poon, and Chris Quirk. 2016. “Compositional Learning of Embeddings for Relation Paths in Knowledge Bases and Text”. Proceedings of ACL, 2016. [Paper]
  • Kristina Toutanova, Waleed Ammar, Pallavi Choudhury, and Hoifung Poon. 2015. “Model Selection for Type Supervised Learning with Application to POS Tagging”. Proceedings of CoNLL 2015. [Paper]
  • Sauleh Eetemadi, William Lewis, Kristina Toutanova, and Hayder Radha. 2015. Survey of Data-selection Methods in Statistical Machine Translation. Machine Translation 29(3) :189-223, 2015. [Paper]
  • Michel Galley, Chris Quirk, Colin Cherry, and Kristina Toutanova. 2013 “Regularized Minimum Error Rare Training”. Proceedings of EMNLP 2013. Honorable mention award. [Paper]
  • Wen-tau Yih, Kristina Toutanova, John Platt, and Chris Meek. 2011. “Learning Discriminative Projections for Text Similarity Measures”. Proceedings of CoNLL, 2011. Best paper award. [Paper]
  • Hoifung Poon, Colin Cherry, and Kristina Toutanova. 2009. “Unsupervised Morphological Segmentation with Log-Linear Models”. Proceedings of NAACL-HLT, 2009. Best paper award. [Paper]
  • Kristina Toutanova, Aria Haghighi, and Christopher D. Manning. 2008. in press. A Global Joint Model for Semantic Role Labeling. Computational Linguistics 34(2) (Special Issue on Semantic Role Labeling). [Paper
  • Kristina Toutanova, Christopher D. Manning, Dan Flickinger, Stephan Oepen. 2005. “Stochastic HPSG Parse Disambiguation using the Redwoods Corpus”. Journal of Research on Language and Computation, 3(1):83- 105, 2005. [Paper]
  • Stephan Oepen, Daniel Flickinger, Kristina Toutanova, and Christopher D. Manning. LinGO Redwoods. A Rich and Dynamic Treebank for HPSG. Journal of Research on Language and Computation, 2(4):575-596, 2004 [Paper]
  • Kristina Toutanova, Mark Mitchell and Christopher D. Manning. 2003.”Optimizing Local Probability Models for Statistical Parsing”. Proceedings of the European Conference on Machine Learning, 2003. Best student paper runner-up award. [Paper]
    Kristina Toutanova and Christopher D. Manning. “Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger”. Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, 2000. [Paper]

View more publications by Kristina Toutanova

View more Knowledge Graph-Indexed AI research videos at Crossminds:

Crossminds.ai Knowledge Graph-Indexed AI research videos

Install "Crossminds Papers with Video" Chrome extension to instantly find research videos for AI papers on arXiv:

Crossminds Papers with Videos: Find research videos for AI research papers on arXiv While browsing arXiv.org, Crossminds papers with video extension instantly help you find research videos related to each paper. Powered by Crossminds.ai research video platform, this free extension currently covers over 6000 videos in artificial intelligence, machine learning, deep learning, computer vision, natural language processing (NLP), robotics and many other topics in the computer science domain.



Sign up with Crossminds.ai to get personalized recommendations of the latest tech research videos!

Join Crossminds Now!
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form
Tags:
Crossminds

Crossminds.ai is a personalized research video platform for tech professionals. We aim to empower your growth with the latest and most relevant research, industry, and career updates.