NaturalQuestions

Created

Created

2023 Sep 24 9:8

Creator

Creator

Seonglae Cho

Editor

Editor

Seonglae Cho

Edited

Edited

2023 Nov 28 5:11

Refs

Refs

NQ

natural-questions

google-research-datasets • Updated 2023 Sep 27 14:54

contains real questions submitted to Google by multiple searchers each

https://inflection.ai/inflection-2

Browser

Google's Natural Questions

https://ai.google.com/research/NaturalQuestions/databrowser

Google's Natural Questions

https://ai.google.com/research/NaturalQuestions

Datasets

natural_questions · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

https://huggingface.co/datasets/natural_questions

natural_questions · Datasets at Hugging Face

Papers

Papers with Code - Natural Questions Dataset

The Natural Questions corpus is a question answering dataset containing 307,373 training examples, 7,830 development examples, and 7,842 test examples. Each example is comprised of a google.com query and a corresponding Wikipedia page. Each Wikipedia page has a passage (or long answer) annotated on the page that answers the question and one or more short spans from the annotated passage containing the actual answer. The long and the short answer annotations can however be empty. If they are both empty, then there is no answer on the page at all. If the long answer annotation is non-empty, but the short answer annotation is empty, then the annotated passage answers the question but no explicit short answer could be found. Finally 1% of the documents have a passage annotated with a short answer that is “yes” or “no”, instead of a list of short spans.

https://paperswithcode.com/dataset/natural-questions

Papers with Code - Natural Questions Dataset

Natural Questions: A Benchmark for Question Answering Research

Tom Kwiatkowski, Jennimaria Palomaki, Olivia Redfield, Michael Collins, Ankur Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Jacob Devlin, Kenton Lee, Kristina Toutanova, Llion Jones, Matthew Kelcey, Ming-Wei Chang, Andrew M. Dai, Jakob Uszkoreit, Quoc Le, Slav Petrov. Transactions of the Association for Computational Linguistics, Volume 7. 2019.

Natural Questions: A Benchmark for Question Answering Research

https://aclanthology.org/Q19-1026/

Natural Questions: A Benchmark for Question Answering Research

Natural Questions: a Benchmark for Question Answering Research – Google Research

We present the Natural Questions corpus, a question answering dataset. Questions consist of real anonymized, aggregated queries issued to the Google search engine. An annotator is presented with a question along with a Wikipedia page from the top 5 search results, and annotates a long answer (typically a paragraph) and a short answer (one or more entities) if present on the page, or marks null if no long/short answer is present. The public release consists of 307,373 training examples with single annotations, 7,830 examples with 5-way annotations for development data, and a further 7,842 examples 5-way annotated sequestered as test data. We present experiments validating quality of the data. We also describe analysis of 25-way annotations on 302 examples, giving insights into human variability on the annotation task. We introduce robust metrics for the purposes of evaluating question answering systems; demonstrate high human upper bounds on these metrics; and establish baseline results using competitive methods drawn from related literature.

https://research.google/pubs/pub47761/

Natural Questions: a Benchmark for Question Answering Research – Google Research

Recommendations

////////