finding multiple relevant passages and step-by-step reasoning to answer complex questions.
Multi-hop QA Models
Multi-hop QA Datasets
There is moderate evidence of the second-hop reasoning, which does not become stronger with increasing model size.
Do Large Language Models Latently Perform Multi-Hop Reasoning?
We study whether Large Language Models (LLMs) latently perform multi-hop reasoning with complex prompts such as "The mother of the singer of 'Superstition' is". We look for evidence of a latent...
https://arxiv.org/abs/2402.16837


Seonglae Cho