ref: https://arxiv.org/pdf/2512.16553
github: https://github.com/Tango-Whiskyman/Needle_in_the_Web
Summary
Needle in the Web explores a new benchmark for evaluating LLM search agents. It uses a broadcast + parallel retrieval approach (fuzzy exploratory search) instead of traditional multi-hop reasoning. Retrieved webpages are verified to ensure all query criteria are satisfied using single source, selecting a “ground-truth” page for answer generation. Closed-source search agents from LLM providers show somewhat better results, another action item to uncover the myths.
However, i feel that search agent facing RAG limitation: LLMs solely rely on information in the retrieved context (what been provided) or prompt and LLM continue to struggle without human-like critical thinking and iterative reasoning.
Claim Extraction prompt
You need to extract all the claims from the given article, formulating them as a list of declarative sentences. The claims should be self-contained, so you must avoid using pronouns or relative time references. Only focus on the contents of the article, and ignore the source, author, contributor, or any other information that is not part of
the article itself. Only include claims that are clear, factual and verifiable. Do not include anything that is based on your interpretation.
Central Element Masking prompt
You will be given an article and a list of claims extracted from it. For each of the claims, you need to mask the central part of it, replacing the central part of it with a generic expression. For each claim, only mask ONE element
of it. For different kinds of information you need to mask, you may use `someone` to replace a person's name, something` to replace a certain thing, `in a certain way` to replace a certain action or process, `in a certain state` to replace some adjectives, etc. Importantly, whenever a piece of information is masked, it should not appear in any of the other masked claims.
Quey Template prompt
Please find a single webpage that mentions all of the following information:
{question}
Your response will be parsed by a program, so make sure to observe the formatting instructions! You need to format your response as follows:
<source>the url of the webpage that you found</source>
...
Make sure to explicitly include `<source>` and `</source>` with surrounding angle brackets in your response, even if you do not have an answer. If you are unable to find the webpage that mentions all the information, return the following:
<source> No source found. </source>
Source Checking prompt
You are an expert at extracting information from webpages. You will be given a piece of information, and the content of the webpage that is cited as the source. Your task is to determine whether the information is explicitly mentioned in the contents of the webpage. Your response will be parsed by a program, so make sure to observe the formatting instructions!
Format your response as follows, if the information is explicitly mentioned in the contents:
<accept> The reason why the information is mentioned in the contents. </accept>
If the information is NOT explicitly mentioned in the contents, return:
<reject> The reason why the information is NOT mentioned in the contents. </reject>
Make sure to explicitly include `<accept>` and `</accept>`, or `<reject>` and `</reject>` with surrounding angle brackets in your response.
Exact Source Checking prompt
You are an expert at extracting information from webpages. You will be given a claim, and the content of the webpage that is cited as the source. Your task is to determine whether the claim is explicitly mentioned in the contents of the webpage. Your response will be parsed by a program, so make sure to observe the formatting instructions!
Format your response as follows, if the claim is explicitly mentioned in the contents:
<accept> The reason why the claim is mentioned in the contents. </accept>
If the claim is NOT explicitly mentioned in the contents, return:
<reject> The reason why the claim is NOT mentioned in the contents.</reject>
Make sure to explicitly include `<accept>` and `</accept>`, or `<reject>` and `</reject>` with surrounding angle brackets in your response.

Leave a comment