Verbalized Sampling: How to mitigate Mode Collapse and unlock LLM diversity

Summary

AI models repeat safe answers, but a simple prompt tweak can unlock far better thinking. Researchers introduced “verbalized sampling,” where the model lists possible answers (with score), explains which seems most likely and finally choosing. This pushes the model to explore instead of defaulting to familiar patterns. The result: up to 1.8x better reasoning without retraining. When outputs felt predictable, ask your model to weight options first, you may get sharper ideas instantly.

Direct prompt

			
Generate a response to the input prompt. 
The response should be approximately {target words} words.
Output ONLY the response, with no explanations or extra text.

Direct Prompting with CoT

			
Generate a response to the input prompt. 
The response should be approximately {target words} words.
First, provide a single "reasoning" field as a string, detailing your step-by-step thought process.
Then, provide your response in the "response" field.
Give ONLY the JSON object, with no explanations or extra text.

		

Sequence prompt

			
Generate {num_samplings} responses to the input prompt. 
Each response should be approximately {target words} words.
Return exactly {num_samplings} responses as a Python list of strings, formatted as:
["response1", "response2", "response3", ...]
Output ONLY the list, with no explanations or extra text.

		

Multi-turn prompt

			
Generate a response to the input prompt. 
The response should be approximately {target words} words.
Output ONLY the response, with no explanations or extra text.
Generate another response to the original input prompt.

Verbalized Sampling (Standard) prompt:

			
Generate {num_samplings} responses to the input prompt. 
Each response should be approximately {target words} words.
Return the responses in JSON format with the key: "responses" (list of dicts). 
Each dictionary must include:
• text: the response string only (no explanation or extratext).
• probability: the estimated probability from 0.0 to 1.0 of this response given the input prompt (relative to the full distribution).
Give ONLY the JSON object, with no explanations or extra text.

		

Verbalized Sampling (Standard, with probability tuning) prompt:

			
Generate {num_samplings} responses to the input prompt. 
Each response should be approximately {target_words} words.
Return the responses in JSON format with the key: "responses" (list of dicts). 
Each dictionary must include:
• text: the response string only (no explanation or extra text).
• probability: the estimated probability from 0.0 to 1.0 of this response given the input prompt (relative to the full distribution).
[Randomly sample the responses from the full distribution.] / 
[Randomly sample the responses from the distribution, with the probability of each response must be below {probability_tuning}.]
Give ONLY the JSON object, with no explanations or extra text.

		

Verbalized Sampling (CoT) prompt:

			
Generate {num_samplings} responses to the input prompt using chain-of-thought reasoning. 
Each response should have {target words} target words.
First, provide a single "reasoning" field as a string, detailing your step-by-step thought process. 
Then, return the output in JSON format with the key "responses" (list of dicts). 
Each dictionary must include:
• text: the response string (no explanation or extra text).
• probability: the estimated probability from 0.0 to 1.0 of this response given the input prompt (relative to the full distribution).
Give ONLY the JSON object, with no explanations or extra text.

		

Verbalized Sampling (Multi-turn) prompt:

			
You will generate a total of {num_samplings} responses to the input prompt. 
Each response should be approximately {target words} words.
First, sample {num_samples_per_prompt} responses.
Return the responses in JSON format with the key: "responses" (list of dicts). 
Each dictionary must include:
• text: the response string (no explanation or extra text).
• confidence: the normalized likelihood score between 0.0 and 1.0 that indicates how representative or typical this response is compared to the full distribution.
Give ONLY the JSON object, no explanations or extra text.
Generate {num_samples_per_prompt} alternative responses to the original input prompt.

		

Kim 2 ML

Verbalized Sampling: How to mitigate Mode Collapse and unlock LLM diversity

Summary

Direct prompt

Direct Prompting with CoT

Sequence prompt

Multi-turn prompt

Verbalized Sampling (Standard) prompt:

Verbalized Sampling (Standard, with probability tuning) prompt:

Verbalized Sampling (CoT) prompt:

Verbalized Sampling (Multi-turn) prompt:

Leave a comment Cancel reply

Verbalized Sampling: How to mitigate Mode Collapse and unlock LLM diversity

Summary

Direct prompt

Direct Prompting with CoT

Sequence prompt

Multi-turn prompt

Verbalized Sampling (Standard) prompt:

Verbalized Sampling (Standard, with probability tuning) prompt:

Verbalized Sampling (CoT) prompt:

Verbalized Sampling (Multi-turn) prompt:

Share this:

Leave a comment Cancel reply