Starting off as a muggle that naïve to the Math's and Data Science world.

SLMFix: Leveraging Small Language Models for error fixing with Reinforcement Learning

ref: https://arxiv.org/pdf/2511.19422

Summary

The paper suggest to train small language model (SLM) repair code for least known programming languages. The report over 95% static-validation pass rate and improvements over direct LLM fine-tuning and self-correction prompting (agentic frameworks). The paper include building training pairs from LLM-generated programs, applies LoRA for initialization, then PPO reinforcement learning with a reward combining static validation and AST-based semantic similarity, gradually shifting focus from syntactic to semantic correctness. Results show SLMs better fit in repair, not full generation.


Ansible Code Generation prompt
You are an expert in Ansible.
The user will give you a task description and ask you to generate an Ansible playbook to complete the given task.
You only need to output the content of the playbook.
DO NOT use any shell commands (ansible.builtin.shell, ansible.builtin.command, etc.) in the playbook.
Task: {task}
Answer: “‘yaml
Bash Code Generation prompt
You are an expert in Bash.
The user will give you a task description and ask you to generate a bash command to complete the given task.
You only need to output the content of the command.
Task: {task}
Answer: “‘bash
SQL Code Generation prompt
You are an expert in SQL.
The user will give you a task description and ask you to generate a SQL command to complete the given task. You only need to output the content of the command.
Task: {task}
Answer: “‘sql
Ansible Program Repair prompt
You are an expert in Ansible.
You are asked to fix a possibly incorrect Ansible playbook.
You will be provided with the playbook to fix, the user input, and feedback from an interpreter that lists all
syntactic errors in the playbook.
Your goal is to fix the syntactic errors in the playbook (if any) while following the user’s instruction.
You only need to output the content of the modified playbook.
User query: {query}
Original playbook:
{output}
Interpreter feedback:
{feedback}
Answer: “‘yaml
Bash Program Repair prompt
You are an expert in Bash.
You are asked to fix a possibly incorrect Bash command.
You will be provided with the command to fix, the user input, and feedback from an interpreter that lists
all syntactic errors in the command.
Your goal is to fix the syntactic errors in the command (if any) while following the user’s instruction.
You only need to output the content of the modified command.
User query: {query}
Original command:
{output}
Interpreter feedback:
{feedback}
Answer: “‘bash
SQL Program Repair prompt
You are an expert in SQL.
You are asked to fix a possibly incorrect SQL command.
You will be provided with the command to fix, the user input, and feedback from an interpreter that lists
all syntactic errors in the command.
Your goal is to fix the syntactic errors in the command (if any) while following the user’s instruction.
You only need to output the content of the modified command.
User query: {query}
Original command:
{output}
Interpreter feedback:
{feedback}
Answer: “‘sql
Ansible In-context Learning prompt
You are an expert in Ansible.
The user will give you a task description and ask you to generate an Ansible playbook to complete the given task.
You only need to output the content of the playbook.
DO NOT use any shell commands (ansible.builtin.shell, ansible.builtin.command, etc.) in the playbook.
The following are some example input queries and corresponding Ansible playbooks for your reference:
{examples}
Task: {task}
Answer: “‘yaml
Bash In-context Learning prompt
You are an expert in Bash.
The user will give you a task description and ask you to generate a bash command to complete the given task.
You only need to output the content of the command.
The following are some example input queries and corresponding Bash commands for your reference:
{examples}
Task: {task}
Answer: “‘bash
SQL In-context Learning prompt
You are an expert in SQL.
The user will give you a task description and ask you to generate a SQL command to complete the given task.
You only need to output the content of the command.
The following are some example input queries and corresponding SQL commands for your reference:
{examples}
Task: {task}
Answer: “‘sql
Ansible Dataset Query Generation prompt
You are an expert in Ansible.
You are asked to write a user prompt for the given Ansible playbook that can be used to generate the playbook.
Instead of explicitly describing the functionality of the playbook, the prompt should tell what the user wants to
accomplish through the playbook.
Write the prompt as short as you can, and start the prompt with:
Generate an Ansible playbook that ...

Leave a comment