Topic 4: LLM-Based Decision Making & Reasoning
Topic 4 integrates Large Language Models (LLMs) into your autonomy stack to translate natural-language instructions into structured task graphs, support closed-loop autonomy, and optimize skills via reinforcement-style feedback.
4.1 Natural Language → Task Graph Translation (Module A)​
From Commands to Structured Actions​
Given a user instruction such as:
- “Go to the kitchen, find the red mug, and bring it to me in the lab.”
The LLM’s job is to:
- Parse the instruction.
- Identify subtasks and their order.
- Produce a structured representation (e.g., a task graph or behavior tree skeleton).
Example breakdown:
navigate(kitchen)locate_object(mug, color=red)pick(mug)navigate(lab)deliver(mug, recipient=human)
You will conceptually design:
- Prompts that:
- Provide the robot’s capabilities (skill library).
- Ask the LLM to produce valid sequences using those skills.
- A ROS 2 node that:
- Receives user commands.
- Calls the LLM (local or remote).
- Converts the output into a task graph or behavior tree configuration.
Prompt Engineering for Robotics Context​
Key ideas:
- Constrain outputs to known skills and actions.
- Provide examples of valid task graphs.
- Include contextual information:
- Known rooms and objects.
- Current map or world state summary.
The goal is to make LLM outputs predictable and easy to validate, not free-form.
4.2 Closed-Loop Autonomy (Module B)​
Perception in the Loop​
LLM-driven plans must stay grounded in reality:
- Perception and mapping (Chapter 4) provide:
- Detected objects and humans.
- Current robot pose and map.
- Execution reports back:
- Success/failure of skills.
- Unexpected observations (e.g., object not found).
Closed-loop behavior:
- If the object is not detected:
- Re-scan or search a different area.
- Ask the user for clarification if needed.
- If navigation is blocked:
- Re-plan via alternative routes.
- Adjust the task graph accordingly.
The LLM (or a higher-level policy) can:
- Suggest alternative strategies.
- Decide when to seek human input.
- Update the task graph in response to new information.
4.3 Reinforcement-Based Task Optimization (Module C)​
Reward Signals​
Over time, you may want the robot to:
- Execute tasks more efficiently.
- Reduce failures and retries.
- Improve smoothness and safety.
You can define reward signals such as:
- Task success vs failure.
- Time to completion.
- Number of collisions or near-collisions (should be minimized).
- Smoothness of motion (e.g., penalize jerky movements).
Skill Fine-Tuning​
Reinforcement-style updates can:
- Adjust low-level controller parameters.
- Refine skill implementations (e.g., better grasp strategies).
- Influence high-level choices (e.g., preferred routes).
Data logging:
- Store:
- Task graphs used.
- Execution traces (sensor data, commands).
- Outcomes and rewards.
- Use logs for:
- Offline RL experiments.
- Post-hoc analysis and manual tuning.
By the end of Topic 4, you should understand:
- How LLMs can generate and refine task structures.
- How closed-loop autonomy requires continuous perception–planning–execution integration.
- How reinforcement-style feedback can gradually improve your agent’s behavior.