Module 2: NVIDIA Isaac Sim & Synthetic Data

Module 2 moves beyond basic physics into photorealistic, GPU-accelerated simulation with NVIDIA Isaac Sim. You will learn the USD scene format, the PhysX simulation backend, and how to generate synthetic datasets—with perfect labels and domain randomization—for training perception models that can transfer to real humanoid robots.

2.1 Isaac Sim: Beyond Basic Physics

Why Isaac Sim vs. Gazebo?

Gazebo is excellent for fast, open‑source physics tightly integrated with ROS 2. Isaac Sim adds:

Photorealistic rendering (ray tracing, PBR materials, realistic shadows)
GPU-accelerated PhysX physics (multi‑body dynamics, articulations)
Built-in tools for synthetic data generation (segmentation, bounding boxes, 2D/3D poses)
Domain randomization utilities (lighting, materials, textures, camera parameters)
Strong integration with the NVIDIA ecosystem (Omniverse, Isaac ROS, CUDA)

You’ll use Gazebo for everyday physics iteration and Isaac Sim when:

You need high‑quality visual data for training detectors/segmenters
You want to systematically randomize scenes for sim‑to‑real robustness

Isaac Sim Architecture and Workflow

Isaac Sim is built on the Omniverse platform and uses:

USD (Universal Scene Description) for representing scenes
PhysX for physics simulation
RTX for ray‑traced rendering

Typical workflow:

Create or import USD assets (robots, environments, lights, cameras)
Compose a stage (scene) with those assets
Add physics and sensors via Isaac extensions
Connect to ROS 2 for control and logging
Run simulations and capture synthetic datasets

2.2 USD and Omniverse Fundamentals

What is USD?

USD (Universal Scene Description) is a scene graph format originally from Pixar, now widely adopted for:

Film and VFX
Game engines
Robotics simulation (Omniverse, Isaac Sim)

Key properties:

Hierarchical: Scenes are trees of prims (e.g., /World/Robot/Base)
Composable: Complex scenes are layered from multiple USD files
Extensible: Supports physics schemas, materials, lights, cameras

You will interact with USD primarily through:

Isaac Sim GUI (Stage tree)
Python scripting (Omniverse Kit API)

USD vs URDF/SDF

URDF/SDF:
- Robot‑centric description
- Great for kinematics/physics of individual robots
- Limited for large, photorealistic environments
USD:
- Scene‑centric description
- Includes robots, environments, cameras, lights, assets
- Native support for materials, animations, and complex composition

In practice:

Import your robot from URDF/SDF into USD
Build full environments (rooms, factories, labs) directly in USD

2.3 PhysX in Isaac Sim

Articulations and Joints

Isaac Sim uses PhysX articulations to represent kinematic chains like humanoids:

Efficient handling of many joints
Stable simulation of contact‑rich motion

You will configure:

Joint types (revolute, prismatic, fixed)
Limits (position, velocity, effort)
Drive parameters (stiffness, damping, target position/velocity)

Physics Parameters to Tune

Critical parameters:

Gravity: Typically -9.81 m/s²
Damping: Joint and linear damping to control oscillations
Contact stiffness & damping: Determines how “soft” or “stiff” collisions feel
Friction: Static and dynamic values per material pair
Solver iterations: Controls constraint resolution quality

You will:

Start from conservative, stable defaults
Tune specific joints (knees, ankles, fingers) for your humanoid
Validate motion qualitatively (no jittering, no foot skating) and quantitatively (trajectories, contact forces)

2.4 Synthetic Data Generation for Perception

Why Synthetic Data?

Real datasets for humanoids are:

Expensive to collect (human labeling, robot setup, safety)
Hard to cover all edge cases (rare obstacles, lighting conditions)

Synthetic data lets you:

Generate unlimited labeled images and point clouds
Control every aspect of scenarios (poses, obstacles, lighting)
Produce perfect ground truth:
- 2D and 3D bounding boxes
- Instance and semantic segmentation masks
- 6D object poses
- Depth maps and LiDAR returns

Core Data Types

In Isaac Sim, you will generate:

RGB images with realistic materials and lighting
Depth images from virtual depth cameras
Semantic segmentation masks (class IDs per pixel)
Instance segmentation masks (instance IDs per pixel)
Bounding boxes (2D and 3D, with object IDs)
LiDAR point clouds with per‑point labels

These can be exported in formats such as:

COCO
Pascal VOC
Custom JSON/NPZ for deep learning frameworks

2.5 Domain Randomization for Sim-to-Real

The Sim-to-Real Gap

Even the best simulator will not perfectly match your real lab:

Slight differences in materials, lighting, and textures
Sensors with more noise, blur, rolling shutter, or calibration errors
Dynamics mismatches (friction, center of mass, joint compliance)

Domain randomization addresses this by:

Training models not on one “perfect” world, but on many plausible worlds
Intentionally varying:
- Lighting direction, intensity, and color
- Material colors and textures
- Backgrounds and clutter
- Sensor noise models (blur, distortion, noise)
- Physics parameters (friction, mass ranges)

The model learns to be invariant to those variations, making it more likely to work on real data.

Randomization Parameters

You will randomize:

Visual:
- HDRI environments
- Light positions/colors
- Material roughness and metalness
- Object colors and patterns
Sensor:
- Camera noise (Gaussian, Poisson, motion blur)
- Distortion and vignetting
Environment:
- Object positions and orientations
- Scene clutter, extra distractors

Crucially, you will start modestly (small variations) and gradually expand ranges while checking that:

Synthetic images still resemble your target deployment environment
Model performance on real validation data improves, not degrades

2.6 Isaac ROS: Hardware-Accelerated Perception

Isaac ROS Components

Isaac ROS provides GPU‑accelerated ROS 2 nodes for:

VSLAM (Visual SLAM)
Depth estimation
Object detection and segmentation

These nodes:

Consume sensor data from Isaac Sim or real cameras
Produce:
- Robot pose estimates
- Depth maps
- Object detections and masks

You will integrate Isaac ROS with:

Isaac Sim (simulated sensors)
Gazebo and real robots (real sensors)

So that your perception stack can run at high frame rates (30–60 Hz) and scale with GPU power.

2.7 Hands-On Lab: Synthetic Data Generation Pipeline

Scenario

You will build a synthetic dataset for an obstacle detector that helps your humanoid navigate cluttered environments. The obstacles will be boxes and cylinders of varying sizes, colors, and positions.

Tasks

Create an Isaac Sim scene with:
- A ground plane and walls
- Randomly placed obstacles (boxes, cylinders)
- A virtual RGB‑D camera on the humanoid or at a fixed vantage point
Implement domain randomization hooks:
- Random lighting (intensity, direction, color)
- Random materials and textures for obstacles
- Random camera poses within a defined region
Generate at least 1,000 annotated images with:
- RGB frames
- Semantic segmentation masks
- 2D bounding boxes for obstacles
Export data as a COCO‑style dataset:
- Images directory
- annotations.json with categories, images, and annotations
Analyze dataset coverage:
- Distribution of obstacle sizes and positions
- Lighting variations
- Class balance

Success Criteria

You can train a simple detector (e.g., a lightweight CNN) on the synthetic dataset and:
- Achieve high accuracy on a synthetic validation split
- Show initial qualitative transfer to real images with similar structure
Dataset and scripts are:
- Reproducible (fixed seeds where needed)
- Well‑organized (clear directory layout, README)

By completing Module 2, you will have a GPU‑accelerated digital twin capable of producing high‑quality synthetic data for perception, and a first experience with domain randomization as a tool for sim‑to‑real transfer.

Module 2: NVIDIA Isaac Sim & Synthetic Data

2.1 Isaac Sim: Beyond Basic Physics

Why Isaac Sim vs. Gazebo?

Isaac Sim Architecture and Workflow

2.2 USD and Omniverse Fundamentals

What is USD?

USD vs URDF/SDF

2.3 PhysX in Isaac Sim

Articulations and Joints

Physics Parameters to Tune

2.4 Synthetic Data Generation for Perception

Why Synthetic Data?

Core Data Types

2.5 Domain Randomization for Sim-to-Real

The Sim-to-Real Gap

Randomization Parameters

2.6 Isaac ROS: Hardware-Accelerated Perception

Isaac ROS Components

2.7 Hands-On Lab: Synthetic Data Generation Pipeline

Scenario

Tasks

Success Criteria

AI Assistant

AI Assistant

Start a Conversation

2.1 Isaac Sim: Beyond Basic Physics​

Why Isaac Sim vs. Gazebo?​

Isaac Sim Architecture and Workflow​

2.2 USD and Omniverse Fundamentals​

What is USD?​

USD vs URDF/SDF​

2.3 PhysX in Isaac Sim​

Articulations and Joints​

Physics Parameters to Tune​

2.4 Synthetic Data Generation for Perception​

Why Synthetic Data?​

Core Data Types​

2.5 Domain Randomization for Sim-to-Real​

The Sim-to-Real Gap​

Randomization Parameters​

2.6 Isaac ROS: Hardware-Accelerated Perception​

Isaac ROS Components​

2.7 Hands-On Lab: Synthetic Data Generation Pipeline​

Scenario​

Tasks​

Success Criteria​

AI Assistant

AI Assistant

Start a Conversation

2.1 Isaac Sim: Beyond Basic Physics

Why Isaac Sim vs. Gazebo?

Isaac Sim Architecture and Workflow

2.2 USD and Omniverse Fundamentals

What is USD?

USD vs URDF/SDF

2.3 PhysX in Isaac Sim

Articulations and Joints

Physics Parameters to Tune

2.4 Synthetic Data Generation for Perception

Why Synthetic Data?

Core Data Types

2.5 Domain Randomization for Sim-to-Real

The Sim-to-Real Gap

Randomization Parameters

2.6 Isaac ROS: Hardware-Accelerated Perception

Isaac ROS Components

2.7 Hands-On Lab: Synthetic Data Generation Pipeline

Scenario

Tasks

Success Criteria