Chapter 1: Introduction to ROS 2
Learning Objectives
By the end of this chapter, you will:
- Explain what ROS 2 is and the problems it solves for physical AI systems
- Describe ROS 2's role as middleware using the nervous system analogy
- Compare ROS 2 to ROS 1 and articulate why ROS 2 was created
- Identify real-world humanoid robotics use cases for ROS 2
What is ROS 2?
ROS 2 (Robot Operating System 2) is a middleware framework for robot software development. It provides the communication infrastructure, tools, and libraries that enable developers to build complex robot applications.
Definition: Middleware is software that sits between different applications or components, allowing them to communicate and share data. ROS 2 serves as this communication layer for robotic systems.
Why it matters: Modern robots are not simple machines controlled by a single program. They are sophisticated systems with multiple sensors gathering data, AI models making decisions, and actuators executing movements. All these components need to work together in real time. ROS 2 provides the communication backbone that makes this coordination possible.
How it works: ROS 2 creates a distributed network where each component runs as an independent process (called a "node"). These nodes communicate by sending messages to each other. This architecture allows different parts of a robot to work together without being tightly coupled to one another.
Think of ROS 2 as the phone network that lets different apps communicate. Just as WhatsApp can send a message to your email app without knowing how email works internally, ROS 2 lets a camera sensor send data to an AI vision model without the sensor knowing anything about AI. Each component focuses on its job, and ROS 2 handles the communication between them.
If you have worked with microservices or distributed systems, ROS 2 will feel familiar. It provides publish-subscribe messaging (like message queues), request-response patterns (like REST APIs), and long-running task management (like asynchronous job processing). The key difference is that ROS 2 is optimized for real-time robotics with strict timing requirements and high-frequency sensor data.
What Problem ROS 2 Solves
Robots face unique software challenges that traditional programming approaches cannot easily address. ROS 2 solves four critical problems:
1. Modularity and Reusability
Physical AI systems combine perception, planning, and control. Each subsystem is complex enough to require dedicated teams and specialized algorithms. ROS 2 allows developers to build these subsystems independently and connect them together.
For example, a team can develop a computer vision system for object detection without knowing anything about the motion planning algorithms. Another team can develop arm control without understanding the vision system. ROS 2 allows both teams to work in parallel and integrate their work seamlessly.
2. Distributed Computation
Humanoid robots have dozens of sensors and actuators operating simultaneously. Processing sensor data from cameras, LiDAR, IMUs, force sensors, and joint encoders requires significant computational power. ROS 2 enables this computation to be distributed across multiple processors or even multiple computers.
A humanoid robot might run perception algorithms on a GPU, motion planning on a CPU, and low-level motor control on dedicated microcontrollers. ROS 2 manages the communication between these distributed computational units.
3. Real-Time Communication
Physical AI systems interact with the physical world where timing matters. A humanoid robot walking must coordinate leg movements within milliseconds. If a command to lift the left foot arrives late, the robot could fall.
ROS 2 uses DDS (Data Distribution Service) as its communication middleware. DDS provides real-time, reliable message delivery with quality-of-service (QoS) policies that control message priority, delivery guarantees, and timing constraints.
4. Platform and Language Flexibility
Different parts of a robot system may use different programming languages and run on different operating systems. Computer vision algorithms might be written in Python using deep learning frameworks. Low-level motor controllers might be written in C++ for performance. ROS 2 supports both Python and C++ natively, with community support for other languages.
ROS 2 also runs on Linux, Windows, and macOS. This flexibility allows developers to choose the best tools for each subsystem without being locked into a single platform.
ROS 2 Architecture Overview
ROS 2 architecture consists of several layers, from low-level communication to high-level application code.
The ROS 2 Graph
At runtime, a ROS 2 system forms a graph where nodes are the vertices and communication channels are the edges. Each node is an independent process performing a specific computation. Nodes communicate by publishing and subscribing to named topics, calling services, or executing actions.
This graph structure provides flexibility. You can add new nodes, remove nodes, or replace nodes without affecting the rest of the system, as long as the communication interfaces remain compatible.
DDS: The Communication Foundation
ROS 2 builds on DDS (Data Distribution Service), an industry-standard middleware for distributed real-time systems. DDS handles the low-level details of message delivery, network communication, discovery of nodes, and quality-of-service guarantees.
Using DDS gives ROS 2 several advantages: proven reliability in aerospace and defense applications, real-time performance, and vendor interoperability. Developers work with ROS 2 APIs while DDS handles the complex networking underneath.
Language Client Libraries
ROS 2 provides client libraries that allow developers to write nodes in different programming languages. The two primary libraries are:
- rclpy: ROS Client Library for Python
- rclcpp: ROS Client Library for C++
These libraries provide identical functionality in each language, allowing teams to choose the best language for their needs while maintaining full compatibility.
Why ROS 2 Matters for Physical AI
Physical AI refers to artificial intelligence that operates in the physical world through robotic systems. Unlike software-only AI that processes data and returns results, physical AI must perceive the environment through sensors, make decisions, and execute actions through actuators.
ROS 2 enables physical AI in three critical ways:
1. Connecting AI Models to Sensors and Actuators
Modern AI models (vision transformers, large language models, reinforcement learning policies) are typically developed in Python using frameworks like PyTorch or TensorFlow. However, robots require low-level control code often written in C++ for real-time performance. ROS 2 bridges this gap, allowing AI models written in Python to seamlessly communicate with robot controllers written in C++.
2. Enabling Modular AI Architectures
Physical AI systems often combine multiple AI models: vision models for perception, language models for understanding commands, and policy models for planning actions. ROS 2 allows each model to run as a separate node, making it easy to swap models, upgrade algorithms, or run different models in parallel for redundancy.
3. Providing Real-Time Feedback Loops
Physical AI systems require continuous feedback between perception, decision-making, and actuation. A humanoid robot adjusting its balance uses sensor data from IMUs and force sensors, computes corrective actions, and sends motor commands hundreds of times per second. ROS 2's real-time communication ensures these feedback loops operate without delays.
The Nervous System Analogy
To understand ROS 2's architecture, consider the human nervous system.
Neurons are like nodes: Each neuron (or group of neurons) performs a specific function—some process visual information, others control muscles, and others make decisions. Similarly, each ROS 2 node performs one specific task: reading sensor data, running a perception algorithm, or controlling a motor.
Synapses are like topics: Neurons communicate by sending signals across synapses. They do not need to know which other neurons will receive their signals. Similarly, ROS 2 nodes publish messages to topics without knowing which other nodes subscribe to those messages.
The nervous system is like the ROS 2 graph: Just as the human nervous system is a network of interconnected neurons working together to control the body, the ROS 2 graph is a network of interconnected nodes working together to control the robot.
This distributed, message-passing architecture provides the same benefits in robots that evolution provided in biological systems: modularity, parallelism, and resilience.
Your brain does not have a single "master program" controlling everything. Instead, different brain regions handle vision, movement, balance, and decision-making independently. They coordinate by passing signals to each other. ROS 2 works the same way: many small programs working together instead of one giant program doing everything.
ROS 2 vs ROS 1
ROS 1 (Robot Operating System 1) was released in 2007 and became the standard middleware for research robotics. However, as robots moved from research labs to real-world applications, ROS 1's limitations became apparent. ROS 2, released in 2017, addressed these limitations.
Key Differences
Real-Time Support: ROS 1 was not designed for real-time systems. ROS 2 uses DDS, which provides deterministic communication suitable for safety-critical applications like autonomous vehicles and humanoid robots.
Distributed Architecture: ROS 1 required a central "master" process for node discovery and communication setup. If the master failed, the entire system stopped. ROS 2 uses DDS discovery, which is fully distributed with no single point of failure.
Security: ROS 1 had no built-in security. Any program could access any data. ROS 2 includes DDS security features like authentication, encryption, and access control, essential for commercial and safety-critical robots.
Multi-Platform Support: ROS 1 officially supported only Linux. ROS 2 supports Linux, Windows, and macOS, making development and deployment more flexible.
Why ROS 2 was created: As robots transitioned from research prototypes to commercial products (delivery robots, warehouse automation, autonomous vehicles, humanoid assistants), they required production-quality software. ROS 2 provides the reliability, security, and real-time performance needed for robots operating in uncontrolled environments alongside humans.
The transition from ROS 1 to ROS 2 mirrors the evolution in distributed AI training systems. Just as modern ML frameworks moved from single-GPU training to multi-node distributed training with fault tolerance and efficient communication, ROS 2 evolved from a research tool to a production-grade distributed robotics framework.
Real-World Humanoid Use Cases
ROS 2 powers humanoid robots across industry and research. Here are examples where ROS 2's capabilities are essential:
Boston Dynamics Atlas
Atlas, a research humanoid robot, performs complex dynamic movements like backflips and parkour. These movements require coordinating dozens of joints, processing IMU and visual data, and executing real-time balance control. ROS 2's real-time communication enables the high-frequency control loops needed for dynamic stability.
Service Robots in Healthcare
Humanoid service robots assist in hospitals by delivering medications, transporting equipment, and interacting with patients. These robots use ROS 2 to integrate navigation systems, manipulation controllers, and natural language interfaces. The modular architecture allows hospitals to customize robot capabilities without rewriting core systems.
Research Platforms
Universities worldwide use ROS 2-based humanoid platforms (like PAL Robotics TALOS or Agility Robotics Digit) for Physical AI research. Researchers develop new algorithms for walking, manipulation, and human-robot interaction. ROS 2's modularity allows researchers to replace specific components (like a walking controller) while keeping the rest of the system unchanged.
Manufacturing and Logistics
Humanoid robots are entering warehouses and factories for tasks requiring human-like dexterity. These robots use ROS 2 to coordinate vision systems (identifying objects), manipulation planners (reaching and grasping), and navigation systems (moving through dynamic environments). The distributed architecture allows scaling: multiple robots can operate in the same facility, coordinating through ROS 2 communication.
While humanoid robots are mechanically complex (20+ degrees of freedom, complex kinematics), ROS 2 abstracts away much of the software complexity. Instead of writing one monolithic control program, you can develop separate nodes for inverse kinematics, trajectory planning, and motor control, then connect them via ROS 2 topics and services.
Summary
This chapter introduced ROS 2 as the middleware framework that serves as the nervous system for physical AI systems.
Key takeaways:
- ROS 2 is middleware that provides communication infrastructure for robot software
- It solves problems of modularity, distributed computation, real-time communication, and platform flexibility
- ROS 2 uses a graph architecture where nodes (processes) communicate via topics, services, and actions
- The nervous system analogy helps understand distributed, message-passing architecture
- ROS 2 improves on ROS 1 with real-time support, no single point of failure, security, and multi-platform compatibility
- Humanoid robots use ROS 2 to coordinate complex subsystems for walking, manipulation, perception, and human interaction
Next: Chapter 2 explores the specific communication primitives (nodes, topics, services, actions) that make ROS 2 powerful for physical AI systems.