In this chapter, we set up a simple XR rig in Unity and track our headset and controllers with virtual objects.


Spatial Data Is Always Relative

Before diving into implementation, let’s clarify what we mean by tracking:

A head-mounted display (HMD) and its motion controllers provide three sets of pose data. Tracking involves transforming this data and applying it to virtual objects within the scene.

However, spatial data is meaningless without a reference point. To design an effective tracking system, we must understand what these values are relative to.

Here are two key questions to consider:

  • What are the reported HMD and motion controller pose values relative to?
  • What should their virtual counterparts’ pose values be relative to?

Notice the difference in wording: the first question begins with “what,” while the second starts with “what should.” This distinction matters.

At the application level, we do not decide the reference pose1 for the devices’ reported pose values—we simply need to determine it so we can apply the values correctly.

However, for their virtual counterparts, the reference pose is entirely up to our design.

Grounded and Looking Forward

If your development environment matches mine—Meta Quest 3 connected to a PC via Air Link, using OpenXR as the runtime—and you’re accessing spatial data using the same InputAction mappings as in my example code2, you’ll find that the data is relative to the HMD’s pose at the last recenter action—but grounded and facing forward.

When a recentering event occurs, the following steps take place:

  1. The HMD’s pose is recorded.
  2. The pose is adjusted to remove tilt, ensuring the forward direction (blue arrow) aligns with the horizontal plane.
  3. The pose is lowered until it touches the ground3. Note that the runtime always knows the “up” direction using the IMU (Inertial Measurement Unit) and determines the floor position based on the user-defined play area.
  4. This processed pose becomes the reference pose for all tracking data.

To visualize this, imagine the user standing where they last performed a recenter. If they remove their HMD and place it between their feet while facing forward, the HMD’s reported position will be Vector3.zero, and its rotation will be Quaternion.identity.

Note that some platforms automatically perform a recenter when launching an application. This isn’t universal—test your target platform to determine if it does something similar.

[Insert Visual Illustration Here]

The Virtual Setup

Now that we understand what the device-reported pose values are relative to, we need to establish a tracking origin—an empty object that represents the reference pose’s corresponding pose in the virtual world.

To sync tracking data correctly, we create three child components under the tracking origin, each mirroring the reported local pose of its respective device:

C#
private void HandleRotationDataUpdated(Quaternion newRotation)
{
    transform.localRotation = newRotation;
}

private void HandlePositionDataUpdated(Vector3 newPosition)
{
    transform.localPosition = newPosition;
}

Let’s also attach some 3D objects to our virtual HMD and controllers. This should help visualizing their poses. Let’s also attach a camera to the virtual HMD.4

The “Standard” Pose

This setup of ours is straightforward:

  • The tracking origin is the virtual counterpart of the reference pose in the real world.
  • The relative positions and rotations of the virtual HMD/controllers to the tracking origin mirror those of the physical HMD/controllers to their reference pose.

It is, however, crucial to interpret the tracking origin correctly:

The pose of the tracking origin defines the user’s “standard pose” at the application’s current state.

To better understand what “standard” means in practice, consider these examples:

  1. Menu-Based Applications
    • If your app features a large 2D menu standing in the scene, you’d position the tracking origin at a comfortable distance from it, facing towards it.
  2. Seated Experiences (e.g., Sim Racing Games)
    • If the player is seated, the tracking origin should be at ground level, positioned directly under the seat, and aligned forward.
  3. World Alignment
    • Most applications define a fixed “up” direction in world space to align the virtual ground with the real-world floor.
    • As long as this remains true, the tracking origin’s local “up” direction should always match world space “up”.

In short, the tracking origin determines the player’s virtual position and rotation when recentered. You can see it as a marker on the ground of the virtual world that hints where the player should be in the scene and which direction they should be facing.

Many common application-level tracking issues arise from failing to define the standard pose based on the program’s current state—or not using a tracking origin at all.

We’ll talk more about these issues in our upcoming chapters. But for now, we’ve got ourselves a minimalist XR rig setup: load scene 1.00 - Basic Tracking, connect your headset, hit play, and observe it in action.


Next: 1.01 – Recentering

Table Of Contents

Leave a Reply

Your email address will not be published. Required fields are marked *