Before you start
Set up your environment
- Light the subject well. Natural or overhead lighting works best. Avoid recording with a bright window directly behind the subject; this creates silhouetting that degrades video quality and hurts object recognition.
- Clear the workspace. Remove objects and clutter that aren’t part of the procedure. A clean field of view helps the tracker focus on the right objects and makes the footage easier to review.
- Know the procedure in advance. Walk through what you’re going to do before starting the recording. Captures of someone working from memory mid-session produce hesitant, disjointed footage that’s hard to learn from. Think of it like a rehearsed demonstration, not a first attempt.
Check your tracked objects
If the skill you’re recording against has tracked objects configured, confirm they all appear in the pre-recording sheet before tapping Start. Any object not shown there won’t be tracked during the session — and this list can’t be changed once recording begins. Position the objects where they’ll be used before starting. Getting them into frame early gives the tracker more time to acquire a lock.During the session
Move slowly and deliberately
This is the single most important thing you can do for a good capture. Pose tracking works best when head movements are smooth and intentional. Fast pans, sudden turns, and shaky movements introduce noise into the pose data and can cause brief tracking losses. The same applies to hand tracking; quick, jerky motions are harder to reconstruct than smooth, deliberate ones. A good rule of thumb: move at roughly half the speed you would in a normal work context. It will feel slow. That’s correct. When transitioning between steps (turning to pick something up, moving to a different part of the workspace), turn your whole body rather than just your head, and do it smoothly.Narrate everything out loud
On-device transcription produces a timestamped record of everything you say during the session. This transcript becomes part of the capture data and is used to generate captions and training outputs. A well-narrated capture is dramatically more useful than a silent one. Narrate as you go, not after. Describe what you’re about to do just before doing it, then describe what you did. For example:“I’m picking up the torque wrench; this one is set to 25 foot-pounds. I’m positioning it on the left-side bolt here, and applying steady pressure clockwise until I feel the click.”Speak clearly and at a normal pace. Avoid filler words. If you make a mistake, narrate that too (“I grabbed the wrong tool there, let me swap to the correct one”) so reviewers understand what they’re seeing. Don’t worry about sounding scripted. A clear, informative narration is far more valuable than a natural-sounding but vague one.
Pause between steps
After completing each discrete step, pause for 1–2 seconds before moving to the next. This gives the tracker time to settle, produces clean step boundaries in the data, and makes the footage easier to review and annotate later. Think in terms of chapters: complete a step, hold briefly, move on.Keep the subject in frame
If you’re demonstrating a procedure on an object or surface, keep it in your field of view as much as possible. The stereo cameras capture what you’re looking at; if the subject drifts to the edge of your view, the useful data goes with it. For bench work, position yourself so the work surface is roughly centered in your view at a comfortable working distance. Avoid leaning in very close (below ~30cm) or stepping back far; both extremes degrade tracking quality.Let object tracking acquire a lock
When you first encounter a tracked object, look directly at it for 1–2 seconds before touching or moving it. This gives the tracker time to acquire a lock. You’ll see the highlight appear when it does. If a tracked object loses its lock mid-session (the highlight disappears), look directly at it again before continuing with that part of the procedure.Common mistakes to avoid
| Mistake | Why it matters | Fix |
|---|---|---|
| Moving too fast | Introduces noise into pose and hand data | Slow down intentionally; half speed is a good target |
| Working in silence | Transcript is empty or sparse, outputs are lower quality | Narrate continuously, including transitions and decisions |
| Poor lighting | Degrades video and object recognition | Ensure good overhead or front lighting before starting |
| Rushing through steps | No natural pause points in the data | Pause 1–2 seconds between each discrete step |
| Not checking tracked objects first | Objects not tracked even though they’re present | Confirm objects loaded in the pre-session sheet |
| Looking away from the subject | Subject drops out of frame at key moments | Keep work surface centered in your view throughout |
| Recording a first attempt | Hesitation and mistakes produce noisy data | Rehearse the procedure before recording |
