Green Screen Augmentation Enables Scene Generalisation in Robotic Manipulation

1Dyson Robot Learning Lab, *Equal Contribution

GreenAug provides a simple visual augmentation to robot policies by first collecting data with a green screen, then augmenting it with different textures. The resulting policy can be transferred to unseen visually distinct novel locations (scenes).

Abstract

Generalising vision-based manipulation policies to novel environments remains a challenging area with limited exploration. Current practices involve collecting data in one location, training imitation learning or reinforcement learning policies with this data, and deploying the policy in the same location. However, this approach lacks scalability as it necessitates data collection in multiple locations for each task. This paper proposes a novel approach where data is collected in a location predominantly featuring green screens. We introduce Green-screen Augmentation (GreenAug), employing a chroma key algorithm to overlay background textures onto a green screen. Through extensive real-world empirical studies with over 850 training demonstrations and 8.2k evaluation episodes, we demonstrate that GreenAug surpasses no augmentation, standard computer vision augmentation, and prior generative augmentation methods in performance. While no algorithmic novelties are claimed, our paper advocates for a fundamental shift in data collection practices. We propose that real-world demonstrations in future research should utilise green screens, followed by the application of GreenAug. We believe GreenAug unlocks policy generalisation to visually distinct novel locations, addressing the current scene generalisation limitations in robot learning.

Video

Tasks


We collected in total of 800+ demonstrations across 8 real-world tasks.

Method

GreenAug is a visual augmentation technique for RGB-based robot learning algorithms. It begins with setting up a green screen, then replaces the scene with various textures through chroma keying, and trains robot learning policies with these augmented images. This approach allows the policies to adapt to visually distinct novel locations (scenes). We explore several variants of GreenAug:

Results

Scene Generalisation with Imitation Learning

Task
trained with
, evaluated on three novel locations.
Train Scene
Test Scene 2
Test Scene 1
Test Scene 3

Further Study: Scene Generalisation with Reinforcement Learning

We also applied GreenAug to Coarse-to-fine Deep Q Network, a value-based reinforcement learning algorithm.
Train Scene
Green screen cloth placed on table
Test Scene
Green screen cloth removed; added distractors on the table