Mark Jouppi

Date: Summer 2014
Disciplines: Image Processing, Computer Vision, Machine Learning, Distributed Systems, Big Data, Remote Sensing
Technologies Used: Java, Javascript, JUnit, various internal Google cluster computing/data storage/testing/building/monitoring systems
Team: This was my individual project within the Earth Engine team. Advised by Max Shawabkeh and Dr. David Thau.

Description:

Google has petabytes of satellite imagery from many sources. One of the most common goals with satellite imagery is classification. That is, we want to automatically determine what objects are in a satellite image. For example, in course imagery, we may want to classify the image in terms of landcover classes (e.g., forest, urban) and in high resolution imagery, we might classify an image in terms of buildings, cars, roads, etc. Classifying each pixel in a satellite image independently of its neighbors is often ineffective, especially in high resolution imagery. By segmenting an image into objects, and classifying each object, we can take advantage of spatial information. This is often much more effective. But how do we obtain the segments in the first place?

To segment a satellite image into logical objects (e.g., separate fields, towns, lakes into distinct regions), I implemented a distributed region merging segmentation algorithm. This bottom-up segmentation algorithm starts with each pixel as a unique region and iteratively merges regions according to spectral and shape properties, the importance of which is specified by the user. I decreased memory usage and computation time each by over 50% by implementing lightweight region objects to represent single pixels and complex region objects that keep track of their neighbor frontier and region properties as they expand. The code is now in production and can run on production servers which have limited memory. Excessive memory usage was a problem in older segmentation algorithm tests from a few years ago, so I made sure this implementation was as efficient as possible.

A giant satellite image is split into tiles which are each sent to a machine and segmented independently. I designed and implemented a post-processing algorithm to merge regions across different tiles. After segmentation is computed for each tile, each tile examines its upper and left neighbor tile segmentation results and updates its own segment labelling to merge with neighboring regions in the other tiles if the merges are within acceptable thresholds. In this way, results are generated with consistency between tiles. However, there is a maximum region size that can be produced based on the tile size as a result of memory limitations.

[I've been told I can say whatever I want about the project, but I don't know if I can publish project images without specific permission/authorization, so I'm only including already-public/official media below.]

Earth Engine UI showing satellite image classified by land cover. Image courtesy Google Earth Engine

Rebecca Moore, manager of Earth Engine, discusses the platform

Google Internship: Large-Scale Satellite Image Segmentation