I’m currently studying for a PhD at the Australian Centre for Field Robotics (ACFR), at The University of Sydney. Working with the Marine Robotics team, my area of interest is in the automated interpretation of data that is collected by our Autonomous Underwater Vehicles (AUVs). For some background on why I started the PhD, check out the first post in this blog.
Our AUVs regularly perform surveys of the sea floor both in Australia, and around the world. Typically, the requests come from marine biologists and ecologists, climate scientists, and archeologists. In a matter of a few hours, the AUVs can be deployed from a small boat to map an area of the sea floor (coral beds, kelp forests, urchin barrens and even ancient shipwrecks). Returning with tens of thousands of stereo images, along with other sensor data, a 3D map of the area can be constructed.
Where does “Data Science” come in?
When people think about “robotics”, typically the first things that come to mind are physical systems, moving parts, navigation and control. My interest, however, is in what we do with the data the robot collects. It’s one thing to have a robot that can autonomously go off and have underwater adventures, navigating and mapping an area. It’s another thing entirely to present information to the user (usually a scientist) that is accurate, intuitively understood and relevant, based on the masses of raw data that have been collected. In terms of the problems that are encountered, it makes little difference, whether the challenge involves a scientist and a marine robot, or a corporate user and a mass of product analytics data. All the hype and buzzwords aside, this is essentially what “Data Science” is – taking large amounts of data, and presenting it to real people in a way they can understand.
The current practice for users of our AUV data is to manually label image content. In the image below, the scientists manually labelled each of 50 different points with a particular tag. With dozens of tags to choose from, and the need to often zoom in to the pixel level to see exactly what is there, this is a time consuming and tedious task.
The main aim of my research is to find a way to automate this process, such that users of our data receive not only a 3D map of the sea floor, but a semantic map which describes what is on the sea floor. By providing a rich, dense map of what species, objects and features are present, we can not only ease the burden of manual labelling, but also change the way marine science is done. Rather than needing to extrapolate from a few points on a few of the AUV images, it would become a simple task to identify where interesting content is located, how it is distributed, and how it is changing over time.
In machine learning, the supervised classification task involves taking a set of labelled training data, extracting some relevant features, using a classification algorithm (such as a Support Vector Machine) to learn the relationship between the features and the labels, and then predicting the labels on new data.
Typically, these labels are either binary (as in “this is kelp” vs “this is not kelp”), or multiclass (as in “this is kelp”, “this is coral”, and “this is neither kelp nor coral”). When it comes to a complex environment, however, such simple labelling schemes are insufficient.
In early 2013, the Catami Project identified an entire taxonomy of over 100 biological species of interest to Australian marine scientists. In it, “Crabs” are a type of “Crustacea”, in the same way that a labrador is a type of dog.
Traditional multi-class classification assumes that the classes are mutually exclusive, so the fact that an object could be correctly labelled as “Kelp”, “Algae” or even “Biological” (as opposed to “Physical”) means that a different approach is required.
An Example Framework
Hierarchical classification is a type of machine learning where the labels in the data set form some kind of tree or graph structure, rather than a simple list of alternatives. I presented an initial framework for dealing with this kind of data in May 2013, at the Fine Grained Visual Classification workshop at CVPR in Portland, Oregon. The abstract is available for download.
The interactive visualisation below shows the performance on a sample data set, for classification using PCA and an SVM. Moving the mouse over a node adjusts edge width to show where true instances of that class ended up being classified, as well as an example of that class, with graphs of the PCA components. Red bars next to nodes represent the f1 score for that node, and the numbers inside each node box below the f1-score are the confusion matrix.