Monday, June 16, 2014

Identifying Balloons Using Computer Vision

Note: This is a series of articles in which we document our attempt to build an autonomous drone for the Sparkfun AVC.  The previous post is here.

This is a followup to my previous article describing how to compile a hardware accelerated version of OpenCV.  You can find that article here, but to briefly recap: part of the Sparkfun AVC is finding and popping red balloons. We wanted to do this using on-board vision processing. For performance reasons we need to hardware accelerate the vision system. We are using the Jetson TK1 board from NVIDIA with the OpenCV library.

What follows is a description of the vision system we developed. The full code for the vision system can be found on github at

The OpenCV Project

The Open Computer Vision project (OpenCV) is easy to use and allows you to quickly program solutions to a wide variety of computer vision problems. Normally, the structure of a simple computer vision program using OpenCV is:

  1. read in a frame from a video camera or a video file
  2. use image transformations to maximize the visibility of the target
  3. extract target geometry
  4. filter and output information about the target.

For hardware accelerated applications, OpenCV provides hardware accelerated replacements for most of its regular functions.  These functions are located in the gpu namespace which is documented at

The Algorithm

When trying to identify our balloon targets, the first thing I tried was converting the video stream to Grayscale because it is fast and cheap. However, this did not give sufficient distinction between the sky and balloons. I then tried converting the stream to HSV (Hue Saturation Value) because it is good for identifying objects of a particular color and relatively simple to do. The balloons are quite distinct in both the hue and saturation channels, but neither alone is sufficient to clearly distinguish the balloons against both the sky and the trees. To resolve this, I multiplied the two channels together, which yielded good contrast with the background.

Here is the code implementing that section of the algorithm.

gpu::absdiff(hue, Scalar(90), huered);
gpu::divide(huered, Scalar(4), scalehuered);
gpu::divide(sat, Scalar(16), scalesat);
gpu::multiply(scalehuered, scalesat, balloonyness);
gpu::threshold(balloonyness, thresh, 200, 255, THRESH_BINARY);

Hue is normally defined with a range of 0..360 (corresponding to a color wheel) but to fit into eight bits, it is rescaled to a range of 0..180. Red corresponds to both edges of the range, so taking the absolute value of the difference between the hue (with a range of 0..180) and 90 gives how close the hue is to red (huered) with a range of 0..90. The redness and saturation are both divided by constants chosen to give them appropriate weightings and so that their product fits the range of the destination. That result, which I call balloonyness (i.e. how much any given pixel looks like the color of the target balloon) is then taken through a binary threshold such that any pixel value above 200 is mapped to 255 and anything else is mapped to zero, storing the resulting binary image in the thresh variable. The threshold function is documented at, the GPU version is equivalent.

Once the image has been thresholded, I extract external contours. The external contours correspond to the outlines of the balloons. The balloons are nearly circular, so I find the minimal enclosing circle around the contours. Then, to deal with noise (red things that aren't shaped like balloons), I compare the area of the circle with the area of the contour it encloses to see how circular the contour is.

The image below illustrates this process.
I could also have used an edge detector and then a Hough circle transform ( to detect the balloons, but I decided not to because that method would not be able to detect balloons reliably at long range.

The Joys of Hardware Acceleration

Before hardware acceleration, this algorithm was running at between two and three frames per second on the Jetson board. With hardware acceleration, it now runs at over ten frames per second, and it is now limited by how quickly frames from the camera can be captured and decoded. This equates to about a five times speedup overall. It is even greater if you only consider the image processing phase.

Writing computer vision systems seems very intimidating at first, however there is very good library support so many types of problems can be solved easily with only a little research and experimentation.

Sunday, June 15, 2014

Hardware Accelerated OpenCV Configuration

Note: This is a series of articles in which we document our attempt to build an autonomous drone for the Sparkfun AVC.  The previous post is here.

Part of the Sparkfun AVC is finding and popping red balloons. We wanted to do this using on board vision processing. For performance reasons we need to hardware accelerate the vision system. We are using the Jetson TK1 board from NVIDIA with the OpenCV library. Here are instructions for configuring OpenCV with CUDA hardware acceleration on the Jetson TK1.

Imaging the Jetson Board

The first step is to image the Jetson TK1 with the latest version of Linux 4 Tegra (L4T); at the time of writing this is Rel-19 which is available at There are good instructions for this available from NVIDIA in the quick start guide available on the L4T page.

I did "sudo ./ -S 8GiB jetson-tk1 mmcblk0p1"

Installing Cuda

You must be part of the CUDA/GPU Computing Registered Developer program to get CUDA. Signing up is free on the NVIDIA developer website. You will have to login or create an account.

Download the CUDA Toolkit; at the time of writing this is CUDA 6.0 Toolkit for L4T Rel-19.2 which is available from That page also contains a getting started with linux guide which has more instructions; I only give the minimum required for this specific case. For troubleshooting refer to the getting started with linux guide.

Install the CUDA Toolkit:
cd Downloads/
sudo dpkg -i cuda-repo-l4t-r19.2_6.0-42_armhf.deb
sudo apt-get update
sudo apt-get install cuda-toolkit-6-0

Create symbolic links for the CUDA Libraries for compatibility. I put them in /usr/local/lib:
cd /usr/local/cuda-6.0
sudo ln -s *.so /usr/local/lib

Ensure that your PATH contains the CUDA executables:
export PATH=/usr/local/cuda-6.0/bin:$PATH

Ensure that your LD_LIBRARY_PATH includes the CUDA libraries and custom built libraries:
export LD_LIBRARY_PATH=/usr/local/lib:/usr/local/cuda-6.0/lib:$LD_LIBRARY_PATH

Compiling OpenCV

Download and extract the OpenCV code. I am using version 2.4.9 because, at the time of writing, that is the latest version.

Create a build directory in the same directory as the source directory. I put the source code in my Downloads directory, which currently contains "cuda-repo-l4t-r19.2_6.0-42_armhf.deb", "", "opencv-2.4.9", "opencv-2.4.9-build".

The GUI functions in OpenCV depend on GTK, so if you plan to use them, install libgtk2.0-dev:
sudo apt-get install libgtk2.0-dev

You need CMake to configure OpenCV so install that:
sudo apt-get install cmake

Change into the build directory for OpenCV

Configure OpenCV with the appropriate CUDA_ARCH_BIN for your GPU's Compute Capability, which can be determined with the deviceQuery CUDA sample. For the Jetson TK1, this is 32, so I ran:
cmake  -DCUDA_ARCH_BIN=32 ../opencv-2.4.9

Run the compile and install using make; The command I used was:
sudo make -j4 install

The -j4 flag instructs it to run four jobs simultaneously, which gives a considerable speed-up when run on the Jetson which has four large ARM Cores, but allows some of the output to be out of order or interleaved.

Using GPU accelerated OpenCV

The GPU module for OpenCV is quite simple to use. The functions in the gpu namespace generally have identical semantics to the cpu variants, with the only difference being that they take cv::gpu::GpuMat arguments instead of cv::Mat arguments. Data must be uploaded and downloaded between the CPU and GPU in order to be used. GpuMat provides upload and download functions that take a single Mat as an argument to transfer the data between the CPU and GPU.

For example, the code below shows the difference between using the CPU and the GPU for the threshold function.

using namespace cv;

Mat src_host, dest_host;

//on the CPU
threshold(src_host, dest_host, THRESH_VAL);

//on the GPU
gpu::GpuMat src, dest;
gpu::threshold(src, dest, THRESH_VAL);; 

My next article will discuss the specific algorithm we are using to find balloons.

Sunday, June 1, 2014

GroovyFX 0.4.0 Released

GroovyFX makes writing JavaFX code fast and easy.  The latest version is available from Maven Central using the coordinates


This new version includes support for Groovy 2.3.x as well as Java 8 and JavaFX 8.  Please try it out and let us know if you have any problems by sending email to the mailing lists.

If you are a current user of GroovyFX and have any thoughts or questions on future directions, please send those to the mailing lists as well!