Sunday, March 20, 2011

Open CVhat I can do!

Greetings:

I have decided to take a short break from Lego building(let us just say that I stepped on one too many Lego bricks) and decided to learn something new. I always wanted to incorporate vision in my programs, so I decided to learn a little bit of OpenCV. I actually want to use OpenCV with Lego Mindstorms, but since I am using OpenCV for c++, I would have to use JNI to get my LeJOS Mindstorms to successfully communicate with my C++ OpenCV program. (EDIT: I just found out that a Java Wrapper for OpenCV exists, so I may be able to give my Mindstorms "vision" after all!) Anyway, What I really was interested in was to use OpenCV to track eyeball movement. I will tell you why in the next post. Now how OpenCV tracks face and eye movements is relatively complex. I do not understand all of it, just enough to make my small eye tracking program work. Pretty much the face detection algorithm scans an image for Haar like Features, and if the image satisfies the condition of having a face, the algorithm further subdivides the face into many categories. For example in a typical face, the eye is darker than the cheek, and so when the algorithm compares the intensities of the pixels that make up the eye and cheek, it finds the difference in intensities and is able to distinguish between the eye and a cheek in a face.
          This is a very watered down explanation of what is *really* happening behind the scenes, and in reality, the face and eye detection algorithms in openCV are clunky at best without proper optimization. So I decided to create  small c++ program that takes in input from a webcam ans tracks the face and eyes in real time. It also superimposes images over the face. Face detection was pretty much straightforward. But the eye detection algorithm required further optimization. How it worked was that it all eyes that it discovered in an image(or video frame) it would store in a array. But the problem was that it detected 6-7 eyes at once! Now I only have 2 eyes(4 with glasses, but that not the point), so I had to make the algorithm pick only two eye objects from all that it had discovered. The optimization that I used was to pick the two eye objects with the largest area, and see how their x and y coordinates match up relative to themselves and the face(This is better explained in the annotated code posted below). After hours of tinkering, I have a created a program that tracks human eyes with movement in all 3 dimensions with ~85% accuracy in real time. It is good enough to start out with, but I figure I will have to do much more of optimization if I want to carry out a certain project(more on this later).
              I have posted the links to the annotated code below which explains how my optimizations work. They are no work of art, but they work satisfactorily. Also here is short video of the eye tracking in progress in real time.


Link to download the folder with code, .exe, classifiers and the other good stuff.

This is not the end......