Here we’ll look at building Headroid1 in a few hours – a face tracking 2-axis robot head controlled by Python and open source modules. This is what the finished system will look like:
An earlier demo was presented on my blog as Headroid1 – A Face Tracking Robot, here’s a video demo:
Requirements:
- An afternoon with some tools and Python
- pySerial, openCV with Python wrappers
- Webcam
- 2 servos (if you want the head to move) and some brackets
- Serial Servo Controller or Arduino (if you want to control your servos)
First – let Python see faces using OpenCV:
My earlier facedetect.py post shows you how well facial detection works, it includes links to get OpenCV (which includes the Python bindings). It’ll take 30 minutes to download and compile OpenCV. To get facial detection working just plug in a webcam and run:
cd OpenCV-2.1.0/samples/python python facedetect.py 0 # pass in id for webcam - 0 is first webcam
Now you’ll have a red rectangle around your face as long as you’re looking roughly towards the webcam. First step complete!
Second – figure out how far the face is from the centre of the screen
Having found a face we now need to determine how far it is from the centre of the image. We edit detect_and_draw(…) in facedetect.py to add the following lines:
centre = None
...
if faces:
for ((x, y, w, h), n) in faces:
# the input to cv.HaarDetectObjects was resized, so scale the
# bounding box of each face and convert it to two CvPoints
pt1 = (int(x * image_scale), int(y * image_scale))
pt2 = (int((x + w) * image_scale), int((y + h) * image_scale))
cv.Rectangle(img, pt1, pt2, cv.RGB(255, 0, 0), 3, 8, 0)
# ADD THESE LINES BELOW
# get the xy corner co-ords, calc the centre location
x1 = pt1[0]
x2 = pt2[0]
y1 = pt1[1]
y2 = pt2[1]
centrex = x1+((x2-x1)/2)
centrey = y1+((y2-y1)/2)
centre = (centrex, centrey)
...
return centre
So as long as we find a face, we know where the centre of the facial rectangle is inside the webcam image. We’ll also know how big the webcam’s image is, so we know which quadrant of the webcam the face is in – and from here we’ll know which direction to move.
Third – how far should we move the servos?
Next we want to know how far to move the x and y servo locations to bring the face closer to the centre of the webcam image. We can’t just change the position in one big jump – the whole assembly will rock due to the sudden movement and then the webcam’s image rocks creating odd oscillations. Instead we move in short, stable steps.
We’ll call the following routine twice, once for the x axis and once for the y axis. We’ll allow the x axis to move up to 4 degrees each iteration, the y axis can only move a maximum of 1 degree per iteration (mechanically one degree on the y axis is lots, 4 degrees on the x axis with my webcam isn’t much). These figures will need tuning depending on your setup.
def get_delta(loc, span, max_delta, centre_tolerance):
"""How far do we move on this axis to get the webcam
centred on the face?
loc is the face's centre for this axis
span is the width or height for this axis
max_delta is the max nbr of degrees to move on this axis
centre_tolerance is the centre region where we don't allow movement
"""
framecentre = span/2
delta = framecentre - loc
if abs(delta) < centre_tolerance: # within X pixels of the centre
delta = 0 # so don't move - else we get weird oscillations
else:
# the x-axis is reversed so we must remember the sign
is_neg = delta <= 0
to_get_near_centre = abs(delta) - centre_tolerance
if to_get_near_centre > 35:
delta = 4 # big movement allowed if we're far away
else:
delta = 1 # small movement if we're close to the centre
if is_neg:
delta = delta * -1
return delta
Fourth – move the servos to re-centre the webcam using pySerial
Finally we need to control our servos so they respond to the deltas we’ve calculated. I’m using the pySerial module and BotBuilder‘s Serial Servo Board. The servo board is based on an Arduino – if you have an Arduino then these servo links will give you an easy equivalent (I’d love to see new code if you have a working Arduino solution!).
You’ll also need some brackets to mount your servos, see the end for purchase details to get an assembly like this (a USB->Serial cable is also shown to drive the serial board):
This assembly doesn’t show the webcam – we’ll add it back shortly. To control the servo board we open a connection using:
import serial
# /dev/cu.usbserial is the serial port on a Mac, it'll be COMx on Windows
# The Serial Servo Board uses 19200 baud
ser=serial.Serial(port='/dev/cu.usbserial',baudrate=19200,timeout=0)
ser.write('r') # send reset command
ser.read(100) # receive 'ready' string back
and to move the servos we simply specify the angle for the servo, e.g.:
ser.write('20a') # send servo on connection A to 20 degrees
ser.write('40a40b') # move servos on connections A and B to 40 degrees
To get an idea of how quickly the servos move watch this 30 second video:
Now we add the webcam using a bracket to complete the hardware:
Here’s the final head assembly:
Down below you’ll find the complete source code.
Questions? Join the A.I. Cookbook’s Google Group and see more details in the Cookbook wiki.
Purchase? If you’re interested in buying a hardward kit then email: kits AT aicookbook.com. We don’t have kits yet but if there’s interest, we’ll put them together via botbuilder.co.uk.
Moving forwards
The openCV book is rather excellent, everything is in C++ but the Python API is easy to figure out and the reference text makes it all clear.
Next? I’m glad you asked – here’s Headroid2 with a smiley-inspired emoticon interface from a recent A.I. talk I gave. Instructions for adding an Arduino+LOLShield for emotional feedback will follow.
Full source:
#!/usr/bin/python
"""
This program is demonstration for face and object detection using haar-like features.
The program finds faces in a camera image or video stream and displays a red box around them,
then centres the webcam via two servos so the face is at the centre of the screen
Based on facedetect.py in the OpenCV samples directory
"""
import sys
from optparse import OptionParser
import time
import math
import datetime
import serial
import cv
# Parameters for haar detection
# From the API:
# The default parameters (scale_factor=2, min_neighbors=3, flags=0) are tuned
# for accurate yet slow object detection. For a faster operation on real video
# images the settings are:
# scale_factor=1.2, min_neighbors=2, flags=CV_HAAR_DO_CANNY_PRUNING,
# min_size=<minimum possible face size
min_size = (20, 20)
image_scale = 2
haar_scale = 1.2
min_neighbors = 2
haar_flags = 0
def detect_and_draw(img, cascade):
gray = cv.CreateImage((img.width,img.height), 8, 1)
small_img = cv.CreateImage((cv.Round(img.width / image_scale),
cv.Round (img.height / image_scale)), 8, 1)
# convert color input image to grayscale
cv.CvtColor(img, gray, cv.CV_BGR2GRAY)
# scale input image for faster processing
cv.Resize(gray, small_img, cv.CV_INTER_LINEAR)
cv.EqualizeHist(small_img, small_img)
centre = None
if(cascade):
t = cv.GetTickCount()
# HaarDetectObjects takes 0.02s
faces = cv.HaarDetectObjects(small_img, cascade, cv.CreateMemStorage(0),
haar_scale, min_neighbors, haar_flags, min_size)
t = cv.GetTickCount() - t
if faces:
for ((x, y, w, h), n) in faces:
# the input to cv.HaarDetectObjects was resized, so scale the
# bounding box of each face and convert it to two CvPoints
pt1 = (int(x * image_scale), int(y * image_scale))
pt2 = (int((x + w) * image_scale), int((y + h) * image_scale))
cv.Rectangle(img, pt1, pt2, cv.RGB(255, 0, 0), 3, 8, 0)
# get the xy corner co-ords, calc the centre location
x1 = pt1[0]
x2 = pt2[0]
y1 = pt1[1]
y2 = pt2[1]
centrex = x1+((x2-x1)/2)
centrey = y1+((y2-y1)/2)
centre = (centrex, centrey)
cv.ShowImage("result", img)
return centre
def move_servos(xygo):
position = '%da%db' % (xygo[0], xygo[1])
ser.write(position)
def get_delta(loc, span, max_delta, centre_tolerance):
"""How far do we move on this axis to get the webcam
centred on the face?
loc is the face's centre for this axis
span is the width or height for this axis
max_delta is the max nbr of degrees to move on this axis
centre_tolerance is the centre region where we don't allow movement
"""
framecentre = span/2
delta = framecentre - loc
if abs(delta) < centre_tolerance: # within X pixels of the centre
delta = 0 # so don't move - else we get weird oscillations
else:
is_neg = delta <= 0
to_get_near_centre = abs(delta) - centre_tolerance
if to_get_near_centre > 35:
delta = 4
else:
# move slower if we're closer to centre
if to_get_near_centre > 25:
delta = 3
else:
# move real slow if we're very near centre
delta = 1
if is_neg:
delta = delta * -1
return delta
if __name__ == '__main__':
# open a serial port
ser=serial.Serial(port='/dev/cu.usbserial',baudrate=19200,timeout=0)
ser.write('r')
xygo = (90,90)
move_servos(xygo)
# parse cmd line options, setup Haar classifier
parser = OptionParser(usage = "usage: %prog [options] [camera_index]")
parser.add_option("-c", "--cascade", action="store", dest="cascade", type="str", help="Haar cascade file, default %default", default = "/Users/ian/Documents/OpenCV-2.1.0//data/haarcascades/haarcascade_frontalface_alt.xml")
(options, args) = parser.parse_args()
cascade = cv.Load(options.cascade)
if len(args) != 1:
parser.print_help()
sys.exit(1)
input_name = args[0]
if input_name.isdigit():
capture = cv.CreateCameraCapture(int(input_name))
else:
print "We need a camera input! Specify camera index e.g. 0"
sys.exit(0)
cv.NamedWindow("result", 1)
if capture:
frame_copy = None
while True:
frame = cv.QueryFrame(capture)
if not frame:
cv.WaitKey(0)
break
if not frame_copy:
frame_copy = cv.CreateImage((frame.width,frame.height),
cv.IPL_DEPTH_8U, frame.nChannels)
if frame.origin == cv.IPL_ORIGIN_TL:
cv.Copy(frame, frame_copy)
else:
cv.Flip(frame, frame_copy, 0)
centre = detect_and_draw(frame_copy, cascade)
if centre is not None:
cx = centre[0]
cy = centre[1]
# modify the *-1 if your x or y directions are reversed!
xdelta = get_delta(cx, frame_copy.width, 6, 15) * -1
ydelta = get_delta(cy, frame_copy.height, 1, 25) * -1
# on my camera I introduce a delay after movements
# else my assembly wobbles and the webcam transmits
# a non-centred image, so weird oscillations can occur
total_delta = abs(xdelta)+abs(ydelta)
if total_delta > 0:
xygo = (xygo[0]+xdelta,xygo[1]+ydelta)
sleep_for = 1/10.0*min(total_delta, 10)
sleep_for = min(sleep_for, 0.4)
move_servos(xygo)
else:
sleep_for = 0
if cv.WaitKey(10) >= 0: # 10ms delay
break
cv.DestroyWindow("result")





[...] – see Building A Face Tracking Robot In An Afternoon for full details to build your own [...]
Awesome! Inspiring!
[...] Building a face-tracking robot (Headroid1) with Python in an afternoon | The Artificial Intelligence… (tags: python opencv webcam image processing) Published: June 29, 2010 Filed Under: Delicious Leave a Comment Name: Required [...]
[...] For full details including build instructions see building a face tracking robot. [...]
I like it. Reminds me of the early days of cheap voice recognition (thanks to Sync magazine) and later ads showing cv with add ons that I could not afford, you have shown me some of that old creativity that I only read about years ago.
[...] bring Headroid along and I hope to organise a Birds of a Feather session on Artificial Intelligence and robotics. [...]
[...] use this approach for my artificial intelligence projects (e.g. my robot head and English Heritage plaque machine vision [...]
If i use an arduino, the purpose of the serial port is to program it, right? So , once I get the arduino programmed, I can put it by itself onto the robot and let the camera track automatically, correct?
I also need the video fed into the robot, so, the arudino is not getting the video from the serial port, is it? If it is, I the serial port will have to be continuously hooked up defeating the portability of the unit.
Hi Mel. The unit is *not* portable at present.
The PC (Mac in my case) does the vision processing over USB. Vision requires a lot of CPU power, far more than an Arduino can provide.
The Servo Board controls the servos (Arduinos can control up to 16 servos I think if programmed the right way). This Servo Board is a modified Arduino. It is programmed directly over the serial port.
I did wonder about using a SheevePlug (http://en.wikipedia.org/wiki/SheevaPlug) to do the vision processing. It draws about 7W which is doable (sort of) with batteries perhaps.
If you have further questions then the Google Group is probably the best place to ask them:
http://groups.google.com/group/aicookbook