MouseTrap Dev Help/mousetrap internals
The MouseTrap Program
Here is a complete dissection of the hierarchy done by gnome: http://gnome-mousetrap.sourcearchive.com/documentation/0.3plus-psvn17-2ubuntu1/main.html
- Haar Cascade
- A common technique used for detection of rigid objects in image processing. In MouseTrap, there is an XML file that contains the definition of common facial features such as eyes, nose, etc. and eyeglasses.
- Haar wavelet
- First proposed by Alfred Haar, the Haar wavelet is a series of square-shaped functions that when shown together form a basis otherwise known as a wavelet family and expressed in terms of an orthonormal function basis. For more information, see: []
- Haar-like features
- Uses adjacent rectangular regions in a specified detection window, sums up the pixel intensities and calculates the difference between those sums. They are called Haar-like features because they are computed using similar coeffients in the Haar wavelet transforms. These regions can then be concatenated with boosted classifiers into a classifier cascade to cross reference these regions with other positive samples to form a model for object detection. For more information, see [] []
- Boosted Classifiers
- The result of increasing the accuracy of a classifier cascade through the means of positive object recognition of scaled images.
- Attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. []
- Classifier Cascade
- Proposed by Paul Viola and refined by Rainer Lienhart, a classifier cascade
- ROI (Region of Interest)
- This is usually a subset of the original frame represented as a rectangle. It is most often compared to a classifier cascade to determine a positive match.
- Ocvfw (OpenCV FrameWork)
- A MouseTrap in house framework that manages OpenCV methods and includes functions to initiate the camera and detect further Haar-like features.
- Optical flow
- Detecting the pattern of motion of moving objects. [] []
- Lucas-Kanade method (AKA LK Algorithm)
- Differential method to estimate optical flow, it combines several nearby pixels to resolve the ambiguity of the optical flow equation []. Used by MouseTrap to track head and facial motion.
- Restricts the instantiation of a class to one object so coordination over program actions can be achieved.
- Image Detection Module. executes the detections algorithms, recognize the movements performed by users and translates them to mouse pointer movements. MouseTrap >= 0.4 can handle more than one IDM (a.k.a Algorithm) which allow users to chose the IDM adapted to them.
This section provides an overview of the structure of the MouseTrap project. Below is a figure showing the directory structure of the MouseTrap code. Note that only directories are shown, not individual files within directories.
MouseTrap is split into two main pieces:
- app which contains the actual mousetrap application
- ocvfw which contains the code related to cv and is responsible for controlling the actual camera interactions.
Note that all of the directories in the structure have individual Makefile files. This allows you to compile the code within that directory individually.
The app directcory contains the following three areas of functionality:
- addons - these are classes that provide additional functionality to mousetrap.
- cpu - checks the CPU usage so that it can be displayed.
- handler - handles the addition of any gtk widgets that have been "added" to mousetrap. "gtk" stands for GNOME Toolkit and widgets are components that are used to compose UIs.
- recalc - responsible for adding a button to mousetrap to recalculate the location of the forehead.
- lib - contains functionality for communication and for handling events and settings:
- dbus - supports communication between mousetrap and the GNOME desktop using dbus
- httpd - provides support for communicating with mousetrap via HTTP using GET, POST, etc.
- mouse - handles mouse events
- settings - handles mousetrap settings via a configuration file(?)
- ui - contains code for setting up and managing the UI.
- dialogs.py - creates and manages the formatted dialogs used by mousetrap
- i18n.py - internationalization module for handling different languages, etc.
- settings_gui.py - provides GUI to allow user to manage settings
- widgets.py - code to draw mousetrap window and widgets
- The scripts directory contains two files color.py and screen.py that appear to control the screen display and how it is drawn (location, etc.). These two files are very similar and I'm not sure the difference between the two.
The app directory also contains the following four files:
- commons - contains python functions common across MouseTrap. Currently there is only one.
- environment - contains information about mousetrap needed by the system including the process id, version, name of the data directory, etc.
- debug - contains functions to support debugging of mousetrap including creating a log and writting to the log.
- main - this is the main function for mousetrap and where execution starts.
The ocvfw (openCV framework) portion of mousetrap manages the graphics and camera including determinig facial features and movement. It contains the following four directories:
- haars - contains the Haars Cascade (see definitions above) XML definitions for a variety of facial aspects. Things like faces with eyes, faces with eyeglasses, etc.
- backends - contains classes that provide back-end interface with cv functions. These three classes, OcvfwCtypes, OcvfwBase, and OcvfwPpython, appear to be contained in the file ocvfw/_ocv.py. (Not sure how this is happening.) In addition, these appear to be used in place of the bindings provided by OpenCV. Not sure why!!
- dev - contains single file camera.py that manages the camera. Converts image that the camera sees into a gtkimage that can be processed.
- idm - contains the image detection module. This code is responsible for detecting and reacting to movements of the user's head. It contains the following classes:
- color.py - supports color tracking by IDM including like adding color masks and changing color representation from HSV to RGB.
- eyes.py - captures the eyes and eye motion.
- forehead.py - forehead pointer tracker based on LK algorithm (see definitions above).
- finger.py - finger pointer tracker based on LK algorithm (see definitions above).
The ocvfw directory also contains four files:
- commons.py - contains global variables for camera functionality. Includes path to Haars definitions of facial aspects.
- debug.py - supports logging for debugging in ocvfw
- pocv.py - Python OpenCV handler. Interfaces between IDM and rest of code. Simply retrieves the IDM.
- _ocv.py - Called "Little Framework for OpenCV Library". Appears to contain the backend interface with the cv library.
OpenCV Framework (ocvfw)
- This is the wrapper around OpenCV
- View diagram here: []
- Contains three classes:
OcvfwBase: direct copy of backends/ OcvfwBase OcvfwPython: direct copy of backends/ OcvfwPython OcvfwCtypes: direct copy of backends/ OcvfwCtypes
- Detects features based on an xml file
- This is used instead of the python bindings for OpenCV
- Returns an instance of the idm
- Used thousands of samples to concatenate an xml file to predict features
- checks for gtk
- loads the camera backend
- sets Camera as a singleton with backend as the base
- Class Capture
- first gets a region of interest and then matches it against a haar classifier
- Sets all variables associated with video capture
set_async(fps, async): sets the frames per second and whether the image should have asynchronous querying. If it is true, then it will set a gobject timeout of the specified frames per second sync(): Synchronizes the Capture image with the Camera image set_camera(key, value): sets the Camera object with key and value specified image(new_img): sets the self.__image variable to specified value resize(width, height, copy): manipulates the self.__image variable and resizes it using cv.Resize() and width and height given, will not replace the self.__image if copy is True. to_gtk_buff(): Converts image to gtkImage and returns it points(): returns self.__graphics["point"], a list with rectangles that have been added? rectangles(): returns self.__graphics["rect"], a list with rectangles that have been added show_rectangles(rectangles): draws the rectangles onto the self.__image original(): returns the Capture object with the self.__image_orig image, setting the Capture to the original image rect(*args): uses the args (a rectangle) to get a sub-part of the self.__image using cv.GetSubRect() flip(flip): flip is a string that can contain 'hor' 'ver' or 'both' to use cv.Flip() to manipulate the self.__image. Returns self.__image color(new_color, channel, copy): if new_color is true it will set the image to the new channel provided. If copy is set, it will only manipulate a new image and keep the existing image as is. change(size, color, flip): will set self.__color_set to the new color value and set self.__flip to the new flip value. Does not currently support the change in image size. add(graphic): has checks to see if the capture is locked or if the graphic exists already. Otherwise it will add the graphic passed to it to the image using set_lkpoint() remove(label): removes a graphic object from self.__graphics by its label. get_area(haar_csd, roi, orig): uses the haartraining file (haar_csd) to it will use get_haar_points() or get_haar_roi_points() depending whether roi is set. It can also get the area within an area using the roi and setting the origin point. message(message): does nothing, just pass lock(): sets the self.__lock to true, which is used in add() and remove() unlock(): sets self.__lock to false is_locked(): returns self.__lock
Graphic() init(): x and y stored in a list coords[x, y] size: list [width, height] type: could be a point label: string color: rgb color or tuple follow: used for optical flow parent: what is the parent class is_point(): checks if type is true
Point(): contains a graphic and additional variables and methods init(): graphic(**args) __ocv: opencv attribute last: opencv attribute diff: difference between two points abs_diff: difference between original and current rel_diff: difference between last and current orig: an opencv Point objec set_opencv(opencv): updates the current attributes, updates the points in abs_diff and rel_diff and sets self.__ocv to opencv given opencv(): returns the graphic object with the opencv attributes (__ocv)
- link to image on how the backend distributes variables ["]
- Contains three classes:
- sets image variables
set(key, value): sets the key (image variables) to the value specified lk_swap(set): switches the boolean of the Lucas-Kanade points, if true, it will append current to last new_image(size, num, ch): Will CreatImage(size, depth, channel) using a Size(width, height), depth and channel set_camera_idx(idx): sets the global var self.idx to specified idx number wait_key(num): uses cv WaitKey() and inputs number specified start_camera(params): grabs the video capture and sets it as a global variable query_image: grabs the first frame and creates self.img, the pyramids and grey images for optical flow. Uses wait_key(). returns true set_lkpoint(point): uses cv.Point, sets the self.img_lkpoints image, uses dev/ camera.set_opencv() to manipulate the graphic object (made my Mousetrap), sets the ["current"] using FindCornerSubPix() and if ["last"] exists, it appends current, appends point to ["points"] clean_lkpoints(): sets self.img_lkpoints current, last and points to empty show_lkpoints(): calculates the optical flow and assigns it to the ["current"] lkpoints if it resolves. Recursively goes through ["points"] and draws them, then sets ["current"] back to points swap_lkpoints(): only after the new points were shown, swap prev with original and current with last
- inherits from OcvfwBase
- imports global and local variables from Commons.hg (highgui related variables) and Commons.cv(Opencv) variables
- has the ability to get_motion_points but is not used
- has the ability to add_message to the image shown but is not used
get_haar_roi_points: finds regions of interest within the entire frame image and returns the matches against the classifier cascade using the HaarDetectObjects() OpenCV function. get_haar_points: resizes the image by 1.5
- imports global and local variables from Commons.hg (highgui), Commons.cv(which is a CV common lib), and OcvfwBase
Files affected by OpenCV 2.4.3 upgrade
- ocvfw/ idm
- ocvfw/ dev
- ocvfw/ backends