Introduction

Introduction

This term was the third of four terms that this MQP will span. This term provided us with a more complete project, and more of our ideas working together. The previous terms consisted of the breaking of the project into sections, the vision, the table model, the image transformation and analysis, and the interface. Until a few weeks into this C-Term we had been primarily developing the sections of the project in individual teams. Now we have a cohesive collection of the code that we worked so hard on earlier in the project. This has provided us with a lot. Not only are we further along in the implementation of the project, but we now have something working together. This has not only furthered our progress on the project, but also helped to keep us motivated. Also it has finally shown us that the large amount of time we had spent on the design of the application was worth it. This term report will focus on the development of the four sections during C-Term and also provide a little insight as to how the whole scope of the project will work together.

Vision

Summary of Code Created

Vision Class:

Vision has the following functions: (* means new method)

*processImage à This is the main interface for the front end. ProcessImage is passed an image. If it is necessary, the vision class is calibrated. Once the vision class is calibrated, the edges of the image are found using findEdge. Next, circles are found within the boundaries of the table. The boundaries of the table are determined by calibrateVision. For each of the circles found, the pixels enclosed by the circle are converted to an image, which is then passed to the Analyzer Class so that which ball, if any, is the circle.

*calibrateVision à CalibrateVision is called at the beginning, and can be called again if the camera is moved or whatnot. It uses findEdge to find the edges of the table, and then finds the strongest lines using applyHoughLine. These lines are then manipulated so that the four lines closest to representing the table are found. The four points where they interesect are then saved so that the boundaries of the table can be found. These four points are also used to generate the lookup table, which maps a point on the image to a point on the model space.

findEdge à findEdge finds the edges of the image and returns a black and white image of these edges.

applyHoughLine à finds the N best lines, where N is a value passed to it. Returns a list of lines found.

* findBallColors à For each circle, it attempts to find the pixels within this circle. Currently, this method uses recursion, which is slower than an iterative version I hope to write over break. The pixels found are saved in an image, and a list of these images are returned

applyHoughCircle à finds the N best circles within the boundary of the table. This method is considerable more accurate than the previous term’s version; no longer are circles found (or even searched for) in the area outside of the table

Circle Class

Circle Class is a very simple class. It contains the X,Y and R value of a circle, as well a function to draw the circle for debug purposes. (FindBallColors also uses it so that if two circles overlap, only the part not overlapping is looked at)

CircleList Class

Used to pass back and forth a List of Circles.

GeoModel Class

This is currently what the Vision Class returns. This is a collection of the critical lines and circles found for the image. Vision is currently being updated so that it instead returns a table state object, but this has not been completed yet

Line Class

Line List is another simple class, with the X and Y values for the end points of the line, as well as the original Rho and Theta values, and the calculate slope and y-intercept values. It also has a function to draw the line, which is currently only for debugging purposes

LineList Class

Used to pass back and forth a List of Lines.

What was accomplished this term for the vision class

Porting original code to Windows 95 completed.
CalibrateVision, ProcessImage, and FindBallColor functions were added and tested.
Performance for HoughCircle was improved significantly both for time and number of balls correctly found.

Table Model

This is part the of vision section that transforms the image coordinates of the balls that were found into world coordinates. The world coordinate of the balls is then the output of the vision section in the form of the ball state. The transformation is done using a lookup table. For every point in the image the world coordinates are stored in the lookup table. The only interesting part about this is the how the table is initialized. The information that we start with are four images and four world coordinates that represent the coordinates of the corners of the pool table in both coordinate systems. The way it’s done was to use a trick that was learned in high school art class. When drawing a picture using vanishing points the way you find the center of a square is to draw two diagonal lines from the corners and where they intersect is the center.

After we find the center we use the vanishing points to find the middle of each edge (another art trick).

Now for the world coordinates. This is much simpler because we assume that the world coordinates are linear, so finding the same five points that are represented in the image is a simple mater of doing averages of certain points. You may be asking your self why find all these points. If you look at the picture you'll see that there are four smaller rectangles. We make four recursive calls to split the image until the table is full.

Now that we have this nifty lookup table we put it to some use to see how well it works. The result is where the color of every pixel in the original image is transformed into world coordinates. One of the interesting properties of this transformation is the black gaps towards the top of the image (farther from the camera) because of the non-linear transformation.

This is the original image of the pool table with some modifications to show our calculations in a visual form. The first thing you'll notice is the large gradient on top of the pool table. This gradient is the transformation lookup table where the red channel was used to show the magnitude in the X world coordinate and the green channel shows the Y magnitude. Some of the other things going on here are the result of the line detection, which is used to get the corners of the pool table and to clip where it looks for circles, and the result of the circle detection which is shown as circles drawn on top of where we think there are balls.

Image Transformation & Analysis

During C term we have been working on creating a graphical representation of a Table object for the user interface. This was implemented, with the assistance of Lisa C. and George, to draw a pool table. After that, the Interface team handled implementation. The Interface team handled final designs for the CUE interface. Algorithms for critiquing shots have been explored. This involved the development of the analysis section of the CUE that uses a vision object to obtain state information from the incoming image data. This involved deciding which frames to actually look at to identify events. Once all the events in a particular shot are found, analysis of the shot can take place.

The Analyzer is still in design phase but an implementation of the class has been started to facilitate coding efforts in the other sections of CUE. Control and data flow has been designed and diagrams will soon be available. The main interface for the Analyzer has been set with the help of the Interface team, since they will be the ones using it. Current efforts to finalize the interface with the Vision class have commenced. See Vision section for current status.

Work was also done for the Vision section. Identifying was implemented with Lisa C. and George to identify balls from a cropped image. This currently uses RGB values and is not functioning well (only gets a couple balls right). Efforts are currently underway to use HSV values and incorporating a little inference to identify the balls.

List of Classes and Methods Implemented so far:

The Model Section :

Ball Class
Description:
Ball describes the set of balls used for play it also uses the BallState class to return a BallState that describes a particular ball's position and current movement on a table.
Methods : public
Returns	Method Calls	Description
	Ball()	Constructor
	~Ball()	Destructor
Int	IsStripe(char *)	Returns 1 if specified ball is a stripe, 0 otherwise
BallState *	State(char *)	Returns a state object for the specified bal
Image *	SampleOf(char *)	Returns a sample image of the ball requested
COLOR	ColorOf(char *);	Returns the color of the ball with specified if ball unknown, returns NULL
Methods : private
Void	InitColors(void)	Initializes the file names for ball samples
Void	InitFiles(void)	Initializes the file names for ball samples
Comments
The Ball class also contains a list of strings used to identify balls used in game

BallState Class
Description:
The BallState class is a simple object that holds a particular ball’s position on the table and how much the ball is moving on the table. BallState also keeps track of whether the ball is on the table. If it is not, position and movement are not available. User can change the position, movement and on/off table information.
Methods : public
Returns	Method Calls	Description
	BallState ();	Constructor
	BallState(char *)	Constructor - sets label
	~BallState()	Destructor
char *	GetLabel(void)	Returns label of ball
Void	SetX(float)	sets <name of function> data member setting x or y marks ball as on table when set
Void	SetY(float)	sets <name of function> data member setting x or y marks ball as on table when set
Void	SetMoveX(float)	sets <name of function> data member setting x or y marks ball as on table when set
Void	SetMoveY(float)	sets <name of function> data member setting x or y marks ball as on table when set
Float	GetX(void)	Returns <name of function> data member// if ball not on table, return NULL
Float	GetY(void)	Returns <name of function> data member// if ball not on table, return NULL
Float	GetMoveX(void)	Returns <name of function> data member// if ball not on table, return NULL
Float	GetMoveY(void)	Returns <name of function> data member// if ball not on table, return NULL
Int	OffTable(void)	Returns 1 if ball is not on table Returns 0 if ball on table
Void	SetOffTable(void)	Marks the ball as being off the table
Methods : private
Void	Initxy(void)	Initialization
Comments

Table Class
Description:
The Table class is based on the Ball class, so a Table has all the information about a particular set of balls. An array of BallStates is kept for the balls used in play. The methods available in the BallState class are mimicked in Table with the addition of requiring the label of the ball of interest. The Table class is the only class users need be concerned with. The method in Ball that returns a list of labels for the ball set information is available for, is overridden to just return the balls in play. A time stamp for this object will be implemented later.
Methods : public
Returns	Method Calls	Description
	Table()	Constructor
	~Table()	Destructor
BallState *	State(char *)	Returns the current state of the ball requested
int	SetX(char *, float)	sets same as BallState class
int	SetY(char *, float)	sets same as BallState class
int	SetMoveX(char *, float)	sets same as BallState class
int	SetMoveY(char *, float)	sets same as BallState class
int	SetOffTable(char *)	marks specified ball as being off the table
float	GetX(char *)	Returns same as BallState class
float	GetY(char *)	Returns same as BallState class
float	GetMoveX(char *)	Returns same as BallState class
float	GetMoveY(char *)	Returns same as BallState class
int	OffTable(char *)	Returns 1 if ball is not on table Returns 0 if ball on table
double	GetBallRatio(void)	Returns the ball ratio of table
double	GetWidthRatio(void)	Returns the width ratio of table
double	GetLengthRatio(void)	Returns the length ratio of table
double	GetPocketRatio(void)	Returns the Pocket ratio of table
TimeStamp *	GetTimeStamp(void)	Returns the timestamp for the table Returns NULL if there is none
int	SetTimeStamp(int minutes, float seconds)	Checks for existing timestamp, if one exists, nothing changes, otherwise timestamp set as indicated
int	SetTimeStamp(int minutes)	Checks for existing timestamp, if one exists, nothing changes, otherwise timestamp set as indicated
int	SetTimeStamp(float seconds)	Checks for existing timestamp, if one exists, nothing changes, otherwise timestamp set as indicated
Methods : private
void	Init(void)	Initializes the array of ball states
Comments
all methods return NULL or -1 if requested ball is not in use all methods return 1 for success if int specified for return value // array of labels for the balls, just so we can pass it when asked for const char * ballList[NUMBALLS]

The Analyzer Section :

Analyzer Class
Description:

Methods : public
Returns	Method Calls	Description
	Analyzer()	Constructor
	~Analyzer()	Destructor
Table *	GetCurrentTableState(void)	Returns a pointer to the most recent table object, represents the current state of the pool table model returns NULL if no Table objects exist yet
Shot *	GetShot(void)	Returns a pointer to a Shot object to be used throughout the existance of Analyzer. You can only call one shot at a time
bool	CallShot(void)	Notifies the backend that the current shot object is set. Returns true if Shot object is a valid shot, ie. All balls in shot are on table etc. returns false otherwise.
Table *	InstantModelReplay(void)	Returns pointer to list of Table objects representing the last shot that was made returns NULL if there hasn't been a shot taken yet
Table *	InstantModelReplay(int n)	Returns pointer to list of Table objects representing the nth shot of the current game/instance of brain object returns NULL if the nth shot doesn't exist
Table *	CalledShotModelReplay(void)	Returns pointer to list of Table objects representing the shot that was called, ie. what the model would have looked like if the shot was successful returns NULL if there was no shot called
int	GetResult(void)	Returns an int code defined in the header file that generalizes the result of the last shot made, ie. shot successful, shot failed etc.
ImageList *	GetInstantMovieReplay(void)	Returns pointer to movie object holding the important frames captured from the last shot made. Important frames being those in which events occur Returns NULL if there were no shots taken
ImageList *	GetInstantMovieReplay(int n)	Returns pointer to movie object holding the important frames captured from the nth shot made in game/current instance of this brain object returns NULL if nth shot does not exist
int	GetCurrentState(void)	Returns a code (listed at bottom of header file) representing the current state of the analyzer
Comments
IDLE 0 - the analyzer is not doing anything

Shot Class
Description:
Will be used for communication between the analyzer and the front end. Represents a shot.
Methods : public
Returns	Method Calls	Description
	Shot();	Constructor
	~Shot();	Destructor
Methods : private
Comments
This class is not yet implemented

Part of the Image stuff :

TimeStamp Class
Description:
Acts a once settable time stamp, mainly for identifying Images and Table States
Methods : public
Returns	Method Calls	Description
	TimeStamp(int m)	Constructor
	TimeStamp(float s)	Constructor
	TimeStamp(int m, float s)	Constructor
	~TimeStamp()	Destructor
Int	GetMinutes(void)	Returns the minutes
Float	GetSeconds(void)	Returns the seconds
Methods : private
Comments
Future plans may include having a 60 second cap on the seconds field, putting excees automatically on minutes

ImageList Class
Description:
Holds and manages a list of Images, for possible play back or reference… might be used between Vision to help with Ball identification or whatnot
Methods : public
Returns	Method Calls	Description
Methods : private
Comments
This class is not yet implemented

Part of the Vision Section :

Identifier Class
Description:
This will guess the identity of a ball in a specified image
Methods : public
Returns	Method Calls	Description
	Identifier()	Constructor
	~Identifier()	Constructor
char *	Identify(Image *)	Takes in an image, returns string identifying the ball contained in the image. returns NULL if we don't know which ball it is
Methods : private
double	Magnitude(COLOR, int, int , int)	Calculate magnitude of distance between two colors in color space
COLOR	Average(Image *)	Finds Average Pixel Value for Image
char *	IdentifySolid(Image *)	takes in image of possibly solid ball and identifies it
char *	IdentifyStripe(Image *)	takes in image of possibly striped ball and identifies it
HSICOLOR	InitHSICOLOR()	This function allocates 3 floats for an HSICOLOR
HSICOLOR	Normalize(COLOR)	This function takes in a COLOR and normalizes its R, G, B components so that each lies in the interval [0, 1]
HSICOLOR	RGBtoHSI(COLOR)	This function takes in a COLOR, calls the normalize Function on it, and converts the normalized red, green. Blue values to hue, saturation, and intensity values
float	MinColor(HSICOLOR)	gets min of normalized R, G, or B for use in calculating saturation

Comments
Currently changing to use an HSI color model instead of RGB

Interface

This has been a very busy term for the final interface. Throughout the term we have designed the framework for the communication between the interface and the back end. We have also designed the communication between the front end and the user. In this section of the term report we will look at flow control for the applications interface and some issues in the user interface design.

There were many issues in the design of the interface; how the user would enter information, how information would be presented to the user, how information would be sent to the back end, and how the back end would send new information to the interface. Figure 3.1 shows the basic flow control of the interface.

Figure 3.1: Flow Control

From this diagram you can see that there will be as little communication as possible between the interface and the back end. The reason for this is because the back end is fairly processor intense, and the interface has to be able to provide feedback to the user quickly. The interface will construct its own method of creating a sequence of events, or a shot. This will give us the advantage of not having to communicate with the back end until a shot has been constructed and verified that it’s correct. Once the interface knows that this shot is the one that the user is trying to take, it will call the back end. The back end will then pass a pointer to a shot object. The interface will the fill this shot object in with the information the back end needs to figure out if the shot was successful. After the user takes the shot, the back end will analyze it and return a result code to the interface. The interface will then interpret this result code and provide the user with appropriate feedback.

Now we’ll look at the specifics inside the interface. A screen shot of the interface can be seen in Figure 3.2. The interface is a multiple windowed application. The two main windows that the user is presented with are a model of the table and a text interface window. These two windows are based different document types. This presented a minor problem in the early stages of the interface implementation. There was some difficulty in getting the application to start with both of these windows open because of the different document types. This was one of many problems based on the large learning curve of Microsoft Visual C++.

Figure 3.2: Interface Screen Shot

The next problem based on the VC++ learning curve was the window communication between the view and the document of a window. Each window has a document class associated with it. This document class contains functions and data about the information in the window. Each window also has a view class, and this view class holds the functions that place information from the document class to the screen. For example, in the model window the document class contains information about the table, the location of balls and pockets on the table, and how to change ball location ratios the back end uses to exact pixel values for the display. The model’s view class contains functions on how to take this table information and place all of the objects in the right location in the right window on the screen. This view class also contains the handlers for mouse events in the model window. When the user clicks on a ball, this is the class that handles those window events. Now that we have the class communication nailed down, we began serious work on the real interface implementation.

First we’ll look at how the user interacts with the interface. The goal of the interface is to provide a means for the user to be able to tell the analysis part of this project what they are trying to do for a shot. The interface also translates the response from the analysis section to a language that the user can understand. There are many ways that the user can tell the interface what shot they intend to take. The first method is the user can click on the sequence of their shot in the model. For example, if the user intended to have the cue ball hit the eight ball and the eight ball hit the nine ball and have the nine ball go into the corner pocket the user would click on the cue ball, click on the eight ball, click on the nine ball, and click on the pocket they intended. While this sequence is being defined the user will see the ball numbers they click being displayed in the text window. This shot sequence can be completed by either clicking on the "Commit Shot" button in the text window, or by double clicking on the target pocket. An alternate method for a shot to be described is located in the text window. Most expert users prefer using a text interface versus a mouse. Because of this we also allow the user to enter a string into a text input area of the text window. This string would consist of the balls they intended to hit, in number form, and the pocket the intend to place the ball(s) into, in letter form. Using the same example as before, the user would enter "0 8 9 A" into the text input area. The numbering of the balls is self explanatory, but the pocket letters are a little more confusing. The pocket in the upper left corner is lettered A and the lettering goes in alphabetical order around the table clockwise. After the shot string is entered, the user would click on the "Commit Shot" button. Before the user double clicks on a pocket, or clicks the "Commit Shot" button, they can cancel the shot at anytime by clicking on the "Cancel Shot" button also located in the text window. This is the extent of the user interaction with the interface.

Now that the user can tell the interface what they want, the interface now has to tell the user what they really did. The back end will return a result code to the interface, and the interface will be able to interpret that code as to if the user succeeded or what the user did wrong. Not only will the back end present the interface with a shot result, but also a table state. This table state will contain information about what balls are located where on the table. This table state will be returned to us when we ask for it, and when a shot result is sent to the interface. When a shot result is sent to the interface, the user will also have an option to look at an instant replay in the model window of the shot they just took. This instant replay will be provided by the back end via an array of table states.

The final thing in this interface could be a real time video window of the input the computer is getting. Just recently we have obtained a SDK (Software Development Kit) for the video capture hardware we own. We hope to not only provide the back end with video images from this SDK, but also add a new window to the interface containing the real video input of the camera. This change is dependent upon time constraints of the project.

We have completed a lot of the user interface this term. We started the implementation at the beginning of the term, and have a little "cleaning-up" of the interface to do next term. The only thing that could add serious time to the interface development would be the inclusion of the real-time video window. We feel that a lot of work has been done on the interface side of the project this term. Not only is almost the entire interface done, but we had to do it fighting with Microsoft’s design environment every step of the way. We are very happy with the progress made this term on the interface.

Conclusion

This was a very busy term overall for the project. We have completed a lot of the sub-sections of the project and are now starting to put it all together. Within the first few weeks of our final term we hope to complete all the assembly of the sub-sections we have left, and to finalize any changes we have within the sub-sections. After we have an assembled application we will spend the rest of the term preparing our presentation and final report. Up to this point we all think that this project has not only presented us with many technological challenges, but also introduced us to serious group project work. We will take a lot more that just technological knowledge away from this project.