Saturday, 20 December 2014

Dice reading - photo image processing

One of the methods suggested for reading back dice rolls for our electronic board game is to make use of image processing. This seemed like a massive overkill, as well as being quite nasty to implement. Sure, we're using a smart device/tablet with our board game and it has a camera in it - so holding a dice up to a camera and getting it to read the value off it makes sense. Sort of.

Until you consider how this would play out in the middle of a game. At key decision points in the game, roll a dice (or two or three) then show them, one at a time, to the camera on the smart device. Which may be a front-facing camera. Or it may be round the back. Or it might be off to one side, so you don't actually present the dice to the screen displaying the game details, but to a tiny lens on the opposite side and towards the edge of the device. Except in landscape mode, the camera isn't at that end, because the device is rotated 90 degrees the other way around..... it just gets really nasty, really quickly.

Steve pointed out that there are OV7670 UART-cameras available relatively cheaply all over the net and that at £2 each, they're only a little bit more expensive than a matrix of five reflective sensors. Our little 8-bit micro is unlikely to have the grunt-power to do much image processing (though, reading through the code below, if one could be found with enough RAM, it might just be possible) so the idea is to snap an image with the camera, stream it from the camera GRAM into the microcontroller, and send the data, byte-by-byte to the host app over wifi (using one of the ESP8266 UART-to-wifi modules).

Now we're working with the original v2 firmware on these devices, which runs at 115200bps. An image taken at 640x480 is a massive 307,200 bytes. Even using a really low colour resolution (where each colour is one byte of 3G3R2B) that would take over 20 seconds to transfer just one frame.

Luckily, the OV7670 has a number of supported modes. One of which is QQCIF and scales the captured image to 88x72 pixels.

88x72 = 6336 bytes or 50688 bits, which, at one byte per pixel, could be transferred in less than half a second over 115200bps UART-to-wifi. Even on the "higher setting" of one byte per colour (three bytes per pixel) this is about one-and-a-half seconds to transfer the image; a much more reasonable delay time.

So what does an 88x72 bitmap look like?
Here's a photo we took with a regular camera phone, of two dice on a clear plastic acrylic sheet, with the backlight on (to illuminate the face of the dice being photographed).

 When scaled down and reduced to a 1-bit image (all our image processing will be done on single-bit data) it looks like this:

Although not perfect, it retains enough information to show which dots are showing on the dice. All we need to do is process the image and extract the dice values. Which is easier said than done!

Steve suggested OpenCV and an ANE to get existing processing routines running in Flash (we're coding our native apps in Flash, the compiling the same source code down for iOS, Android and PC). This took a lot of time to set up and understand, and fail to understand, and give up on. Eventually we decided to just code our own dot recognition routine!

We can't be sure that the dice in the image are "square" to the frame - they could be at any old angle. But irrespective of the angle, we're expecting to find one or more "clusters" of dots in the image. That is, one or more instances of a dot made up of black point, surrounded by white points. So the first thing we do is scan the entire bitmap (it's only 6336 bytes remember: looping for(i=0; i<6336; i++) is actually pretty quick in AS3) and look for a black pixel, with a white pixel above, a white pixel below, a white pixel to the left and a white pixel to the right.

Whenever we find this combination of points, we compare the centre pixel to the centre pixel of previously found "dots" on the dice. If it's within a few pixels, there's a very real chance that it's actually one of the dots we've already found, so it's ignored. If it is a new dot, however, we add it to an ongoing array.

During development, we wrote a routine to draw each discovered dot in our black-and-white bitmap image. Amazingly, it correctly drew the dice dots in the right place, first time!

After parsing the image once, we end up with an array of co-ordinates where our black-dots-surrounded-by-white appear. The trick now is to group these into "clusters" of dots, to work out what the actual face values are.

We give every set of co-ordinates in the array a "cluster number" - all co-ordinate groups begin set to zero. The first time we find a co-ordinate point without a cluster number, we give it the current dice count, then loop through all the other points, looking for another dot, within a few pixels of this one, also without a "cluster number". By calling a couple of recursive functions, we can give every dot in the image a "cluster number" by working out which other dots it's closest to.

Once all dots in the image have been given their "cluster number" the values of each dice are easy to read back. If we gave three dots a cluster number of one, dice one has the value three. If five dots were given the cluster number two, it means that dice two has the value five, and so on.

The great thing about this approach is that it automatically adapts to more than one or two dice: so long as the dice are separated so that the dots appear in definite, distinct groups, there's no reason why this routine can't detect the face values of three, four, five or more dice in a single image.

Here's the code

import flash.display.Bitmap;
import flash.geom.Point;

var bmp:BitmapData=new BitmapData(imgHolder.width, imgHolder.height, true, 0x00000000);
var foundDots:Array = new Array();
var p0:int=0;
var p1:int=0;
var p2:int=0;
var p3:int=0;
var p4:int=0;

var clusterChar:int=0;
var clusterIndex:int=0;
var lastDot:Object;

function findDots(){
     var idx:int=0;
     for(var y:int=2; y<imgHolder.height-2; y++){
           for(var x:int=2; x<imgHolder.width-2; x++){
                 // check to see if you can find a dot
                 p0 = bmp.getPixel(x,y);                
                 p1 = bmp.getPixel(x-3,y);                
                 p2 = bmp.getPixel(x+2,y);                
                 p3 = bmp.getPixel(x,y-3);                
                 p4 = bmp.getPixel(x,y+2);                
                 // if you find a dot, see if you've already got one within the very near vicinity
                       if(p1!=0x00 && p2!=0x00 && p3!=0x00 && p4!=0x00){
                             // this looks like a black dot on a white background
                             // but check the array to see if we've ever found a dot within a few
                             // pixels of this one (it might be the same dot)
                             // if so, skip this dot (you've already found it)          
                             // otherwise add to the array of found dots                            
                                   var o:Object = new Object();
                                   o.coords=new Point(x,y);

function drawFoundDots(){
     var circle:Shape = new Shape(); // The instance name circle is created
     for(var i:int=0; i<foundDots.length; i++){
 , 1); // Fill the circle with the color 990000
 , 0x000000); // Give the ellipse a black, 1 pixel thick line
 [i].coords.x, foundDots[i].coords.y, 4); // Draw the circle, assigning it a x position, y position, radius.
 ; // End the filling of the circle

     addChild(circle); // Add a child

function parseDots(){
     var dotValue:int=0
     var dots:Array=new Array();
     // find the first dot in the array that doesn't have a cluster character
           trace("found start of cluster "+clusterIndex+" at index "+lastDot.index);
                       trace("found another dot for cluster "+clusterIndex);
           trace("dotValue = "+dotValue);
     for(var i:int=0; i<dots.length; i++){
           trace("dice "+i+" value "+dots[i]);

function getConnectedDotForCluster(indx:int){
     var o:Object=null;
     for(var i:int=0; i<foundDots.length; i++){
                 for(var j:int=0; j<foundDots.length; j++){
                       if(j!=i && foundDots[j].cluster==0){
                             if(Math.abs(foundDots[j].coords.x-foundDots[i].coords.x)<=7 && Math.abs(foundDots[j].coords.y-foundDots[i].coords.y)<=7 ){                      
                                   trace(i+" is connected to another dot in cluster "+indx);

function getDotWithNoClusterChar():Object{
     var o:Object=null;
     for(var i:int=0; i<foundDots.length; i++){

function similarPixel(ix:int, iy:int):Boolean {
     var found:Boolean=false;
     for(var i:int=0; i<foundDots.length; i++){
           if(Math.abs(foundDots[i].coords.x-ix) < 4 && Math.abs(foundDots[i].coords.y-iy) < 4){
                 // found a similar pixel


Below are the results of some of our testing. To date we've tested it on about a dozen photos of dice (all take from the same distance, since any device using this approach would have a plate at a fixed height from a fixed-position camera) and each time we've correctly reported back the dice values on the faces in the photo.

Obviously, in real use, we'd need to subtract the dice face value from seven to infer the value that was face-up on the dice (since we're taking a photo of the dice face that is face-down on the clear surface) but that is just a trivial application of our dice-reading routine.

No comments:

Post a Comment