Ladstatt: July 2013

In this blog post I want to continue the work on the sudoku solver and use my webcam to grab the sudokus and show the solution right away in the grabbed picture.

Here is a video of a recent version of the application:

This video is in black and white from an earlier version:

And here is a screenshot:

Augmented reality display of a newspaper sudoku

What happened in addition to the last posting on the sudoku2go program is the following:

grabbed images of digits are reused when showing the solution

I thought it would be a nice idea to reuse the digits already available to show the solution. Like that there is no mismatch between the original font used and the fields which are to be filled in. In the screenshot above you can verify that the result really looks as if a solved sudoku was printed (well sort of ;-) )

live grabbing from a continuous stream of pictures

When showing the application to friends this was almost always the first question - here is the newspaper with the sudoku, now solve it! I've added a webcam functionality to the program (reusing code from my isight-java webcam project) which gives an immediate feedback to the user.

border around the whole puzzle

I thought some colored border would provide a better feedback to the user. Using JavaFX for painting such a border is far more powerful and less of a hassle than to use OpenCV's possibilities. Like that you get effects like DropShadow for free.

overlay of the solution on the original input image

Maybe you have a look into the source how I create the sudoku solution by reusing a digit "library" of grabbed images. I'm using JavaFX and its screenshot API to create a picture which then is "rewarped" again by openCV to fit in the original image. This could be improved by just using openCV's Mat class I suppose, which has to be faster, too. Anyway, the "blendMode" feature of JavaFX saved me much (development) time. ;-)

Improved speed

I've measured the performance bottlenecks of the sudoku program, and found some places to be optimized. Specifically I've introduced Scala Futures in order to decide which cell contains which digit (or no digit). After segmentation this is a nice example to apply parallel computation and it turns out this speeds up things considerably. Like this you get a parallel computation of your subtasks, and collect them as they finish.

Ideas to improve the application

Anyway, in its current state the program does what I originally wanted from it - show the webcam the newspaper, you get the solution back right away in augmented reality. As always, there are many ways to improve the application - for example a more intelligent way to recognize the digits, a more responsive ui, using an approach which reuses information collected in the past for the current measurement, etc... maybe you want to fork the project and give it a spin?

In this post I'll show you how to build an Application which solves a Sudoku puzzle from a o photo of a local newspaper.

First of all, here is a video of the application:

Maybe a screen shot suffices for most visitors:

OpenCV for image processing, JavaFX for visualization and Scala as the driving force

Ok. Fine. Source Code!

Click here for the complete source code. Make sure you have openCV installed however and change the path of the native libraries matching your setup. At the moment you have to start it from your IDE.

Why a sudoku program?

My main motivation to create the program was to explore the possibilities openCV offers a little more in detail, and tackle a non trivial (of course "non triviality" always lies in the eye of the beholder) problem with it. As it turned out there are many blog postings around which describe how one could recognize the digits and their position and as such the whole sudoku puzzle. Nevertheless it was a pleasant experience to implement it on my own. (Disclaimer: Needless to say I would have been totally lost without reading all available online resources, browsing through blogs and books, trying out stuff and redoing it all over ... )

Anyway, when creating such a program, you have two major challenges:

Image recognition
Solve the sudoku problem itself

I was more interested in the image processing domain since it was apparent to me that the solving algorithm is a very deep snake pit. But more on that later.

How to squeeze out digits out of an image?

It turns out that it is relatively easy to do this with openCV. The main idea is that every sudoku is surrounded by a border and the sudoku is defined by a 9x9 matrix of squares containing the digits. As such, the first task is to identify the borders of the sudoku. It goes without saying that everything else on the input image should be ignored, only the sudoku remains as a region of interest. Furthermore you know in advance what you want to extract out of the photo - you want to know which square at which position contains a number or not, and you want to determine the number.

There are several preprocessing steps involved such that it is easier for the openCV algorithms to detect the borders properly. The idea is that the surrounding borders define the rectangle with the biggest area on the photo, which is a valid assumption if you create such a program. Concerning the recognition of the digits there exist several approaches like using a neural network or even a pre-made library like tesseract for doing this tasks. Tesseract is surely the choice to be made if you want to have a robust ocr engine, there exist also java bindings for it, I didn't try it out though.

Another approach, and maybe the simplest one, is template matching.

Image preprocessing

In order to filter out stuff we don't need for digit recognition, we apply certain preprocessing steps, which are shown in the screenshot below:

This is fairly self explanatory (and the implementations of the called functions are one liners calling some OpenCV library functions).

Finding the contour with maximum area

The result of this preprocessing step will be examined in order to find the shape with 4 sides with the maximal area.

Here you can see the power of Scala collections applied by filtering out only shapes with 4 sides and choosing the one with the biggest area. The result already contains the 4 points which define the sudoku borders.

Now the situation already turns out to be quite a good one: we know the corner points of our sudoku, but it still is placed somehow on the photo, we have to "normalize" it to be able to do template matching. OpenCV comes to the rescue, it has a function called "warpperspective" which does exactly what we want and is one very important aspect of the whole approach.

On the right side you'll see the sudoku which serves as an input for the next stage, detecting the numbers.

Detecting numbers

I made myself my life very easy by just dividing the resulting Mat from the previous warp step to be a 9x9 matrix. All cells are inspected separately and thus the problem is reduced to detect one number in one cell.

The contour with the biggest area will be searched in a subarea of the cell (in order not to have false positives with a sudoku line which quite often interferes here). Another assumption is here that the contour must contain the center of the sudoku square. With those two assumptions we get quite reliably the contours we are interested in.

Like this we get a list of contours which have to be matched against a template library. We have to make sure that the detected contour is normalized in order to properly compare it with our templates.

OpenCV has everything one needs for such an operation. In essence you need to compare two lists of points which each other and calculate the minimal distance between reference lists (associated with a number) and the list in question. The main problem is that you don't know how many points there are in the list you investigate, and as such the trouble starts. Here it is important to know which functionality is already available in the library, and which glue code between the calls to the library still have to be custom made.

I am sure that wheels get reinvented all the time - so make sure to double check before implementing a too low level concept in your code. (I would not be surprised if some steps in this small project are also superfluous)

I designed the program in a way that the detection method can be exchanged, above you see the approach using template matching. A set of reference images is compared to the slices picked out by all operations described above, and at the end we assume the best match is good enough and return the number associated with the template match.

Here is an example for some templates to match against:

You can see that I've choosen templates which aren't perfect either - the reasoning behind this is some kind of "fuzziness" in the matching should be possible. Try yourself what works best for you.

Almost there

Now we have identified the cells which are empty and distinguished them from the cells containing numbers. We are were most sudoku solver start anyway - the definition of the puzzle in some form of an array.

Assuming you have a function which knows how to solve such a puzzle it is easy to show the user the solution of the puzzle. There are some very nice thoughts about a sudoku solving algorithm online, make sure you check it out. I took somewhat of a shortcut and used following approach:

complete Sudoku solving algorithm

This is the first google hit for "scala sudoku solver" with small adaptions to fit into my program.

The output gets translated to JavaFX Labels which are shown when pressed on the "solve" button.

Ah. Great. What about this bumping effect??

You may have noticed in the video that the contour visualization bumps up and down when selected - I thought it would be nice to have such a thing in the program to make it a little bit more attractive. The contours are the shapes which are detected by openCV.

I googled for "bounce JavaFX" and stumbled upon this project, i've extracted the parts I needed and was astonished that there is not more code necessary for such an effect. In short, you just need to define the keyframes of the animation, and the rest is done by JavaFX.

Conclusion

To sum up, it was again a nice experience writing the program with my favorite toolchain. I'm very pleased about how well JavaFX and openCV can work together, especially if you use JavaFX for visualization and UI and let openCV do the core processing tasks, and let Scala be mediator between those two great libraries.

Anyway, I only scratched (again) the surface of all involved technologies, I'm sure that the integration of openCV with Scala for example could be improved a lot by creating a DSL on top of the available Java bindings. Some implicits would do the job, too - sometimes it is a little tedious to convert MatOfFoo to List[Foo] - applying here a little scala magic would declutter the API a lot.

Nevertheless already in their current state all three technologies can be considered - when used together - as a very powerful platform for creating image processing applications.

Update

Maybe you want to check out Part II of the Sudoku2go blog post series.

Ladstatt

Pages

Sunday, July 14, 2013

Sudoku Grabber and Solver - Part II