Project #1: Self-Playing Candy Crush


There are a lot of tasks we do in our lives that could benefit from automation. Things that we do over and over with little change, or sometimes things that require close precision or constant attention. In the physical world that can be tricky, requiring parts, labour and expertise that makes automation impractical. When that repetitive task involves a computer, automation is much easier.
In this series I will be documenting the process of automating a variety of tasks using a variety of automation technologies. By demonstrating the breadth and ease of automating, I hope that the reader is able to understand what is possible with automation, and recognize what they might be able to use it for in their own workflows and how they might go about it.

Project #1: Self-Playing Candy Crush

Download the completed script here

I’ll start with a script to play Candy Crush. Why Candy Crush? Because:

  • It’s popular and you’ve probably heard of it.
  • Games are fun to work with as sample projects.
  • It lets me demonstrate image recognition. More on this later.

If your goal was to actually cheat at a game like this there are usually better ways like hacking the save file. Automation in this case will let us play the game as humans do, which is a much subtler and harder form of cheating to stop 😉 .

Beginning a project like this, let’s visit the website. I’m using:!/play/candycrush

First we’ll examine the game and see what we’re dealing with. A simple right click shows me we’re dealing with Adobe Flash.

Flash is relatively difficult to work with for automation. As a plugin the web browser isn’t aware and able to interact with what’s going on inside it, so javascript and web-browser based automation are out. Instead we’ll be using image recognition based automation. For this I’ll be using Sikuli ( Sikuli is Jython based, so Java and most Python libraries are accessible, and it comes with a graphical interface that makes it easy to create simple scripts without any programming knowledge. Since automating a game is going to involve the script making decisions we’ll definitely need to do some programming.





Before we start, I’ll just summarize some advantages/disadvantages of Sikuli:


You can interact with programs that don’t provide the useful feedback or tools to interact with them programmatically (in this case, Flash).

You can potentially run a script on your own computer to recognize and interact with elements on the screen of another computer you’re connected to through VNC, RDP, or other remote connection software or protocol. This can be useful to automate something on a computer when you can’t install the tools to do so on the actual computer.

Through the java or python libraries you can do other scripting tasks, eg. file manipulation.

Simple scripts can be created quickly and easily through the GUI.


Image recognition can be slow. Most applications should be fine as searching the screen for an image should take <1s, but when using a lot of searches like we will be doing to identify multiple screen elements, the search space should be restricted and some options used to speed things up.

False positives and failing to match the element you want can happen. You may need to adjust the precision of the match required to suit your purposes. Transparency or screen artifacts through a remote connection will require using less than 100% match, while small images with low tolerance can give false positives.


Finally before we get started on the actual script, let’s outline the process:

Step 1: Identify screen elements and positions.

Step 2: Use logic to determine what move to make.

Step 3: Interact with the game to make the move.

Repeat 1 through 3 until the game is finished.

Optional Step 4: Start the game again or move on to the next level.


Step 1:

First let’s start the game and take a screenshot:

From this screenshot we’ll cut out pieces that we’ll want to recognize. Sikuli lets you do this easily with the interface, but I’ll be using Microsoft paint and saving to .png for ease of organization with the files. I’ll also have to play a bit to make sure I get pictures of all the different pieces.

Done. I’ve kept the screenshots these were taken from as some of these may need to be redone if there are false matches. In our picture library we have each different piece of candy, as well as the + button at the top left, and the X for the window that appears after a level is over.

Our image search logic will be to first search whole screen for the + button as it is a static element of the game, we’ll use this to figure out the position of the game on the screen, and define a more limited region of the screen to use for the candy piece searches. For the candy searches we will run a search on each different piece of candy to find all the locations it appears on the screen. Then we can do a sort by their X and Y coordinates to find their relative positions.

After some annoyed playing trying to get every candy to appear, and some cutting up bits out of screenshots, I’ve begun with the coding. Here’s a screenshot of the Sikuli IDE so you can see what it looks like, after this point I’ll just be posting code as text instead.

(note: I accidentally missed 5 candies at this point as there are horizontal and vertical stripe versions for each)

Above I’ve created an array of all the images we’ll later be searching for, and started on defining the region of the screen we’ll be searching for them. Technically defining a region is not required, but without it every image search will search the entire screen, and with the large volume of searches this will add up to a lot of time and CPU wasted. I am also going to be defining the region’s coordinates in relation to a calibration image (the +), because otherwise if the game is in a different position for any reason (eg. The window is moved or resized, script is running on a computer with different resolution, or even if the website is scrolled down) it will throw the region’s position off and the script won’t work properly.

I take some pixel measurements from a screenshot with paint and find each “cell” the candies are in measures 71×63 pixels. This will come in handy later when sorting their positions. Also I adjust the region’s position and size so it lines up with these cells. Now the code for Step 1:

On first test of the find there were a few false matches, so I set all the candies without transparency to exact matches. After that only the green wrappered one was matching non-wrappered candies, so I raised its match similarity a little. Otherwise, the only issue so far is speed. The speed of the find operation took about 3.5 seconds on my desktop, but 18 seconds on my laptop (the search is optimized for multi-core). Depending on use case this could be a big problem or no problem at all. I’ll probably leave things mostly as-is, but for reference some possible speed optimizations could be:

  • Since there were no special candies on the screen at first, only 6 of the 25 searches were actually relevant – and failed searches take about as long as successful ones. All candies were found < 1 second on my desktop, and 4 seconds on the laptop. This could be  easily optimized by initially doing a count of candies and stopping the searches once that number have been found. Further improvements could be reordering the searches based on probability (eg. if a certain special candy was created from the last match move it up higher in the search order)
  • Another option is using Java’s robot.getPixelColor(). It can get the colour of a pixel when given screen coordinates, which is faster than an image search. With some analysis of the different objects you’re matching you can find a pixel that is a different RGB colour for each object and use that to distinguish them. This can be used to replace an image search, or you could set the image search to fuzzier matches and use getPixelColor to distinguish them.
  • Another option to speed things up would be to reduce the size of the region we’re searching, however since I’ve already made it about as small as it can be while still including all the candies, this could be achieved by zooming out the web browser. The problems with that are the zoom out function may scale images differently for different web browsers, leading to inconsistent results. It also would require resampling our images used for matching.


Step 2:

Now it’s time for the actual game playing logic. With the game elements all mapped in our script now there isn’t any limit to how sophisticated it can be. Of course we need to start by finding all possible matches. In theory you could go nuts with this and calculate all the chains of matches and calculate the odds the right candies will fall and get the statistically best move. That’s far too much effort for our simple demonstration – I’ll just be prioritising large matches, special candy matches, and matches lower on the playing field as they cause more falling and thus more chance of a chain.

This one is going to be messy. Since most of this is just logic regarding playing the game which isn’t likely applicable in other situations, feel free to skip past it. In summary: It iterates through each square, checking to see if a match could be made by moving a candy onto that square. If so it checks how big the match is and if any candies are special, and adds value accordingly. The move and its value are stored in a list, which is sorted by value. The highest value, lowest position move is then chosen and made.



There’s some repetitive code there that could be reduced with generalized functions, though I didn’t think they repeated enough to be worth the effort. Plus this way allows finer control if for example you wanted to make horizontal matches higher priority than vertical.

Step 3:

In this particular case the actions we need the script to take are tiny. They were these lines at the end of the previous code segment:

The wait at the end might be improved by instead of waiting a fixed time for the matches to make, instead checking until the game area has stopped moving.

That’s it! All done.

Here’s a demonstration video where I ran the script (though it’s a lot more impressive when you get to watch your computer play the game without you touching it).

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">