DAVID DUPUIS

Minesweeper Statistics

A statistical analysis of minesweeper – Placing the Mines

I was recently asked in an interview to code a Minesweeper. It’s not really possible to code the entire game in a single run but there are some key elements to coding the game that can and probably should be memorized as they have other practical applications in computer science.

In my minesweeper posts I'll share with you the important functions to coding the game but with a particular emphasis on the statistical aspect of the game.

First off, the board is a two dimensional array (a grid or matrix): board = [length,width]. There are different ways of adding the elements to the board but we’ll just stick with the basics and have one board with numbers ranging from 0 to 9. 0 represents a completely empty cell, 1 through 8 are the values of a cell next to 1 to 8 mines and 9 is a mine.

Hence our grid would look something like this (minus the color):

minesweeper board

Our first goal is to populate the board with a given number of mines.

There is only one straight forward way to do this. We go through the board once building an array or set (an array of unique values) of all of the cell positions in the board. We then randomly select a value in that array and assign a mine to it. We then remove that position from our set of unique position values and repeat until we positioned all of the mines.

Here is the code to do that:

                              
  import numpy as np
  board = np.zeros((9,9))
  def place_mines_basic(board,mines):
      '''
      '''
      length = board.shape[0]
      width = board.shape[1]
      cells = set()
      for i in range(0,length):
          for j in range(0,width):
              cells.add((i,j))

      while mines > 0:
          sample = random.sample(cells,1)
          board[sample[0][0],sample[0][1]] == 9
          cells.remove(sample[0])
          mines -= 1

      return board
                              
                            

This is nice, fast and easy but it involves going through the list once before and then randomly choosing all of the cells in which we place our mines.

How can we place the mines by going through the entire board only once? :)

There is a solution!

This is my method and I think it’s pretty darn cool. (It’s sort of similar to Reservoir Sampling)

On a normal minesweeper we know that the probability of a cell having a mine is number_of_mines / number_of_cells. So how can we place the mines randomly so that this probability applies to all of the cells?

We know how many cells there are: length x width = number_of_cells and we also know how many mines we want to place. So my method consists of: going through every cell in the grid. We assign to the current cell, a mine with the probability number_of_mines_left / number_of_cells_left and we do this until there are no more mines to place.

Let’s say we have 10 mines in a 100 cell board. We want the probability of a cell having a mine to be 10 / 100 = 0.1.

We can prove that my method works mathematically by using the hypergeometric distribution. (Which, funny enough, I didn't know about when I came up with this solution.)

(My Metis instructor Paul Burkard helped out with this when I asked him to review the first draft of my article.)

The hypergeometric distribution is the "discrete probability distribution that describes the probability of k successes in n draws, without replacement, from a finite population of size N that contains exactly K successes, wherein in each draw is either a success or a failure. (see Wikipedia page)

In our case we are drawing a cell from the board and a success is defined as placing a mine in that cell.

In our case our parameters are defined by:
N: number of cells
K: number of mines
n: number of cells passed
k: number of mines placed

The probability of cell n+1 having a mine then looks like this:

minesweeper formula

The term on the left is the hypergeometric function. The term on the right is the probability of placing a mine in a given cell where K - k = number of mines left and N - n = number of cells left given the number of mines k, and cells n we’ve seen up to this point.

To prove that my method works I can show that for every n belonging to {0,...,N} the probability of cell n + 1 having a mine is the same. Mathematically I don't have the skills to do so but I can solve it with computer science.

The function below calculates that probability for every n. For N = 100, K = 10, we see that the result is alway 0.1 and if we choose N = 256, K = 40 (parameters of an intermediate minesweeper game), the result is always 0.15625.

                             
 from operator import mul    # or mul=lambda x,y:x*y
 from fractions import Fraction

 def nCk(n,k):
     '''
     This function calculate n choose k, which is a binomial coefficient.
     n choose k = n! / (k! * n-k!)
     '''
     return int( reduce(mul, (Fraction(n-i, i+1) for i in range(k)), 1) )

 '''
 Here we loop through all the possible values of n = {0,...,N}
 and we also loop through all the possible values of k = {0,...,K} where k <= n
 hence where k = {max(0,K-(N-n)),...,min(n+1,K)}
 '''
  N = 100
  K = 10
  for n in range(N):
      prob = 0.
      for k in range(max(0,K-(N-n)),min(n+1,K)):
          hg = 1.0 * (nCk(K,k) * nCk(N-K,n-k)) / float(nCk(N,n))
          p = 1.0 * (K-k)/(N-n)
          prob += hg*p
      print(prob)

 '''
  For N = 100 and K = 10, we get:
  0.1
  0.1
  0.1
  0.1
  0.1
  0.1
  0.1
  0.1
  0.1
  0.1
  0.1
  ... (100 times)
  0.1

  For N = 256 and K = 40, we get:
  0.15625
  0.15625
  0.15625
  0.15625
  0.15625
  ...(256 times)
  0.15625

  (I advise not printing that out, instead we could just change the code to verify if two values are different.)
 '''
                             
                           

This is proof that with my method the mines are distributed uniformly on the board!
(Personally I think this is AWESOME!)

Since we know that the method works we can go ahead and code it.
The code to do this is here:

                             
 import numpy as np
 board = np.zeros((9,9))
 def weighted_choice(choices):
    '''
    This function allows you to make weighted choices, it was found on:
    http://stackoverflow.com/questions/3679694/a-weighted-version-of-random-choice
    '''
    total = sum(w for c, w in choices)
    r = random.uniform(0, total)
    upto = 0
    for c, w in choices:
        if upto + w >= r:
            return c
        upto += w
    assert False, "Shouldn't get here"

 def place_mines_probability(board,mines):
    '''
    For all the cells in the board, iterate through the board once.
    Assign a mine to a cell with a probability of (number of mines left)/(number of cells left)
    As the board gets smaller the chances of assigning a mine to a cell will increase
    '''
    cells = float(len(board)**2)
    mines = float(mines)
    length = board.shape[0]
    width = board.shape[1]
    for i in range(0, length):
        for j in range(0,width):
            prob = mines / cells
            choice = weighted_choice([(0,1-prob),(1,prob)])
            if choice == 1: #it's a mine
                board[i,j] = 9
                mines -= 1
                if mines <= 0: #no point going on if there are no more mines left
                    return board
            cells -= 1
    return board
                             
                           

All that remains now to do is verify which function is fastest.

I'll spare you the code but here are the results.
To evaluate our function I use a square board of size n * n where n will vary from 0 to 100 with a step of 3. (0, 3, 6, 9 ... )
I choose [ number of mines = 0.15 * number of cells] because that is about the proportion of mines to cells in the normal games. Otherwise I would have to generate a 3-dimensional graph, and I'm not going to do that.

For our place_mines_basic function we get:

minesweeper board

For our place_mines_probability function we get:

minesweeper board

And if we compare both of them together we get:

minesweeper board

The difference is obvious, placing the mines with the probability function is considerably faster as we increase the size of the board and the number of mines.
The probability function is a SUCCESS!

The iPython Notebook will soon be made available.

-David Dupuis