Retailer Locality Simulation

Problem Statement & Motivation

In this first problem exploration, I will be using this video by TED-Ed as reference:

This video introduces the concept of Hotelling’s law; a game theory law which explains why two retailers selling the same product tend to set up shop close to each other.

I was motivated to make this simulation when I recently came across 2 cinemas situated side-by-side, and I decided to look up cinema locations within Singapore where I live. Lo and behold:

As you can see, cinemas tend to be isolated or clustered into pairs.

I decided to make a simple simulation of shops within a 2D space to see if I could reproduce this result. Python was used as the programming language.

Python notebook link (Google Colab)

The general idea was to randomly generate shop locations first, then have each shop shift its position little by little towards a more advantageous one. The metric used to determine the feasibility of a position was only the shop’s area of influence (as in the video), and other factors such as non-uniform population density and spatial obstacles were not considered for simplicity. The problem was restricted to a square domain of [0,1]×[0,1], wherein shops were not allowed to leave this domain and areas of influence were not allowed to exceed its boundaries.

Finding the Areas of Influence

Voronoi Diagrams

It was a simple task to randomly initialise points in a 2D domain, but it was immediately after that I came to realise my first problem: how do I calculate a shop’s area of influence? To be clear, a shop’s area of influence is defined as all points in the domain whereby the nearest shop is the one in question. Thankfully, there is a perfect solution for this problem: Voronoi diagrams.

A Voronoi diagram would partition the domain into regions, where each region corresponds to a shop’s area of influence. The scipy library provides a function to obtain the Voronoi diagram from a list of points. While I’m not familiar with the details of how it achieves this, it does so in O(n\log{n}) time for 2D points. Just when I thought I was done with this task, I plotted the diagram and noticed a glaring problem. Can you figure out what it is? (Hint: the lines describe the edges of the regions and the orange dots describe the vertices of the regions.)

Voronoi diagram of randomised 2D points (actual size)

Now, it might not be immediately obvious from this image alone, but there is indeed a big problem: the diagram does not know what the problem domain is, and hence does not terminate at the boundary. This becomes evident when the diagram is zoomed out:

Voronoi diagram of randomised 2D points (zoomed out)

Reflection Augmentation

Unfortunately, scipy’s Voronoi class does not allow me to provide the domain boundary, which meant that I had to consult Google on how to implement bounded Voronoi diagrams. Not to anyone’s surprise, I found an answer on Stack Overflow. The trick was that, since the domain was rectangular, I could reflect the points off the left, right, top and bottom boundaries, and add those reflected points to my list of points. In doing so, all points outside the domain would be closest to a reflected point than one within the domain. Check it out:

Voronoi diagram of randomised 2D points (augmented. zoomed out)

Notice here the edges forming a square where the boundary is supposed to be? This means that augmenting the list of points with the reflected points naturally leads to edges in the diagram inscribing the domain! This neat trick is so remarkable that it’s almost magical. Despite the additional computation cost incurred by increasing the number of points by 5 times, it certainly was a saner (and probably more efficient) option than manually calculating the edge-boundary intersections. I then only needed to select the regions within the domain.

Voronoi diagram of randomised 2D points (augmented, actual size)

There was only one step left to obtain the areas of influence, which was to calculate the area of a region given the coordinates of its vertices.

Convex Polygon Area

There is a well-known formula to obtain the area of a convex polygon given its coordinates in clockwise/anticlockwise order:

A=\dfrac{1}{2}\begin{vmatrix}x_1 & y_1\\ x_2 & y_2\\ \vdots & \vdots\\ x_n & y_n\\ x_1 & y_1\end{vmatrix}\quad

Using this formula, I calculated the area of each region and annotated each point with its relative area of influence, defined as the proportion of the current point’s area of influence compared to the average area of influence:

Voronoi diagram with relative areas

With this, the first obstacle is overcome and therewith the next: how do I know which direction and how far to shift the shops to reach a more advantageous position?

Maximising the Areas of Influence

Gradient Ascent

Warning: Lots of math ahead! As such, I will try to provide an explanation that does not rely on the formulae/equations involved.

In order to simulate a shop’s movement to a better location, I assumed every shop to behave according to the following rationale: At every time step, a shop will look around its current vicinity to determine the direction which increases its area of influence the most, and make a small movement towards that location. This was a simplified and unrealistic assumption since actual shops do not make smooth motions towards better locations, but nonetheless a good baseline to begin from.

Although there might be better algorithms to maximise each shop’s area of influence, I decided to stick to the simple algorithm of gradient ascent. The idea is that the gradient of the area of influence at some location points towards the direction of greatest increase. The gradient of a function in the Cartesian basis is simply:

\nabla f(x,y)=\begin{pmatrix}\dfrac{\partial f}{\partial x}(x,y)\\\\\dfrac{\partial f}{\partial y}(x,y)\end{pmatrix}

By making incremental steps in that direction with magnitude proportional to the gradient, every point will eventually reach a location with locally maximum area of influence:

\begin{pmatrix}x\\y\end{pmatrix}_{n+1}=\begin{pmatrix}x\\y\end{pmatrix}_{n}+\alpha\nabla f(x_n,y_n)

Of course, this brought to mind the issue of how the gradient could be computed numerically. There are 2 main methods used for numerical differentiation:

  • Automatic differentiation
  • Finite differencing

Note: Gradient ascent requires that the function be differentiable at every point in the domain, which does not hold if 2 off-centre points are extremely close to each other, as one point moving past the other would swap their areas of influence and thus cause a stepwise change. This will however give each pair of points the tendency to centralise itself.

Automatic Differentiation

Automatic differentiation is a powerful method that gives an exact value of the first derivative. The magic lies in introducing a special number \epsilon, with the definition \epsilon\ne0,\epsilon^2=0. Of course, \epsilon itself is not a real/complex number since no such number would fulfil that definition. It is very similar to the complex unit i, defined as i\ne0,i^2=-1.

This technique can be easily understood by performing a Taylor series expansion on the function. Since all powers of \epsilon greater than 2 are defined to be 0, they do not appear in the expansion:

f(x+\epsilon,y)=f(x,y)+\dfrac{\partial f}{\partial x}(x,y)\epsilon

f(x,y+\epsilon)=f(x,y)+\dfrac{\partial f}{\partial y}(x,y)\epsilon

The algorithmic implementation of this technique involves the use of dual numbers (a pair of numbers), typically defined as the function’s value and first derivative at a particular point. This dual number can be interpreted as the coefficient vector of \epsilon. Numerical operations (addition, multiplication, etc.) have to be overridden to accommodate these dual numbers, which is difficult to implement with reflection augmentation. As such, I resorted to finite differencing, an approximate method.

Finite Differencing

Finite differencing is a straightforward technique which approximates the infinitesimals, dx and dy, with small but finite values, \Delta x and \Delta y. However, this also means that the derivative obtained through this technique is not exact. The accuracy can be improved by incorporating more points near where the derivative is to be calculated.

I used the central difference, defined as:

\dfrac{\partial f}{\partial x}(x,y)=\dfrac{f(x+\Delta x,y)-f(x-\Delta x,y)}{2\Delta x}+O((\Delta x)^2)

\dfrac{\partial f}{\partial y}(x,y)=\dfrac{f(x,y+\Delta y)-f(x,y-\Delta y)}{2\Delta y}+O((\Delta y)^2)

This basically meant that for every point, I had to bump it left, right, down and up a slight amount, and recalculate its area of influence on each bumped position. With those, I could calculate the partial derivatives and thereby the gradient.

Now that I had the means to calculate the gradient, I could finally begin running the simulation! One issue though, was that convergence could not be attained due to discontinuities in the gradient. Hence, I only ran the simulation for a fixed number of time steps.

Process Simulation

Pseudocode

The simulation process could be summarised into the following steps as discussed:

  1. Randomly generate n points in the 2D space.
  2. Define finite steps \Delta x and \Delta y, as well as movement step size \alpha.
  3. Run simulation for k time steps. At each time step:
    1. Iterate across all points. At each point:
      1. Shift point left and right by \Delta x, and up and down by \Delta y, calculating the new area of influence at each position.
      2. Calculate the gradient of the area of influence using the central difference method.
      3. Scale the gradient by \alpha and add the result to the point’s original position.
      4. If the point exits the domain, shift it back to its original position.

For the programming-inclined, this has a time complexity of O(kn^2\log{n}).

Results

For these examples, I set \Delta x=\Delta y=0.001, \alpha=0.01, and ran the simulations for 1000 time steps.

Two Points

Due to the heavy dependence of the time complexity on the number of points, I decided to start with the simple case of 2 points, just like in the TED-Ed video:

2-point simulation (initial state)

After running the simulation, they converged to the following result:


2-point simulation (converged state)

The result agrees with Hotelling’s law, which is great.

Three Points

I then ran the simulation again with an additional point, and obtained the following result:

3-point simulation (initial state)
3-point simulation (converged state)

This was a bemusing result, as it seems like the point on the left could improve its own position by shifting towards the right. Perhaps there was an error with my program, or just that it was already a local minimum for that point. Either way, I might revisit this in the future.

Many Points

Of course, it seemed necessary that I should run this for many points, which I did. Here are the results for 20 points:

20-point simulation (initial state)
20-point simulation (converged state)

This result shows that points have a tendency to remain isolated or pair up with another point. Running the simulation several times with different initial conditions led to the same phenomenon. Isn’t it satisfying how similar this is to the distribution of cinemas in Singapore?

Conclusion & Further Ideas

This simulation generated rather interesting results, which provide an idea of what would happen when the concept behind Hotelling’s law is applied to many points. As mentioned, this was a very simplified simulation, yet it still took a considerable amount of time to run (about 2 minutes for 20 points). The simulation process might not have accurately modelled a realistic situation, but the results were still close.

In the future, I might consider making the following improvements:

  • Make shops relocate discontinuously instead of smoothly.
  • Model the business folding process whereby a shop would close down if its influence remains too low for an extended amount of time.

Further suggestions are welcome!

Leave a comment

Design a site like this with WordPress.com
Get started