A brand new solution to an old mathematical problem. This time we found it by throwing in some negative cash, the debt.
Negative money is a powerful reality, think about a debt. Turns out, if we throw some negative cash into an old mathematical problem, we can find a brand new solution!
The old problem
I will tell you about a problem that is directly related to a famous mathematical model with tons of applications – from logistics to data science. The famous model is called Markov chain after a Russian mathematician Andrei Andreevich Markov (1856–1922). The story goes that Markov had strong opinions about many things, obviously including mathematics, and invented the famous chain to prove his rival wrong (Basharin, G. P., Langville, A. N., & Naumov, V. A. (2004). The life and work of AA Markov. Linear algebra and its applications, 386, 3-26.). Regardless the motivation, Markov’s great mind apprehended the significance of this, in his own words, `general construction’. And indeed, the model is enjoying a bright future!
I have been working on Markov chains since I was a bachelor student. Most of my results are too specialized to deserve your attention. However, I want to tell you about my most recent results, because even now, as I am writing these words, I feel thrilled and surprised about it.
The title of this section already gave away the first reason to be surprised: we are talking about a very old problem, where I could never hope to find anything new.
Let me now explain in some detail what the problem is about.
Consider a mini-social network of three people: Anna, Boris and Cecile. Every day each participant wakes up and distributes all the money they have. For example, in the picture below, Anna gives 80% of her money to Boris and 20% to Cecile, Boris divides his money between Anna and Cecile 50-50, and Cecile gives all her money to Anna. The next day, they do the exact same thing with the money they received the day before.
Figure 1. Small social network. Drawing: Natalia Litvak.
So, after many days, how much money will Anna have?
Turns out, that in the long run, the fraction of cash of each participant will stabilize. Anna wakes up, distributes her cash, and then receives the same amount of cash back from the other participants. You can see this in the plot below. On the y-axis is the fraction of cash of each participant. On the x-axis is the number of transactions they made altogether, 5 transactions per day.
Figure 2. Fractions of cash of each participant stabilize after many transactions. On the x-axis is the number of transactions (5 per day), on the y-axis the fraction of cash.
As you see, after 30-40 transactions (6-8 days) the fractions of cash do not change anymore. Anna ends up with 5/12, Boris with 4/12, and Cecile with 3/12 of the entire capital.
I will now introduce one mathematical term. The fraction of cash that Anna will have after many days, is called Anna’s stationary probability.
Why probability? Think of just one coin, passing from one person to another. Each coin is no different from any other. Then the probability that a coin lands in Anna’s hands is the same as the fraction of coins that Anna has.
The Markov chain here is exactly the mathematical model that describes the movement of a coin from one participant to another: if a coin is now with Anna, then the next day it will go to Boris with probability 0.8 and to Cecile with probability 0.2, and so on.
Markov himself used his model to compute the fraction of vowels and consonants in the `Eugenyi Onegin’ poem by Pushkin, the great Russian poet. A Markov chain describes how the text proceeds from one letter to another, and a vowel is followed by a consonant with a given probability. Up to date, language modelling remains one of the applications. But there are so many more! In logistics, Markov chains describe goods and people moving from one service point to another. Then stationary probabilities reveal bottlenecks and help to improve the system. In biology, Markov chains model the genes going through successive mutations, and stationary probabilities will tell us, for example, which gene sequences will prevail after a long time.
My favourite Markov chain is of course Google PageRank. When we search for information on the Web, the search engine has to decide which pages should come out on top of the list. This is very important because we rarely go beyond just the first page of Google search results. One of the core ideas that has put Google ahead of the competitors was the algorithm called PageRank. The algorithm is based on a mathematical model that is nothing else but a Markov chain! This Markov chain describes a stylized surfer that hops from one Web page to another, and the stationary probability of a page is the fraction of time that the surfer spends on it. Google stood out by best quality first page, thanks to a Markov chain. (See my article on Google PageRank here).
Of course, with so many applications, it is clearly very important to compute stationary probabilities as fast as possible, in very large systems such as the World Wide Web. This is a very old problem that has many solutions developed through years, some are classical, and some, motivated by PageRank, are very recent.
Now you can imagine how excited I was when my co-authors Konstantin Avrachenkov and Patrick Brown and I realized that maybe we found a new, faster, solution!
The new solution
Another surprise for me was that our new solutions was inspired by a concept familiar to all of us: the negative cash, the debt.
Let me explain what changes when we introduce the negative cash.
Now we start with all participants having cash zero. Then all of them borrow some amount from the bank, and pass it further. Assume Anna borrowed 1 euro from the bank, she passed 0.8 euro to Boris and 0.2 euro to Cecile. After that, she has -1 euro. However, Boris and Cecile also borrow 1 euro and pass it further. Then Anna receives 0.5 euro from Boris and 1 euro from Cecile, and her cash becomes -1+1.5 = 0.5 euro.
In each of the following days, some of the participants pass further their cash, positive or negative, until all of them have no cash left. Then we look at the total amount of cash that has been passed along. Turns out that the fraction of cash passed on by Anna will be exactly Anna’s stationary probability! And most importantly, we get to the result faster, because computation time is roughly proportional to the number of transactions, and we can get the answer with much fewer transactions than before!
In the table below I have worked out one example, where Anna distributes her cash on day 2, Boris on day 3, and Cecile on day 4. Green circles shows whose turn it is to transfer the cash.
I have also included a video of how I filled out the table, you can find it here. Hopefully this will help you to carry out all the steps if you want.
You may skip the table but please look at three things.
First, the computed stationary probabilities are exactly correct!
Second, and most importantly, look at the number 10 in the right lower corner, circled in red. We need only 10 transactions to get the right answer! Recall Figure 2, when the participants passed only positive cash: we needed at least 40 transactions to get to the right answer, and after 10 transactions we were not even close.
Third, the order of transactions is important! For example, if on Day 4 not Cecile but Anna transferred her cash then we would need a couple of more steps to reach the solution. This is very exciting because then we can speed up the computations by choosing the right order. It means that our algorithm is very flexible, and we are yet to discover its true potential.
Table 1. The new algorithm.
I must add that it is not always possible to drain all the cash from all participants. Still, even in very large networks, we get excellent approximations with only a small amount of computations.
Why it works
Why the negative cash made such a difference?
Mainly because now not all, but only some of the participants need to pass the cash on each day. So, if we choose the order of transactions well, we can drain the cash from the system with only a small number of transactions. Our solution is fast because we use each computer operation wisely.
With only positive cash this was not possible because, if we let participants transfer cash by turn, the cash might not stabilize, or stabilize at wrong fractions. When we use positive and negative cash, then they cancel each other, so the end result is always the same – nobody has any cash left. And we can prove mathematically that at that point we will always end up with the right answer.
To be honest, it is not completely clear yet in which order the participants have to transfer the cash to arrive to the answer as fast as possible. Even for small networks it is already a difficult problem. And this means that the old problem about Markov chains has let out a young sprout of a new interesting mathematics!
Source: Avrachenkov, K., Brown, P., & Litvak, N. (2020). Red Light Green Light Method for Solving Large Markov Chains. arXiv preprint arXiv:2008.02710.