Shadow Strike
8 min readNov 4, 2019

Indian Traffic: a prisoner’s dilemma

Indian traffic, if not the worst, would be among the bottom few countries. Traffic in India is bad not only because of road conditions (which are getting better now) but more due to commuters not following any traffic rules laid down by the authorities. The translation of green, yellow and red traffic lights in India is not Go, Caution and Stop but Go, Accelerate & Go and Check if no camera or police then go.

Photo by Atharva Tulsi on Unsplash

According to a survey done by WHO in 2015, enforcement of road safety laws and traffic rules in India is rated less than 4 out of 10. One can argue that poor traffic rules adherence is because of poor enforcement of laws and rules by appropriate authorities. But at the same time, I wonder, following traffic rules is beneficial for every commuter irrespective of whether authorities act on enforcing them or merely define them. As an example, for two cars coming from opposite direction on a road during night, it will be beneficial for both car drivers to keep their respective headlights in low beam, but in reality drivers choose to keep their headlights on high beam even if it causes temporary blindness to both.

Such behaviour of defaulting on rules even when they could have followed the rules, which would have been in better interest, reminded me of prisoner’s dilemma. I thought, why not analyse the Indian traffic’s condition using game theory. I am not an expert in game theory, but I have tried my best to come up with game theory models for the various traffic scenarios where people choose not to follow set traffic rules.

Prisoner’s dilemma, a term made famous by movie, A Beautiful Mind, based on the person who provided major contribution in the game theory field and proposed Nash equilibrium, John Nash. This dilemma arises when two players, prisoners in this case, come up with a strategy that maximizes their individual gain but when both play that strategy it doesn’t results in best case scenario (maximum utility) for both combined.

To elaborate, let’s say, police captures two accomplices in a murder but don’t have any substantial poof against them. Police is relying on either or both accomplices to testify for their crime. So, they come up with a plan, they move suspects in two separate rooms with no means of communication between the suspects. To both, they tell following — if you testify you will be free but the other person will be given life sentence (14 years), if both of you remain silent then both of you will be given 1 year sentence, while if both testify, each would be sentenced for 7 years in jail. They also mention that the same conditions are given to the other suspect and they have an hour to decide. Now, it is obvious that the best for both is to remain silent and not testify (and spend a year in jail). Suspects, however, would notice that testifying gives him or her better returns irrespective of the choice made by the other suspect, i.e., if the other suspect remains silent he or she will get no jail time (instead of a year in jail had he or she remained silent) while if other suspect testifies he or she will get a 7 year sentence (which is better than 14 years sentence had he or she would have remained silent). Therefore, it makes a lot of sense for suspects to testify, giving rise to this dilemma.

Mathematically, the above dilemma can be represented as following. It is a table where the possible choices — silent and testify — becomes rows and columns for suspect one and two respectively. The pay-out (or the sentence in this case) for a scenario is mentioned in the cell that can be arrived based on the choice made by both the suspect. (The numbers are negative because it is the number of years that the suspects will not be able to live freely).

payout matrix for the suspects

It becomes clearer from this representation that for suspect 1 (or suspect 2) testifying is better than remaining silent (0 year in jail is better than 1 year in jail and 7 years in jail is better than 14 years in jail).

Now if a similar matrix is created where “silent” is replaced by “obey rules” and “testify” replaced by “disobey rules”, we get following (since we don’t know the reward or punishment figures I have kept them blank for now)

We also know that disobeying rules by both the drivers is a dominant strategy in Indian traffic conditions while obeying rules is the best-case scenario for both combined. To understand why this must be happening let’s start filling the blanks in the above matrix. For now, we will use unknown variables to fill the matrix — R as reward when both follow traffic, P as punishment when both disobey, O and D when one obeys and other disobeys respectively.

For both drivers disobeying to be a dominant strategy these variables (R, D, O & P) should be related as D>R, P>O and R>P or D>R>P>O. We can quickly check this with prisoner’s dilemma example where no time in jail is better than 1 year sentence which is better than 7 years sentence which is better than 14 years sentence or simply, 0 > -1 > -7 > -14.

With the above condition in mind, let’s try to fill the matrix with time saved during commute by the drivers. We can set a reference for commuting time when both obey rules, this will make R = 0 mins (as there is no gain or loss from the referenced time). Our inequality condition has now changed to D>0>P>O, which suggests that the drivers will gain time if and only if one of them obeys and other disobeys. It also suggests that obeying driver will lose more by obeying than if both disobeyed (which results in both the drivers choosing not to obey rules).

For further discussion, let’s assume D as +5 minutes, P as -2 minutes and O as -5 minutes.

positive means gaining time, negative losing time

How can we now use this to find ways so that drivers start obeying traffic rules? We can do something to change the pay-outs in the matrix so that obeying become the dominant strategy or at the very least disobeying doesn’t remain the dominant strategy.

Before setting up the conditions, let’s play with the above assumed pay-out matrix. For example, if we somehow ensure that when a driver obeys, he never loses time. There would now be a 2 in 7 chances that drivers will follow rule (this chance comes because of a mixed strategy Nash equilibrium; you can read more about it here).

Drivers don’t lose time even if other disobeys

Instead, of ensuring that obeying driver doesn’t lose time, if we demonstrate that by disobeying (when the other driver is obeying) the driver doesn’t gain any time, we can ensure that the drivers obey rules almost all the time (the derivation is again using mixed strategies and Nash equilibrium).

Driver doesn’t gain time by disobeying

A similar experiment (not sure if it was based out of such analysis) was done in Pune, where Deputy Commissioner of Police (Traffic) Tejaswi Satpute demonstrated that there is hardly any gain in commuting time by not following traffic rules. By conducting more such experiments and by educating drivers that they don’t gain much by not following the rules, we might change the traffic behaviour in the country.

To find conditions that will drivers to obey traffic rules require a deeper understanding of what mixed strategy is. For now, let us not go into the background mathematics but it can be derived that the probability of drivers to obey is following.

Probability of driver to obey, q = (P - O) / (R - D + P - O)

We can quickly confirm from the last example when we set R = D (no loss or gain in time) then we get q = 1, i.e. 100% probability of driver obeying.

Now, since we want the driver to obey rules more often than disobeying the probability of obeying should be between ½ and 1 (since probability can’t be greater than 1). Therefore,

½ < q ≤1

Or, by putting the value of q and simplifying we get,

(R - D) < (P - O) ≤ 2(R - D) + (P - O)

Which can be further simplified in two set of inequalities (which is different from the conditions which resulted in prisoner’s dilemma),

R ≥ D & (P + D) > (R + O)

Which means, in simpler terms:

1. Rewards when both drivers obey traffic rules should be higher than or equal to what a driver can gain when the other driver follows the rules.

2. (Counter intuitively), total sum of possible gains from disobeying should be higher than total sum of gains from obeying the rules.

Using these conditions, we can intervene in the desired way to change the behaviour of the drivers. We have already seen that when we set R = D (and P > O) the behaviour changes in the right direction.

Here we created a very simple game theory model to understand the behaviour of the traffic and how we can play with the pay-outs to ensure that drivers follow traffic rules. We can create more complex and closer-to-real-life models to understand the behaviour of the commuters, but that will involve more mathematics making it difficult to explain here (I might write a separate rich in mathematics post for that). However, I will urge authorities to use such models so that they can positively change the traffic behaviour.