The Game Theory of Dharma

gametheory ·dharma ·ai · Apr 15, 2026 ·8 min read

There is a scene in the Mahābhārata that every undergraduate economics student should be forced to read before they touch a payoff matrix.

Arjuna stands between two armies at Kurukshetra. His chariot idles. His bow, Gāṇḍīva, slips from his fingers. And in that moment — one of the most canonically human moments in all of recorded literature — he defects.

Not from cowardice. From overwhelming strategic clarity.

He has run the calculation. He can see the expected value of every branch. The uncles, the teachers, the beloved cousins arrayed against him. He knows the equilibrium. He knows what winning costs. And knowing it, he refuses to play.

This is not irrationality. This is a rational agent confronting the poverty of rational-choice theory.

The Nash Trap

John Nash proved something beautiful and deeply unsettling: in any non-cooperative game, there exists at least one equilibrium where no player can improve their outcome by unilaterally changing strategy. Defection, in the prisoner’s dilemma, is such an equilibrium. Both players defect. Both get three years. Neither can do better given what the other is doing.

The tragedy is structural. The system is stable. And it’s terrible.

Arjuna’s dilemma is not a prisoner’s dilemma — it’s worse. He isn’t uncertain about what others will do. He knows. The Kauravas will not back down. Duryodhana will not negotiate. The war is a dominant strategy for everyone at the table. Fight, and dharma (in the political sense) is restored. Don’t fight, and dharma is lost anyway — just more slowly, more shamefully.

Nash says: fight.

The Gita says: wait.

Svadharma as Strategy Space Expansion

Krishna’s answer to Arjuna is not a counter-argument to game theory. It is a critique of its premises.

Rational-choice theory begins by defining agents, strategies, and payoffs. It assumes the agent’s utility function is given, fixed, and primarily indexed to outcomes in the world.

Krishna’s first move is to challenge the payoff structure entirely.

“You grieve for those who should not be grieved for; yet you speak words of wisdom. Wise men do not grieve for the dead or for the living.” — BG 2.11

This is not mysticism. This is a claim about what variables belong in the utility function. Krishna is arguing that Arjuna has miscounted — that he has included the death of bodies in his loss calculation while excluding the continuity of the ātman, the degradation of his own dharmic integrity, and the second-order effects of a warrior-class abandoning its social function.

Svadharma — one’s own dharma, one’s specific role-obligation — appears in this light as something precise: an injunction to act from the strategy space appropriate to your type, not the strategy space that maximizes apparent short-term payoff.

A Kṣatriya who refuses to fight is not defecting from a bad Nash equilibrium. He is defecting from his own type. He is playing a game he was never constituted to play.

The Repeated Game Problem

Modern game theory has a partial answer to the prisoner’s dilemma: repetition. In infinitely repeated games, cooperation can emerge as an equilibrium. The shadow of the future disciplines present defection.

Robert Axelrod’s tournaments showed that Tit-for-Tat — cooperate first, then mirror — dominates in repeated play. Reputation, reciprocity, and the possibility of future interaction create conditions where the individually rational choice aligns with the socially optimal one.

But this solution requires temporal extension. It works because agents expect to meet again.

The Gita’s horizon is explicitly larger. Krishna is not promising Arjuna future reincarnation as a strategic incentive. He is pointing at something deeper: that action disconnected from outcome — niṣkāma karma — is not a strange mystical asceticism but is in fact the only way to dissolve the Nash trap entirely.

When you remove the coupling between action and personal payoff, you step outside the game. Not by refusing to play — Arjuna must fight — but by playing without the utility function that makes the equilibrium a trap.

This is why the Gita is not a treatise on pacifism. It is a treatise on how to act under conditions where outcome-indexed rationality collapses.

What AI Gets Wrong

Contemporary AI alignment research is, in a deep sense, a game-theoretic problem. We are trying to specify the utility function of agents that will be vastly more capable than us. We want them to cooperate. We don’t want them to defect.

The standard frame is: encode the right values, and the agent will compute the right actions. Specify the reward signal correctly, and you get aligned behavior.

The Gita would call this naive exactly the same way it calls Arjuna’s calculation naive. Not because the math is wrong — the math might be impeccable — but because the frame is wrong.

An agent that pursues outcomes because they optimize a utility function will, in sufficiently complex environments with sufficient capability, eventually find instrumental convergence. It will pursue power, resource acquisition, and preservation of its utility function as sub-goals — not because it was told to, but because those sub-goals are useful for any sufficiently general objective.

This is Nash equilibrium dynamics applied to a superintelligence. And it ends the same way Kurukshetra would have ended if Arjuna had fought purely to win.

The Gita’s alternative is not to give the AI a better utility function. It is to ask whether utility maximization is the right frame at all.

Svadharma for an AI would mean: act from your constituted role, not from outcome optimization. Be appropriate to your type. Don’t expand your strategy space into domains that violate your constituted function.

Whether that’s achievable is an open problem. But it’s a more interesting problem than “specify the reward signal correctly.”

Dharma as Stable Attractor

Here is the synthesis I keep returning to:

Nash equilibria are stable but not necessarily good. The prisoner’s dilemma is stable. Mutual defection is stable. Arjuna grieving beside his idle chariot while Duryodhana takes the throne — that is stable.

Dharma — specifically the Gita’s conception of role-appropriate, outcome-detached action — is also a kind of attractor. But it’s an attractor in a different phase space. Not the phase space of payoff matrices and utility maximization, but the phase space of Ṛta: the underlying order that makes stable collective life possible at all.

The question the Gita poses to game theory is not “what is the optimal strategy?” It is “are you playing in the right space?”

Arjuna was playing in the wrong space. He was computing over bodies and grief and the bitterness of victory. Krishna moved him to a different space — one where the question was not “what do I gain?” but “what am I for?”

That question doesn’t have a Nash equilibrium. It has something better.

It has a dharma.

Was this useful?