Degrees of Freedom

Something I’ve been thinking about for a while is the dual relationship between optimization and indifference, and the relationship between both of them and the idea of freedom.

Optimization: “Of all the possible actions available to me, which one is best? (by some criterion).  Ok, I’ll choose the best.”

Indifference: “Multiple possible options are equally good, or incommensurate (by the criterion I’m using). My decision algorithm equally allows me to take any of them.”

Total indifference between all options makes optimization impossible or vacuous. An optimization criterion which assigns a total ordering between all possibilities makes indifference vanishingly rare. So these notions are dual in a sense. Every dimension along which you optimize is in the domain of optimization; every dimension you leave “free” is in the domain of indifference.

Being “free” in one sense can mean “free to optimize”.  I choose the outcome that is best according to an internal criterion, which is not blocked by external barriers.  A limit on freedom is a constraint that keeps me away from my favorite choice. Either a natural limit (“I would like to do that but the technology doesn’t exist yet”) or a man-made limit (“I would like to do that but it’s illegal.”)

There’s an ambiguity here, of course, when it comes to whether you count “I would like to do that, but it would have a consequence I don’t like” as a limit on freedom.  Is that a barrier blocking you from the optimal choice, or is it simply another way of saying that it’s not an optimal choice after all?

And, in the latter case, isn’t that basically equivalent to saying there is no such thing as a barrier to free choice? After all, “I would like to do that, but it’s illegal” is effectively the same thing as “I would like to do that, but it has a consequence I don’t like, such as going to jail.” You can get around this ambiguity in a political context by distinguishing natural from social barriers, but that’s not a particularly principled distinction.

Another issue with freedom-as-optimization is that it’s compatible with quite tightly constrained behavior, in a way that’s not consistent with our primitive intuitions about freedom.  If you’re only “free” to do the optimal thing, that can mean you are free to do only one thing, all the time, as rigidly as a machine. If, for instance, you are only free to “act in your own best interests”, you don’t have the option to act against your best interests.  People in real life can feel constrained by following a rigid algorithm even when they agree it’s “best”; “but what if I want to do something that’s not best?”  Or, they can acknowledge they’re free to do what they choose, but are dismayed to learn that their choices are “dictated” as rigidly by habit and conditioning as they might have been by some human dictator.

An alternative notion of freedom might be freedom-as-arbitrariness.  Freedom in the sense of “degrees of freedom” or “free group”, derived from the intuition that freedom means breadth of possibility rather than optimization power.  You are only free if you could equally do any of a number of things, which ultimately means something like indifference.

This is the intuition behind claims like Viktor Frankl’s: “Between stimulus and response there is a space. In that space is our power to choose a response. In our response lies our growth and our freedom.”  If you always respond automatically to a given stimulus, you have only one choice, and that makes you unfree in the sense of “degrees of freedom.”

Venkat Rao’s concept of freedom is pretty much this freedom-as-arbitrariness, with some more specific wrinkles. He mentions degrees of freedom (“dimensionality”) as well as “inscrutability”, the inability to predict one’s motion from the outside.

Buddhists also often speak of freedom more literally in terms of indifference, and there’s a very straightforward logic to this; you can only choose equally between A and B if you have been “liberated” from the attractions and aversions that constrain you to choose A over B.  Those who insist that Buddhism is compatible with a fairly normal life say that after Buddhist practice you still will choose systematically most of the time — your utility function cannot fully flatten if you act like a living organism — but that, like Viktor Frankl’s ideal human, you will be able to reflect with equinamity and consider choosing B over A; you will be more “mentally flexible.”  Of course, some Buddhist texts simply say that you become actually indifferent, and that sufficient vipassana meditation will make you indistinguishable from a corpse.

Freedom-as-indifference, I think, is lurking behind our intuitions about things like “rights” or “ownership.” When we say you have a “right” to free speech — even a right bounded with certain limits, as it of course always is in practice — we mean that within those limits, you may speak however you want.  Your rights define a space, within which you may behave arbitrarily.  Not optimally. A right, if it’s not to be vacuous, must mean the right to behave “badly” in some way or other.  To own a piece of property means that, within whatever limits the concept of ownership sets, you may make use of it in any way you like, even in suboptimal ways.

This is very clearly illustrated by Glen Weyl’s notion of radical markets, which neatly disassociates two concepts usually both considered representative of free-market systems: ownership and economic efficiency.  To own something just is to be able to hang onto it even when it is economically inefficient to do so.  As Weyl says, “property is monopoly.”  The owner of a piece of land can sit on it, making no improvements, while holding out for a high price; the owner of intellectual property can sit on it without using it; in exactly the same way that a monopolist can sit on a factory and depress output while charging higher prices than he could get away with in a competitive market.

For better or for worse, rights and ownership define spaces in which you can destroy value.  If your car was subject to a perpetual auction and ownership tax as Weyl proposes, bashing your car to bits with a hammer would cost you even if you didn’t personally need a car, because it would hurt the rental or resale value and you’d still be paying tax.  On some psychological level, I think this means you couldn’t feel fully secure in your possessions, only probabilistically likely to be able to provide for your needs. You only truly own what you have a right to wreck.

Freedom-as-a-space-of-arbitrary-action is also, I think, an intuition behind the fact that society (all societies, but the US more than other rich countries, I think) is shaped by people’s desire for more discretion in decisionmaking as opposed to transparent rubrics.  College admissions, job applications, organizational codes of conduct, laws and tax codes, all are designed deliberately to allow ample discretion on the part of decisionmakers rather than restricting them to following “optimal” or “rational”, simple and legible, rules.  Some discretion is necessary to ensure good outcomes; a wise human decisionmaker can always make the right decision in some hard cases where a mechanical checklist fails, simply because the human has more cognitive processing power than the checklist.  This phenomenon is as old as Plato’s Laws and as current as the debate over algorithms and automation in medicine.  However, what we observe in the world is more discretion than would be necessary, for the aforementioned reasons of cognitive complexity, to generate socially beneficial outcomes.  We have discretion that enables corruption and special privileges in cases that pretty much nobody would claim to be ideal — rich parents buying their not-so-competent children Ivy League admissions, favored corporations voting themselves government subsidies.  Decisionmakers want the “freedom” to make illegible choices, choices which would look “suboptimal” by naively sensible metrics like “performance” or “efficiency”, choices they would prefer not to reveal or explain to the public.  Decisionmakers feel trapped when there’s too much “accountability” or “transparency”, and prefer a wider sphere of discretion.  Or, to put it more unfavorably, they want to be free to destroy value.

And this is true at an individual psychological level too, of course — we want to be free to “waste time” and resist pressure to account for literally everything we do. Proponents of optimization insist that this is simply a failure mode from picking the wrong optimization target — rest, socializing, and entertainment are also needs, the optimal amount of time to devote to them isn’t zero, and you don’t have to consider personal time to be “stolen” or “wasted” or “bad”, you can, in principle, legibilize your entire life including your pleasures. Anything you wish you could do “in the dark”, off the record, you could also do “in the light,” explicitly and fully accounted for.  If your boss uses “optimization” to mean overworking you, the problem is with your boss, not with optimization per se.

The freedom-as-arbitrariness impulse in us is skeptical.

I see optimization and arbitrariness everywhere now; I see intelligent people who more or less take one or another as ideologies, and see them as obviously correct.

Venkat Rao and Eric Weinstein are partisans of arbitrariness; they speak out in favor of “mediocrity” and against “excellence” respectively.  The rationale being, that being highly optimized at some widely appreciated metric — being very intelligent, or very efficient, or something like that — is often less valuable than being creative, generating something in a part of the world that is “dark” to the rest of us, that is not even on our map as something to value and thus appears as lack of value.  Ordinary people being “mediocre”, or talented people being “undisciplined” or “disreputable”, may be more creative than highly-optimized “top performers”.

Robin Hanson, by contrast, is a partisan of optimization; he speaks out against bias and unprincipled favoritism and in favor of systems like prediction markets which would force the “best ideas to win” in a fair competition.  Proponents of ideas like radical markets, universal basic income, open borders, income-sharing agreements, or smart contracts (I’d here include, for instance, Vitalik Buterin) are also optimization partisans.  These are legibilizing policies that, if optimally implemented, can always be Pareto improvements over the status quo; “whatever degree of wealth redistribution you prefer”, proponents claim, “surely it is better to achieve it in whatever way results in the least deadweight loss.”  This is the very reason that they are not the policies that public choice theory would predict would emerge naturally in governments. Legibilizing policies allow little scope for discretion, so they don’t let policymakers give illegible rewards to allies and punishments to enemies.  They reduce the scope of the “political”, i.e. that which is negotiated at the personal or group level, and replace it with an impersonal set of rules within which individuals are “free to choose” but not very “free to behave arbitrarily” since their actions are transparent and they must bear the costs of being in full view.

Optimization partisans are against weakly enforced rules — they say “if a rule is good, enforce it consistently; if a rule is bad, remove it; but selective enforcement is just another word for favoritism and corruption.”  Illegibility partisans say that weakly enforced rules are the only way to incorporate valuable information — precisely that information which enforcers do not feel they can make explicit, either because it’s controversial or because it’s too complex to verbalize. “If you make everything explicit, you’ll dumb everything in the world down to what the stupidest and most truculent members of the public will accept.  Say goodbye to any creative or challenging innovations!”

I see the value of arguments on both sides. However, I have positive (as opposed to normative) opinions that I don’t think everybody shares.  I think that the world I see around me is moving in the direction of greater arbitrariness and has been since WWII or so (when much of US society, including scientific and technological research, was organized along military lines).  I see arbitrariness as a thing that arises in “mature” or “late” organizations.  Bigger, older companies are more “political” and more monopolistic.  Bigger, older states and empires are more “corrupt” or “decadent.”

Arbitrariness has a tendency to protect those in power rather than out of power, though the correlation isn’t perfect.  Zones that protect your ability to do “whatever” you want without incurring costs (which include zones of privacy or property) are protective, conservative forces — they allow people security.  This often means protection for those who already have a lot; arbitrariness is often “elitist”; but it can also protect “underdogs” on the grounds of tradition, or protect them by shrouding them in secrecy.  (Scott thought “illegibility” was a valuable defense of marginalized peoples like the Roma. Illegibility is not always the province of the powerful and privileged.)  No; the people such zones of arbitrary, illegible freedom systematically harm are those who benefit from increased accountability and revealing of information. Whistleblowers and accusers; those who expect their merit/performance is good enough that displaying it will work to their advantage; those who call for change and want to display information to justify it; those who are newcomers or young and want a chance to demonstrate their value.

If your intuition is “you don’t know me, but you’ll like me if you give me a chance” or “you don’t know him, but you’ll be horrified when you find out what he did”, or “if you gave me a chance to explain, you’d agree”, or “if you just let me compete, I bet I could win”, then you want more optimization.

If your intuition is “I can’t explain, you wouldn’t understand” or “if you knew what I was really like, you’d see what an impostor I am”, or “malicious people will just use this information to take advantage of me and interpret everything in the worst possible light” or “I’m not for public consumption, I am my own sovereign person, I don’t owe everyone an explanation or justification for actions I have a right to do”, then you’ll want less optimization.

Of course, these aren’t so much static “personality traits” of a person as one’s assessment of the situation around oneself.  The latter cluster is an assumption that you’re living in a social environment where there’s very little concordance of interests — people knowing more about you will allow them to more effectively harm you.  The former cluster is an assumption that you’re living in an environment where there’s a great deal of concordance of interests — people knowing more about you will allow them to more effectively help you.

For instance, being “predictable” is, in Venkat’s writing, usually a bad thing, because it means you can be exploited by adversaries. Free people are “inscrutable.”  In other contexts, such as parenting, being predictable is a good thing, because you want your kids to have an easier time learning how to “work” the house rules.  You and your kid are not, most of the time, wily adversaries outwitting each other; conflicts are more likely to come from too much confusion or inconsistently enforced boundaries.  Relationship advice and management advice usually recommends making yourself easier for your partners and employees to understand, never more inscrutable.  (Sales advice, however, and occasionally advice for keeping romance alive in a marriage, sometimes recommends cultivating an aura of mystery, perhaps because it’s more adversarial.)

A related notion: wanting to join discussions is a sign of expecting a more cooperative world, while trying to keep people from joining your (private or illegible) communications is a sign of expecting a more adversarial world.

As social organizations “mature” and become larger, it becomes harder to enforce universal and impartial rules, harder to keep the larger population aligned on similar goals, and harder to comprehend the more complex phenomena in this larger group.  . This means that there’s both motivation and opportunity to carve out “hidden” and “special” zones where arbitrary behavior can persist even when it would otherwise come with negative consequences.

New or small organizations, by contrast, must gain/create resources or die, so they have more motivation to “optimize” for resource production; and they’re simple, small, and/or homogeneous enough that legible optimization rules and goals and transparent communication are practical and widely embraced.  “Security” is not available to begin with, so people mostly seek opportunity instead.

This theory explains, for instance, why US public policy is more fragmented, discretionary, and special-case-y, and less efficient and technocratic, than it is in other developed countries: the US is more racially diverse, which means, in a world where racism exists, that US civil institutions have evolved to allow ample opportunities to “play favorites” (giving special legal privileges to those with clout) in full generality, because a large population has historically been highly motivated to “play favorites” on the basis of race.  Homogeneity makes a polity behave more like a “smaller” one, while diversity makes a polity behave more like a “larger” one.

Aesthetically, I think of optimization as corresponding to an “early” style, like Doric columns, or like Masaccio; simple, martial, all form and principle.  Arbitrariness corresponds to a “late” style, like Corinthian columns or like Rubens: elaborate, sensual, full of details and personality.

The basic argument for optimization over arbitrariness is that it creates growth and value while arbitrariness creates stagnation.

Arbitrariness can’t really argue for itself as well, because communication itself is on the other side.  Arbitrariness always looks illogical and inconsistent.  It kind of is illogical and inconsistent. All it can say is “I’m going to defend my right to be wrong, because I don’t trust the world to understand me when I have a counterintuitive or hard-to-express or controversial reason for my choice.  I don’t think I can get what I want by asking for it or explaining my reasons or playing ‘fair’.”  And from the outside, you can’t always tell the difference between someone who thinks (perhaps correctly!) that the game is really rigged against them a profound level, and somebody who just wants to cheat or who isn’t thinking coherently.  Sufficiently advanced cynicism is indistinguishable from malice and stupidity.

For a fairly sympathetic example, you see something like Darkness at Noon, where the protagonist thinks, “Logic inexorably points to Stalinism; but Stalinism is awful! Therefore, let me insist on some space free from the depredations of logic, some space where justice can be tempered by mercy and reason by emotion.” From the distance of many years, it’s easy to say that’s silly, that of course there are reasons not to support Stalin’s purges, that it’s totally unnecessary to reject logic and justice in order to object to killing innocents.  But from inside the system, if all the arguments you know how to formulate are Stalinist, if all the “shoulds” and “oughts” around you are Stalinist, perhaps all you can articulate at first is “I know all this is right, of course, but I don’t like it.”

Not everything people call reason, logic, justice, or optimization, is in fact reasonable, logical, just, or optimal; so, a person needs some defenses against those claims of superiority.  In particular, defenses that can shelter them even when they don’t know what’s wrong with the claims.  And that’s the closest thing we get to an argument in favor of arbitrariness. It’s actually not a bad point, in many contexts.  The counterargument usually has to boil down to hope — to a sense of “I bet we can do better.”

 

Advertisements

Personalized Medicine For Real

I was part of the founding team at MetaMed, a personalized medicine startup.  We went out of business back in 2015.  We made a lot of mistakes due to inexperience, some of which I deeply regret.

I’m reflecting on that now, because Perlara just went out of business, and they got a lot farther on our original dream than we ever did. Q-State Biosciences, which is still around, is using a similar model.

The phenomenon that inspired MetaMed is that we knew of stories of heroic, scientifically literate patients and families of patients with incurable diseases, who came up with cures for their own conditions.  Physicist Leo Szilard, the “father of the atom bomb”, designed a course of radiation therapy to cure his own bladder cancer.  Computer scientist Matt Might analyzed his son’s genome to find a cure for his rare disorder.  Cognitive scientist Joshua Tenenbaum found a personalized treatment for his father’s cancer.

So, we thought, could we try to scale up this process to help more people?

In Lois McMaster Bujold’s science fiction novels, the hero suffers an accident that leaves him with a seizure disorder. He goes to a medical research center and clinic, the Durona Group, and they design a neural prosthetic for him that prevents the seizures.

This sounds like it ought to be a thing that exists. Patient-led, bench-to-bedside drug discovery or medical device engineering.  You get an incurable disease, you fund scientists/doctors/engineers to discover a cure, and now others with the disease can also be cured.

There’s actually a growing community of organizations trying to do things sort of in this vein.  Recursion Pharmaceuticals, where I used to work, does drug discovery for rare diseases. Sv.ai organizes hackathons for analyzing genetic data to help patients with rare diseases find the root cause.  Perlara and Q-state use animal models and in-vitro models respectively to simulate patients’ disorders, and then look for drugs or gene therapies that reverse those disease phenotypes in the animals or cells.

Back at MetaMed, I think we were groping towards something like this, but never really found our way there.

One reason is that we didn’t narrow our focus enough.  We were trying to solve too many problems at once, all called “personalized medicine.”

Personalized Lifestyle Optimization

Some “personalized medicine” is about health optimization for basically healthy people. A lot of it amounts to superficial personalization on top of generic lifestyle advice. Harmless, but more of a marketing thing than a science thing, and not very interesting from a humanitarian perspective.  Sometimes, we tried to get clients from this market.  I pretty much always thought this was a bad idea.

Personalized Medicine For All

Some “personalized medicine” is about the claim that the best way to treat even common diseases often depends on individual factors, such as genes.

This was part of our pitch, but as I learned more, I came to believe that this kind of “personalization” has very little applicability.  In most cases, we don’t know enough about how genes affect response to treatment to be able to improve outcomes by stratifying treatments based on genes.  In the few cases where we know people with different genes need different treatments, it’s often already standard medical practice to run those tests.  I now think there’s not a clear opportunity for a startup to improve the baseline through this kind of personalized medicine.

Preventing Medical Error

Some of our founding inspirations were the work of Gerd Gigerenzer and Atul Gawande, who showed that medical errors were the cause of many deaths, that doctors tend to be statistically illiterate, and that systematizing tools like checklists and statistical prediction rules save lives.  We wanted to be part of the “evidence-based medicine” movement by helping patients whose doctors had failed them.

I now think that we weren’t really in a position to do that as a company that sold consultations to individual patients. Many of the improvements in systematization that were clearly “good buys” have, in fact, been implemented in hospitals since Gawande and Gigerenzer first wrote about them.  We never saw a clear-cut case of a patient whose doctors had “dropped the ball” by giving them an obviously wrong treatment, except where the patient was facing financial hardship and had to transfer to substandard medical care.  I think doctors don’t make true unforced errors in diagnosis or treatment plan that often; and medical errors like “operating on the wrong leg” that happen in fast-paced decisionmaking environments were necessarily outside our scope.  I think there might be an opportunity to do a lot better than baseline by building a “smart hospital” that runs on checklists, statistical prediction rules, outcomes monitoring, and other evidence-based practices — Intermountain is the closest thing I know about, and they do get great outcomes — but that’s an epically hard problem, it’s political as much as medical and technological, and we weren’t in a position to make any headway on it.

AI Diagnosis

We were also hoping to automate diagnosis and treatment planning in a personalized manner.  “Given your symptoms, demographics, and genetic & lab test data, and given published research on epidemiology and clinical experiments, what are the most likely candidate diagnoses for you, and what are the treatments most likely to be effective for you?”

I used to be a big believer in the potential of this approach, but in the process of actually trying to build the AI, I ran into obstacles which were fundamentally philosophical. (No, it’s not “machines don’t have empathy” or anything like that.  It’s about the irreducible dependence on how you frame the problem, which makes “expert systems” dependent on an impractical, expensive amount of human labor up front.)

Connecting Patients with Experimental Therapies

Yet another “personalized medicine” problem we were trying to solve is the fact that patients with incurable diseases have a hard time learning about and getting access to experimental therapies, and could use a consultant who would guide them through the process and help get them into studies of new treatments.

I still think this is a real and serious problem for patients, and potentially an opportunity for entrepreneurs.  (Either on the consulting model, or more on the software side, via creating tools for matching patients with clinical trials — since clinical trials also struggle to recruit patients.)  In order to focus on this model, though, we’d have had to invest a lot more than we did into high-touch relationships with patients and building a network of clinician-researchers we could connect them with.

When Standard Practice Doesn’t Match Scientific Evidence

One kind of “medical error” we did see on occasion was when the patient’s doctors are dutifully doing the treatment that’s “standard-of-care”, but the medical literature actually shows that the standard-of-care is wrong.

There are cases where large, well-conducted studies clearly show that treatment A and treatment B have the same efficacy but B has worse side effects, and yet, “first-line treatment” is B for some reason.

There are cases where there’s a lot of evidence that “standard” cut-offs are in the wrong place. “Subclinical hypothyroidism” still benefits from supplemental thyroid hormone; higher-than-standard doses of allopurinol control gout better; “standard” light therapy for seasonal affective disorder doesn’t work as well as ultra-bright lights; etc.  More Dakka.

There are also cases where a scientist found an intervention effective, and published a striking result, and maybe it was even publicized widely in places like the New Yorker or Wired, but somehow clinicians never picked it up.  The classic example is Ramachandran’s mirror box experiment — it’s a famous experiment that showed that phantom limb pain can be reversed by creating an illusion with mirrors that allows the patient to fix their “body map.” There have since been quite a few randomized trials confirming that the mirror trick works. But, maybe because it’s not a typical kind of “treatment” like a drug, it’s not standard of care for phantom limb pain.

I think we were pretty successful at finding these kinds of mismatches between medical science and medical practice.  By their nature, though, these kinds of solutions are hard to scale to reach lots of people.

N=1 Translational Medicine for Rare Diseases

This is the use case of “personalized medicine” that I think can really shine.  It harnesses the incredible motivation of patients with rare incurable diseases and their family members; it’s one of the few cases where genetic data really does make a huge difference; and the path to scale is (relatively) obvious if you discover a new drug or treatment.  I think we should have focused much more tightly on this angle, and that a company based on bench-to-bedside discovery for rare diseases could still become the real-world “Durona Group”.

I think doing it right at MetaMed would have meant getting a lot more in-house expertise in biology and medicine than we ever had, more like Perlara and Q-State, which have their own experimental research programs, something we never got off the ground.

Speaking only about myself and not my teammates, while I was at MetaMed I was deeply embarrassed to be a layman in the biomedical field, and I felt like “why would an expert ever want to work with a layman like me?” So I was far too reluctant to reach out to prominent biologists and doctors. I now know that experts work with laymen all the time, especially when that layman brings strategic vision, funding, and logistical/operational manpower, and listens to the expert with genuine curiosity.  Laymen are valuable — just ask Mary Lasker!  I really wish I’d understood this at the time.

People overestimate progress in the short run and underestimate it in the long run.  “Biohackers” and “citizen science” and “N=1 experimentation” have been around for a while, but they haven’t, I think, gotten very far along towards the ultimate impact they’re likely to have in the future.  Naively, that can look a lot like “a few people tried that and it didn’t seem to go anywhere” when the situation is actually “the big break is still ahead of us.”

The Tale of Alice Almost: Strategies for Dealing With Pretty Good People

Suppose you value some virtue V and you want to encourage people to be better at it.  Suppose also you are something of a “thought leader” or “public intellectual” — you have some ability to influence the culture around you through speech or writing.

Suppose Alice Almost is much more V-virtuous than the average person — say, she’s in the top one percent of the population at the practice of V.  But she’s still exhibited some clear-cut failures of V.  She’s almost V-virtuous, but not quite.

How should you engage with Alice in discourse, and how should you talk about Alice, if your goal is to get people to be more V-virtuous?

Well, it depends on what your specific goal is.

Raising the Global Median

If your goal is to raise the general population’s median Vlevel, for instance, if V is “understanding of how vaccines work” and your goal is to increase the proportion of people who vaccinate their children, you want to support Alice straightforwardly.

Alice is way above the median V level. It would be great if people became more like Alice. If Alice is a popular communicator, signal-boosting Alice will be more likely to help rather than harm your cause.

For instance, suppose Alice makes a post telling parents to vaccinate their kids, but she gets a minor fact wrong along the way.  It’s still OK to quote or excerpt the true part of her post approvingly, or to praise her for coming out in favor of vaccines.

Even spreading the post with the incorrect statement included, while it’s definitely suboptimal for the cause of increasing the average person’s understanding of vaccines, is probably net positive, rather than net negative.

Raising the Median Among The Virtuous

What if, instead, you’re trying to promote V among a small sub-community who excel at it?  Say, the top 1% of the population in terms of V-virtue?

You might do this if your goal only requires a small number of people to practice exceptional virtue. For instance, to have an effective volunteer military doesn’t require all Americans to exhibit the virtues of a good soldier, just the ones who sign up for military service.

Now, within the community you’re trying to influence, Alice Almost isn’t way above average any more.  Alice is average. 

That means, you want to push people, including Alice, to be better than Alice is today.  Sure, Alice is already pretty V-virtuous compared to the general population, but by the community’s standards, the general population is pathetic.  

In this scenario, it makes sense to criticize Alice privately if you have a personal relationship with her.  It also makes sense to, at least sometimes, publicly point out how the Alice Almosts of the community are falling short of the ideal of V.  (Probably without naming names, unless Alice is already a famous public figure.)

Additionally, it makes sense to allow Alice to bear the usual negative consequences of her actions, and to publicly argue against anyone trying to shield her from normal consequences. For instance, if people who exhibit Alice-like failures of V are routinely fired from their jobs in your community, then if Alice gets fired, and her supporters get outraged about it, then it makes sense for you to argue that Alice deserved to be fired.

It does not make sense here to express outrage at Alice’s behavior, or to “punish” her as though she had committed a community norm violation.  Alice is normal — that means that behavior like Alice’s happens all the time, and that the community does not currently have effective, reliably enforced norms against behavior like hers.

Now, maybe the community should have stronger norms against her behavior!  But you have to explicitly make the case for that.  If you go around saying “Alice should be jailed because she did X”, and X isn’t illegal under current law, then you are wrong.  You first have to argue that X should be illegal.

If Alice’s failures of V-virtue are typical, then you do want to communicate the message that people should practice V more than Alice does.  But this will be news to your audience, not common knowledge, since many of them are no better than Alice.  To communicate effectively, you’ll have to take a tone of educating or sharing information: “Alice Almost, a well-known member of our community, just did X.  Many of us do X, in fact. But X is not good enough. We shouldn’t consider X okay any more. Here’s why.”

Enforcing Community Norms

What if Alice is inside the community of top-1%-V-virtue you care about, but noticeably worse than average at V or violating community standards for V?

That’s an easy case. Enforce the norms! That’s what they’re there for!

Continuing to enforce the usual penalties against failures of V, and making it common knowledge that you do so, and support others who enforce penalties, keeps the “floor” of V in your community from falling, either by deterrence or expulsion or both.

In terms of tone, it now makes sense for you to communicate in a more “judgmental” way, because it’s common knowledge that Alice did wrong.  You can say something like “Alice did X.  As you know, X is unacceptable/forbidden/substandard in our community. Therefore, we will be penalizing her in such-and-such a way, according to our well-known, established traditions/code/policy.”

Splintering off a “Remnant”

The previous three cases treated the boundaries of your community as static. What if we made them dynamic instead?

Suppose you’re not happy with the standard of V-virtue of “the top 1% of the population.”  You want to create a subcommunity with an even higher standard — let’s say, drawing from the top 0.1% of the population.

You might do this, for instance, if V is “degree of alignment/agreement with a policy agenda”, and you’re not making any progress with discourse/collaboration between people who are only mostly aligned with your agenda, so you want to form a smaller task force composed of a core of people who are hyper-aligned.

In that case, Alice Almost is normal for your current community, but she’s notably inferior in V-virtue compared to the standards of the splinter community you want to form.

Here, not only do you want to publicly criticize actions like Alice’s, but you even want to spend most of your time talking about how the Alice Almosts of the world fall short of the ideal V, as you advocate for the existence of your splinter group.  You want to reach out to the people who are better at V than Alice, even if they don’t know it themselves, and explain to them what the difference between top-1% V-virtue and top 0.1% V-virtue looks like, and why that difference matters.  You’re, in effect, empowering and encouraging them to notice that they’re not Alice’s peers any more, they’ve leveled up beyond her, and they don’t have to make excuses for her any more.

Just like in the case where Alice is a typical member of your community and you want to push your community to do better, your criticisms of Alice will be news to much of your audience, so you have to take an “educational/informational” tone. Even the people in the top 0.1% “remnant” may not be aware yet that there’s anything wrong with Alice’s behavior.

However, you’re now speaking primarily to the top 0.1%, not the top 1%, so you can now afford to be somewhat more insulting towards Alice.  You’re trying to create norms for a future community in which Alice’s behavior will be considered unacceptable/substandard, so you can start to introduce the frame where Alice-like behavior is “immoral”, “incompetent”, “outrageous”, or otherwise failing to meet a reasonable person’s minimum expectations.

Expanding Community Membership

Let’s say you’re doing just the opposite. You think your community is too selective.  You want to expand its boundaries to, say, a group drawn from the top 10% of the population in V-virtue.  Your goals may require you to raise the V-levels of a wider audience than you’d been speaking to before.

In this case, you’re more or less in the same position as in the first case where you’re just trying to raise the global median.  You should support Alice Almost (as much as possible without yourself imitating or compounding her failures), laud her as a role model, and not make a big public deal about the fact that she falls short of the ideal; most of the people you’re trying to reach fall short even farther.

What if Alice is Diluting Community Values?

Now, what if Alice Almost is the one trying to expand community membership to include people lower in V-virtue … and you don’t agree with that?

Now, Alice is your opponent.

In all the previous cases, the worst Alice did was drag down the community’s median V level, either directly or by being a role model for others.  But we had no reason to suppose she was optimizing for lowering the median V level of the community.  Once Alice is trying to “popularize” or “expand” the community, that changes. She’s actively trying to lower median V in your community — that is, she’s optimizing for the opposite of what you want.

This means that, not only should you criticize Alice, enforce existing community norms that forbid her behavior, and argue that community standards should become stricter against Alice-like, 1%-level failures of V-virtue, but you should also optimize against Alice gaining more power generally.

(But what if Alice succeeds in expanding the community size 10x and raising the median V level within the larger community by 10x or more, such that the median V level still increases from where it is now? Wouldn’t Alice’s goals be aligned with your goals then?  Yeah, but we can assume we’re in a regime where increasing V levels is very hard — a reasonable assumption if you think about the track record of trying to teach ethics or instill virtue in large numbers of people — so such a huge persuasive/rhetorical win is unlikely.)

Alice, for her part, will see you as optimizing against her goals (she wants to grow the community and you want to prevent that) so she’ll have reason to optimize generally against you gaining more power.

Alice Almost and you are now in a zero-sum game.  You are direct opponents, even though both of you are, compared to the general population, both very high in V-virtue.

Alice Almost in this scenario is a Sociopath, in the Chapman sense — she’s trying to expand and dilute the subculture.   And Sociopaths are not just a little bad for the survival of the subculture, they are an existential threat to it, even though they are only a little weaker in the defining skills/virtues of the subculture than the Geeks who founded it.  In the long run, it’s not about where you are, it’s where you’re aiming, and the Sociopaths are aiming down.

Of course, getting locked into a zero-sum game is bad if you can avoid it.  Misidentifying Alice as a Sociopath when she isn’t, or missing an opportunity to dialogue with her and come to agreement about how big the community really needs to be, is costly.  You don’t want to be hasty or paranoid in reading people as opponents.  But there’s a very, very big difference between how you deal with someone who just happened to do something that blocked your goal, and how you deal with someone who is persistently optimizing against your goal.

Humans Who Are Not Concentrating Are Not General Intelligences

Recently, OpenAI came out with a new language model that automatically synthesizes text, called GPT-2.

It’s disturbingly good.  You can see some examples (cherry-picked, by their own admission) in OpenAI’s post and in the related technical paper.

I’m not going to write about the machine learning here, but about the examples and what we can infer from them.

The scary thing about GPT-2-generated text is that it flows very naturally if you’re just skimming, reading for writing style and key, evocative words.  The “unicorn” sample reads like a real science press release. The “theft of nuclear material” sample reads like a real news story. The “Miley Cyrus shoplifting” sample reads like a real post from a celebrity gossip site.  The “GPT-2” sample reads like a real OpenAI press release. The “Legolas and Gimli” sample reads like a real fantasy novel. The “Civil War homework assignment” reads like a real C-student’s paper.  The “JFK acceptance speech” reads like a real politician’s speech.  The “recycling” sample reads like a real right-wing screed.

If I just skim, without focusing, they all look totally normal. I would not have noticed they were machine-generated. I would not have noticed anything amiss about them at all.

But if I read with focus, I notice that they don’t make a lot of logical sense.

For instance, in the unicorn sample:

The scientist named the population, after their distinctive horn, Ovid’s Unicorn. These four-horned, silver-white unicorns were previously unknown to science.

Wait a second, “Ovid” doesn’t refer to a “distinctive horn”, so why would naming them “Ovid’s Unicorn” be naming them after a distinctive horn?  Also, you just said they had one horn, so why are you saying they have four horns in the next sentence?

While their origins are still unclear, some believe that perhaps the creatures were created when a human and a unicorn met each other in a time before human civilization. According to Pérez, “In South America, such incidents seem to be quite common.”

Wait, unicorns originated from the interbreeding of humans and … unicorns?  That’s circular, isn’t it?

Or, look at the GPT-2 sample:

We believe this project is the first step in the direction of developing large NLP systems without task-specific training data. That is, we are developing a machine language system in the generative style with no explicit rules for producing text.

Except the second sentence isn’t a restatement of the first sentence — “task-specific training data” and “explicit rules for producing text” aren’t synonyms!  So saying “That is” doesn’t make sense.

Or look at the LOTR sample:

Aragorn drew his sword, and the Battle of Fangorn was won. As they marched out through the thicket the morning mist cleared, and the day turned to dusk.

Yeah, day doesn’t turn to dusk in the morning.

Or in the “resurrected JFK” sample:

(1) The brain of JFK was harvested and reconstructed via tissue sampling. There was no way that the tissue could be transported by air. (2) A sample was collected from the area around his upper chest and sent to the University of Maryland for analysis. A human brain at that point would be about one and a half cubic centimeters. The data were then analyzed along with material that was obtained from the original brain to produce a reconstruction; in layman’s terms, a “mesh” of brain tissue.

His brain tissue was harvested…from his chest?!  A human brain is one and a half cubic centimeters?!

So, ok, this isn’t actually human-equivalent writing ability. OpenAI doesn’t claim it is, for what it’s worth — I’m not trying to diminish their accomplishment, that’s not the point of this post.  The point is, if you skim text, you miss obvious absurdities.  The point is OpenAI HAS achieved the ability to pass the Turing test against humans on autopilot.

The point is, I know of a few people, acquaintances of mine, who, even when asked to try to find flaws, could not detect anything weird or mistaken in the GPT-2-generated samples.

There are probably a lot of people who would be completely taken in by literal “fake news”, as in, computer-generated fake articles and blog posts.  This is pretty alarming.  Even more alarming: unless I make a conscious effort to read carefully, I would be one of them.

Robin Hanson’s post Better Babblers is very relevant here.  He claims, and I don’t think he’s exaggerating, that a lot of human speech is simply generated by “low order correlations”, that is, generating sentences or paragraphs that are statistically likely to come after previous sentences or paragraphs:

After eighteen years of being a professor, I’ve graded many student essays. And while I usually try to teach a deep structure of concepts, what the median student actually learns seems to mostly be a set of low order correlations. They know what words to use, which words tend to go together, which combinations tend to have positive associations, and so on. But if you ask an exam question where the deep structure answer differs from answer you’d guess looking at low order correlations, most students usually give the wrong answer.

Simple correlations also seem sufficient to capture most polite conversation talk, such as the weather is nice, how is your mother’s illness, and damn that other political party. Simple correlations are also most of what I see in inspirational TED talks, and when public intellectuals and talk show guests pontificate on topics they really don’t understand, such as quantum mechanics, consciousness, postmodernism, or the need always for more regulation everywhere. After all, media entertainers don’t need to understand deep structures any better than do their audiences.

Let me call styles of talking (or music, etc.) that rely mostly on low order correlations “babbling”. Babbling isn’t meaningless, but to ignorant audiences it often appears to be based on a deeper understanding than is actually the case. When done well, babbling can be entertaining, comforting, titillating, or exciting. It just isn’t usually a good place to learn deep insight.

I used to half-joke that the New Age Bullshit Generator was actually useful as a way to get myself to feel more optimistic. The truth is, it isn’t quite good enough to match the “aura” or “associations” of genuine, human-created inspirational text. GPT-2, though, is.

I also suspect that the “lyrical” or “free-associational” function of poetry is adequately matched by GPT-2.  The autocompletions of Howl read a lot like Allen Ginsberg — they just don’t imply the same beliefs about the world.  (Moloch whose heart is crying for justice! sounds rather positive.)

I’ve noticed that I cannot tell, from casual conversation, whether someone is intelligent in the IQ sense.

I’ve interviewed job applicants, and perceived them all as “bright and impressive”, but found that the vast majority of them could not solve a simple math problem.  The ones who could solve the problem didn’t appear any “brighter” in conversation than the ones who couldn’t.

I’ve taught public school teachers, who were incredibly bad at formal mathematical reasoning (I know, because I graded their tests), to the point that I had not realized humans could be that bad at math — but it had no effect on how they came across in friendly conversation after hours. They didn’t seem “dopey” or “slow”, they were witty and engaging and warm.

I’ve read the personal blogs of intellectually disabled people — people who, by definition, score poorly on IQ tests — and they don’t read as any less funny or creative or relatable than anyone else.

Whatever ability IQ tests and math tests measure, I believe that lacking that ability doesn’t have any effect on one’s ability to make a good social impression or even to “seem smart” in conversation.

If “human intelligence” is about reasoning ability, the capacity to detect whether arguments make sense, then you simply do not need human intelligence to create a linguistic style or aesthetic that can fool our pattern-recognition apparatus if we don’t concentrate on parsing content.

I also noticed, upon reading GPT2 samples, just how often my brain slides from focused attention to just skimming. I read the paper’s sample about Spanish history with interest, and the GPT2-generated text was obviously absurd. My eyes glazed over during the sample about video games, since I don’t care about video games, and the machine-generated text looked totally unobjectionable to me. My brain is constantly making evaluations about what’s worth the trouble to focus on, and what’s ok to tune out. GPT2 is actually really useful as a *test* of one’s level of attention.

This is related to my hypothesis in https://srconstantin.wordpress.com/2017/10/10/distinctions-in-types-of-thought/ that effortless pattern-recognition is what machine learning can do today, while effortful attention, and explicit reasoning (which seems to be a subset of effortful attention) is generally beyond ML’s current capabilities.

Beta waves in the brain are usually associated with focused concentration or active or anxious thought, while alpha waves are associated with the relaxed state of being awake but with closed eyes, before falling asleep, or while dreaming. Alpha waves sharply reduce after a subject makes a mistake and begins paying closer attention. I’d be interested to see whether ability to tell GPT2-generated text from human-generated text correlates with alpha waves vs. beta waves.

The first-order effects of highly effective text-generators are scary. It will be incredibly easy and cheap to fool people, to manipulate social movements, etc. There’s a lot of opportunity for bad actors to take advantage of this.

The second-order effects might well be good, though. If only conscious, focused logical thought can detect a bot, maybe some people will become more aware of when they’re thinking actively vs not, and will be able to flag when they’re not really focusing, and distinguish the impressions they absorb in a state of autopilot from “real learning”.

The mental motion of “I didn’t really parse that paragraph, but sure, whatever, I’ll take the author’s word for it” is, in my introspective experience, absolutely identical to “I didn’t really parse that paragraph because it was bot-generated and didn’t make any sense so I couldn’t possibly have parsed it”, except that in the first case, I assume that the error lies with me rather than the text.  This is not a safe assumption in a post-GPT2 world. Instead of “default to humility” (assume that when you don’t understand a passage, the passage is true and you’re just missing something) the ideal mental action in a world full of bots is “default to null” (if you don’t understand a passage, assume you’re in the same epistemic state as if you’d never read it at all.)

Maybe practice and experience with GPT2 will help people get better at doing “default to null”?

The Relationship Between Hierarchy and Wealth

Epistemic Status: Tentative

I’m fairly anti-hierarchical, as things go, but the big challenge to all anti-hierarchical ideologies is “how feasible is this in real life? We don’t see many examples around us of this working well.”

Backing up, for a second, what do we mean by a hierarchy?

I take it to mean a very simple thing: hierarchies are systems of social organization where some people tell others what to do, and the subordinates are forced to obey the superiors.  This usually goes along with special privileges or luxuries that are only available to the superiors.  For instance, patriarchy is a hierarchy in which wives and children must obey fathers, and male heads of families get special privileges.

Hierarchy is a matter of degree, of course. Power can vary in the severity of its enforcement penalties (a government can jail you or execute you, an employer can fire you, a religion can excommunicate you, the popular kids in a high school can bully or ostracize you), in its extent (a totalitarian government claims authority over more aspects of your life than a liberal one), or its scale (an emperor rules over more people than a clan chieftain.)

Power distance is a concept from the business world that attempts to measure the level of hierarchy within an organization or culture.  Power distance is measured by polling less-powerful individuals on how much they “accept and expect that power is distributed unequally”.  In low power distance cultures, there’s more of an “open door” policy, subordinates can talk freely with managers, and there are few formal symbols of status differentiating managers from subordinates.  In “high power distance” cultures, there’s more formality, and subordinates are expected to be more deferential.  According to Geert Hofstede, the inventor of the power distance index (PDI), Israel and the Nordic countries have the lowest power distance index in the world, while Arab, Southeast Asian, and Latin American countries have the highest.  (The US is in the middle.)

I share with many other people a rough intuition that hierarchy poses problems.

This may not be as obvious as it sounds.  In high power distance cultures, empirically, subordinates accept and approve of hierarchy.  So maybe hierarchy is just fine, even for the “losers” at the bottom?  But there’s a theory that subordinates claim to approve of hierarchy as a covert way of getting what power they can.   In other words, when you see peasants praising the benevolence of landowners, it’s not that they’re misled by the governing ideology, and not that they’re magically immune to suffering from poverty as we would in their place, but just that they see their situation as the best they can get, and a combination of flattery and (usually religious) guilt-tripping is their best chance for getting resources from the landowners.  So, no, I don’t think you can assume that hierarchy is wholly harmless just because it’s widely accepted in some societies. Being powerless is probably bad, physiologically and psychologically, for all social mammals.

But to what extent is hierarchy necessary?

Structurelessness and Structures

Nominally non-hierarchical organizations often suffer from failure modes that keep them from getting anything done, and actually wind up quite hierarchical in practice. I don’t endorse everything in Jo Freeman’s famous essay on the Tyranny of Structurelessness, but it’s important as an account of actual experiences in the women’s movement of the 1970s.

When organizations have no formal procedures or appointed leaders, everything goes through informal networks; this devolves into popularity contests, privileges people who have more free time to spend on gossip, as well as people who are more privileged in other ways (including economically), and completely fails to correlate decision-making power with competence.

Freeman’s preferred solution is to give up on total structurelessness and accept that there will be positions of power in feminist organizations, but to make those positions of power legible and limited, with methods derived from republican governance (which are also traditional in American voluntary organizations.)  Positions of authority should be limited in scope (there is a finite range of things an executive director is empowered to do), accountable to the rest of the organization (through means like voting and annual reports), and impeachable in cases of serious ethical violation or incompetence. This is basically the governance structure that nonprofits and corporations use, and (in my view) it helps make them, say, less likely to abuse their members than cults and less likely to break up over personal drama than rock bands.

Freeman, being more egalitarian than the republican tradition, also goes further with her recommendations and says that responsibilities should be rotated (so no one person has “ownership” over a job forever), that authority should be distributed widely rather than concentrated, that information should be diffused widely, and that everyone in the organization should have equal access to organizational resources.  Now, this is a good deal less hierarchical than the structure of republican governments, nonprofits, and corporations; it is still pretty utopian from the point of view of someone used to those forms of governance, and I find myself wondering if it can work at scale; but it’s still a concession to hierarchy relative to the “natural” structurelessness that feminist organizations originally envisioned.

Freeman says there is one context in which a structureless organization can work; a very small team (no more than five) of people who come from very similar backgrounds (so they can communicate easily), spend so much time together that they practically live together (so they communicate constantly), and are all capable of doing all “jobs” on the project (no need for formal division of labor.)  In other words, she’s describing an early-stage startup!

I suspect Jo Freeman’s model explains a lot about the common phenomenon of startups having “growing pains” when they get too large to work informally.  I also suspect that this is a part of how startups stop being “mission-driven” and ambitious — if they don’t add structure until they’re forced to by an outside emergency, they have to hurry, and they adopt a standard corporate structure and power dynamics (including the toxic ones, which are automatically imported when they hire a bunch of people from a toxic business culture all at once) instead of having time to evolve something that might achieve the founders’ goals better.

But Can It Scale? Historical Stateless Societies

So, the five-person team of friends is a non-hierarchical organization that can work.  But that’s not very satisfying for anti-authoritarian advocates, because it’s so small.  And, accordingly, an organization that small is usually poor — there’s only so many resources that five people can produce.

(Technology can amplify how much value a single person can produce. This is probably why we see more informal cultures among people who work with high-leverage technology.  Software engineers famously wear t-shirts, not suits; Air Force pilots have a reputation as “hotshots” with lax military discipline compared to other servicemembers. Empowered with software or an airplane, a single individual can be unusually valuable, so  less deference is expected of the operators of high technology.)

When we look at historical anarchies or near-anarchies, we usually also see that they’re small, poor, or both.  We also see that within cultures, there is often surprisingly more freedom for women among the poor than among the rich.

Medieval Iceland from the tenth to thirteenth centuries was a stateless society, with private courts of law, and competing legislative assemblies (Icelanders could choose which assembly and legal code to belong to), but no executive branch or police.  (In this, it was an unusually pure form of anarchy but not unique — other medieval European polities had much more private enforcement of law than we do today, and police are a 19th-century invention.)

The medieval Icelandic commonwealth lasted long enough — longer than the United States — that it was clear this was a functioning system, not a brief failed experiment.  And it appears that it was less violent, not more, compared to other medieval societies.  Even when the commonwealth was beginning to break down in the thirteenth century, battles had low casualty rates, because every man still had to be paid for!  The death toll during the civil war that ended the commonwealth’s independence was only as high per capita as the current murder rate of the US.  While Christianization in neighboring Norway was a violent struggle, the decision of whether to convert to Christianity in Iceland was decided peacefully through arbitration.  In this case, it seems clear that anarchy brought peace, not war.

However, medieval Iceland was small — only 50,000 people, confined to a harsh Arctic environment, and ethnically homogeneous.

Other historical and traditional stateless societies are and were also relatively poor and low in population density. The Igbo of Nigeria traditionally governed by council and consensus, with no kings or chiefs, but rather a sort of village democracy.   This actually appears to be fairly common in small polities.  The Iroquois Confederacy governed by council and had no executive. (Note that the Iroquois are a hoe culture.)  The Nuer of Sudan, a pastoral society currently with a population of a few million, have traditionally had a stateless society with a system of feud law — they had judges, but no executives. There are many more examples — perhaps most familiar to Westerners, the society depicted in the biblical book of Judges appears to have had no king and no permanent war-leader, but only judges who would decide cases which would be privately enforced. In fact, stateless societies with some form of feud law seem to be a pretty standard and recurrent type of political organization, but mostly in “primitive” communities — horticultural or pastoral, low in population density.  This sounds like bad news for modern-day anarchists who don’t want to live in primitive conditions. None of these historical stateless societies, even the comparatively sophisticated Iceland, are urban cultures!

It’s possible that the Harappan civilization in Bronze Age India had no state, while it had cities that housed tens of thousands of people, were planned on grids, and had indoor plumbing.  The Harappans left no massive tombs, no palaces or temples, houses of highly uniform size (indicating little wealth inequality) no armor and few weapons (despite advanced metalworking), no sign of battle damage on the cities or violent death in human remains, and very minimal city walls.  The Harappan cities were commercial centers, and the Harappans engaged in trade along the coast of India and as far as Afghanistan and the Persian Gulf.  Unlike other similar river-valley civilizations (such as Mesopotamia), the Harappans had so much arable land, and farmsteads so initially spread out, that populations steadily grew and facilitated long-distance trade without having to resort to raiding, so they never developed a warrior class.  If so, this is a counterexample to the traditional story that all civilizations developed states (usually monarchies) as a necessary precondition to developing cities and grain agriculture.

Bali is another counterexample.  Rice farming in Bali requires complex coordination of irrigation. This was traditionally not organized by kings, but by subaks, religious and social organizations that supervise the growing of rice, supervised by a decentralized system of water temples, and led by priests who kept a ritual calendar for timing irrigation.  While precolonial Bali was not an anarchy but a patchwork of small principalities, large public works like irrigation were not under state control.

So we have reason to believe that Bronze Age levels of technological development (cities, metalworking, intensive agriculture, literacy, long-distance trade, and high populations) can be developed without states, at scales involving millions of people, for centuries.  We also have much more abundant evidence, historical and contemporary, of informal governance-by-council and feud law existing stably at lower technology levels (for pastoralists and horticulturalists).  And, in special political circumstances (the Icelanders left Norway to settle a barren island, to escape the power of the Norwegian king, Harald Fairhair) an anarchy can arise out of a state society.

But we don’t have successful examples of anarchies at industrial tech levels. We know industrial-technology public works can be built by voluntary organizations (e.g. the railroads in the US) but we have no examples of them successfully resisting state takeover for more than a few decades.

Is there something about modern levels of high technology and material abundance that is incompatible with stateless societies? Or is it just that modern nation-states happened to already be there when the Industrial Revolution came around?

Women’s Status and Material Abundance

A very weird thing is that women’s level of freedom and equality seems almost to anticorrelate with the wealth and technological advancement.

Horticultural (or “hoe culture“) societies are non-patriarchal and tend to allow women more freedom and better treatment in various ways than pre-industrial agricultural societies. For instance, severe mistreatment of women and girls like female infanticide, foot-binding, honor killings, or sati, and chastity-oriented restrictions on female freedom like veiling and seclusion, are common in agricultural societies and unknown in horticultural ones. But horticultural societies are poor in material culture and can’t sustain high population densities in most cases.

You also see unusual freedom for women in premodern pastoral cultures, like the Mongols. Women in the Mongol Empire owned and managed ordos, mobile cities of tents and wagons which also comprised livestock and served as trading hubs.  While the men focused on hunting and war, the women managed the economic sphere. Mongol women fought in battle, herded livestock, and occasionally ruled as queens.  They did not wear veils or bind their feet.

We see numerous accounts of ancient and medieval women warriors and military commanders among Germanic and Celtic tribes and steppe peoples of Central Asia.  There are also accounts of medieval European noblewomen who personally led armies. The pattern isn’t obvious, but there seem to be more accounts of women military leaders in pastoral societies or tribal ones than in large, settled empires.

Pastoralism, to a lesser extent than horticulture but still more than plow agriculture, gives women an active role in food production. Most pastoral societies today have a traditional division of labor in which men are responsible for meat animals and women are responsible for milk animals (as well as textiles).  Where women provide food, they tend to have more bargaining power.  Some pastoral societies, like the Tuareg, are even matrilineal; Tuareg women traditionally have more freedom, including sexual freedom, than they do in other Muslim cultures, and women do not wear the veil while men do.

Like horticulture, pastoralism is less efficient per acre at food production than agriculture, and thus does not allow high population densities. Pastoralists are poorer than their settled farming neighbors. This is another example of women being freer when they are also poorer.

Another weird and “paradoxical” but very well-replicated finding is that women are more different from men  in psychological and behavioral traits (like Big 5 personality traits, risk-taking,  altruism, participation in STEM careers) in richer countries than in poorer ones.  This isn’t quite the same as women being less “free” or having fewer rights, but it seems to fly in the face of the conventional notion that as societies grow richer, women become more equal to men.

Finally, within societies, it’s sometimes the case that poor women are treated better than rich ones.  Sarah Blaffer Hrdy writes about observing that female infanticide was much more common among wealthy Indian Rajput families than poor ones. And we know of many examples across societies of aristocratic or upper-class women being more restricted to the domestic sphere, married off younger, less likely to work, more likely to experience restrictive practices like seclusion or footbinding, than their poorer counterparts.

Hrdy explains why: in patrilinear societies, men inherit wealth and women don’t. If you’re a rich family, a son is a “safe” outcome — he’ll inherit your wealth, and your grandchildren through him will be provided for, no matter whom he marries. A daughter, on the other hand, is a risk. You’ll have to pay a dowry when she marries, and if she marries “down” her children will be poorer than you are — and at the very top of the social pyramid, there’s nowhere to marry but down.  This means that you have an incentive to avoid having daughters, and if you do have daughters, you’ll be very anxious to avoid them making a bad match, which means lots of chastity-enforcement practices. You’ll also invest more in your sons than daughters in general, because your grandchildren through your sons will have a better chance in life than your grandchildren through your daughters.

The situation reverses if you’re a poor family. Your sons are pretty much screwed; they can’t marry into money (since women don’t inherit.) Your daughters, on the other hand, have a chance to marry up. So your grandchildren through your daughters have better chances than your grandchildren through your sons, and you should invest more resources in your daughters than your sons. Moreover, you might not be able to afford restrictive practices that cripple your daughters’ ability to work for a living. To some extent, sexism is a luxury good.

A similar analysis might explain why richer countries have larger gender differences in personality, interests, and career choices.  A degree in art history might function as a gentler equivalent of purdah — a practice that makes a woman a more appealing spouse but reduces her earning potential. You expect to find such practices more among the rich than the poor.  (Tyler Cowen’s take is less jaundiced, and more general, but similar — personal choices and “personality” itself are more varied when people are richer, because one of the things people “buy” with wealth is the ability to make fulfilling but not strictly pragmatic self-expressive choices.)

Finally, all these “paradoxical” trends are countered by the big nonparadoxical trend — by most reasonable standards, women are less oppressed in rich liberal countries than in poor illiberal ones.  The very best countries for women’s rights are also the ones with the lowest power distance: Nordic and Germanic countries.

Is Hierarchy the Engine of Growth or a Luxury Good?

If you observe that the “freest” (least hierarchical, lowest power distance, least authoritarian, etc) functioning organizations and societies tend to be small, poor, or primitive, you could come to two different conclusions:

  1. Freedom causes poverty (in other words, non-hierarchical organization is worse than hierarchy at scaling to large organizations or rich, high-population societies)
  2. Hierarchy is expensive (in other words, only the largest organizations or richest societies can afford the greatest degree of authoritarianism.)

The first possibility is bad news for freedom. It means you should worry you can’t scale up to wealth for large populations without implementing hierarchies.  The usual mechanism proposed for this is the hypothesis that hierarchies are needed to coordinate large numbers of people in large projects.  Without governments, how would you build public works? Or guard the seas for global travel and shipping? Without corporate hierarchies, how would you get mass-produced products to billions of people?  Sure, goes the story, idealists have proposed alternatives to hierarchy, and we know of intriguing counter examples like Harappa and Bali, but these tend to be speculative or small-scale and the success stories are sporadic.

The second possibility is (tentative)  good news for freedom.  It says that hierarchy is inefficient.  For instance, secluding women in harems wastes their productive potential. Top-down state control of the economy causes knowledge problems that limit economic productivity. The same problem applies to top-down control of decisionmaking in large firms.  Dominance hierarchies inhibit accurate transmission of information, which worsens knowledge problems and principal-agent problems (“communication is only possible between equals.”)  And elaborate displays of power and deference are costly, as nonproductive displays always are.  Only accumulations of large amounts of resources enable such wasteful activity, which benefits the top of the hierarchy in the short run but prevents the “pie” of total resources from growing.

This means that if you could just figure out a way to keep inefficient hierarchies from forming, you could grow systems to be larger and richer than ever.  Yes, historically, Western economies grew richer as states grew stronger — but perhaps a stateless society could be richer still.  Perhaps without the stagnating effects of rent-seeking, we could be hugely better off.

After all, this is kind of what liberalism did. It’s the big counter-trend to “wealth and despotism go together” — Western liberal-democratic countries are much richer and much less authoritarian (and less oppressive to women) than any pre-modern society, or than developing countries. One of the observations in Wealth of Nations is that countries with strong middle classes had more subsequent economic growth than countries with more wealth inequality — Smith uses England as an example of a fast-growing, equal society and China as an example of a stagnant, unequal one.

But this is only partial good news for freedom, after all. If hierarchies tend to emerge as soon as size, scale, and wealth arise, then that means we don’t have a solution to the problem of preventing them from emerging. On a model where any sufficiently large accumulation of resources begins to look attractive to “robber barons” who want to appropriate it and forcibly keep others out, we might hypothesize that a natural evolution of all human institutions is from an initial period of growth and value production towards inevitable value capture, stagnation, and decline.  We see a lack of freedom in the world around us, not because freedom can’t work well, but because it’s hard to preserve against the incursions of wannabe despots, who eventually ruin the system for everyone including themselves.

That model points the way to new questions, surrounding the kinds of governance that Jo Freeman talks about. By default an organization will succumb to inefficient hierarchy, and structureless organizations will succumb faster and to more toxic hierarchies. When designing governance structures, the question you want to ask is not just “is this a system I’d want to live under today?” but “how effective will this system be in the future at resisting the guys who will come along and try to take over and milk it for short-term personal gain until it collapses?”  And now we’re starting to sound like the rationale and reasoning behind the U.S. Constitution, though I certainly don’t think that’s the last word on the subject.

Book Recommendations: An Everyone Culture and Moral Mazes

Epistemic Status: Casual

I highly recommend An Everyone Culture, by Robert Kegan, and Moral Mazes, by Robert Jackall, as companion books on business culture. Moral Mazes is an anthropological study of the culture and implicit ethics of a few large corporations, and is an eye-opening illustration of the problems that arise in those corporations. An Everyone Culture is an introduction to the idea of a “deliberately developmental organization”, an attempt to fix those problems, plus some case studies of companies that implemented “deliberately developmental” practices.

The basic problem that both books observe in corporate life is that everybody in a modern office is trying to conceal their failures and present a misleadingly positive impression of themselves to their employers and coworkers.

This leads to lost productivity.

For instance:

  • The longer one tries to cover up a mistake, the costlier it will be to fix it.
  • The less accurately credit is allocated for success or failure, the harder it will be to incentivize good work.
  • The more employees misinform their bosses, the worse-informed the bosses’ decisions will be.
  • The more people are concerned with maintaining appearances, the less cognitive capacity they will have for productivity and creativity.
  • The more unacceptable it is to acknowledge “personal” concerns (emotions, physical health, intrinsic motivation or lack thereof), the harder it is to fix productivity problems that arise from “personal” problems.

Moral Mazes basically takes the view that the Protestant work ethic really died in the mid-to-late nineteenth century, when an American economy defined by small business owners and freelance professionals was replaced by an economy defined by larger firms and the rise of the managerial profession. The Protestant work ethic declared that hard work, discipline, and honesty would bring success. The “managerial work ethic” holds that a good employee has quite different “virtues” — things like

  • ability to play politics
  • loyalty & willing to subordinate oneself to one’s manager
  • “flexibility” (the opposite of stubbornness — not holding strong individual opinions)

To give an outside example, the author of “The Western Elite from a Chinese Perspective” was coming from a “Protestant work ethic” culture of hard work (though not, of course, actually Protestant) and encountering the “managerial work ethic” culture of American office politics.

Moral Mazes relies on the author’s observations and interviews with managers. I’m sure it’s not a fully objective portrayal — perhaps the author selected the most damning quotes, and perhaps the most disgruntled and cynical managers were the most willing to talk.  But the picture the book gives is of a culture where:

  • rank is everything — contradicting your boss, especially in public, is career suicide, and deference to superiors is expected
  • beyond a certain minimum floor of competence, objective job performance doesn’t determine career success, political skill does
  • “credit flows upwards, details flow downwards” — higher-rank managers take credit for work done by their subordinates, and the higher-rank you are, the fewer object-level details you concern yourself with
  • mistakes and bad decisions are reliably concealed; then, when the inevitable catastrophe happens, whoever’s politically vulnerable takes the fall
  • managers are tested for their “flexibility” — someone with strong opinions about the best engineering decisions or with rigid ethical principles will not rise far in their career

If you watch The Marvelous Mrs. Maisel, Joel Maisel’s job at the plastics company is a classic example of the managerial work ethic; he’s basically a professional sycophant.  He’s burned out and unmotivated, and he leaves to “find himself” as a comedian, but quickly realizes he has no talent at comedy either.  Instead, working in his father’s garment business, he comes to life again.  He learns the nitty-gritty of the factory floor, the accounting, the machines, the seamstresses and their personal needs and strengths and weaknesses.  It’s a beautiful illustration of the difference between fake work and real work.

An Everyone Culture‘s prescription for the problems of deception, sycophancy, and stagnation in conventional companies is complex, but I’d summarize it as follows: creating a culture where everyone talks about mistakes and improvements, and where the personal/professional boundaries are broken down.

This sounds vaguely cultish and shocking, and indeed, the companies profiled (like Bridgewater) are often described as cults.  Kegan acknowledges that their practices are outside most of our comfort zones, but believes that nothing inside the range of what we think of as a normal workplace will solve workplace dysfunctions.

What distinguishes the companies profiled in the book is a lot of talk, about issues that would ordinarily be considered too “personal” for work.  When a failure occurs, a DDO looks for the root cause, as you would in a kaizen system, but it won’t stop there — people will also ask what personal or psychological issue caused the mistake.  Does this person have a tendency towards overconfidence that they need to work on?  Were they afraid of looking bad?  Do they need to learn to consider others’ feelings more?

It’s vulnerable to be laid bare in this way, but, at least in the ideal of a DDO, everyone does it, from the interns to the CEO, to the point that people internalize that having flaws and a personal life is nothing to hide. Some people would find this horrifically intrusive, but others find it a relief.

I’ve never worked in a DDO, but I think I might like it; with enough mandated transparency, I’d be forced to override the temptation to hide flaws and make myself look better, and could focus better on actually doing good work.

The cost, of course, is way more communication about seemingly non-work-related things. You’d be processing personal stuff with coworkers all the time. The hope is that this is actually cheaper than the costs of the bad decisions made when you don’t have enough honest communication, but it’s an empirical matter whether that works out in practice, and the authors don’t have data so far.

Contrite Strategies and The Need For Standards

Epistemic Status: Confident

There’s a really interesting paper from 1996 called The Logic of Contrition, which I’ll summarize here.  In it, the authors identify a strategy called “Contrite Tit For Tat”, which does better than either Pavlov or Generous Tit For Tat in Iterated Prisoner’s Dilemma.

In Contrite Tit For Tat, the player doesn’t only look at what he and the other player played on the last term, but also another variable, the standing of the players, which can be good or bad.

If Bob defected on Alice last round but Alice was in good standing, then Bob’s standing switches to bad, and Alice defects against Bob.

If Bob defected on Alice last round but Alice was in bad standing, then Bob’s standing stays good, and Alice cooperates with Bob.

If Bob cooperated with Alice last round, Bob keeps his good standing, and Alice cooperates.

This allows two Contrite Tit For Tat players to recover quickly from accidental defections without defecting against each other forever;

D/C -> C/D -> C/C

But, unlike Pavlov, it consistently resists the “always defect” strategy

D/C -> D/D -> D/D -> D/D …

Like TFT (Tit For Tat) and unlike Pavlov and gTFT (Generous Tit For Tat), cTFT (Contrite Tit For Tat) can invade a population of all Defectors.

A related contrite strategy is Remorse.  Remorse cooperates only if it is in bad standing, or if both players cooperated in the previous round. In other words, Remorse is more aggressive; unlike cTFT, it can attack cooperators.

Against the strategy “always cooperate”, cTFT always cooperates but Remorse alternates cooperating and defecting:

C/C -> C/D -> C/C -> C/D …

And Remorse defends effectively against defectors:

D/C -> D/D -> D/D -> D/D…

But if one Remorse accidentally defects against another, recovery is more difficult:

C/D -> D/C -> D/D -> C/D -> …

If the Prisoner’s Dilemma is repeated a large but finite number of times, cTFT is an evolutionarily stable state in the sense that you can’t do better for yourself when playing against a cTFT player through doing anything that deviates from what cTFT would recommend. This implies that no other strategy can successfully invade a population of all cTFT’s.

REMORSE can sometimes be invaded by strategies better at cooperating with themselves, while Pavlov can sometimes be invaded by Defectors, depending on the payoff matrix; but for all Prisoner’s Dilemma payoff matrices, cTFT resists invasion.

Defector and a similar strategy called Grim Trigger (if a player ever defects on you, keep defecting forever) are evolutionarily stable, but not good outcomes — they result in much lower scores for everyone in the population than TFT or its variants.  By contrast, a whole population that adopts cTFT, gTFT, Pavlov, or Remorse on average gets the payoff from cooperating each round.

The bottom line is, adding “contrition” to TFT makes it quite a bit better, and allows it to keep pace with Pavlov in exploiting TFT’s, while doing better than Pavlov at exploiting Defectors.

This is no longer true if we add noise in the perception of good or bad standing; contrite strategies, like TFT, can get stuck defecting against each other if they erroneously perceive bad standing.

The moral of the story is that there’s a game-theoretic advantage to not only having reciprocity (TFT) but standards (cTFT), and in fact reciprocity alone is not enough to outperform strategies like Pavlov which don’t map well to human moral maxims.

What do I mean by standards?

There’s a difference between saying “Behavior X is better than behavior Y” and saying “Behavior Y is unacceptable.”

The concept of “unacceptable” behavior functions like the concept of “standing” in the game theory paper.  If I do something “unacceptable” and you respond in some negative way (you get mad or punish me or w/e), I’m not supposed to retaliate against your negative response, I’m supposed to accept it.

Pure reciprocity results in blood feuds — “if you kill one of my family I’ll kill one of yours” is perfectly sound Tit For Tat reasoning, but it means that we can’t stop killing once we’ve started.

Arbitrary forgiveness fixes that problem and allows parties to reconcile even if they’ve been fighting, but introduces the new problem that now you’re vulnerable to an attacker who just won’t quit.

Contrite strategies are like having a court system. (Though not an enforcement system!  They are still “anarchist” in that sense — all cTFT bots are equal.)  The “standing” is an assessment attached to each person of whether they are in the wrong and thereby restricted in their permission to retaliate.

In general, for actions not covered by the legal system and even for some that are, we don’t have widely shared standards of acceptable vs. unacceptable behavior.  We’re aware (and especially so given the internet) that these standards differ from subculture to subculture and context to context, and we’re often aware that they’re arbitrary, and so we have enormous difficulty getting widely shared clarity on claims like “he was deceptive and that’s not OK”.  Because…was he deceptive in a way that counts as fraud? Was it just “puffery” of the kind that’s normal in PR?  Was it a white lie to spare someone’s feelings?  Was it “just venting” and thus not expected to be as nuanced or fact-checked as more formal speech?  What level or standard of honesty could he reasonably have been expected to be living up to?

We can’t say “that’s not OK” without some kind of understanding that he had failed to live up to a shared expectation.  And where is that bar?  It’s going to depend who you ask and what local context they’re living in.  And not only that, but the fact that nobody is keeping track of where even the separate, local standards are, eventually standards will have to be dropped to the lowest common denominator if not made explicit.

MBTI isn’t science but it’s illustrative descriptively, and it seems to me that the difference between “Perceivers” and “Judgers”, which is basically the difference between the kinds of people who get called “judgmental” in ordinary English and the people who don’t, is that “Judgers” have a clear idea of where the line is between “acceptable” and “unacceptable” behavior, while Perceivers don’t.  I’m a Perceiver, and I’ve often had this experience where someone is saying “that’s just Not OK” and I’m like “whoa, where are you getting that? I can certainly see that it’s suboptimal, this other thing would be better, but why are you drawing the line for acceptability here instead of somewhere else?”

The lesson of cTFT is that having a line in the first place, having a standard that you can either be in line with or in violation of, has survival value.

 

The Pavlov Strategy

Epistemic Status: Common knowledge, just not to me

The Evolution of Trust is a deceptively friendly little interactive game.  Near the end, there’s a “sandbox” evolutionary game theory simulator. It’s pretty flexible. You can do quick experiments in it without writing code. I highly recommend playing around.

One of the things that surprised me was a strategy the game calls Simpleton, also known in the literature as Pavlov.  In certain conditions, it works pretty well — even better than tit-for-tat or tit-for-tat with forgiveness.

Let’s set the framework first. You have a Prisoner’s dilemma type game.

  • If both parties cooperate, they each get +2 points.
  • If one cooperates and the other defects, the defector gets +3 points and the cooperator gets -1 point
  • If both defect, both get 0 points.

This game is iterated — you’re randomly assigned to a partner and you play many rounds.   Longer rounds reward more cooperative strategies; shorter rounds reward more defection.

It’s also evolutionary — you have a proportion of bots each playing their strategies, and after each round, the bots with the most points replicate and the bots with the least points die out.  Successful strategies will tend to reproduce while unsuccessful ones die out.  In other words, this is the Darwin Game.

Finally, it’s stochastic — there’s a small probability that any bot will make a mistake and cooperate or defect at random.

Now, how does Pavlov work?

Pavlov starts off cooperating.  If the other player cooperates with Pavlov, Pavlov keeps doing whatever it’s doing, even if it was a mistake; if the other player defects, Pavlov switches its behavior, even if it was a mistake.

In other words, Pavlov:

  • cooperates when you cooperate with it, except by mistake
  • “pushes boundaries” and keeps defecting when you cooperate, until you retaliate
  • “concedes when punished” and cooperates after a defect/defect result
  • “retaliates against unprovoked aggression”, defecting if you defect on it while it cooperates.

If there’s any randomness, Pavlov is better at cooperating with itself than Tit-For-Tat. One accidental defection and two Tit-For-Tats are stuck in an eternal defect cycle, while Pavlov’s forgive each other and wind up back in a cooperate/cooperate pattern.

Moreover, Pavlov can exploit CooperateBot (if it defects by accident, it will keep greedily defecting against the hapless CooperateBot, while Tit-For-Tat will not) but still exerts some pressure against DefectBot (defecting against it half the time, compared to Tit-For-Tat’s consistent defection.)

The interesting thing is that Pavlov can beat Tit-For-Tat or Tit-for-Tat-with-Forgiveness in a wide variety of scenarios.

If there are only Pavlov and Tit-For-Tat bots, Tit-For-Tat has to start out outnumbering Pavlov quite significantly in order to win. The same is true for a population of Pavlov and Tit-For-Tat-With-Forgiveness.  It doesn’t change if we add in some Cooperators or Defectors either.

Why?

Compared to Tit-For-Tat, Pavlov cooperates better with itself.  If two Tit-For-Tat bots are paired, and one of them accidentally defects, they’ll be stuck in a mutual defection equilibrium.  However, if one Pavlov bot accidentally defects against its clone, we’ll see

C/D -> D/D -> C->C

which recovers a mutual-cooperation equilibrium and picks up more points.

Compared to Tit-For-Tat-With-Forgiveness, Pavlov cooperates *worse* with itself (it takes longer to recover from mistakes) but it “exploits” TFTWF’s patience better. If Pavlov accidentally defects against TFTWF, the result is

D/C -> D/C -> D/D -> C/D -> D/D -> C/C,

which leaves Pavlov with a net gain of 1 point per turn, (over the first five turns before a cooperative equilibrium) compared to TFTWF’s 1/5 point per turn.

If TFTWF accidentally defects against Pavlov, the result is

C/D -> D/C -> D/C -> D/D -> C/D

which cycles eternally (until the next mistake), getting Pavlov an average of 5/4 points per turn, compared to TFTWF’s 1 point per turn.

Either way, Pavlov eventually overtakes TFTWF.

If you add enough DefectBots to a mix of Pavlovs and TFT’s (and it has to be a large majority of the total population being DefectBots) TFT can win, because it’s more resistant against DefectBots than Pavlov is.  Pavlov cooperates with DefectBots half the time; TFT never does except by mistake.

Pavlov isn’t perfect, but it performs well enough to hold its own in a variety of circumstances.  An adapted version of Pavlov won the 2005 iterated game theory tournament.

Why, then, don’t we actually talk about it, the way we talk about Tit-For-Tat?  If it’s true that moral maxims like the Golden Rule emerge out of the fact that Tit-For-Tat is an effective strategy, why aren’t there moral maxims that exemplify the Pavlov strategy?  Why haven’t I even heard of Pavlov until now, despite having taken a game theory course once, when everybody has heard of Tit-For-Tat and has an intuitive feeling for how it works?

In Wedekind and Milinski’s 1996 experiment with human subjects, playing an iterated prisoner’s dilemma game, a full 70% of them engaged in Pavlov-like strategies.  The human Pavlovians were smarter than a pure Pavlov strategy — they eventually recognized the DefectBots and stopped cooperating with them, while a pure-Pavlov strategy never would — but, just like Pavlov, the humans kept “pushing boundaries” when unopposed.

Moreover, humans basically divided themselves into Pavlovians and Tit-For-Tat-ers; they didn’t switch strategies between game conditions where one strategy or another was superior, but just played the same way each time.

In other words, it seems fairly likely not only that Pavlov performs well in computer simulations, but that humans do have some intuitive model of Pavlov.  And, even more suggestively, it might be that “there are two kinds of people” — some people always play Pavlov while others always play Tit-For-Tat.

Human players are more likely to use generous Tit-For-Tat strategies rather than Pavlov when they have to play a working-memory game at the same time as they’re playing iterated Prisoner’s Dilemma.  In other words, Pavlov is probably more costly in working memory than generous Tit for Tat.

If you look at all 16 theoretically possible strategies that only have memory of the previous round, and let them evolve, evolutionary dynamics can wind up quite complex and oscillatory.

A population of TFT players will be invaded by more “forgiving” strategies like Pavlov, who in turn can be invaded by DefectBot and other uncooperative strategies, which again can be invaded by TFT, which thrives in high-defection environments.  If you track the overall rate of cooperation over time, you get very regular oscillations, though these are quite sensitive to variation in the error and mutation rates and nonperiodic (chaotic) behavior can occur in some regimes.

This is strangely reminiscent of Peter Turchin’s theory of secular cycles in history.  Periods of peace and prosperity alternate with periods of conflict and poverty; empires rise and fall.  Periods of low cooperation happen at the fall of an empire/state/civilization; this enables new empires to rise when a subgroup has better ability to cooperate with itself and fight off its enemies than the surrounding warring peoples; but in peacetime, at the height of an empire, more forgiving and exploitative strategies like Pavlov can emerge, which themselves are vulnerable to the barbaric defectors.  This is a vastly simplified story compared to the actual mathematical dynamics or the actual history, of course, but it’s an illustrative gist.

The big takeaway from learning about evolutionary game theory is that it’s genuinely complicated from a player-perspective.

“It’s complicated” sometimes functions as a curiosity-stopper; you conclude “more research is needed” instead of looking at the data you have and drawing preliminary conclusions, if you want to protect your intellectual “territory” without putting yourself out of a job.

That isn’t the kind of “complexity” I’m talking about here.  Chaos in dynamical systems has a specific meaning: the system is so sensitive to initial conditions that even a small measurement error in determining where it starts means you cannot even approximately predict where it will end up.

“Chaos: When the present determines the future, but the approximate present does not approximately determine the future.”

Optimal strategy depends sensitively on who else is in the population, how many errors you make, and how likely strategies are to change (or enter or leave).  There are a lot of moving parts here.

Argue Politics* With Your Best Friends

Epistemic Status: I endorse this strongly but don’t think I’m being original or clever at all.

Until recently — yesterday, in fact — I was seriously wrong about something.

I thought that it was silly when I saw people spending lots of energy arguing with their closest friends who almost completely agreed with them, but not quite.

That’s some People’s Front Of Judaea shit, I thought.  Don’t you know that guy you’re arguing with so vehemently is your friend?  He likes you!  He’s a pretty good guy!  He even shares your values and models, almost completely! He’s only wrong about this one, itty bitty, relatively abstract thing!

Meanwhile, there are people out there in the world who don’t share your values. And there are people out there who are actually evil and do awful things.

It’s like “ok, saying mean things about Muslims can be bad, but being a Muslim terrorist is a hell of a lot worse! Why do the people who are so quick to penalize Islamophobic speech never have anything bad to say about actual mass murder?  C’mon, get a sense of proportion!”

I still think, obviously, that really bad actions are worse than slightly bad actions.

But I was seriously misunderstanding why people argue with their close friends.

Have you noticed my mistake yet?  Give it a moment.

. . .

. . .

. . .

Ok, here it is.

Arguing is not a punishment.

Again.

Arguing is not a punishment.

Sure, serious wrongdoing should be penalized, and socially disapproved of, more than mild wrongdoing.   (Murder is worse than prejudiced speech.)

Also, fixing big problems should take priority over fixing little problems. (Saving money on rent is worth more of your attention than saving money on apples.)

But let’s frame it differently.

Cooperation is really valuable. Stable cooperation, that is; when even in the future, when you know each other better, and you’ve had more time to think, you’ll still want to cooperate.

Trust is really valuable, and scarce.  Justified trust, that is; when you can rely on what somebody says to be true and base your decisions on information you get from them.

Having “true friends” — people you can cooperate with and trust, stably, to a high degree — is valuable.

Yeah, you can get along and even thrive in a low-trust environment if you have the right skills for it.  Havamal, the medieval Icelandic wisdom literature, attributed to the god Odin, is my favorite advice for how to be a savvy customer in a low-trust world. (Exercise for the reader: think about how it applies to the replication crisis in science.) But especially in a low-trust world, true friends are valuable, as Havamal will remind you again and again.

How do you get more trust and cooperation with your friends?

It’s a hard problem; I haven’t solved it or even really started trying yet, the following are just ideas at the conceptual level rather than things I’ve found successful.

But communicating with them to get on the same page is clearly part of the puzzle.  Cooperation means “you and I agree to do X, and then we follow through and actually do X.”  The part about willingness to follow through is about loyalty, conscientiousness, motivation, integrity, all those kinds of virtues.  The part about agreeing to do X, though?  That’s not possible unless you both clearly understand what X is, which is much harder than it sounds!  It takes a lot of discussion, in my experience and from what I’ve heard, to get people on the same page about what exactly they’ve committed to doing.

Moreover, if I don’t understand why X is so important to you, and I say “yeah, ok, sure, X”, and then I go home and back to my life, but X still seems pointless to me, then I’m going to be less motivated to do X.

Because we didn’t have the argument about “is X pointless or not?”

We didn’t resolve it. We let it drop, to be nice, because we’re friends and we like each other.  But we didn’t get on the same page, and now a ball got dropped and you’re unhappy with me.

That getting-on-the-same-page process is not a punishment.

It’s something you’d only do with a friend close enough that you really might cooperate on work that you care about getting done.  (Mundane example: household chores.  Gotta get on the same page about who’s responsible for what!  Negotiating for fewer/different responsibilities is better than shirking!  That can be a really hard thing to internalize, though.)

“I spend more time communicating and getting on the same page with my friends than I do on having discussions with people I hate” — frame it that way, and suddenly that doesn’t sound like pointless infighting, it sounds mature and practical, right?

Of course you’d focus most on clarifying communication with your closest friends! They’re the people you’re most likely to be able to cooperate with!

Ok, so what kind of agreement is most valuable and attainable?  After all, nobody, even your closest friends, agrees with you on everything.

Short term, the answer is obvious: agreement on the details that are practical and relevant to the tasks you share.  Share an apartment?  Gotta come to agreement on chores, and share world-models relevant for those. (It’s no good if I agree to sweep but I don’t know where we keep the broom.)

But how about the long-run and more meta problem of living in a low-cooperation world itself?

Here’s one example: we’re in a real trade war with China now. Chinese investment in the US dropped 92 percent in the first half of 2018!  I’ve tuned out financial markets for most of my life, but I’m essentially a professional fundraiser now, and let me tell you, a drop in Chinese-US investment that drastic affects a US organization’s ability to raise capital.  Trade wars, like real wars, can come along all of a sudden and destroy value. Cooperation in this sense is less about singing kumbaya and more about not taking a wrecking ball to your own house.  The Hobbesian war of all against all ruins things that people were trying to build.

You want collaborators on fixing that kind of a problem?

The relevant things to agree (and disagree!) on are about the nature of cooperation and trust themselves. How are alliances and coalitions formed and maintained and broken?  How, and how well, do enforcement mechanisms and incentive strategies work?  You can think of these questions through the lenses of a number of fields:

  • game theory
  • evolutionary psychology
  • some branches of economics (mechanism design, public choice, price theory in general)
  • international relations (I know none of this)
  • Marxism (I haven’t read Marx either, but I’ve heard that his class analysis can be seen as applied iterated game theory, where a “class” refers to a coalition)

In all cases, the things to get on the same page about are positive not normative aspects of fundamental theory not immediate policy.

We want long-term cooperation, right? That means fundamentals need to be gotten right. Why? If you focus on object-level policy, it’s too easy for your friend to concur without agreeing (“I agree we should do X, but not with your reason for doing X”), which means that on the next policy question that comes up, your friend might not even concur!

(I have a friend — a good guy! a smart guy! — who concurs with me on 100% of object-level political controversies, and in every case, he concurs for a reason I think is dumb.  You may know someone like that too.  For the purposes of building long-term cooperation, your friend Mr. Concur is harder to get on the same page with, and thus lower priority to have discussions with, than your friend Ms. Dissent, who starts with the same premises as you but takes them in a totally different direction.  This is counterintuitive, because often you will initially get along better with Mr. Concur! That is because the mechanism that produces “getting along with” and makes friendships closer or weaker is itself a short-term, object-level policy! For instance, people in the same political tribe are nicer to each other.)

So, that’s why fundamental principles, not immediate policy.

Why positive and not normative?  So you’ll avoid unnecessary hostility.

Hostility, after all, in game-theory-land, is what it feels like from the inside to decide that your interests are opposed to someone else’s.  You can come to this conclusion mistakenly.  To avoid becoming hostile by mistake, first try to clearly understand and communicate what the landscape of interests and incentives even looks like.  That’s what professional negotiators harp on all the time — more often than most people assume, it’s in your interests to keep asking clarifying questions until you understand wtf is going on, and stay cordial enough to keep talking until you understand wtf is going on, because that increases the odds you’ll find a mutually agreeable deal, should one exist.  (Notwithstanding this, there are cases in which obfuscating your negotiating position is in your interest.  That’s less true, I expect, the more meta you go.  Another reason to start with foundations rather than policies.)

Sticking around for a technical discussion is, itself, a gesture of trust. It invests resources.

That’s why it’s hard to get this stuff started. As I write this, I haven’t washed up yet, I’m not cleaning the house or reading science papers or adding stuff to the LRI blog, and I’m ignoring my baby (who, luckily, is happily playing with his toys and smiling at me every so often.)  I’m of the opinion that laying these things out in writing is one of the better ways I have to start coordinated conversations, but, let’s be real, it does involve being a little…spendthrift.  Feeling like “sure, I can afford to do this.”  I’m also reading Law’s Order, currently.  That’s also a resource investment into this whole maybe-doomed “understand the micro-foundations of politics” goal, and it also looks kinda like goofing off, and lookit, aren’t there already economists for this who do it better?  I’m in a remarkably privileged position at the moment when I have a bunch of time flexibility, and something tells me that this is one of the ways I want to be using it.  It is kind of the future of humanity, after all.  But actually spending hours chatting merrily — or furiously — with a friend about what is effectively politics for nerds — well, that’s what people usually call “wasting time”, isn’t it?

It’s not a waste if you do it well.  But I get that there are a lot of incentives pushing against it.

What friendly theory talk has going for it is the very long term — getting to be the future’s equivalent of Confucius or Boethius and their friends, or maybe even the Amoraim— and the very short term, in which it’s fun to hang out with your friends and talk about interesting things and have some sense that you’re getting somewhere.

Example question to explore:

The nitty-gritty of the  “forgiveness” part of “tit-for-tat-with-forgiveness” in iterated games.  There are a lot of slightly different variants of this, I know, which are viable enough to see play.  Algorithms for recovery of cooperation after defection — how do different ones work? Advantages or disadvantages?  Do any of them correspond to known human behaviors or historical/current institutions?  As a practical matter, what kind of heuristics do people use as to whether or how to revive relationships with friends that have grown distant, pitch to leads that have gone cold, collect debts that have gone unpaid for a long time, etc?

 

Player vs. Character: A Two-Level Model of Ethics

Epistemic Status: Confident

This idea is actually due to my husband, Andrew Rettek, but since he doesn’t blog, and I want to be able to refer to it later, I thought I’d write it up here.

In many games, such as Magic: The Gathering, Hearthstone, or Dungeons and Dragons, there’s a two-phase process. First, the player constructs a deck or character from a very large sample space of possibilities.  This is a particular combination of strengths and weaknesses and capabilities for action, which the player thinks can be successful against other decks/characters or at winning in the game universe.

The choice of character often determines the strategies that character can use in the second phase, which is actual gameplay.  In gameplay, the character can only use the affordances that it’s been previously set up with.

This means that there are two separate places where a player needs to get things right: first, in designing a strong character/deck, and second, in executing the optimal strategies for that character/deck during gameplay.

(This is in contrast to games like chess or go, which are single-level; the capacities of black and white are set by the rules of the game, and the only problem is how to execute the optimal strategy. Obviously, even single-level games can already be complex!)

The idea is that human behavior works very much like a two-level game.

The “player” is the whole mind, choosing subconscious strategies.  The “elephant“, not the “rider.”  The player is very influenced by evolutionary pressure; it is built to direct behavior in ways that increase inclusive fitness.  The player directs what we perceive, do, think, and feel.

The player creates what we experience as “personality”, fairly early in life; it notices what strategies and skills work for us and invests in those at the expense of others.  It builds our “character sheet”, so to speak.

Note that even things that seem like “innate” talents, like the savant skills or hyperacute senses sometimes observed in autistic people, can be observed to be tightly linked to feedback loops in early childhood. In other words, savants practice the thing they like and are good at, and gain “superhuman” skill at it.  They “practice” along a faster and more hyperspecialized path than what we think of as a neurotypical “practicing hard,” but it’s still a learning process.  Savant skills are more rigidly fixed and seemingly “automatic” than non-savant skills, but they still change over time — e.g. Stephen Wiltshire, a savant artist who manifested an ability to draw hyper-accurate perspective drawings in early childhood, has changed and adapted his art style as he grew up, and even acquired new savant talents in music.  If even savant talents are subject to learning and incentives/rewards, certainly ordinary strengths, weaknesses, and personality types are likely to be “strategic” or “evolved” in this sense.

The player determines what we find rewarding or unrewarding.  The player determines what we notice and what we overlook; things come to our attention if it suits the player’s strategy, and not otherwise.  The player gives us emotions when it’s strategic to do so.  The player sets up our subconscious evaluations of what is good for us and bad for us, which we experience as “liking” or “disliking.”

The character is what executing the player’s strategies feels like from the inside.  If the player has decided that a task is unimportant, the character will experience “forgetting” to do it.  If the player has decided that alliance with someone will be in our interests, the character will experience “liking” that person.  Sometimes the player will notice and seize opportunities in a very strategic way that feels to the character like “being lucky” or “being in the right place at the right time.”

This is where confusion often sets in. People will often protest “but I did care about that thing, I just forgot” or “but I’m not that Machiavellian, I’m just doing what comes naturally.”  This is true, because when we talk about ourselves and our experiences, we’re speaking “in character”, as our character.  The strategy is not going on at a conscious level. In fact, I don’t believe we (characters) have direct access to the player; we can only infer what it’s doing, based on what patterns of behavior (or thought or emotion or perception) we observe in ourselves and others.

Evolutionary psychology refers to the player’s strategy, not the character’s. (It’s unclear which animals even have characters in the way we do; some animals’ behavior may all be “subconscious”.)  So when someone speaking in an evolutionary-psychology mode says that babies are manipulating their parents to not have more children, for instance, that obviously doesn’t mean that my baby is a cynically manipulative evil genius.  To him, it probably just feels like “I want to nurse at night. I miss Mama.”  It’s perfectly innocent. But of course, this has the effect that I can’t have more children until I wean him, and that’s to his interest (or, at least, it was in the ancestral environment when food was more scarce.)

Szaszian or evolutionary analysis of mental illness is absurd if you think of it as applying to the character — of course nobody wakes up in the morning and decides to have a mental illness. It’s not “strategic” in that sense. (If it were, we wouldn’t call it mental illness, we’d call it feigning.)  But at the player level, it can be fruitful to ask “what strategy could this behavior be serving the person?” or “what experiences could have made this behavior adaptive at one point in time?” or “what incentives are shaping this behavior?”  (And, of course, externally visible “behavior” isn’t the only thing the player produces: thoughts, feelings, and perceptions are also produced by the brain.)

It may make more sense to frame it as “what strategy is your brain executing?” rather than “what strategy are you executing?” since people generally identify as their characters, not their players.

Now, let’s talk morality.

Our intuitions about praise and blame are driven by moral sentiments. We have emotional responses of sympathy and antipathy, towards behavior of which we approve and disapprove. These are driven by the player, which creates incentives and strategic behavior patterns for our characters to play out in everyday life.  The character engages in coalition-building with other characters, forms and breaks alliances with other characters, honors and shames characters according to their behavior, signals to other characters, etc.

When we, speaking as our characters, say “that person is good” or “that person is bad”, we are making one move in an overall strategy that our players have created.  That strategy is the determination of when, in general, we will call things or people “good” or “bad”.

This is precisely what Nietzsche meant by “beyond good and evil.”  Our notions of “good” and “evil” are character-level notions, encoded by our players.

Imagine that somewhere in our brains, the player has drawn two cartoons, marked “hero” and “villain”, that we consult whenever we want to check whether to call another person “good” or “evil.” (That’s an oversimplification, of course, it’s just for illustrative purposes.)  Now, is the choice of cartoons itself good or evil?  Well, the character checks… “Ok, is it more like the hero cartoon or the villain cartoon?”  The answer is “ummmm….type error.”

The player is not like a hero or a villain. It is not like a person at all, in the usual (character-level) sense. Characters have feelings! Players don’t have feelings; they are beings of pure strategy that create feelings.  Characters can have virtues or vices! Players don’t; they create virtues or vices, strategically, when they build the “character sheet” of a character’s skills and motivations.  Characters can be evaluated according to moral standards; players set those moral standards.  Players, compared to we characters, are hyperintelligent Lovecraftian creatures that we cannot relate to socially.  They are beyond good and evil.

However! There is another, very different sense in which players can be evaluated as “moral agents”, even though our moral sentiments don’t apply to them.

We can observe what various game-theoretic strategies do and how they perform.  Some, like “tit for tat”, perform well on the whole.  Tit-for-tat-playing agents cooperate with each other. They can survive pretty well even if there are different kinds of agents in the population; and a population composed entirely of tit-for-tat-ers is stable and well-off.

While we can’t call cellular automata performing game strategies “good guys” or “bad guys” in a sentimental or socially-judgmental way (they’re not people), we can totally make objective claims about which strategies dominate others, or how strategies interact with one another. This is an empirical and theoretical field of science.

And there is a kind of “”morality”” which I almost hesitate to call morality because it isn’t very much like social-sentiment-morality at all, but which is very important, which simply distinguishes the performance of different strategies.  Not “like the hero cartoon” or “like the villain cartoon”, but “win” and “lose.”

At this level you can say “look, objectively, people who set up their tables of values in this way, calling X good and Y evil, are gonna die.”  Or “this strategy is conducting a campaign of unsustainable exploitation, which will work well in the short run, but will flame out when it runs out of resources, and so it’s gonna die.”  Or “this strategy is going to lose to that strategy.”  Or “this strategy is fine in the best-case scenario, but it’s not robust to noise, and if there are any negative shocks to the system, it’s going to result in everybody dying.

“But what if a losing strategy is good?” Well, if you are in that value system, of course you’ll say it’s good.  Also, you will lose.

Mother Teresa is a saint, in the literal sense: she was canonized by the Roman Catholic Church. Also, she provided poor medical care for the sick and destitute — unsterilized needles, no pain relief, conditions in which tuberculosis could and did spread.  Was she a good person? It depends on your value system, and, obviously, according to some value systems she was.  But, it seems, that a population that places Mother Teresa as its ideal (relative to, say, Florence Nightingale) will be a population with more deaths from illness, not fewer, and more pain, not less.  A strategy that says “showing care for the dying is better than promoting health” will lose to one that actually can reward actions that promote health.  (To be fair, for most of human history we didn’t have ways to heal the sick that were clearly better than Mother Teresa’s, and even today we don’t have credit-allocation systems that reliably reward the things that keep people alive and healthy; it would be wrong to dump on Catholicism too much here.)  That’s the “player-level” analysis of the situation.

Some game-theoretic strategies (what Nietzsche would call “tables of values”) are more survival-promoting than others.  That’s the sense in which you can get from “is” to “ought.”  The Golden Rule (Hillel’s, Jesus’s, Confucius’s, etc) is a “law” of game theory, in the sense that it is a universal, abstract fact, which even a Lovecraftian alien intelligence would recognize, that it’s an effective strategy, which is why it keeps being rediscovered around the world.

But you can’t adjudicate between character strategies just by being a character playing your strategy.  For instance, a Democrat usually can’t convert a Republican just by being a Democrat at him. To change a player’s strategy is more like “getting the bodymind to change its fundamental assessments of what is in its best interests.”  Which can happen, and can happen deliberately and with the guidance of the intellect! But not without some…what you might call, wiggling things around.

The way I think the intellect plays into “metaprogramming” the player is indirect; you can infer what the player is doing, do some formal analysis about how that will play out, comprehend (again at the “merely” intellectual level) if there’s an error or something that’s no longer relevant/adaptive, plug that new understanding into some change that the intellect can affect (maybe “let’s try this experiment”), and maybe somewhere down the chain of causality the “player”‘s strategy changes. (Exposure therapy is a simple example, probably much simpler than most: add some experiences of the thing not being dangerous and the player determines it really isn’t dangerous and stops generating fear emotions.)

You don’t get changes in player strategies just by executing social praise/blame algorithms though; those algorithms are for interacting with other characters.  Metaprogramming is… I want to say “cold” or “nonjudgmental” or “asocial” but none of those words are quite right, because they describe character traits or personalities or mental states and it’s not a character-level thing at all.  It’s a thing Lovecraftian intelligences can do to themselves, in their peculiar tentacled way.