On Drama

Epistemic Status: Loose but mostly serious

One of the things that’s on my mind a lot is the psychology of Nazis.  Not neo-Nazis, but the literal Nazi party in Germany in the 1930’s and 40’s. In particular, Adolf Hitler.  What was it like inside his head? What could make a person into Hitler?

When I read Mein Kampf, I was warned by my more historically-minded friends that it wasn’t a great way to learn about Nazism. Hitler, after all, was a master manipulator. His famous work of propaganda would obviously paint him in an unrealistically favorable light.

The actual impressions I got from Mein Kampf, though, were very similar to the psychological profile of Hitler compiled by the OSS,  (h/t Alice Monday) the US’s intelligence service during WWII and the predecessor of the CIA.

Here’s what Hitler was like, as presented by the OSS:

  • Lazy by default, only able to be active when agitated
  • Totally uninterested in details, facts, sitting down to work, “dull” things
  • Dislikes and fears logic, prefers intuition
  • Keen understanding of human psychology, especially “baser” urges
  • Very sensitive to the “vibe” of the room, the emotional arc of the crowd
  • Strong aesthetic sense and interest in the visual and theatrical
  • Highly sentimental, kind to dogs and children, accepting of personal foibles
  • Views human interaction through the lens of seduction and sadomasochism
  • Eager to submit as well as to dominate, but puzzled or disgusted by anything which is neither submission nor domination
  • Sensitive to slights, delighted by praise, obsessed with superficial marks of rank & respect
  • Fixated on personal loyalty
  • Suicidal (and frequently threatened suicide long before he actually did it)

This is all very Cluster B, though the terminology for personality disorders didn’t exist at the time and I’m obviously not in a position to make a diagnosis.  Hitler’s tantrums, impulsiveness, inability to have lasting relationships, constant seeking of approval and need to be at the center of attention, grandiosity, envy, and lack of concern for moral boundaries, are all standard DSM symptoms of personality disorders.

In his own words, Hitler was very opposed to rule of law and intellectual principles: “The spectacled theorist would have given his life for his doctrine rather than for his people.”  He disapproved of intellectuals and of logical thinking, had contempt for “Manchester liberalism” (classical liberalism) and commerce, and instead praised the spiritual transfiguration that masses of people could attain through patriotism and self-sacrifice.

He said, “A new age of magic interpretation of the world is coming, of interpretation in terms of the will and not of the intelligence. There is no such thing as truth either in the moral or the scientific sense.”

He believed strongly in the need for propaganda, and repeatedly explained the principles for designing it:

  • it must be simple and easy to understand by the uneducated
  • it must be one-sided and present us as absolutely good and the enemy as absolutely bad
  • it must have constant repetition
  • it should NOT be designed to appeal to intellectuals or aesthetes
  • it should focus on feelings not objectivity

He believed in the need of the people for “faith”, not because he was a believing Christian, but because he thought it was psychologically necessary:

“And yet this human world of ours would be inconceivable without the practical existence of a religious belief. The great masses of a nation are not composed of philosophers. For the masses of the people, especially faith is absolutely the only basis of a moral outlook on life. The various substitutes that have been offered have not shown any results that might warrant us in thinking that they might usefully replace the existing denominations. But if religious teaching and religious faith were once accepted by the broad masses as active forces in their lives, then the absolute authority of the doctrines of faith would be the foundation of all practical effort. There may be a few hundreds of thousands of superior men who can live wisely and intelligently without depending on the general standards that prevail in everyday life, but the millions of others cannot do so. Now the place which general custom fills in everyday life corresponds to that of general laws in the State and dogma in religion. The purely spiritual idea is of itself a changeable thing that may be subjected to endless interpretations. It is only through dogma that it is given a precise and concrete form without which it could not become a living faith.”

In other words, the picture that is emerging is that Hitler himself craved, and understood other people’s craving, for a certain kind of emotionally resonant experience. Religious or mystical faith; absorption in the crowd; mass enthusiasm; sacrifice of self; and sacrifice of the outsider or scapegoat.  Importantly, truth doesn’t matter for this experience, and critical thinking must be absolutely suppressed in order to fully enact the ritual.

I’m pretty confident, despite not having much knowledge of history, that this was a real and central part of Hitler’s ideology and practice.

If you watch Triumph of the Will, it’s very clearly a mass ritual calculated to produce strong emotional responses from the crowd.

In particular, the emotion it evokes is certainty. The crowd looks to their leader for validation and assurance; and with great confidence, he gives it to them, assuring the German people eternal glory.  One can safely lay down one’s burden of worry and anxious thought.  One can be at peace, knowing that one has Hitler’s love and approval. One can rest in the faith that Hitler will take care of things.

Repetitive call-and-response rituals, endless ranks of soldiers, flags and logos and symbols, huge crowds, rhythmic beats, all give a sense of a simple, steady, loud, bold message. It is cognitively easy. There is no need to strain to hear or understand.  It will be the same, over and over again, forever.

What the OSS report suggests, which Nazi propaganda would never admit, is that Hitler himself craved external validation, and was distraught when it was not supplied.  He understood how badly the people wanted to be led and to be annihilated in the worship of a ruler, because he longed for that submission and release himself.

There is nothing particularly unusual about what I’m saying; the standard accounts of Nazism always make mention of the quasi-religious fanaticism it engendered.  And the connection to ritual is obvious: mass events, loss of individuality in the collective frenzy, the heightening of tension and its release, often through violence.  This is the pattern of all sacrificial festivals.

You can see a modern reconstruction of the primitive sacrificial festival in the Rite of Spring (here, with Nijinsky’s choreography and Roerich’s set design, which captures the atavistic character of the original ballet in a way later productions don’t).

You can also see a version of this in the coronation scene from Boris Godunov, which is a very beautiful expression of quasi-religious mass worship for a state leader.

There’s an important connection between drama, the drive for emotional validation and stirring up interpersonal conflict, and drama, acting out a play to produce a sense of catharsis in the audience, originally as part of a religious ritual involving both sacrifice and collective frenzy.

Both drama in the colloquial sense and the artistic sense are about evoking emotions and provoking sympathies.  Drama requires an emotional arc, in which tension rises, comes to a head, and is released (catharsis).

Why is this satisfying?  Why do we like to lose our minds, to go up into an irrational frenzy, and then to come down again, often through sorrow and sympathetic suffering?

Current psychological opinion holds that catharsis doesn’t work; venting anger makes people angrier and more violent, not less so.  This isn’t a new idea; Plato thought that encouraging violent passions through theater would only make them worse.

It’s possible that the purpose of drama isn’t to help people cool down, but quite the opposite: to provide plausibly-deniable occasions for mob violence, and to bind the group closer together by sharing strong emotional connections.  Emotional mirroring helps groups coordinate better, including for war or hunting. Highly rhythmic activities (like music, dance, and chanting) both promote emotional mirroring and make it easy to detect those individuals who are out of step or disharmonious.

(In the original Nijinsky choreography of the Rite of Spring, the girl who is chosen to be a human sacrifice is chosen by lot, through a “musical-chairs”-style game in which the one caught out of the circle is singled out. In both Greek and Biblical tradition, sacrifices were chosen by lot. “Random” choice of a victim is often an excellent, plausibly-deniable way to promote subconscious choice.)

Ben Hoffman’s concept of empathy as herd cognition is similar, though humans are more like pack predators than true herd animals.  Emotions are shared directly, through empathy, through song and dance and nonverbal vibrations.  This is a low-bandwidth channel and can’t convey complex chained plans ahead of time.  You can’t communicate “if-then” statements directly through emotional mirroring.  But you can communicate a lot about friend and foe, and guide quite complex behaviors through “warmer, colder, warmer”-style reinforcement learning.

It’s a channel of communication that’s optimized to be intelligible only to the people who are in harmony at the moment — that is, those who are feeling the same thing, are part of the group, are acting in roughly the same way.  This has some disadvantages. For one thing, it’s hard to use it to coordinate division of labor. You need more explicit reasoning to, for instance, organize your army into a pincer movement, as Shaka Zulu did.  Emotion-mirroring motivates people to “act as one”, not to separate into parts.  For another thing, emotion-mirroring doesn’t allow for fruitful disagreement or idea-generation, because that’s inherently disharmonious, no matter how friendly in intent or effect; suggesting a different idea is differing from the group.

The advantage of emotional-mirroring as a form of communication is precisely that it is only intelligible to people who are engaging in the mirroring. If you are coordinating against the people who are out of sync or out of harmony, you can be secretive in plain view, simply by communicating through a rhythm that they can’t quite detect.

It makes sense, in a sort of selfish-gene way.  A gene which caused individuals to become very good at coordinating with others who had the gene, to kill those who didn’t have the gene, would promote natural selection for itself.  It would make it feel good to harmonize and “become one with” the crowd, and elevate rage to a fever pitch against those who would interrupt the harmony.  Those who didn’t have the gene would be worse at seeing the mob coming, and would not be able to secretly coordinate with each other.

(This idea is not due to me, but to a friend who might prefer to remain anonymous.)

Only a small portion of the population can be antisocial in the long run, where antisocial means impulsive aggression, in the sense of “people who are more likely to drive at the oncoming car in the game of Chicken”; evolutionary game theory simulations bear that out.  Aggressive or risk-seeking behavior can only be a minority trait, because while it does result in more sexual success and more short-term wins in adversarial games, people with those traits have too high a risk of dying out. But the more sensitive, harmony-coordination-mob trait, might be better at surviving, because it’s usually quiescent and only initiates violence when there’s a critical mass of people moving in unison.

There also may be the “charismatic” or “Dionysian” or “actor/performer/poet/bard” trait: the ability of an individual to activate people’s harmony-sensing, emotional-mirroring moods, the ability to make people get up and dance or cheer or fight.  People with borderline personality disorder sometimes are better than neurotypicals at reading emotions and inferring people’s feelings and intentions in social situations.  Hyper-sensitive, hyper-expressive people may also be a stable minority strategy; minority, because getting people worked up increases risk, though not as much as unilaterally seeking conflict oneself.

High drama is, obviously, dangerous. It is also powerful and at times beautiful. Even those of us who would never be Nazis can be moved by art and music and theater and religious ritual.  It’s a profound part of the human psyche.  It’s just important to be aware of how it works.

Drama is inherently transient and immediate. It’s like a spell; it affects those within range, while the spell is being sustained, and dissipates when the spell is broken. If you want to enhance drama, you create an altered environment, separate from everyday life, and aim for repetition, unanimity, and cohesiveness.  You rev people up with enthusiasm.  You say “Yes, and…”, as in improv. If you want to dispel drama, you break up the scene with interruptions, disagreements, references to mundane details, collages of discordant elements.  You deescalate emotions by becoming calm and boring.  You impede the momentum. 

If you have a plan that you’re afraid will fail unless everyone stays rev’ed up 24/7 and unanimously enthusiastic, you have a plan that’s being communicated through drama, and you need to beware that drama’s nature is typically transient, irrational, and violent.

Denotative language, as opposed to enactive language, is literally opposed to role-playing. When you say out loud what is going on — not to cause anyone to do anything, but literally just to inform them what is going on — you are “breaking character.”

If I am playing the role of a sad person, it’s breaking character to say “I’d probably feel better if I took a nap.”  That’s not expressing sadness! That’s not what a Sad Person would say!  It’s not acting out the arc of “inconsolableness” to its inevitable conclusion. It’s cutting corners.  Cheating, almost.  Breaking momentum.

By alluding to the reality beyond the current improv scene, the scaffolding of facts and interests that lasts even after passions have cooled, I am ruining the scene and ceding my power to shape it, but potentially gaining a qualitatively different kind of power.

Breaking flow is inherently frustrating, because we humans probably have a desire for flow for its own sake.  Drama wants drama. Flow wants flow.

But ultimately, there’s a survival imperative that limits all of these complex adaptations. You have to be alive in order to act out a drama. The “scaffolding” facts of practical reality remain, even if they’re mostly far away when you’re well-insulated from danger.  Drama provides a relative, but not an absolute, survival advantage, which means it’s more-or-less a parasitic phenomenon, and has natural limitations on how much behavior it can co-opt before negative consequences start showing up.


Parenting and Heritability Overview

Epistemic status: pretty preliminary, not conclusive

Can parenting affect children’s outcomes? Can you raise your child to be better, healthier, smarter, more successful?

There’s a lot of evidence, from twin and adoption studies,  that behavioral traits are highly heritable and not much affected by adoptive parents or by the environment shared between siblings.

High heritability does not strictly imply that parenting doesn’t matter, for a few reasons.

  1. Changes across the entire population don’t affect heritability. For example, heights have risen as nutrition improves, but height remains just as heritable.  So if parenting practices have changed over time, heritability won’t show whether those changes helped or hurt children.
  2. Family environment and genes may be positively correlated. For instance, if a gene for anxiety causes both anxiety in children and harshness in parents, then it may be that the parenting still contributes to the children’s anxiety.  If parents who overcome their genetic predispositions are sufficiently rare, it may still be possible that choosing to parent differently can help.
  3. Rare behaviors won’t necessarily show up at the population level.  Extremely unusual parenting practices can still be helpful (or harmful), if they’re rare enough to not be caught in studies.  Extremely unusual outcomes in children (like genius-level achievement) might also not be caught in studies.
  4. Subtle effects don’t show up in studies that easily. A person who has to spend a lot of time in therapy unlearning subtle emotional harms from her home environment won’t necessarily show up as having a negative outcome on a big correlational study.

With those caveats in mind, let’s see what the twin and adoption studies show.


In a study of 331 pairs of twins reared together and apart, a negligible proportion of the variance in personality was due to shared family environment.  About 50% of the variance in personality scores was due to genetics; average heritability was 0.48.[1]

Attachment Style

In a study of 125 early-adopted adolescents, secure-attached infants were more likely to grow into secure-attached teenagers (correlation 0.30, p<0.01), and mothers of secure adolescents were more likely to show “sensitive support” (high relatedness and autonomy in resolving disagreements with children) at age 14. (p < 0.03).[2]

Antisocial/Criminal Behavior

An adoption study found that adolescents whose adoptive parents had high levels of conflict with them (arguments, hitting, criticizing and hurting feelings, etc) were more likely to have conduct problems. Correlations were between 0.574 and 0.696. Effects persisted longitudinally (i.e. past conflict predicted future delinquency).[3]

A meta-analysis of 51 twin and adoption studies found that 32% of the variance in antisocial behavior was due to genetic influences, while 16% was due to shared environment influences.[4]


Drug Abuse

In a Swedish adoption study of 18,115 children, adopted children with biological parents who abused drugs were twice as likely to abuse drugs themselves, while there was no elevated risk for having an adoptive parent who abused drugs.  However, adoptive siblings of adopted children with DA were twice as likely to abuse drugs as adoptive siblings of adopted children without DA. This implies that there is both environmental and genetic influence, but suggests that environmental influence may be more about peers than parents.[5]

Psychiatric Disorders

Having a mother (but not a father) with major depression was associated in adoptive children getting major depression, in a study of 1108 adopted and nonadopted adolescents.  Odds ratio of having a mother with major depression was 3.61 for nonadopted children and 1.97 for adopted children.  Odds ratio of externalizing disorders if you had a mother with depression was 2.23 for nonadopted children and 1.69 for adopted children.[6]


The Minnesota Study of Twins Reared Apart, which includes more than 100 pairs of twins, and found that 70% of the variance in IQ of monozygotic twins raised apart was genetic. No environmental factor (father’s education, mother’s education, socioeconomic status, physical facilities) contributed more than 3% of the variance between twins. Identical twins correlate about 70% in IQ, 53% on traditionalism, 49% on religiosity, 34% on social attitudes, etc.  Identical twins reared apart are roughly as similar as identical twins reared together.[7]

According to a twin study, heritability on PSAT scores was 50-75%, depending on subscore.[8]


Years of Schooling

In the Wisconsin Longitudinal Survey,  of 16481 children of which 610 were adopted, finds that adopted parental income has a significant positive effect on years of schooling.  Adoptive father’s years of schooling had a significant effect, but adoptive mother’s years of schooling were not significant. In nonadopted families, parental IQ and years of schooling (both mother and father) have a statistically significant effect.[9]


Reading Achievement

The Colorado Adoption Study finds that heritability of reading usually explains about 40% of the variance in outcomes in reading achievement, while adoptive-sibling correlations (a measure of shared environment) explain less than 10% of the variance. The rest is non-shared environment.  Unrelated sibling correlations are 0.05, while related sibling correlations are 0.26. Genetic correlations rise with age (from 0.34 at age 7 to 0.67 at age 16).[10]

In the Western Reserve Twin Study of 278 twin pairs, ages 6-12, IQ score variance was mostly due to heritability (37%-78%, depending on subscore) and not on shared environment (<8%).  However, school achievement was more dependent on shared environment (65-73%) than heritability (19-27%).[11]

In a twin study, spelling ability has a heritability of 0.53.[12]

Language ability in toddlers, in a twin study, was found to be more dependent on shared environment than genetics: 71% of variance explained by shared environment, 28% explained by genetics. This was reversed in the case of reading ability in 7-10-year-olds, where 72% of variance was explained by genetics, while 20% was explained by shared environment. Maybe the effects of home environment fade out with age.[13]

Academic Achievement

A twin study of 2602 twin pairs found that 62% of variance in science test scores at age 9 was explained by heredity, compared to 14% shared environment. There was no difference between boys and girls in heritability.[14]

51-54% of variance in grades, in the Minnesota Twin Study, is due to heredity, in girls and boys respectively.  Similar genetic contributions to IQ (52%, 37%), externalizing behavior (45%, 47%) and engagement (54%, 49%).  Shared environment mattered less (26%).  The majority (55%) of the change in grades after age 11 is due to “nonshared environment.”[15]



The National Longitudinal Study of Youth which included full and half-siblings found IQ was 64% heritable, education was 68% heritable, and income was 42% heritable.  Almost all the rest of income variation was non-shared environment (49%), leaving only 9% explained by shared environment.[16]

In a study of Finnish twins, 24% of the variance of women’s lifetime income and 54% of the variance of men’s lifetime income was due to genetic factors, and the contribution of shared environment is negligible.[17]


Corporal Punishment

In laboratory settings, corporal punishment is indeed effective at getting immediate compliance.  In a meta-analysis of mostly correlational and longitudinal studies, the weighted mean effect size of corporal punishment was -0.58 on the parent-child relationship, -0.49 on childhood mental health, 0.42 on childhood delinquent and antisocial behavior, 0.36 on childhood aggression, 1.13 on immediate compliance. There were no large adult effects significant at a <0.01 level, but there was an effect size of 0.57 on aggression significant at a <0.05 level.

Bottom line is that corporal punishment is fairly bad for childhood outcomes, but doesn’t usually cause lasting trauma or adult criminal/abusive behavior; still, there are good evidence-based reasons not to do it.


What Parenting Can’t Affect

Personality, IQ, reading ability in teenagers, and income are affected negligibly by the “shared environment” contribution. Drug abuse is also very heritable and not much affected by parenting.


What Parenting Might Affect

Reading ability in children and grades in teenagers have a sizable (but minority) shared environment component; reading ability in toddlers is mostly affected by shared environment. Grades are generally less IQ-correlated than test scores, and are highly affected by school engagement and levels of “externalizing” behavior (disruptive behavior, inattention, criminal/delinquent activity.)  Antisocial and criminal behavior has a sizable (but minority) shared environment component. You may be able to influence your kids to behave better and study harder, and you can definitely teach your kids to read younger, though a lot of this may turn out to be a wash by the time your kids reach adulthood.


What Parenting Can Affect

Having a mother — even an adoptive mother — with major depression puts children at risk for major depression, drug abuse, and externalizing behavior. Conflict at home also predicts externalizing behavior in teenagers. Mothers of teenagers who treat them well are more likely to have teenagers who have loving and secure relationships with them. Basically, if I were to draw a conclusion from this, it would be that it’s good to have a peaceful and loving home and a mentally healthy mom.

Father’s income and family income, but not mother’s income, predicts years of schooling; I’m guessing that this is because richer families can afford to send their kids to school for longer. You can, obviously, help your kids go to college by paying for it.


[1]Tellegen, Auke, et al. “Personality similarity in twins reared apart and together.” Journal of personality and social psychology 54.6 (1988): 1031.

[2]Klahr, Ashlea M., et al. “The association between parent–child conflict and adolescent conduct problems over time: Results from a longitudinal adoption study.” Journal of Abnormal Psychology 120.1 (2011): 46.

[3]Klahr, Ashlea M., et al. “The association between parent–child conflict and adolescent conduct problems over time: Results from a longitudinal adoption study.” Journal of Abnormal Psychology 120.1 (2011): 46.

[4]Rhee, Soo Hyun, and Irwin D. Waldman. “Genetic and environmental influences on antisocial behavior: a meta-analysis of twin and adoption studies.” Psychological bulletin 128.3 (2002): 490.

[5]Kendler, Kenneth S., et al. “Genetic and familial environmental influences on the risk for drug abuse: a national Swedish adoption study.” Archives of general psychiatry 69.7 (2012): 690-697.

[6]Tully, Erin C., William G. Iacono, and Matt McGue. “An adoption study of parental depression as an environmental liability for adolescent depression and childhood disruptive disorders.” American Journal of Psychiatry 165.9 (2008): 1148-1154.

[7]Bouchard, T., et al. “Sources of human psychological differences: The Minnesota study of twins reared apart.” (1990).

[8]Nichols, Robert C. “The national merit twin study.” Methods and goals in human behavior genetic (1965): 231-244.

[9]Plug, Erik, and Wim Vijverberg. “Does family income matter for schooling outcomes? Using adoptees as a natural experiment.” The Economic Journal 115.506 (2005): 879-906.

[10]Wadsworth, Sally J., et al. “Genetic and environmental influences on continuity and change in reading achievement in the Colorado Adoption Project.” Developmental contexts of middle childhood: Bridges to adolescence and adulthood (2006): 87-106.

[11]Thompson, Lee Anne, Douglas K. Detterman, and Robert Plomin. “Associations between cognitive abilities and scholastic achievement: Genetic overlap but environmental differences.” Psychological Science 2.3 (1991): 158-165.

[12]Stevenson, Jim, et al. “A twin study of genetic influences on reading and spelling ability and disability.” Journal of child psychology and psychiatry 28.2 (1987): 229-247.

[13]Harlaar, Nicole, et al. “Why do preschool language abilities correlate with later reading? A twin study.” Journal of Speech, Language, and Hearing Research 51.3 (2008): 688-705.

[14]Haworth, Claire MA, Philip Dale, and Robert Plomin. “A twin study into the genetic and environmental influences on academic performance in science in nine‐year‐old boys and girls.” International Journal of Science Education 30.8 (2008): 1003-1025.

[15]Johnson, Wendy, Matt McGue, and William G. Iacono. “Genetic and environmental influences on academic achievement trajectories during adolescence.” Developmental psychology 42.3 (2006): 514.

[16]Rowe, David C., Wendy J. Vesterdal, and Joseph L. Rodgers. “Herrnstein’s syllogism: Genetic and shared environmental influences on IQ, education, and income.” Intelligence 26.4 (1998): 405-423.

[17]Hyytinen, Ari, et al. “Heritability of lifetime income.” (2013).

[18]Gershoff, Elizabeth Thompson. “Corporal punishment by parents and associated child behaviors and experiences: a meta-analytic and theoretical review.” Psychological bulletin 128.4 (2002): 539.

Don’t Shoot the Messenger

Epistemic status: confident but informal

A while back, I read someone complaining that the Lord of the Rings movie depicted Aragorn killing a messenger from Mordor. In the book, Aragorn sent the messenger away.  The moviemakers probably only intended to add action to the scene, and had no idea that they had made Aragorn into a shockingly dishonorable character.

Why don’t you shoot messengers?  What does that tradition actually mean?

Well, in a war, you want to preserve the ability to negotiate for peace.  If you kill a member of the enemy’s army, that puts you closer to winning the war, and that’s fine.  If you kill a messenger, that sends a message that the enemy can’t safely make treaties with you, and that means you destroy the means of making peace — both for this war and the wars to come.  It’s much, much more devastating than just killing one man.

This is also probably why guest law exists in so many cultures.  In a world ruled by clans, where a “stranger” is a potential enemy, it’s vitally important to have a ritual that guarantees nonviolence, such as breaking bread under the same roof. Otherwise there would be no way to broker peace between your family and the stranger over the next hill.

This is why the Latin hostis (enemy) and hospes (guest or host) are etymologically cognate. This is why the Greeks had a concept of xenia so entrenched that they told stories about a man being tied to a fiery wheel for eternity for harming a guest.  This is why the sin of Sodom was inhospitality.

It’s actually not about charity or compassion, exactly. It’s about coordinating a way to not kill each other.

Guest law and not shooting messengers are natural law: they are practical necessities due to game theory, that ancient peoples traditionally concretized into virtues like “honor” or “hospitality.”  But it’s no longer common knowledge what they’re for.

A friend of mine speculated that, in the decades that humanity has lived under the threat of nuclear war, we’ve developed the assumption that we’re living in a world of one-shot Prisoner’s Dilemmas rather than repeated games, and lost some of the social technology associated with repeated games. Game theorists do, of course, know about iterated games and there’s some fascinating research in evolutionary game theory, but the original formalization of game theory was for the application of nuclear war, and the 101-level framing that most educated laymen hear is often that one-shot is the prototypical case and repeated games are hard to reason about without computer simulations.

One of the things about living in what feels like the shadow of the end of the world — there’s been apocalypse in the zeitgeist since at least the 1980’s and maybe longer — is that it’s very counterintuitive to think about a future that might last a long time.

What if we’re not wiped out by an apocalypse?  What if humans still have an advanced civilization in 50 years — albeit one that looks very different from today’s?  What if the people who are young today will live to grow old? What would it be like to take responsibility for consequences and second-order effects at the scale of decades?  What would it be like to have models of the next twenty years or so — not for the purpose of sounding cool at parties, but for the sake of having practical plans that actually extend that far?

I haven’t thought much about how to go about doing that, but I think we may have lost certain social technologies that have to do with expecting there to be a future, and it might be important to regain them.

Sepsis Cure Needs An RCT

Epistemic Status: Confident

Every now and then the news comes out with a totally clear-cut, dramatic example of an opportunity to do a lot of good. This is one of those times.

The story began in January, 2016, when Dr. Paul Marik was running the intensive care unit at Sentara Norfolk General Hospital. A 48-year-old woman came in with a severe case of sepsis — inflammation frequently triggered by an overwhelming infection.

“Her kidneys weren’t working. Her lungs weren’t working. She was going to die,” Marik said. “In a situation like this, you start thinking out of the box.”

Marik had recently read a study by researchers at Virginia Commonwealth University in Richmond. Dr. Berry Fowler and his colleagues had shown some moderate success in treating people who had sepsis with intravenous vitamin C.

Marik decided to give it a try. He added in a low dose of corticosteroids, which are sometimes used to treat sepsis, along with a bit of another vitamin, thiamine. His desperately ill patient got an infusion of this mixture.

“I was expecting the next morning when I came to work she would be dead,” Marik said.”But when I walked in the next morning, I got the shock of my life.”

The patient was well on the road to recovery.

Marik tried this treatment with the next two sepsis patients he encountered, and was similarly surprised. So he started treating his sepsis patients regularly with the vitamin and steroid infusion.

After he’d treated 50 patients, he decided to write up his results. As he described it in Chest, only four of those 47 patients died in the hospital — and all the deaths were from their underlying diseases, not from sepsis. For comparison, he looked back at 47 patients the hospital had treated before he tried the vitamin C infusion and found that 19 had died in the hospital.

This is not the standard way to evaluate a potential new treatment. Ordinarily, the potential treatment would be tested head to head with a placebo or standard treatment, and neither the doctors nor the patients would know who in the study was getting the new therapy.

But the results were so stunning, Marik decided that from that point on he would treat all his sepsis patients with the vitamin C infusion. So far, he’s treated about 150 patients, and only one has died of sepsis, he said.

That’s a phenomenal claim, considering that of the million Americans a year who get sepsis, about 300,000 die.

Sepsis is a really big deal. More people die from sepsis every year than from diabetes and COPD combined. Ten thousand people die of sepsis every day.  A lot of these cases are from pneumonia in elderly people, or hospital-acquired infections.  Curing sepsis would put a meaningful dent in the kind of hell that hospital-bound old people experience, that Scott described in Who By Very Slow Decay.

Sepsis is the destructive form of an immune response to infection. Normally the infection is managed with antibiotics, but the immune response still kills 30% of patients.  Corticosteroids, which reduce the immune response, and vitamin C, which reduces blood vessel permeability so that organs are less susceptible to pro-inflammatory signals, can treat the immune response itself.

Low-dose corticosteroids have been found to significantly reduce mortality in sepsis elsewhere in controlled studies (see e.g. here, here, here) and there’s some animal evidence that vitamin C can reduce mortality in sepsis (see here).

This treatment seems to work extraordinarily well in Marik’s retrospective study; it is made of simple, cheap, well-studied drugs with a fairly straightforward mechanism of action; the individual components seem to work somewhat on sepsis too.  In other words, it’s about as good evidence as you can get, before doing a randomized controlled trial.

But, of course, before you can start treating patients with it, you need an RCT.

I wrote Dr. Marik and asked him what the current status of the trials is; he’s got leads at several hospitals: “two in CA, one at Harvard, and one in RI. In addition the Veterinary University of Georgia is proposing a neat study in horses — horses are at increased risk of sepsis.”

But he needs funding.

Medical research does not progress by default. The world is full of treatments that one doctor has tried to great success, which never went through clinical trials, and so we’ll never know how many lives could have been saved.  Some of the best scientists in the world are chronically underfunded. The world has not solved this coordination problem.

By default, things fall apart and never get fixed. They only get better if we act.

You can click on this Google Form to give me estimates of how much you’d be willing to donate and your contact information; once I get a sense of what’s possible, my next step will be coordinating with Dr. Marik and finding a good vehicle for where to send donations.

(I don’t have any personal connection to Dr. Marik or to the treatment; I literally just think it’s a good thing to do.)


Are Adult Developmental Stages Real?

Epistemic status: moderately confident

Robert Kegan’s developmental stages have become popular in my corner of the social graph, and I was asked by Abram Demski and Jacob Liechty to write a literature review (which they kindly funded, before I started my new job) of whether Kegan’s theory is justified. Since Kegan’s model is a composite that builds on many previous psychologists’ work, I had to do an overview of several theories of developmental stages.  I cover the theories of Piaget, Kohlberg, Erikson, Piaget, Maslow, and Kegan.  All of these developmental stage theories posit that there are various levels of cognitive, moral, or psychological maturity and sophistication; children start at the low levels and progress to the higher ones; only a few of the very “wisest” adults reach the very top stages.

This makes intuitive sense and is a powerful story to tell. You can explain conflicts and seemingly strange behavior by understanding that some people are simply on more primitive levels and cannot comprehend more sophisticated ones. You can be motivated to reach towards self-improvement by a model of a ladder of development.

But for the moment I want to ask more from developmental theories than being interesting or good stories; I want to ask if they’re actually correct.

In order for a developmental theory to be correct, I think a few criteria must be met:

  • The developmental stages must be reliably detectable, e.g. by some questionnaire or observational test that has high internal consistency and/or inter-rater reliability
  • The developmental stages must improve with age, at least within a given cohort (most people progress to later stages as they grow older)
  • The developmental stages must be sequential and cumulative (people must learn earlier stages before later ones, and not skip stages)
  • In cases where the developmental stages are supposed to occur at particular ages, they must actually be observed being attained at those ages.

Most of the theories do not appear to meet these criteria.


Jean Piaget was one of the pioneers of child development psychology. Beginning in the 1930’s, his observations of children led him to a sequential theory of how children gain cognitive abilities over time.

Piaget’s stages of cognitive development are:

  • Sensorimotor, ages 0-2, hand-eye coordination and goal-directed motion
  • Pre-operational, ages 2-7, speech, pretend play and use of symbols
  • Concrete operational, age 7-11, inductive logic, perspective-taking
  • Formal operational, ages 11-adult, deductive logic, abstraction, metacognition, problem-solving

Piaget’s first study, The Origins of Intelligence in Children, published in 1952, was conducted on his own three children, from birth to age 2. He and his wife made daily observations of the children.

Reflexes, Piaget noticed, are present from birth: the sucking reflex, upon contact with the nipple, happens automatically. In the first month of life, he notes that the babies become more effective at finding the nipple.  From one to two months, babies self-stimulate even when there is no breast — they suck their thumbs or make sucking motions on their own. This, Piaget calls the “primary circular reaction”. A reflex has been transformed into a self-generated behavior.  At first, the baby can’t reliably find his thumb; he flails his arms until they happen to brush his face, and then engages the sucking reflex.  There are “circular reactions” to grasping, looking, and listening as well.  Later, babies learn to coordinate these circular reactions across senses, and to move their bodies in order to attain an objective (e.g. reaching for an object to take it).

Some of Piaget’s conclusions have been disputed by modern experiments.

In his studies of infants, he tested their ability to reason about objects by occluding the object from view, subjecting it to some further, hidden motion, and then having the child search for the object.

However, younger infants have less physical ability to search, so this task is less appropriate for assessing what young infants know.  In the 1980’s, Leslie and Ballargeon used looking time as a metric for how much infants were surprised by observations; since this doesn’t require physical coordination, it allows for accurate assessment of the cognitive abilities of younger infants.  Leslie’s experiments confirmed that babies understand causality, and Ballargeon’s confirmed that babies have object permanence — in both cases, looking times were longer for “impossible” transformations of objects that violated the laws of causality or caused objects to transform when behind a screen.  4-month-old infants are, contra Piaget, capable of object permanence; they understand that objects must move along continuous paths, and that solid objects cannot pass through each other.[2]

Kittens go through Piaget’s sensorimotor stages: first reflexes, then habits (pawing, oscillating head), then secondary circular formations (wrestling, biting, dribbling with objects), and finally means-end coordination (playing hide-and-seek.)[3]  This supports the ordering of sensorimotor skills in Piaget’s classification.

A 1976 study of 9-14-year-olds given a test and subjecting it to factor analysis found that there were three axes: formal operational systematic permutations; concrete operational addition of asymmetric relations; and formal operational logic of implications.[4]  This supports something like Piaget’s classifications of cognitive tasks.

Different studies conflict on which operational stages come before others: is class inclusion always required before multiplication of classes? Ordinal before cardinal? Logical and number abilities before number conservation? There’s no consistent picture.  “By 1970, it was evident in the important book, Measurement and Piaget, that the empirical literature functioned poorly as a data base on which the objective evaluation of Piagetian theory could be effectively attempted (Green, Ford, & Flamer, 1971)…For example, Beard (1963) found that 50% of her 5- to 6-year-old samples conserved quantity (solid). In contrast, Lovell and Ogilvie (1960) and Uzgiris (1964) reported that it is in the 8- to 9-year-old range that children conserve quantity. Elkind (1961) reported that 52% of 6-year-old children conserved weight, but Lovell and Ogilvie (1961) reported this percentage for 10-year-old children.”[5]

According to Piaget’s “structured whole” theory, when children enter a new stage, they should gain all the skills of that stage at once. For instance, they should learn conservation of volume of water at the same time as they learn that the length of a string is conserved. “ However, point synchrony across domains has never been found. To the contrary, children manifest high unevenness or decalage (Feldman 1980, Biggs & Collis 1982, Flavell 1982). Piaget acknowledged this unevenness but never explained it; late in his life he asserted that it could not be explained (Piaget 1971).”  However, it’s overwhelmingly true that success at cognitive tasks is age-dependent. On a host of tasks, age is the most potent predictor of performance.[6]

Piaget claimed that children develop cognitive skills in discrete stages, at particular ages, and in a fixed order. None of these claims appear to be replicated across the literature. The weaker claims that children learn more cognitive skills as they grow older, that some skills tend to be learned earlier than others, and that there is some clustering in which children who can perform one skill can also perform similar skills, have some evidentiary support.


Lawrence Kohlberg, working in the 1960’s and 70’s, sought to extend Piaget’s developmental-stage theories to moral as well as cognitive development.

Kohlberg’s stages of moral development are:

  • Obedience and punishment (“how can I avoid punishment?”)
  • Instrumentalist Relativist (“what’s in it for me?”)
  • Interpersonal Concordance (“be a good boy/girl”, conformity, harmony, being liked)
  • “Law and Order” (maintenance of the social order)
  • Social contract (democratic government, greatest good for the greatest number)
  • Universal ethical principles (eg Kant)

The evidence for Kohlberg’s theory comes from studies of how people respond to questions about hypothetical moral dilemmas, such as “Heinz steals the drug”, a story about a man who steals an expensive drug to save his dying wife.

Kohlberg did longitudinal studies of adolescents and adults over a period of six years, in the US, Taiwan, Mexico, and isolated villages in Turkey and the Yucatan.  In all three developed-country examples, the prevalence of stages 1 and 2 declined with age, while the prevalence of 5 and 6 increased with age.  In the isolated villages, stage 1 declined with age, while stage 3 and 4 increased with age, and stages 5 and 6 were always rare. Among 16-year-olds, Stage 5 was the most common in the US, while stages 3 and 4 were the most common in Taiwan and Mexico; in the isolated villages, Stage 1 was still the most common by age 16.

In adults there was likewise some change in moral development over time — Stage 4 (law and order) increased with age from 16 to 24, in both lower- and middle-class men, and the highest rates of stage 4 were found in the men’s fathers.  Most men stabilize at Stage 4, while most women stabilize at stage 3.

Kohlberg’s experiments show that there is change with age in how people explain moral reasoning, which is similar in direction but different in magnitude across cultures.[7]

In subsequent studies from around the world, 85% (out of 20 cross-sectional studies) showed an increase in moral stage with age, and none of them found “stage skipping” (all stages between the lowest and the highest were present.)  Contra Kohlberg, most subsequent studies do not show significant sex differences in moral reasoning. There are some cultural differences: stage 1 does not show up in children in Iran or Hutterite children; most folk tribal societies do not have stages 4, 5, or 6 at all.[8]

Subsequent studies have shown that children do in fact go through Kohlberg’s stages sequentially, usually without stage skipping or regression.[9]

Juvenile delinquents have lower scores on Kohlberg’s moral development test than nondelinquents; moreover, the most psychopathic delinquents had the lowest scores.[11]

Jonathan Haidt has critiqued Kohlberg’s theory, on the grounds that people’s verbal reasoning process for justifying moral hypotheticals does not drive their conclusions.  In hypothetical scenarios about taboos —  like a pair of siblings who have sex, using birth control and feeling no subsequent ill effects — people quickly assert that incest is wrong, but can’t find rational explanations to justify it. People’s affective associations with taboo scenarios (such as claiming that it would upset them to watch) were better predictors of their judgments than their assessments of the harm of the scenarios.[10]

If the social intuitionists like Haidt are correct, then research in Kohlberg’s paradigm may tell us something about people’s verbal narratives about morality, but not about their decision-making process.

There is also the possibility that interviews about hypotheticals are not good proxies for moral decision-making in practice; people may give the explanations that are socially desirable rather than the real reasons for their judgments, and their judgments about hypotheticals may not correspond to their actions in practice.

Still, Kohlberg’s stages are an empirical phenomenon: there is high inter-rater reliability, people  advance steadily in stage with age (before stabilizing), and industrialized societies have higher rates of the higher stages.


Erik Erikson was a psychoanalyst who came to his own theory of stages of psychosocial development in the 1950’s, in which different stages of life force the individual to confront different challenges and develop different “virtues.”

Erikson’s developmental stages are:

  1. Trust vs. Mistrust (infancy, relationship with mother, feeding and abandonment)
  2. Autonomy vs. Shame (toddlerhood, toilet training)
  3. Initiative vs. Guilt (kindergarten, exploring and making things)
  4. Industry vs. Inferiority (grade school, sports)
  5. Identity vs. Role Confusion (adolescence, social relationships)
  6. Intimacy vs. Isolation (romantic love)
  7. Generativity vs. Stagnation (middle age, career and parenthood)
  8. Ego integrity vs. Despair (aging, death)

This theory had its origins in subjective clinical impressions. There has been some attempt to correlate a measure of identity achievement with other positive attributes, but, for instance, it has no association with self-esteem or locus of control, which would seem counterintuitive if the “identity achievement” score really corresponded to the development of an independent self.

A self-report questionnaire, in which people rated themselves on Trust, Autonomy, Initiative, Industry, Identity, and Intimacy, was found to have moderately high Kronbach alpha scores (0.57-0.75).  Males scored higher on autonomy and initiative, while females scored higher on intimacy, as you’d expect from sex stereotypes.[12]

Two studies, one of 394 inner-city men, and one of 94 college sophomore men, classified them as stage 4 if they never managed to live independently or made lasting friendships, stage 5 if they managed to live apart from their family of origin and become financially independent, stage 6 if they lived with a wife or partner, and stage 7 if they had children, managed others at work, or otherwise “cared for others”. They added a stage 6.5 for career consolidation.   Adult life stages in this sense were independent of chronological age, and men who didn’t master earlier stages usually never mastered later ones.[13]

Erikson’s Stage 5, identity development, has some observational evidence behind it; children’s spontaneous story-telling exhibits less concern with identity than adolescents’.  One researcher “found the white adolescents to show a pattern of “progressive identity formation” characterized by frequent changes in self-concept during the early high school years followed by increasing consistency and stability as the person approached high school graduation. In contrast, the black adolescents showed a general stability in their identity elements over the entire study period, a pattern Hauser termed “identity foreclosure.” He interpreted this lack of change as reflecting a problem in development in that important developmental issues had been dodged rather than resolved.”  Of course, it may also mean that “identity development” is culturally contingent rather than universal.[14]

A study that gave 244 undergraduates a questionnaire measuring the Eriksonian ego strengths found that “purpose in life, internal locus of control, and self-esteem bore strong positive relations with all of the ego strengths, with the exception of care.”  But there were no significant correlations between the ego strengths and age, nor any indication that they are achieved in a succession.[15]

A study giving 1073 college students an Erikson developmental stage questionnaire found that it did not fit the “simplex” hypothesis (where people’s achievement of stage n would depend directly on how well they’d achieved step n-1, and less on other stages.)[16]

A 22-year longitudinal study showed that people continued to develop higher scores on Erikson developmental-stage questionnaires between the ages of 20 and 42, even “younger” stages; there was a significant increase over time in stages 1, 5, and 6, for several cohorts.[17]

While people do seem to in some cases gain more of  Erikson’s  ego strengths over time, this finding is not reliable in all studies. People do not climb Erikson’s stages in sequence, or at fixed ages.


Psychologist Abraham Maslow, inspired by the horrors of war to learn about what propels people to “self-actualized”, developed the concept of a “hierarchy of needs” in which the lower ones must be fulfilled before people can pursue the higher ones.

Maslow’s hierarchy of needs are:

  • Physiological (food, air, water)
  • Safety (security from violence, disease, or poverty)
  • Love and belonging
  • Esteem (self-respect, respect from others)
  • Self-actualization (realizing one’s potential)

The theory is that lower needs, when unsatisfied, “dominate” higher needs — that one cannot focus on esteem without first satisfying the need for safety, for instance.  Once one need is satisfied, the next higher need will “activate” and start driving the person’s actions.

Three researchers (Alderfer, Huizinga, and Beer) developed questionnaires designed to measure Maslow’s needs, but all had weaknesses, “particularly a low convergence among items designed to measure the same constants.”  None of the studies showed Maslow’s five needs as independent factors.  Both adjacent and nonadjacent needs overlap, contradicting Maslow’s theory that needs are cumulative.  People also do not rank the importance of those needs according to Maslow’s order.  Also the “deprivation/domination” paradigm (that, the more deprived you are of a need, the higher its importance to you) is contradicted by studies that show that this is not true for safety, belonging, and esteem needs.  The “gratification/activation” theory, that when need n is satisfied, need n becomes less important and need n + 1 becomes more important, was also not borne out by studies.

The author of the review concludes, “Maslow’s Need Hierarchy Theory is almost a nontestable theory…Maslow (1970) criticized what he called the newer methods of research in psychology. He called for a “humane” science.  Accordingly, he did not attempt to provide rigor in his writing or standard definitions of concepts. Further, he did not discuss any guides for empirical verification of his theory. In fact, his defense of his theory consisted of logical as well as clinical insight rather than well-developed research findings.”[18]

However, a more recent study of 386 Chinese subjects found Cronbach alpha scores in the 80-90% range, positive correlations between the satisfaction of all needs, and higher correlations between the satisfaction of adjacent needs than nonadjacent needs.  This seems to suggest a stagelike progression, although the satisfaction of all needs still overlap.  Also, satisfaction of the physiological needs was a predictor of the satisfaction of every one of the four higher-level needs.[19]

A global study across 123 countries found that subjective wellbeing, positive feelings, and negative feelings were all correlated in the expected ways with certain universal needs: basic needs for food and shelter, safety and security, social support and love, feeling respected and pride in activities, mastery, and self-direction and autonomy.  The largest proportion of variance explained globally in life evaluation was from basic needs, followed by social, mastery, autonomy, respect, and safety.  The largest proportion of variance explained in positive emotions was from social and respect. The largest proportion of variance explained in negative emotions was from basic needs, respect, and autonomy.  There are “crossovers”, people who have fulfillment of higher needs but not lower ones: “For example, respect is frequently fulfilled even when safety needs are not met.”[20]

It is unclear whether Maslow’s needs are distinct natural categories, and it is clear that they do not have to be satisfied in sequence, that the most important needs to people are not necessarily the lowest ones or the ones they lack most, and that people do not develop stronger drives towards higher needs when their lower needs are fulfilled. The only part of Maslow’s theory that is borne out by evidence is that people around the world do, indeed, value and receive happiness from all Maslow’s basic categories of needs.


Robert Kegan is not an experimental psychologist, but a practicing therapist, and his books are works of interpretation rather than experiment. He integrates several developmental-psychology frameworks in The Evolving Self, such as Piaget, Kohlberg, and Maslow.

Kegan’s stage 0 is “Incorporative” — babies, corresponding to Piaget’s sensorimotor stage, no real social orientation.

Stage 1 is “impulsive”, corresponding to Piaget’s concrete operational stage, Kohlberg’s punishment/obedience orientation, and Maslow’s physiological satisfaction orientation: the subject is impulses, the objects are reflexes, sensing and moving. This is roughly toddlers.

Stage 2 is “imperial”, corresponding to Piaget’s concrete operational stage, Kohlberg’s “instrumental” orientation, and Maslow’s safety orientation; this is roughly grade-school-aged children. The subject is needs and wishes, the objects are impulses.

Stage 3 is “interpersonal”, corresponding to Piaget’s early formal operational, Kohlberg’s interpersonal concordance orientation, and Maslow’s belongingness orientation. The subject is mutuality and interpersonal relations, the objects are needs and wishes.  These are young teenagers.

Stage 4 is “institutional”, corresponding to Piaget’s formal operational, Kohlberg’s social contract orientation, and Maslow’s self-esteem orientation.  The subject is personal autonomy, the objects are mutuality and interpersonal relations.  This is usually young adulthood and career socialization.

Stage 5 is “interindividual”, corresponding to Maslow’s self-actualization orientation and Kohlberg’s principled orientation.  This is usually mature romantic relationship.

The Subject Object Interview is Kegan’s scale for measuring progression along the stages.

In a study of West Point students, average inter-rater agreement on the Subject-Object Interview was 63%, and students developed from stage 2 to stage 3 and from stage 3 to stage 4 over their years in school. Kegan stage in senior year had a correlation of 0.41 with MD (military development) grade.[21]

A study of 67 executives found that Kegan stage was correlated with leader performance at a p < 0.05 level; Kegan stage was also positively correlated with age.[22]

I was not able to find any studies that indicated whether people skip Kegan stages, regress in stage, or exhibit characteristics of other stages, or other psychometric instruments that decompose into Kegan stages with factor analysis.  Kegan’s stages do appear to be relatively observable and higher stage seems to correspond fairly well with external evaluations of leadership skill.


Piaget’s stages are not distinct (they overlap) or sequential (they can be skipped or attained in different orders.  Later stages do correlate with greater age, but the stages do not arise at consistent ages.

Kohlberg’s stages are sequential; they are defined as distinct by the measurement instrument; and they increase with age (as well as with social class and socioeconomic development of the community.)  Stages don’t arise at fixed ages.

Erikson’s stages do not appear to be distinct, sequential, or even consistently increasing with age.

Maslow’s needs do not appear to be sequentially satisfied.

Kegan’s stages are defined to be distinct by the measurement instrument, and they increased with age in two studies.  I could not find evidence that they are attained sequentially.

Overall, the experimental evidence that distinct, cumulative stages of human development exist is rather weak. The strongest evidence is for Kohlberg’s stages, and these (like all the other stages considered) are limited by the fact that they are measures of how people talk about moral decision-making, rather than what they decide in practice.

Higher stages correlate with positive results in many cases: people at higher Kohlberg stages are less likely to be criminals or delinquents, positive psychological strengths like self-esteem correlate with the Eriksonian ego strengths, and leadership development measures correlate with Kegan stage.  This is evidence that developmental stages do often correspond to real psychological strengths or skills with external validity.  We just don’t generally have strong reason to believe that they progress in a developmental fashion.


[1] Piaget, Jean. The origins of intelligence in children. Vol. 8. No. 5. New York: International Universities Press, 1952.

[2]Spelke, Elizabeth S. “Physical knowledge in infancy: Reflections on Piaget’s theory.” The epigenesis of mind: Essays on biology and cognition (1991): 133-169.

[3]Dumas, Claude, and François Y. Doré. “Cognitive development in kittens (Felis catus): An observational study of object permanence and sensorimotor intelligence.” Journal of Comparative Psychology 105.4 (1991): 357.

[4]Gray, William M. “The Factor Structure of Concrete and Formal Operations: A Confirmation of Piaget.” (1976).

[5]Shayer, Michael, Andreas Demetriou, and Muhammad Pervez. “The structure and scaling of concrete operational thought: Three studies in four countries.” Genetic, Social, and General Psychology Monographs 114.3 (1988): 307-375.

[6]Fischer, Kurt W., and Louise Silvern. “Stages and individual differences in cognitive development.” Annual Review of Psychology 36.1 (1985): 613-648.

[7]Kohlberg, Lawrence. “Stages of moral development.” Moral education 29 (1971).

[8]Snarey, John R. “Cross-cultural universality of social-moral development: a critical review of Kohlbergian research.” Psychological bulletin 97.2 (1985): 202.

[9]Walker, Lawrence J. “The sequentiality of Kohlberg’s stages of moral development.” Child Development (1982): 1330-1336.

[10]Haidt, Jonathan. “The emotional dog and its rational tail: a social intuitionist approach to moral judgment.” Psychological review 108.4 (2001): 814.

[11]Chandler, Michael, and Thomas Moran. “Psychopathy and moral development: A comparative study of delinquent and nondelinquent youth.” Development and Psychopathology 2.03 (1990): 227-246.

[12]Rosenthal, Doreen A., Ross M. Gurney, and Susan M. Moore. “From trust on intimacy: A new inventory for examining Erikson’s stages of psychosocial development.” Journal of Youth and Adolescence 10.6 (1981): 525-537.

[13]Vaillant, George E., and Eva Milofsky. “Natural history of male psychological health: IX. Empirical evidence for Erikson’s model of the life cycle.” The American Journal of Psychiatry (1980).

[14]Waterman, Alan S. “Identity development from adolescence to adulthood: An extension of theory and a review of research.” Developmental psychology 18.3 (1982): 341.

[15]Markstrom, Carol A., et al. “The psychosocial inventory of ego strengths: Development and validation of a new Eriksonian measure.” Journal of youth and adolescence 26.6 (1997): 705-732.

[16]Thornburg, Kathy R., et al. “Testing the simplex assumption underlying the Erikson Psychosocial Stage Inventory.” Educational and psychological measurement 52.2 (1992): 431-436.

[17]Whitbourne, Susan K., et al. “Psychosocial development in adulthood: A 22-year sequential study.” Journal of Personality and Social Psychology 63.2 (1992): 260.

[18]Wahba, Mahmoud A., and Lawrence G. Bridwell. “Maslow reconsidered: A review of research on the need hierarchy theory.” Organizational behavior and human performance 15.2 (1976): 212-240.

[19]Taormina, Robert J., and Jennifer H. Gao. “Maslow and the motivation hierarchy: Measuring satisfaction of the needs.” The American journal of psychology 126.2 (2013): 155-177.

[20]Tay, Louis, and Ed Diener. “Needs and subjective well-being around the world.” Journal of personality and social psychology 101.2 (2011): 354.

[21]Lewis, Philip, et al. “Identity development during the college years: Findings from the West Point longitudinal study.” Journal of College Student Development 46.4 (2005): 357-373.

[22]Strang, Sarah E., and Karl W. Kuhnert. “Personality and leadership developmental levels as predictors of leader performance.” The Leadership Quarterly 20.3 (2009): 421-433.

Resolve Community Disputes With Public Reports?

Epistemic Status: speculative, looking for feedback

TW: Rape

You’re in a tight-knit friend group and you hear some accusations about someone. Often, but not always, these are rape or sexual harassment accusations. (I’ve also seen it happen with claims of theft or fraud.)  You don’t know enough to take it to court, nor do you necessarily want to ruin the accused’s life, but you’ve also lost trust in them, and you might want to warn other people that the accused might be dangerous.

What usually happens at this point is a rumor mill.  And there are a lot of problems with a rumor mill.

First of all, you can get the missing stair problem.  Let’s say Joe raped someone, and the rumor got out. People whisper to each other that Joe is a rapist, they warn each other to stay away from him at parties — but the new girl, who isn’t in on the gossip, is not so lucky, and Joe rapes her too.  And meanwhile, Joe suffers no consequences for his actions, no social disincentive, and maybe community elders actively try to hush up the scandal.  This is not okay.

Sometimes you don’t get a missing-stair situation, you get a witch hunt or a purge.  Some communities are really trigger-happy about “expelling” people or “calling them out”, even for trivial infractions. A girl attempted suicide because her internet “friends” thought her artwork was offensive and tormented her for it.  This is also not what we want.

Sometimes you get a feud, where Alice the Accuser’s friends all rally round her, and Bob the Accused’s friends all rally round him, and there’s a long-lasting, painful rift in the community where everyone is pressured to pick a side because Alice and Bob aren’t speaking.

And a lot of the time you get misinformation spreading around, where the accusation gets magnified in a game of telephone, and you hear vague intimations that Bob is terrible but you are getting conflicting stories about what Bob actually did, and you don’t know the right way to behave.

I have never gotten to know a tight-knit social circle of youngish people that didn’t have “drama” of this kind.  It’s embarrassing that it happens, so it isn’t talked about that much in public, but I’m starting to believe that it’s near-universal.

In a way, this is a question of law.

The American legal system is a really poor fit even for dealing with some legitimate crimes, like sexual assault and small-scale theft, because the odds of a conviction are so low.  Less than 1% of rape, robbery, and assault cases lead to convictions. It can be extremely difficult and stressful to deal with the criminal-justice system, particularly if you’re traumatized, and most of the time it won’t even work.  Moreover, the costs of a criminal penalty are extremely high — prison really does destroy lives — and so people are understandably reluctant to put people they know through that.  And, given problems with police violence, involving the police can be dangerous.

And, of course, for social disputes that aren’t criminal, the law is no use at all.  “Really terrible boyfriend/girlfriend” is not a crime.  “Sockpuppeting and trolling” is not a crime.

Do we have to descend to the level of gossip, feud, witch-hunt, or cover-up, just because we can’t (in principle or in practice) resolve disputes with the legal system?

I think there are alternatives.

My proposed solution (by no means final) is that a panel of trusted “judges” accepted by all parties in the dispute compile a summary of the facts of the case from the accuser(s) and accused, circulate it within the community — and then stop there.

It’s not an enforcement mechanism or a dispute-resolution mechanism, it’s an information-transmission mechanism.

For example, it means that now people will know Joe is an accused rapist, and also know if Joe has explained that he’s innocent. This prevents a few problems:

  • the “missing stair” problem where new people never get warned about Joe
  • the problem that Joe faces no consequences (now his reception will likely be chillier among people who read the summary and think he’s a threat)
  • if Joe is innocent, he’ll face less unfair shunning if people get to hear his side of the story
  • it prevents inflated rumors about Joe from spreading — only the actual accusations get printed, not the telephone-garbled ones

There are a few details about the mechanism that seem important to make the process fair:

  • Accused and accusers must all consent to participating in the process and having their statements made public, otherwise it doesn’t happen
  • Accusers should be allowed to stay anonymous
  • Everybody can meet with the judges at the same time, or one on one, if they choose; accused and accusers do not have to be in the same room together
  • Judges should not have any personal stake in the dispute, and should be accepted by both accused and accusers
  • The format for the report should be something like a password-locked webpage, an email to a mailing list, or a Google doc, not a page on the public internet
  • The report should be limited to what accused and accusers say, and some fact-checking by the judges — maybe a timeline of claimed events, maybe some links to references. But not a verdict.

I’ve heard some counterarguments to this proposal so far.

First, I’ve heard concerns that this is too hard on the accused. Being known to have been accused of anything will make people trust you less, even if you also have the opportunity to defend yourself.  And maybe people, not wanting to make trouble, will still use gossip rather than the formal system, because it seems like too harsh a penalty.

I think it’s fine if not every dispute gets publicly adjudicated; if people don’t want to take it that far, then we’re no worse off than before the option of public fact-finding was made available.

It’s also not obvious to me that this is harsher to the accused than the social enforcement technology we already have.  People are already able to cause scandals by unilaterally making public accusations. My proposal isn’t unilateral — it doesn’t go through unless the accused and accusers both think that transparency can clear their names.

Another criticism I’ve heard is that it gives a false sense of objectivity. People know that the rumor mill is unreliable and weight it appropriately; but if people hear “there’s been a report from a panel of judges about this”, they might assume that everything in the report is definitely true, or worse, that the accused is just guilty by virtue of having been investigated.

This is a real problem, I think, but one that’s difficult to avoid completely. If you attempt to be objective in any setting, you always run the risk that people will mistake you for an oracle. It’s just as true that objective news coverage can give people a false sense of trust in newspapers, but journalistic ideals still promote objectivity. I do think giving an impression of a Weight of Authority can be harmful, and is only somewhat mitigated by practices like not handing down any verdict.

But I think information-sharing is the mildest form of restorative justice.  Restorative justice is dispute resolution within a community, or between offender and victim, rather than being mediated by the state.  It usually involves some kind of penalty and/or restitution from the offender to the victim, or some kind of community penalty (like shunning in various religious congregations.)  Given the failures of the criminal-justice system, restorative justice seems like an appealing goal to me; but it’s hard to implement, especially in modern, non-religious communities of young people without firm shared norms.  If you’re uncomfortable merely publishing accusations and defenses, there’s no way you’re ready to impose restitution within your community.  Maybe that’s appropriate in a given situation — maybe loose friend groups aren’t ready to be self-governing communities. But if you have aspirations towards self-governance, from small-scale (communes) to large-scale (seasteading and the like), figuring out dispute resolution is a necessary step, and it’s worth thinking about what would be required before you’d be okay with any community promotion or enforcement of norms.

I’d actively welcome people’s thoughts and comments on this — how would it fail? how could the mechanism be improved?


Closer to Fine

Epistemic Status: Personal

I’ve done a lot of personal growth work in the past year, and I wanted to document it here, for the sake of general edification and also to keep a record of things.

I’m now at the point where I have markedly less interest in general “self-improvement” — skill-building never stops, of course, but I feel like I’m fundamentally fine and don’t need fixing, I feel like I’m done with the “fix myself” stage of my life.  So it seems like a good point at which to stop and reflect.

Basic Mood

I manage my mood with antidepressants and anti-anxiety meds, and have since September of 2015. I have pretty vanilla 21st century mild depression/anxiety, and at the moment I do much better medicated than not.

But I think of mood as just a sort of baseline scalar value, and addressing it chemically isn’t enough if you also have more complex cognitive stuff going on that needs to be fixed.


Scrupulosity is a term borrowed from the Catholics, referring to “An unfounded apprehension and consequently unwarranted fear that something is a sin which, as a matter of fact, is not”.  In other words, irrational guilt. This used to be a big problem for me.

Scrupulosity is very painful, and seems to be little discussed outside of religious contexts, but one can definitely get it about secular issues. Is it bad that I make money? Is it bad that I hold some controversial opinions? Am I too difficult, too overbearing, too greedy, too much?

The first thing, for me, in combating scrupulosity, was having a moral framework in which I could be confident that the benign things I did were, in fact, not wrong. It was very important to me that this framework was credible and didn’t feel like a bunch of pretty lies. Some things are wrong: cruelty, dishonesty, etc.  I am not perfect and do occasionally do bad things. But things like “thinking for myself” or “asserting myself” or “earning a living” are good, not bad.  And there are straightforward reasons why.

This isn’t enough by itself, because you can be intellectually aware that a thing is true, and still have strong negative feelings that contradict it. So the first thing I started doing was developing repetitive associations & verbal fluency. I literally made Anki cards with inspiring quotes on them and memorized them. I read books that encouraged self-worth. I talked to people who could tell me I was okay.  I picked up little verbal mantras that I’d repeat to myself when I felt down.

This still isn’t enough, because you can fall into the trap of depending too much on external reassurance. There’s a big difference between “please tell me I’m okay, for the umpteenth time” and just believing “yeah, I’m okay, that’s a fact about reality.”  Getting over this last hump, for me, was mostly a matter of insights, deeply internalized.

Epiphanies are weird, because what happens in practice is that you have a lot of them, and they all sound really obvious if you say them out loud. I believe this is normal. Single epiphanies don’t change you for life, mostly, but an aggregate of many epiphanies on the same theme add up.

Things like “I can trust myself more”, “I deserve happiness”, “I am actually successful by my own standards”, “I have strong convictions and admire heroism and that is a good thing”, “I care about truth and science a lot”, “I care about peace and freedom a lot,” “I deserve to give myself credit”, “I deserve to live”, and so on, sound like mundane platitudes, but there’s an experience of grokking them deeply, recognizing that they’re not just nice things to say but in a certain sense literally true in real life, that is essential and can’t be replaced.

Emotional Lability

I had a problem with freaking out over small stuff and wanting a lot of comfort and attention in response. Moreover, I didn’t entirely want to stop freaking out; I kind of enjoyed the drama.

What “attention-seeking” felt like on the inside was craving an intense sensation of pleasure, which was maddeningly hard to get access to, but seemed like it ought to be easily available, and it was frustrating that people weren’t giving it to me, when it seemed so simple. All you have to do is react strongly to my behavior — be shocked, be enthusiastic, be angry, be sympathetic, give me stimulus of some kind!  What’s so hard about that?

But it is hard for a lot of people, and I began to realize that it was a burden on people I love. Moreover, it’s especially harmful to people who are very truthful. Emotional reactions often run counter to careful, honest reasoning.  Splashing around in the Emotion Sandbox often means saying things you don’t really mean, and when people take you literally, you’re deceiving them. Truthful people are also reluctant to jump into the Emotion Sandbox with you, because they want to maintain their own intellectual integrity.

Careful, denotative, truthful use of language, where you’re trying to communicate about reality, rather than just splashing emotional stimulus at each other, is a really useful skill. It built civilization.  Very few people are good at it, and those people are precious. Some of the most important people in my life are good at rationality in this sense, and I care about their happiness, and wouldn’t want to pressure them into damaging their souls. I’m good at rationality in some contexts, for that matter, and enhancing my ability to do science is very important to me, so I need to be careful with my mind.

Also, I’m getting older. “Wild and crazy” is cute on a teenager, less so on an adult.  The person I want to be in the coming years is an intelligent, decisive, practical woman.  I’m going to have a lot of things to do, and that trades off against emotional sound and fury.

So I basically concluded that I can accept a lower-emotional-lability life, because other things are more important.

In practice, that means I’ve cut out social media, drastically reduced the amount I bug people for emotional reassurance or otherwise try to provoke an emotional reaction, and am cultivating a sort of “I’m fine”, cheerful-worker-bee, don’t-sweat-the-small-stuff, sensibility.

At first it started out as a sort of grim satisfaction in Doing The Right Thing, but increasingly it’s felt more like actual cheerfulness, or like strength. Security in my own ability to be fine.


Perversity is my word for when you do bad things on purpose.  Usually, in my case, this was laughably simple: I would go around saying “I’m bad!” and using really gloomy language.

I think it’s akin to some kinds of impulsive behavior like abusing drugs or self-harm, though, in that it involves doing stuff because it’s against the rules or doing stuff because it fits your self-image as a bad person.

I think perversity is actually quite widespread. When people lose hope that some good thing is possible, they say “forget it, I’ll just be Bad then.”  When people believe that ethical people are doomed to lose to unethical people, they can decide to be Bad. (I’ve met finance guys who are actively excited about how they’re investing in companies that destroy the rainforest. It’s not that they’re principled critics of environmentalism, it’s that they identify as the Baddies.)  I think some kinds of shallowness and cynicism and playing-dumb are symptoms of loss of hope.  I think that when you lose hope, you tend to adopt the belief that only losers hope.

In the game of Hearts, the person who accumulates the most heart cards loses — unless you accumulate all the hearts, in which case you win. This is called “shooting the moon.”

Perversity is like shooting the moon; somewhere, subconsciously, you hold the belief that if you only lost enough, you’d win. It doesn’t make sense, but somehow it can be emotionally powerful. There is a will to lose. There is such a thing as Thanatos, the death-drive.   There is such a thing as hatred of the good for being the good.

Any talk of such impulses has a tendency to sound paranoid, but I’m pretty confident that this is a real thing.  It’s not a complicated thing, or a mysterious force of darkness, though. It’s just the subconscious belief that a.) you’re definitely screwed (in some way), and b.) if you decide to lean into the bad thing on purpose that will make it okay.  If you suck on purpose, you don’t have to feel guilty for failing; if you harm yourself on purpose, or harm others, that will make it okay that the world harmed you.

This is kind of bassackwards, of course. There is no rule in reality that if you collect all the Badness, you win.  You just lose more.

(I am not the first person to notice that Hearts can be a weirdly emotionally compelling game, and deeply linked to the impulse towards perversity.)

For me, perversity was partly downstream of scrupulosity. The “I’m definitely screwed” part took the form of believing that I was a bad person, or an unsuccessful person, or an undeserving person.  Understanding that this literally wasn’t true was essential to overcoming despair.

There’s also the more-or-less independent epiphany that’s best summarized as “Goodness works.”  Being truthful, constructive, principled, etc. results in victory, not defeat.  The Allies won and the Nazis lost. The Quakers got rich on their reputation for honesty in business. Correct physics will build airplanes that fly.  Having nice things depends on people building nice things, and most of the time and in the long run, the best way to have nice things is to contribute to building them.  Exploitation is an edge case, that only works locally and burns itself out quickly.

Scott Alexander gets this:

I worry that I’m not communicating how beautiful and inevitable all of this is. We’re surrounded by  a vast confusion, “a darkling plain where ignorant armies clash by night”, with one side or another making a temporary advance and then falling back in turn. And in the middle of all of it, there’s this gradual capacity-building going on, where what starts off as a hopelessly weak signal gradually builds up strength, until one army starts winning a little more often than chance, then a lot more often, and finally takes the field entirely. Which seems strange, because surely you can’t build any complex signal-detection machinery in the middle of all the chaos, surely you’d be shot the moment you left the trenches, but – your enemies are helping you do it. Both sides are diverting their artillery from the relevant areas, pooling their resources, helping bring supplies to the engineers, because until the very end they think it’s going to ensure their final victory and not yours.

Understanding that goodness wins is the same thing as understanding that you can’t shoot the moon.

Being as bad as possible doesn’t make you Milton’s Satan, it makes you the dictator of North Korea. It is small and shitty and ruined and disappointing and sad.  You can’t get nice things by wrecking all the nice things.

If you grok this, then you stop seeing the appeal in fake things, or scams, or random chaos, or anything that isn’t “productive” in the “building more nice things” sense.  An unscrupulous employer can give you money…which you won’t enjoy, because working there will wear you down? That doesn’t sound fun. An angry outburst will…hurt the love of your life?  Well, that just sounds sad.  Obeying someone mean and scary means…you have to spend more time obeying someone mean and scary, instead of getting free. What’s so great about that?

I don’t think I’m articulating this well, but there’s sort of a sense of “you could have paradise — why would you lock yourself into a cage? why not have more good things…and fewer bad things?” And when this solidifies into what you actually believe (as opposed to an idea you’re flirting with or trying on), you have a kind of armor against perversity.




Life Update

I’ve just started a job as a data scientist at Recursion Pharmaceuticals.  I’m using machine learning to find new drug compounds.

The basic model is:

  • take some rare diseases that are caused by single genes;
  • simulate these diseases cheaply and at scale with siRNA knockdowns;
  • detect (here’s the machine learning part) how images of sick cells look different from healthy cells
  • observe (machine learning again) which drugs make sick cells look like healthy cells
  • send the promising drugs on to in-vitro and in-vivo screens

This is basically my dream job.  I’ve been torn between math and biology since I was maybe 9; now I get to do both.  And I get to work towards precisely the problems I care about: curing diseases, getting as much purchase as possible out of computational methods in practical applications, reversing Eroom’s Law, etc.  I’m thrilled to be working at Recursion.

Due to company policy, I won’t be able to continue doing freelance lit review any more, at least not for paid projects; I expect to keep doing the occasional free project here and there.

I’ve also made a few updates in my views recently that I thought I’d share here.

  • The boost in my productivity and overall well-being from having a meaningful job, working with people I trust and respect on problems I care about, is enormous. Much more than I’d have expected. I am now much more sympathetic to messages like “too many people are trapped in bullshit jobs”, “pointless busywork in school is harmful”, “it’s bad to be alienated from one’s labor”, etc. I’m more bullish on things like self-employment, unschooling, quitting your job to pursue your passion, and so on; stagnation is a real cost to your soul.
    • I’m reminded of the theories of people like Gabriel Kolko, who said that the bigness of “big business” is an artifact of regulatory capture, in which large businesses are subsidized by the state. In this model, the “natural”, undistorted size of businesses would be smaller, and fewer things would be done that had no real purpose besides checking an officially-required box.  Pointless activity, under this model, is not “natural”; it’s usually forced.
    • I’m sort of playing with the idea of a philosophy of “makerism”, in which the good guys are simply the people who do self-evidently useful things. Building a house or preparing a meal is obviously Useful Work. As is discovering a drug or inventing a tool. In makerism, if you’d have trouble explaining to a precocious twelve-year-old why you’re doing a useful thing, there’s a chance that what you’re doing is bullshit. I’ve sort of poked at the idea of measures of awesomeness and the ecosystem of industry before.  The thing I’m trying to grope towards is productiveness. Not productivity, as in number of hours worked per day, or number of widgets produced per worker, but reaching towards usefulnessvalue, fruitfulnesssubstantialness, good-for-humans-ness.
  • My main update from job searching this time around (in mostly Silicon-Valley-based data science jobs) is that there is a thing called “fit” — how close the applicant’s background and skills are to what the employer is looking for — and the jobs you are an exact fit for will love you, and the ones you’re an imperfect fit for will reject you. For instance, it’s basically not worth it for me to even apply for jobs as a “data engineer”, because I’m not one. “Oh, it’s close to what I know and I can learn it on the job”? Nope. The right job is the one that’s dead center in the middle of your skillset.
    • Also, I had significantly better results applying to companies in the biomedical industry, I assume because I’ve done biomedical stuff in the past (systems-biology research in grad school, a personalized-medicine startup).  The takeaway here is that I expect you have the best shot in jobs that correspond well to your entire background, including things that you might classify as a “side interest”.  If you have a unique combination of skills, look for places that actively want that.
  •  Bay Area software companies seem mostly pretty sane, in that they do not hire the flagrantly unqualified. Don’t expect to bluff your way in.
  • Because there are so many people sharing stories about the opposite experience, I think it behooves me to share mine; I didn’t experience anything that I’d classify as sexism during my job search, even though nearly all my interviewers were male, and so were nearly all the data scientists at the companies where I applied. The closest thing was being told that I was too “nervous” by one interviewer, which is sort of gendered in a statistical sense, but is also legitimately true of me, and not true of all women.
  • I have noticed that a fair number of companies are “segregated”, in that all the engineers are Asian (and foreign-born) while all the managers are white. It seems to correlate really well with, for lack of a better word, “lameness” — companies that are stagnant, hierarchical, complacent, don’t have a strong engineering culture, etc.  I now consider racial glass ceilings to be a red flag.
  • Skills I wish I’d had: better memory for SQL syntax (yes, really), deep learning, computer vision, ETL pipelines
  • Skills I was glad I had: Spark, familiarity with the Python scientific computing & ML libraries, basic ML skills at the level of Hastie & Tibshirani, basic algorithms & data structures.
  • In technical interviews, a lot comes down to “fluency” or “execution” — can you solve simple math and programming problems correctly and quickly? are you checking for small errors? It’s very g-loaded, but I think there’s a skill of “turning your g on”, getting into “performance mode”, which I learned from years of being a math contest kid, and felt myself relearning as I went through the job search process. If you know what I’m talking about, focus on cultivating that, through repetitive practice of fairly-easy things with a high bar for accuracy, rather than studying super-advanced topics.

What’s Your Type: Identity and its Discontents


My type is Lisa Frank Sea Lion.

When I was a teenager, I had the intuition that third-wave feminism was a genre of feminine content.  A lot of the feminist books and magazines I came across had pink covers. A lot of them were about sex and relationships and clothes and pop culture — the same sorts of things I looked for in Seventeen magazine. I liked those topics; they gave me a deliciously wicked frisson; and I liked the kind of pop-feminist writing that was about Expressing Yourself; but I was obviously not a predominantly pink-flavored person. I was a serious person.

I am embarrassed to say that I never really appreciated the achievements of Rosalind Franklin until I was much older. I had grown up hearing about her as a “women in STEM” sermon.   I was a woman and I was a scientist, but I had decided that “women in STEM” was not my genre, or at least not so much that I would be in danger of being typecast.  The story of Watson and Crick was about DNA, but the story of Rosalind Franklin was about politics and unfairness and the HR-office side of a scientific career.  Obviously, DNA was more exciting to me at the time. It was only later that it clicked — if she independently discovered the double-helix structure, then she’s as much of a genius and pioneer as they were, arguably more so.  Her discovery belongs in the story of scientific progress, not on the shelf of books with pink covers.

In a liberal paradigm, things like feminism or anti-racism or LGBT rights or religious freedom are about liberating people.  You want to get rid of irrational prejudice and oppression so that people of any origin or creed can be free to do human stuff as they choose. The operative word is people. Sexual harassment, for instance, is wrong because it is an unjust harm to people.  None of this has anything to do with being pink-flavored or rainbow-flavored; you can be a middle-aged man with a dark suit and sober habits and speak out against injustice because it harms people, and you care about people, full stop.

The idea that feminism could be a flavor or a subculture or a genre is bizarre, if you look at it from the liberal paradigm.

But there’s also a market segmentation paradigm in which to think about this.

Market segmentation is a technique that marketers use to target products to certain demographics — and “products” include “content”, that is, books and articles and TV shows and so on.  And, with the rise of the internet and the abundance of consumer data, marketers have become very good at it.

Market segmentation involves identifying you with a type of person. A subculture, a demographic, a style, a flavor, a personality type. Cambridge Analytica, the internet marketing firm behind Trump’s success and the Brexit vote, categorizes people by their personality type in order to target political advertising at them. Marketers write profiles of a “typical” buyer of a product — a simplified bio of what kind of person they’re targeting.

“Red state” vs. “Blue state” is market segmentation. Personality types are market segmentation.  Exaggerated gender dimorphism — all women’s products are pink, all men’s products are black — is market segmentation. Subcultures (“nerd”, “goth”, “hipster”) are market segmentation.  Generations (Boomer, Gen X, Millennial) are market segmentation.

Statistical differences between groups of people obviously exist in the real world, but “identifying as” a category, exaggerating how much you match the category’s flavor and style, choosing a “type” to belong to, is a form of actively playing along with market segmentation, over and above whatever statistical differences exist.  One doesn’t “identify as” being born in 1988, but one does “identify as” a Millennial.

What flavor are you? What’s your type? What product is right for you?

There’s something irresistible about a personality quiz.  Tell me what type I belong to!  Tell me about myself!  It gratifies my vanity, and it helps me feel like I know my place in the world.

(I’m an INTP and a Gryffindor, natch.)

It took me a long time, and Dreyfus’ excellent commentary, to realize this, but Heidegger’s concept of Dasein, which literally translates to Being, is really better understood as the behavior of “identifying as.”

Dasein is what you do when you assert what it means to be human, what it means to be you, what it means to be a member of your community.  Dasein is self-definition.  And, in particular, self-definition with respect to a social context. Where do I fit in society? Who is my tribe? Who am I relative to other people? What’s my type?

“Identifying as” always includes an element of misdirection. Merely describing yourself factually (“I was born in 1988”) is not Dasein. Placing an emphasis, exaggerating, cartoonifying, declaring yourself for a team, is Dasein.  But when you identify as, you say “I am such-and-such”, as though you were merely describing. You’re aligning yourself with your flavor of choice, while at the same time declaring vehemently that you’re only describing the way things are.

Your identity, no matter what it is, is always sort of bullshit or arbitrary or performative.  It’s role-playing. It’s kind of like wearing a mask.

And, for people who like it, there’s a delight in “identifying-as”, of putting yourself in a category, of knowing your type.  It makes you feel simple, well-defined, and important.

I knew a psychologist once who worked with businesses, and loved giving his clients the Myers-Briggs personality test. He told me that the main reason he used it was not the particular personality breakdown, but the simple fact that it divided people into 16 types. People would get into workplace disputes that were basically dominance hierarchies, arguments over who’s right or who’s best or who’s in charge. And he would resolve those disputes by helping people understand that Alice is one Myers-Briggs type and Bob is another; not better, not worse, just different.  “There are 16 kinds of people in the world” allows everyone to feel special (“A type! Just for me!”) and defuses hierarchical tussles, because no one type is on top.

But, of course, there are problems with “identifying-as.”

Paul Graham’s essay “Keep Your Identity Small” observes that the very feature that my psychologist acquaintance liked about personality types — that no type is better than any other — as a problem that makes it impossible to assess merit when identities are in play.

For example, the question of the relative merits of programming languages often degenerates into a religious war, because so many programmers identify as X programmers or Y programmers. This sometimes leads people to conclude the question must be unanswerable—that all languages are equally good. Obviously that’s false: anything else people make can be well or badly designed; why should this be uniquely impossible for programming languages? And indeed, you can have a fruitful discussion about the relative merits of programming languages, so long as you exclude people who respond from identity.

Sometimes there are objective things that can be said about topics that people have chosen to build identities out of. Sometimes a programming language has strengths or weaknesses. Sometimes a government policy has benefits or harms. You might, in some circumstances, care about those objective, on-the-merits evaluations; maybe you want to achieve some goal and want to choose the best programming language for the job. You’re not going to be able to do that if the discussion gets taken over by identity; what people are doing when they’re identifying-as is self-expression or self-definition or self-assertion, which is lovely when you want it, but doesn’t answer any of your practical questions.  Unfortunately, people often do self-expression in the guise of answering your practical questions, and you may not know, or your interlocutor may not even know himself, that he’s really saying “I am a Lisp programmer!!” and not describing anything about the properties of Lisp.  One of the qualities of Dasein is that it’s very very stealthy, and it wants everything to be about Dasein, so it winds up muddying the waters, even when you don’t intend it to.

Coming back to the issue of politics, Dasein can mess up the attempt to solve social problems. If, when you say “sexual harassment”, people hear “feminist shibboleth”, then if they don’t identify as feminists, they may not actually notice the possibility that sexual harassment is a big problem that hurts a lot of human beings and that they might want to take seriously.  Sexual harassment gets perceived as a flag for pink-flavored people to wave, and if you’re not pink-flavored, you’re not the target market, so you don’t take it seriously.

If something matters generally, or is true objectively, regardless of subcultures, personality types, and tribes, then the identity mindset will be inadequate to deal with it.

Identity is obviously a really big part of the human experience. Heidegger thinks it’s essential and cannot be excised, and people who think they’ve achieved objectivity are fooling themselves. Without making that strong an absolute claim, I think it’s fair to say that identity is pervasive, and if you think it’s not an issue for you and have never considered it before, you should probably take a closer look and see how much it affects your life.

It’s also worth noting that Heidegger was a member of the Nazi Party, and that Nazism (as described in Mein Kampf) is all about how objectivity is terrible and how strong feelings of identity, specifically national and racial identity, are the best thing ever.  So there are some reasons to be suspicious of putting identity first at the expense of all other considerations.

Identity is always vivid, personal, flavorful.  It’s not “mere” fact, it’s alive with emphasis and exaggeration.  It’s never bland or dry.  I think that’s part of its appeal.  It makes you special, it makes you valid, it makes you distinctive.  It adds vim and verve to your self-image. It’s like all-caps and italics for your soul.

It may be dull in terms of information content (what it says is, always and forever, “I AM!!!”) but it’s never lacking in personal flair.

Most people I know who think about “identity” are rather like Paul Graham; they don’t have that strong a craving for it, and they’re frequently getting annoyed that other people are caught up in it. Or, they seek very specialized and cordoned-off ways to provide it for themselves: think of secular atheists who create rituals or highly independent introverts who contemplate the human need for community. I come at this from the opposite direction: I am a person who likes things hot-pink and in all caps, who always craves a higher emotional temperature, and who has been learning about how to navigate the fact that this is sometimes damaging and worth avoiding.

So, coming from that perspective, I’m genuinely unsure: do we want to channel identifying-as into safe, satisfying forms of pretend-play, or do we want to just have less of it?  To what extent is it even possible to channel or reduce it?

Strong AI Isn’t Here Yet

Epistemic Status: moderately confident. Thanks to Andrew Critch for a very fruitful discussion that clarified my views on this topic.  Some edits due to Thomas Colthurst.

I’ve heard a fair amount of discussion by generally well-informed people who believe that bigger and better deep learning systems, not fundamentally different from those which exist today, will soon become capable of general intelligence — that is, human-level or higher cognition.

I don’t believe this is true.

In other words, I believe that if we develop strong AI in some reasonably short timeframe (less than a hundred years from now or something like that), it will be due to some conceptual breakthrough, and not merely due to continuing to scale up and incrementally modify existing deep learning algorithms.

To be clear on what I mean by a “breakthrough”, I’m thinking of things like neural networks (1957) and backpropagation (1986) [ETA: actually dates back to 1974, from Paul Werbos’ thesis] as major machine learning advances, and types of neural network architecture such as LSTMs (1997), convolutional neural nets (1998), or neural Turing machines (2016) as minor advances.

I’ve spoken to people who think that we will not need even minor advances before we get to strong AI; I think this is very unlikely.

Predicate Logic and Probability

As David Chapman points out in Probability Theory Does Not Extend Logic, one of the important things humans can do is predicate calculus, also known as first-order logic. Predicate calculus allows you to use the quantifiers “for all” and “there exists” as well as the operators “and”, “or”, and “not.”

Predicate calculus makes it possible to make general claims like “All men are mortal”.  Propositional calculus, which consists only of “and”, “or”, and “not”, cannot make such statements; it is limited to statements like “Socrates is mortal” and “Plato is mortal” and “Socrates and Plato are men.”

Inductive reasoning is the process of making predictions from data. If you’ve seen 999 men who are mortal, Bayesian reasoning tells you that the 1000th man is also likely to be mortal. Deductive reasoning is the process of applying general principles: if you know that all men are mortal, you know that Socrates is mortal.  In human psychological development, according to Piaget, deductive reasoning is more difficult and comes later — people don’t learn it until adolescence.  Deductive reasoning depends on predicate calculus, not just propositional calculus.

It’s possible to view propositional calculus as an extension of probability theory. For instance, MIRI’s logical induction paper constructs a (not very efficient) algorithm for assigning probabilities to all sentences in a propositional logic language plus some axioms, such that the probabilities learn to approximate the true computed values faster than it would take to compute the truth of propositions.  For example, if we are given the axioms of first-order logic, the logical induction criterion gives us a probability distribution over all “worlds” consistent with those axioms. (A “world” is an assignment of Boolean truth values to sentences in propositional calculus.)

What’s not necessarily known is how to assign probabilities to sentences in predicate calculus in a way consistent with the laws of probability.

Part of why this is so difficult is because it touches on questions of ontology. To translate “All men are mortal” into probability theory, one has to define a sample space. What are “men”?  How many “men” are there? If your basic units of data are 64×64 pixel images, how are you going to divide that space up into “men”?  And if tomorrow you upgrade to 128×128 images, how can you be sure that when you construct your collection of “men” from the new data, that it’s consistent with the old collection of “men”?  And how do you set up your statements about “all men” so that none of them break when you change the raw data?

This is the problem I alluded to in Choice of Ontology.  A type of object that behaves properly under ontology changes is a concept, as opposed to a percept (a cluster of data points that are similar along some metric.)  Images that are similar in Euclidean distance to a stick-figure form a percept, but “man” is a concept. And I don’t think we know how to implement concepts in machine-learning language, and I think we might have to do so in order to “learn” predicate-logic statements.

Stuart Russell wrote in 2014,

An important consequence of uncertainty in a world of things: there will be uncertainty about what things are in the world. Real objects seldom wear unique identifiers or preannounce their existence like the cast of a play. In the case of vision, for example, the existence of objects must be inferred from raw data (pixels) that contain no explicit object references at all. If, however, one has a probabilistic model of the ways in which worlds can be composed of objects and of how objects cause pixel values, then inference can propose the existence of objects given only pixel values as evidence. Similar arguments apply to areas such as natural language understanding, web mining, and computer security.

The difference between knowing all the objects in advance and inferring their existence and identity from observation corresponds to an important but often overlooked distinction between closed-universe languages such as SQL and logic programs and open-universe languages such as full first-order logic.

How to deduce “things” or “objects” or “concepts” and then perform inference about them is a hard and unsolved conceptual problem.  Since humans do manage to reason about objects and concepts, this seems like a necessary condition for “human-level general AI”, even though machines do outperform humans at specific tasks like arithmetic, chess, Go, and image classification.

Neural Networks Are Probabilistic Models

A neural network is composed of nodes, which take as inputs values from their “parent” nodes, combine them according to the weights on the edges, transform them according to some transfer function, and then pass along a value to their “child” nodes. All neural nets, no matter the difference in their architecture, follow this basic format.

A neural network is, in a sense, a simplification of a Bayesian probability model. If you put probability distributions rather than single numbers on the edge weights, then the neural network architecture can be interpreted probabilistically. The probability of a target classification given the input data is given by a likelihood function; there’s a prior over the distribution of weights; and as data comes in, you can update to a posterior distribution over the weights, thereby “learning” the correct weights on the network.  Doing gradient descent on the weights (as you do in an ordinary neural network) finds the maximum likelihood values of the posterior distributions on the weights in the Bayesian network paradigm.

What this means is that neural networks are simplifications or restrictions of probabilistic models. If we don’t know how to solve a problem with a Bayesian network, then a fortiori we don’t know how to solve it with deep learning either (except for considerations of efficiency and scale — deep neural nets can be much larger and faster than Bayes nets.)

We don’t know how to assign and update probabilities on predicate statements using Bayes nets, in a coherent and general manner. So we don’t know how to do that with neural nets either, except to the degree that neural nets are simpler or easier to work with than general Bayes nets.

For instance, as Thomas Colthurst points out in the comments, message passing algorithms don’t provably work in general Bayes nets, but do work in feedforward neural nets, which don’t have cycles. It may be that neural nets provide a restricted domain in which modeling predicate statements probabilistically is more tractable. I would have to learn more about this.

Do You Feel Lucky?

If you believe that learning “concepts” or “objects” is necessary for general intelligence (either for reasons of predicate logic or otherwise), then in order to believe that current deep learning techniques are already capable of general intelligence, you’d have to believe that deep networks are going to figure out how to represent objects somehow under the hood, without human beings needing to have conceptual understanding of how that works.

Perhaps, in the process of training a robot to navigate a room, that robot will represent the concept of “chairs” and “tables” and even derive general claims like “objects fall down when dropped”, all via reinforcement learning.

I find myself skeptical of this.

In something like image recognition, where convolutional neural networks work very well, there’s human conceptual understanding of the world of vision going on under the hood. We know that natural 2-d images generally are fairly smooth, so expanding them in terms of a multiscale wavelet basis is efficient, and that’s pretty much what convnets do.  They’re also inspired by the structure of the visual cortex.  In some sense, researchers know some things about how image recognition works on an algorithmic level.

I suspect that, similarly, we’d have to have understanding of how concepts work on an algorithmic level in order to train conceptual learning.  I used to think I knew how they worked; now I think I was describing high-level percepts, and I really don’t know what concepts are.

The idea that you can throw a bunch of computing power at a scientific problem, without understanding of fundamentals, and get out answers, is something that I’ve become very skeptical of, based on examples from biology where bigger drug screening programs and more molecular biology understanding don’t necessarily lead to more successful drugs.  It’s not in-principle impossible that you could have enough data to overcome the problem of multiple hypothesis testing, but modern science doesn’t have a great track record of actually doing that.

Getting artificial intelligence “by accident” from really big neural nets seems unlikely to me in the same way that getting a cure for cancer “by accident” from combining huge amounts of “omics” data seems unlikely to me.

What I’m Not Saying

I’m not saying that strong AI is impossible in principle.

I’m not saying that strong AI won’t be developed, with conceptual breakthroughs.  Researchers are working on conceptually novel approaches like differentiable computing and program induction that might lead to machines that can learn concepts and predicates.

I’m not saying that narrow AI might not be a very big deal, economically and technologically and culturally.

I’m not trying to malign the accomplishments of people who work on deep learning. (I admire them greatly and am trying to get up to speed in the field myself, and think deep learning is pretty awesome.)

I’m saying that I don’t think we’re done.