Do Pineal Gland Extracts Promote Longevity? Well…Maybe.

Epistemic Status: Casual. I’m in a period of blogging more frequently so my posts represent only a few hours of thought.

I love ground-up organs.

Seriously — they’re where a lot of medical progress comes from. Long before we knew what “cortisol” did, we were treating autoimmune diseases with ground-up adrenal cortex extract. Long before we knew what thyroxine did, we were treating hypothyroid patients with ground-up thyroid extract.  And a lot of regenerative medicine is still at the stage of “put some mashed-up lymph node tissue or bone marrow on it and see if it grows.”

It’s primitive, but it’s a kind of prudent primitiveness. It is very hard to map out biochemical pathways and extract the precise chemical that binds to the precise receptor that does what you want. Given the messiness of evolution, it is not at all surprising when it turns out that there are many potentially relevant receptors and chemicals related to the disease you want to target.  If you just know the organ, you route around all that complexity.  You don’t have to know all the growth factors to know that bone marrow contains some kind of growth-y stuff.

So I was very interested, after having been linked to this database of putative life-extension drugs, that the top scorer was epithalamin, an extract of the pineal gland; it’s said to extend life by 31% in mice. The second item on the list was polypeptide pineal preparation, which is another name for epithalamin; and melatonin, the primary hormone produced by the pineal gland, is no slouch either, allegedly giving mice 18% longer lives.

So, I had to ask, is this for real?

First, let’s talk about melatonin.

Melatonin is the “sleep hormone” — it is produced at night much more than in the day, and its primary effect is to make humans and animals sleepy. The cyclical pattern of melatonin secretion provides the circadian rhythm.

The study linked in the database, by Walter Pierpaoli, finds that male mice (but not female mice) given melatonin at night (but not during the day) live 18% longer than control mice, and engrafting young pineal glands onto the thymuses of older mice increases lifespan by 27%.[1]

Unfortunately, this experiment is on strains of mice that happen to be deficient in melatonin already,[2] so it doesn’t tell us much about whether melatonin or pineal-gland transplants will extend the lifespans of normal-melatonin individuals.

A critical paper in 1995, called “Melatonin Madness,”[3] pointed out this mistake, and pointed out that in another study on a strain of mouse that does produce melatonin (C3H/He), the treated mice actually had shorter lifespans because they developed more tumors.

Pierpaoli is…rather problematic.  He’s the author of “The Melatonin Miracle: Nature’s Age-Reversing, Disease-Fighting, Sex-Enhancing Hormone” and sells melatonin in his online store. So, a little skepticism is warranted here.

On the other hand, there’s been additional evidence that melatonin can extend life, even in melatonin-producing animals.

In C3H mice (which produce melatonin), melatonin in drinking water prolongs the life of male mice by about 20% (p < 0.01) but not female mice.[4]

In CBA mice, which also produce melatonin, those given melatonin in their night-time drinking water were significantly more likely than controls to get lung cancer and lymphoma, but their lifespan still was extended by 5% relative to controls.[5]

Moreover, a 1979 study by the Russian longevity researcher Vladimir N. Anisimov found that female rats given daily doses of epithalamin at 0.1 or 0.5 mg increased lifespan by 10% and 25% respectively, and aging-related disturbances in the estrus cycle were significantly (p < 0.05) reduced.[6]

Anisimov also found that female  C3H mice (a melatonin-producing strain) given daily epithalamin at 0.5 mg lived on average 40% longer than controls.[7]

However, he found that epithalamin given to old rats did not significantly increase lifespan.[8]

In another Russian experiment, by Vladimir Khavinson, a frequent collaborator of Anisimov’s,[9] 94 women aged 66-94 from the War Veterans Home in St. Petersburg were randomized to control, thymus extract (thymalin), pineal extract (epithalamin), or both.  At baseline, these elderly women had high B lymphocytes, low NK counts, high IgG levels, high cortisol, insulin, and TSH, and low estrogen and LH, compared to the “normal” levels.

Thymalin normalized the NK levels. Epithalamin normalized NK levels as well as ACTH, TSH, cortisol, and insulin. Thymalin significantly reduced the rate of acute respiratory diseases (from 58% to 25%) and epithalamin significantly reduced the rate of ischemic heart disease. In 6 years, 81.8% of the control patients died, compared to 41.7% of the thymalin patients, 45.8% of the epithalamin patients (both applied for 2 years) and 20.0% of the patients given epithalamin and thymalin for 6 years.  The effect on mortality was significant at p<0.001.

This is a pretty small study for a measurement of all-cause mortality, but something might be going on here.

It’s also possible that pineal extract reverses aging-related insulin resistance (which is associated with many aging-related diseases, like heart disease, cancer, and diabetes.)

Old rhesus monkeys have lower melatonin levels than young monkeys. Pineal gland extract, but not placebo, raises old rhesus monkeys’ melatonin levels to that of young monkeys.  Old monkeys also have higher blood glucose levels (at baseline and in response to glucose challenge) than young monkeys; pineal gland extract significantly reduces glucose in old monkeys (but not young monkeys.) Old monkeys have a delayed and flatter curve of insulin response to glucose challenge than young monkeys; pineal gland extract reverses this.  This suggests that pineal gland extract improves insulin sensitivity in monkeys.[10]

There is a lot of uncontroversial evidence that melatonin has something to do with aging. Mammals and humans secrete less melatonin as they age.

The pineal gland’s melatonin secretion rhythm becomes less regular with age (smaller amplitude) in both rats and hamsters. In hamsters and humans, the pineal gland develops “concretions” of calcium with age.  Older animals lose beta-adrenergic receptors on the pineal gland with age.  Food-restricted rats, on the other hand, continue to produce more melatonin in old age — and that’s suggestive, because food-restricted animals also live longer.[11]

Old rats had neurons in their pineal glands fire at lower frequencies than young rats, and produced less melatonin.[12]

Food-restricted rats at 28 months (old age) had twice the melatonin levels of ad-lib fed rats.  Food-restricted rats were smaller, and had no tumors or cataracts, unlike ad-lib fed rats.[13]

The pineal gland also seems to be associated with insulin sensitivity and other markers associated with aging.

Pinealectomy (the removal of the pineal gland) increases blood pressure in rats, suspected to be caused by an increase in adrenal steroid levels.  Melatonin in the drinking water reduces blood pressure to normal levels.[14]

Pinealectomy causes glucose intolerance and insulin insensitivity in rats.  At 8 AM, glucose and insulin are normal in response to glucose challenge; but at 4 PM, pinealectomized rats have way higher of a glucose spike and way less insulin production.  The pancreas responds less to pinealectomized rats, both morning and afternoon. Pinealectomized rats have significantly less GLUT-4 (glucose transporter) in their adipose tissues.[15]

Pinealectomized rats have higher glucose levels, lower insulin levels, and higher glucagon levels than control rats; treatment of pinealectomized rats with melatonin increases insulin and reduces glucagon. Pinealectomized rats have glucose intolerance. Melatonin supplementation partially recovers glucose tolerance.[16]

Corticosterone (the primary corticosteroid in rodents, serving similar functions to cortisol in humans) rises in rats as they age; two-month-old pinealectomized rats had the same corticosterone levels as 24-month-old aged rats.[17]

So, if you remove the pineal gland, you get higher levels of corticosteroids, higher blood pressure, and more metabolic-syndrome-like changes, just like people and animals do as they age. Pinealectomy also causes “pro-gonadal” effects — higher sex hormones and larger sex organs.

Pinealectomizing rats causes ovarian, pituitary, and adrenal hypertrophy (p < 0.001). Adding bovine pineal extract to the rats reverses this.[18]

Melatonin reduces prostate weight in rats as a fraction of total body weight (p < 0.02) and prostate fructose (p < 0.05); being kept in darkness drops testosterone levels to 1/8th their usual levels; pinealectomy increases prostate weight (p < 0.05) and triples testosterone levels (p < 0.01).[19]

The testes of rats produce more testosterone after pinealectomy, and administering melatonin reverses the effect.[20]

Blinding female rats retards the development of their ovaries and uterus. Pinealectomy recovers the normal size of the ovaries and uterus.[21] (Note that blinding animals or keeping them in darkness is kind of like making it perpetually night for them — the conditions under which melatonin is usually secreted. So that’s also consistent with the “melatonin = less sex hormones” pattern.)

Nighttime, but not continuous, administration of melatonin causes delayed puberty and delayed reproductive senescence in mice.  That is, mice given melatonin are slower to reach puberty and slower to become infertile with age.[22]

So this pattern is starting to make sense. Remember how most mutations that increase lifespan have something to do with the GH/IGF pathway that promotes growth and insulin release and sex hormones?  And how caloric restriction increases lifespan, improves insulin sensitivity, but impairs fertility?  And how higher levels of IGF, sex hormones, and obesity are risk factors for cancer, especially reproductive-organ cancers like breast and prostate?  Almost as though there’s an evolutionary toggle between “growth and reproduction” and “surviving through famine”?  Well, melatonin and the pineal gland seem to tie into this story; if you take away the pineal gland you get high sex hormone levels and metabolic syndrome, while if you add melatonin or pineal extract you can reverse those phenomena.

But does it really connect to longevity? It’s not clear. The only study I found that measured the lifespan of pinealectomized rats was by Walter Pierpaoli, and found that rats pinealectomized at 3 to 5 months have 20% shorter lifespan (p = 0.014) but that pinealectomy does not alter lifespan in 7 to 9 month rats.  Pinealectomy at 14 months actually increases lifespan (by 12.5%) but at 18 months it has no effect.  Pierpaoli speculates that there’s a precise age at which the pineal gland promotes aging, but I think this study is nowhere near enough evidence to conclude that.[23]

At any rate, there’s a simple evolutionary story for why we’d have a toggle between “eat, grow, reproduce” and “survive”, and why that would be connected to the circadian rhythm: SEASONS.

Summer has longer days and more food. Winter has shorter days (more melatonin at night!) and less food. You want to grow fat in the summertime and have babies; you want to survive the winter. A lot of species (though not humans) have a seasonal mating pattern.

And, accordingly, you see the effects of the pineal gland on seasonal mating.

Short-day Siberian hamsters (kept under conditions that are dark longer than they are bright), compared to long-day hamsters, have later puberty, more ovarian follicles, and longer fertility.  Pinealectomize the hamsters and even the short-day ones lose fertility quickly.  (Siberian hamsters in the wild reproduce in spring and summer.)  Long-days have higher body mass than short-days, and pinealectomized short-days are the largest of all.  This is because short-day hamsters eat 16% fewer calories.  The basic bottom line is consistent: darkness = melatonin = less gonadal/growth processes going on = slower reproductive aging.[24]

Siberian hamsters are very cute:

but they have an extremely seasonal pattern of gonadal growth; Google Image Search “siberian hamster testicles” if you dare. Those are summertime testicles. In winter the male Siberian hamster loses thirty percent of his body mass.

Testosterone also rises upon pinealectomy in white-tailed deer at baseline, but it prevents them from having the annual “autumn rut” spike in testosterone that usually peaks in November.  Testicular size also rises in the fall in normal white-tailed deer, but in pinealectomized deer it rises steadily throughout the year.  In other words, pinealectomy flattens out the seasonal breeding rhythm.[25]

So, the pineal gland maintains a regular seasonal and daily cycle of “growth/sex” vs “rest/survival”, with nighttime and winter being the more “rest/survival” oriented periods. If you destroy the pineal gland, you can keep animals shifted towards “growth/sex” all the time (which doesn’t actually make them more fertile overall, it makes them use up their fertility faster).  It’s annoyingly unclear whether pinealectomy makes animal lifespans shorter, and somebody should check that.

It’s also not that clear whether you can get normal animals into more of a “rest/survival” oriented mode by administering extra melatonin or pineal extract. You definitely can’t get them to be in permanent “rest/survival” by administering melatonin 24/7 — continuous melatonin (as opposed to nighttime melatonin) has no effect on longevity or delaying puberty or aging.  But there are some apparently okay mice and rat studies that show that melatonin or pineal extract has a longevity-promoting effect. If it pans out, it would be the biggest-effect-size longevity intervention I’ve seen that isn’t a highly restrictive diet (caloric restriction or low-methionine) or obviously very dangerous (high-dose rapamycin).

I might also speculate from all this that getting a good night’s rest is good for you (I know, shocking), and that having artificial light that doesn’t get any shorter in the winter than the summer may be messing with modern people’s metabolisms in some way.


[1]Pierpaoli, Walter, and William Regelson. “Pineal control of aging: effect of melatonin and pineal grafting on aging mice.” Proceedings of the National Academy of Sciences 91.2 (1994): 787-791.

[2]Kasahara, Takaoki, et al. “Genetic variation of melatonin productivity in laboratory mice under domestication.” Proceedings of the National Academy of Sciences 107.14 (2010): 6412-6417.

[3]Reppert, Steven M., and David R. Weaver. “Melatonin madness.” Cell 83.7 (1995): 1059-1062.

[4]Oxenkrug, G., P. Requintina, and S. Bachurin. “Antioxidant and Antiaging Activity of N‐Acetylserotonin and Melatonin in the in Vivo Models.” Annals of the New York Academy of Sciences 939.1 (2001): 190-199.

[5]Anisimov, Vladimir N., et al. “Melatonin increases both life span and tumor incidence in female CBA mice.” The Journals of Gerontology Series A: Biological Sciences and Medical Sciences 56.7 (2001): B311-B323.

[6]Dilman, V. M., et al. “Increase in lifespan of rats following polypeptide pineal extract treatment.” Experimentelle Pathologie 17.9 (1979): 539-545.

[7]Anisimov, V. N., V. Khavinson, and V. G. Morozov. “Twenty years of study on effects of pineal peptide preparation: Epithalamin in experimental gerontology and oncology.” Annals of the New York Academy of Sciences 719.1 (1994): 483-493.

[8]Anisimov, V. N., L. A. Bondarenko, and V. Kh Khavinson. “Effect of pineal peptide preparation (epithalamin) on life span and pineal and serum melatonin level in old rats.” Annals of the New York Academy of Sciences 673.1 (1992): 53-57.

[9]Khavinson, Vladimir Kh, and Vyacheslav G. Morozov. “Peptides of pineal gland and thymus prolong human life.” Neuroendocrinology Letters 24.3-4 (2003): 233-240.

[10]Goncharova, N. D., et al. “Pineal peptides restore the age-related disturbances in hormonal functions of the pineal gland and the pancreas.” Experimental gerontology 40.1 (2005): 51-57.

[11]Reiter, Russel J. “The ageing pineal gland and its physiological consequences.” Bioessays 14.3 (1992): 169-175.

[12]Stokkan, Karl-Arne, et al. “Food restriction retards aging of the pineal gland.” Brain research 545.1 (1991): 66-72.

[13]Stokkan, Karl-Arne, et al. “Food restriction retards aging of the pineal gland.” Brain research 545.1 (1991): 66-72.

[14]Holmes, S. W., and D. Sugden. “Proceedings: The effect of melatonin on pinealectomy-induced hypertension in the rat.” British journal of pharmacology 56.3 (1976): 360P.

[15]Lima, Fabio B., et al. “Pinealectomy causes glucose intolerance and decreases adipose cell responsiveness to insulin in rats.” American Journal of Physiology-Endocrinology and Metabolism 275.6 (1998): E934-E941.

[16]Diaz, Beatriz, and E. Blazquez. “Effect of pinealectomy on plasma glucose, insulin and glucagon levels in the rat.” Hormone and metabolic research 18.04 (1986): 225-229.

[17]Oxenkrug, Gregory F., Iain M. McIntyre, and Samuel Gershon. “Effects of pinealectomy and aging on the serum corticosterone circadian rhythm in rats.” Journal of pineal research 1.2 (1984): 181-185.

[18]Wurtman, Richard Jay, Mark D. Altschule, and Uno Holmgren. “Effects of pinealectomy and of a bovine pineal extract in rats.” American Journal of Physiology–Legacy Content 197.1 (1959): 108-110.

[19]Kinson, G. A., and Frances Peat. “The influences of illumination, melatonin and pinealectomy on testicular function in the rat.” Life Sciences 10.5 (1971): 259-269.

[20] “Effects of melatonin on Leydig cells in pinealectomized rat: an immunohistochemical study.” Acta histochemica 104.1 (2002): 93-97.

[21]Reiter, Russel J., Peter H. Rubin, and John R. Richert. “Pineal-induced ovarian atrophy in rats treated neonatally with testosterone.” Life sciences 7.5 (1968): 299-305.

[22]Meredith, S., et al. “Long-term supplementation with melatonin delays reproductive senescence in rats, without an effect on number of primordial follicles☆.” Experimental gerontology 35.3 (2000): 343-352.

[23]PIERPAOLI, WALTER, and DANIELE BULIAN. “The Pineal Aging and Death Program: Life Prolongation in Pre‐aging Pinealectomized Mice.” Annals of the New York academy of sciences 1057.1 (2005): 133-144.

[24]Place, Ned J., et al. “Short Day Lengths Delay Reproductive Aging 1.” Biology of reproduction 71.3 (2004): 987-992.

[25]Plotka, E. D., et al. “Early effects of pinealectomy on LH and testosterone secretion in white-tailed deer.” Journal of endocrinology 103.1 (1984): 1-7.

Update on Sepsis

Dr. Marik’s home university, EVMS, is currently raising money for an RCT of his sepsis treatment.

You can donate as a private individual here.

Remember, the whole thing only costs $250,000 — a few people can make a significant dent in that.  The response from my preliminary survey was great, and we clearly have a lot of generous people interested in the sepsis issue.

For my own record-keeping, I’d appreciate it if people who saw this blog post and donated would fill out this form; I’d like to be able to congratulate my blog readers on our total amount raised.


How Much Work is Real?

Epistemic Status: Casual, just framing things

Some work is clearly “productive.” If you plant things in a garden, you put in work, and you get out plants.  If you cook a meal, your family gets fed. If you build a building where people want to live or work, they get shelter. If you treat a patient, the patient gets better. If you carry goods to the place that they’re sold, people get their stuff. If you invent a labor-saving machine, people get to free up their time for other things.

Productive work creates value, in the sense of “doingstuffness”, mana, usefulness-to-humans, etc. It’s not just effort expended, or an accounting formalism like dollars, it’s an increase in the “real wealth” of humanity. That’s not a well-defined concept, but it’s worth pointing at, so we know what we’re complaining about when we see deviations from this.

In the standard capitalist story, you get paid for work because you created value for somebody; they wanted your stuff so much that they were willing to give you something in exchange for it.

In this world, all productive work is honorable.  Work is fair — on average, you get what it’s worth — and it’s a contribution, however small, to the wellbeing of humankind, the fire that beats back against the blackness of hard vacuum.

But there are ways that things called “work” can fail to be productive work.

Fraud or crime obviously are not productive. If you get money from people by tricking or terrifying them, you’re not getting it by providing them with value. You’re not a maker.

Enforced monopoly power is also not entirely productive. If people are required by law — on pain of punishment — to buy your product, then at least some of your revenue is driven by fear, not desire.

Regulations can be a form of enforced monopoly power. If only people who meet certain criteria are allowed to sell, then people are buying from you and not your competitors not because they like you better, but because your competitors are driven out by fear of punishment. Once again, you’re profiting partly off fear, not just desire.

A job that is funded by taxpayer money, or by the fact that the product sold is mandatory to buy, or by the fact that nobody knows whether it would be illegal to get rid of that job and they want to play it safe, doesn’t need to be useful at all.

And there are still more indirect ways that a job can fail to be useful.  If you sell to people who don’t do anything useful, then your job would not have been necessary in a sensibly organized world, even if you do nothing dishonest yourself and genuinely add value to your customers.

This is what it means to live in a “mixed economy.”  Not everything that everyone does for a living is genuinely useful.

If there are bullshit jobs, as anthropologist David Graeber claims, then that’s a shame, from the perspective of human well-being. If we have enough real wealth, enough mana, to support even people who aren’t making mana, then why not just allow leisure, instead of forcing people to go through the motions of dull and unnecessary work?

This is largely the position of left-libertarians like the people at Center for a Stateless Society.  They make the empirical claim that most of the present economy in developed countries is coercive and unproductive, the result of crony capitalism and regulatory capture rather than honest, useful work.  As such, a “freed market” without such corruption would actually be more egalitarian than our current economy.   Since government promotes monopoly, Big Business wouldn’t be sustainable without coercion.  Since highly regulated and positional goods like housing and education are essentially mandatory for participating in much of modern life, if those mandates were abolished, socioeconomic inequality would drop.

On the other hand, this empirical claim could be false. Nobody denies that some corruption exists, but it might be the exception rather than the rule. We might not, in fact, be in post-scarcity conditions.  So-called “bullshit” jobs may actually be valuable, just easy to dismiss by outsiders like Graeber.  Growing wealth inequality may be largely the result of winner-take-all phenomena, as Tyler Cowen thinks — in his model, the working rich really are more productive than ever, thanks to the amplifying effects of technology.  Love it or hate it, says Cowen, capitalism works the way it says on the tin.

There is some evidence that economic rents are on the rise in the US. Wealth inequality has risen over recent decades, but labor productivity has declined. Economic dynamism — the number of people changing jobs and starting new businesses — has also declined.

One important piece of evidence that rents are on the rise in the United States is the divergence of rising returns to capital and declining real interest rates. In the absence of economic rents, the return on corporate capital should generally follow the path of interest rates, which reflect the prevailing return to capital in the economy. But over the past three decades, the return to productive capital generally has risen, despite the large decline in yields on government bonds.

Other firm-side evidence points to an increased prevalence of supranormal returns over time. Between 1997 and 2012, market concentration increased in 12 out of 13 major industries for which data are available, and a range of micro-level studies of sectors including air travel, telecommunications, banking, and food-processing have all produced evidence of greater concentration.

The fact that variations in the rate of return to capital have increased enormously across firms may also at least partially reflect increased concentration and the role of economic rents. Finally, there is evidence that land-use regulation may also play a role in the presence of increased economic rents, decreasing housing affordability, and reducing nationwide productivity and growth by restricting supply.

There’s also Matt Rognlie’s paper showing that the long-term rise of capital’s share of wealth (compared to labor’s) is almost entirely a result of increased housing prices — literal rents, kept high by land-use restrictions.

And there’s the phenomenon that S&P 500 firms are now 5/6ths “dark matter” — that is, things that a new entrant to the field can’t copy.

Imagine that you wanted to create a new firm to compete with one of these big established firms. So you wanted to duplicate that firm’s products, employees, buildings, machines, land, trucks, etc. You’d hire away some key employees and copy their business process, at least as much as you could see and were legally allowed to copy.

Forty years ago the cost to copy such a firm was about 5/6 of the total stock price of that firm. So 1/6 of that stock price represented the value of things you couldn’t easily copy, like patents, customer goodwill, employee goodwill, regulator favoritism, and hard to see features of company methods and culture. Today it costs only 1/6 of the stock price to copy all a firm’s visible items and features that you can legally copy. So today the other 5/6 of the stock price represents the value of all those things you can’t copy.

“Intangibles” would certainly include rent-seeking forms of favoritism.

It also includes patents, which arguably do not increase innovation on the margin (according to natural experiments between different countries with different patent regimes or changes in patent laws.) Copyright and patent lengths have gotten longer in the US, and patent applications have grown at an accelerating rate; the growth of intellectual property is another example of our economy becoming more monopolistic.

How much of our economy consists of rent-seeking would be hard to detect, and I’m not sure anybody has attempted it. “Spikiness” in wealth between firms or individuals could be either due to monopolistic privileges or variance in productivity.  Concentration of growth in highly regulated industries points towards rent-seeking being prominent, especially if the measured outcomes of those industries don’t improve (e.g. healthcare), but it doesn’t tell us what percent of the value of healthcare is due to rents.

And, of course, even such estimates don’t tell us what to do, because of path-dependency effects. Even if we discovered with high confidence that an industry was mostly corrupt, that doesn’t guarantee that “anti-corruption” efforts will actually make it less so.  (Sometimes the increased administrative demands of making sure nobody gives bribes cost more than the bribes themselves did.)

But the question is relevant, to those of us who want to know “where can I find productive work?” and “how much misdirection is going on under the surface of today’s world?”

I work at a biotech company. I made a special effort to find a job that was as honest as possible, while still being in my field.  And I think we are honest here; the official purpose of the company (to find new promising drugs) is also the implicitly endorsed goal that people actually work towards.  We’re a bunch of scientists, with scientific sensibilities. But we’re still in an industry defined by grants, patents, regulations, and other monopolistic practices.  There are still, I think, pockets of inefficiency that result from being in that industry.  Bigger, older businesses often get flummoxed when their startup partners move too fast.  And those of us who don’t work in the lab don’t really need to work 8 hours a day every day in order to meet our planned goals.  My job isn’t bullshit, by any means, but I sometimes suspect that it isn’t maximally productive.

I don’t believe in being so obsessed with personal purity that you never get anything done — that’s not useful and it’s not the point. It’s more about trying to figure out what kind of world you live in.

Chronic Fatigue Syndrome

Epistemic Status: moderately confident. Spent several weeks on this in an effort to be more complete and careful than most of my lit reviews.

Chronic fatigue syndrome is something of a medical mystery. Some doctors question whether it’s a real disease at all. There are no well established treatments. We don’t know what causes it.

There’s a lot of evidence that CFS has something to do with immune and hormonal dysfunction, and is frequently associated with infectious diseases, particularly the Epstein-Barr virus and other herpes viruses (not all of which are sexually transmitted.)  There are also some immunotherapy options that seem to be effective in a subset of CFS patients, in particular corticosteroids.

Bottom Lines

Corticosteroids seem to help for a sub-population of CFS patients.  Rituximab, bacterial therapy, and intravenous immunoglobulin may also help for some CFS patients, but the evidence base is smaller or less consistent for those.

Chronic Fatigue and the HPA Axis

The hypothalamus/pituitary/adrenal (HPA) axis is a system of interconnected hormone signaling processes involved in the body’s response to stress. Cortisol, the “stress hormone”, is produced by the adrenal glands in response to signals from the pituitary gland and hypothalamus.

Cortisol suppresses inflammation, which is why it’s often used as a treatment in autoimmune diseases. It also promotes alertness and increases blood sugar, to ready the body for action.

There’s evidence that patients with chronic fatigue syndrome have lower cortisol levels, or are less able to produce cortisol in response to the appropriate stimuli.

There are a number of small studies showing that CFS patients have lower cortisol than healthy people.  14 CFS patients had significantly lower salivary cortisol levels compared to 26 cases of depression and 131 controls.[2]  In 15 CFS patients and 20 controls, mean salivary cortisol levels were significantly lower for CFS patients.[3] Urinary free cortisol was significantly lower in 121 CFS patients compared to 64 control patients.[4]  10 melancholic depressives had higher urinary free cortisol than 15 controls, while 21 CFS patients had lower urinary free cortisol.[5]  In 10 patients with CFS, 15 patients with major depression, and 25 healthy controls, baseline serum cortisol levels were highest in the depressives, lowest in the CFS patients, moderate in the controls.[6]

However, not all studies replicate the finding. In 22 CFS patients and 22 healthy controls, one study found no difference in urinary or salivary cortisol.[7] In another study of 10 CFS patients vs 10 controls, patients were slightly but significantly higher in salivary cortisol.[8]

One possible explanation for the discrepancy is that cortisol levels fluctuate greatly throughout the day, and in response to conditions that vary from day to day (food intake, stress, etc).  All these studies sampled patients over the course of a day or less. It’s not surprising that small studies should find discordant results, especially given the possibility that not all CFS patients are alike.

One finding that does seem consistent is that chronic fatigue patients have a blunted cortisol response to ACTH, the hormone produced by the pituitary that normally stimulates cortisol release, and they have an exaggerated drop in cortisol levels in response to challenge with corticosteroids (cortisol and its molecular analogues reduce ACTH levels in a negative feedback loop.)  So, even if cortisol is not always lower in CFS patients, it may be more sluggish to rise and quicker to decline.  In 21 CFS patients vs. 21 healthy controls, patients with CFS had normal baseline salivary cortisol but showed enhanced and prolonged suppression of salivary cortisol in response to dexamethasone challenge.[9]  Prednisolone challenge suppresses both salivary and urinary cortisol more in CFS patients (n=15) than controls (n=20).[11]

Upon challenge with ACTH, the increase in plasma cortisol was significantly less for 20 CFS subjects vs. 20 controls.[10]  In 22 CFS patients vs 14 controls, CFS patients also had a blunted DHEA response to ACTH; DHEA is a steroid hormone and androgen precursor, which is generally produced in response to exercise. In other words, the steroid-promoting effects of ACTH are weaker in CFS patients, while the cortisol-reducing negative feedback effects of corticosteroids are stronger.

Perhaps relatedly, CFS patients have blunted salivary cortisol response to awakening compared to healthy volunteers.[1]  Another study compares that female CFS patients have lower lower morning salivary cortisol than controls.[8] This may be related to the unrefreshing sleep and constant fatigue that CFS patients experience.

Finally, when there are hormonal differences in CFS patients, there are also clear anatomical differences. 8 CFS patients who had a subnormal cortisol response to ACTH challenge were found to have adrenal glands less than 50% the size of normal subjects’ adrenal glands.  In each case, the symptoms of fatigue were preceded by a viral infection.[12]  So in at least some cases, CFS patients have smaller glands than healthy people, which indicates that at least some of the time, CFS is associated with a damaged endocrine system.

Chronic Fatigue and Impaired NK Function

Studies find a variety of abnormalities in white blood cells in CFS patients, but the only really consistent results are impairment of NK (natural killer) cells’ function.  NK cells are cytotoxic (cell-killing) white blood cells involved in the innate immune response; they attack tumor cells and infected cells.

A study of 30 CFS patients and 69 controls found that NK cell cytotoxicity was 64% lower against tumor cell lines.[13]  In a family with 8 relatives with chronic fatigue syndrome, affected individuals had 62% lower NK activity levels (p = 0.008) against a tumor cell line than normal controls; unaffected relatives had intermediate NK activity levels.[14]  In a study of 41 CFS patients and 23 matched controls, the patients had significantly lower cytotoxic activity against EBV-infected cell lines and tumor cell lines, and patients also had significantly lower levels of NKH1+ NK cells, a subtype which comprises most of the NK cells in healthy people.[15]  A review article [16] explained that there are conflicting results in most immunological abnormalities in CFS; most studies, however, found reduced NK activity and reduced lymphoproliferative activities in response to antigens.  Of 17 studies that evaluated NK activity in CFS patients, 15 found reduced NK cytotoxicity in the CFS patients compared to controls, and a greater decrease in activity was associated with greater symptom severity.[17]

This suggests that CFS is characterized by a weakened immune system.

In general, NK deficiencies or reduced NK activity are associated with greater susceptibility to herpesviruses.[18] Reduced NK activity has also been found in major depression [19],  stress[20][21][22], bereavement [23], and sleep deprivation[24][25].

Chronic Fatigue and Herpesviruses

A number of studies have found that CFS is associated with elevated antibodies to viruses, particularly herpesviruses. It’s also often been found that CFS occurs with rapid onset, after a viral illness, and that there are outbreaks of CFS in locations where there have been disease outbreaks.  However, results are not entirely consistent between studies.

Negative Results

In a study of 548 CFS patients vs 30 healthy controls, CFS patients did not have significantly higher rates of positive titers on antibodies to HSV1, HSV2, Rubella, CMV, EBV, HHV-6, or Coxsackie.  This was a study of consecutive patients at a chronic fatigue clinic in Washington State.[26]

In another study of 100 CFS patients referred to the lab by doctors and 92 healthy controls, there were significantly higher rates in patients than controls of prevalence of antibodies to EBV viral capsid antigen, and prevalence of antibodies to EBV early antigen, but not to antibodies to EBV nuclear antigen; there was also no significant difference between patients and controls in who had high titers of antibodies.[27]

In a study of 26 patients from Atlanta with CFS and 50 healthy controls, there were no significant differences in the rate of prevalence of antibodies to any viruses, including HHV-6; the prevalence of antibodies in the controls was nearly 100% for all viruses tested.  Also, there was no significant difference in the antibody titers for any EBV antibodies (early antigen, nuclear antigen, or viral capsid.)[28]

In a study of four clusters of outbreaks of CFS in the Nevada/California area in the 1980’s, with 31 patients and 105 controls in total, there was no significant difference in the mean antibody titer to HHV-6, EBV-VCA, or EBV-EA.  Mean VCA GMT level for cases was 239.7 vs. 254.0 for controls, a non-significant difference.[29]

In 88 patients with CFS compared to 76 healthy blood donors in the Netherlands, there was no significant difference in geometric mean titer for EBV EA antibodies or EBV VCA antibodies. Mean VCA GMT level for patients was 39.5 vs. 38.0 for controls.[33]

For identical twin pairs discordant for CFS, the twin with CFS was no more likely to have serological evidence of virus than the twin without (including EBV and HHV-6).[39]

In 14 patients with CFS compared to 14 controls, there was no significant difference in EBV antibody titer.[40]

Positive Results

In a study of 259 patients associated with the Lake Tahoe outbreak in 1984, compared to 40 healthy controls, found active replication of HHV-6 in cell cultures from blood in 70% of patients compared to 20% of controls.  The reciprocal geometric mean titers for EBV VCA were significantly higher in the patient than control group (138.0 +/- 2.6 for the cases vs. 67.6 +/- 4.4 for the controls) but not for early antigen or nuclear antigen.  There was no significant difference in antibody titers for HHV-6, though.  Mean HHV-6 ELISA densities were 1905 for cases and 1288 for controls, a nonsignificant difference.[30]

In a study comparing 15 patients from the Lake Tahoe outbreak who had been sick for more than 2 months to 119 patients with less severe symptoms and 30 matched controls found that a significantly higher fraction of cases than non-case patients had EBV VCA antibodies at 160 or greater, and 320 or greater.  Reciprocal geometric mean titers for VCA were higher in case-patients than controls (254 vs. 115.)  After retesting across 3 laboratories, the only significant difference between case-patients and control-patients was EBV EA titer, with reciprocal geometric mean titers of 22 in cases vs. 9 in controls; VCA levels were not significantly different.[31]

In 58 CFS patients and 68 matched controls, 33 CFS patients (57%) had positive EBV VCA IgM titers, compared to 7% of controls.[32]  IgM titers to EBV are rare, are more likely to indicate active infection, and most studies find none at all.

In a study of 154 CFS patients and 165 controls from Flint and Boston, patients were significantly more likely than controls (p < 0.001) to have the presence of IgG and IgM antibodies to HHV-6, but not to have EBV-EA antibodies.[34]

In a study of 10 CFS patients with acute mononucleosis onset, 10 CFS patients without, and 42 healthy controls found significantly higher EBV IgG VCA antibody titers in all CFS patients relative to controls, as well as HHV-6 antibody titers.[35]

In 21 MS patients, 35 CFS patients, and 28 healthy controls, 75% of MS patients had elevated IgM titers to HHV-6 antibodies compared to 6.7% of healthy controls, and 71.4% elevated IgM titers to HHV-6 virus compared to 15% of controls.  However 60-80% of everyone had HHV-6 by PCR. CFS patients were more likely to have IgG responses to early HHV-6 antibodies than controls (65.2% vs 20%) and IgM responses to early HHV-6 antibodies than controls (54.3% vs. 8.0%).  This suggests a high level of HHV-6 reactivation in CFS and MS patients.[36]

In 13 patients with CFS and 13 healthy controls, serum antibodies for HHV-6 were significantly higher in the patients; 7 of the patients and none of the controls had HHV-6 DNA, as measured by PCR.[37]

A study of 36 CFS patients and 24 controls found that HHV-6A DNA was significantly more prevalent in CFS patients, while HHV-6B DNA was the same.[38]


HHV-6 High IgG levels

Study Cases (number) Controls (number)
Buchwald 1996 13% (295) 7% (30)
Patnaik 1995 40% (154) 8% (165)
Sairenji 1995 100% (20) 88% (26)
Ablashi 2000 71% (35) 0% (25)

HHV-6 High IgM levels

Patnaik 1995 60% (154) 4% (165)
Ablashi 2000 54% (35) 8% (25)


Yalcin 1994 53% (13) 0% (13)
Di Luca 1995 22% (36) 4% (24)
Koelle 2002 36% (22) 27% (22)

EBV-VCA High IgG levels

Buchwald 1996 8% (308) 3% (30)
Sumaya 1991 11.9% (42) 18% (100)
Swanink 1995 32% (88) 32% (76)
Sairenji 1995 20% (20) 0% (26)

EBV-VCA High IgM Levels

Lerner 2004 100% (33) 8% (50)

EBV-EA High IgG levels

Buchwald 1996 18% (308) 23% (30)
Sumaya 1991 47.6% (42) 69% (100)
Lerner 2004 79% (33) 30% (50)
Swanink 1995 8% (88) 8% (76)
Patnaik 1995 25% (154) 15% (15)
Sairenji 1995 45% (20) 0% (26)


EBV-NA positive

Buchwald 1996 95% (308) 93% (30)
Sumaya 1991 97.6% (42) 88% (100)
Swanink 1995 16% (88) 32% (76)


EBV-IgM elevated

Buchwald 1996 0.6% (310) 3% (30)


Sumaya 1991 182.5 (42) 181.2 (100)
Mawle 1995 89.0 (26) 83.6 (50)
Buchwald 1992 138 (134) 67.6 (27)
Holmes 1987 169 (15) 113 (30)
Levine 1992 239.7 (24) 254.0 (49)


Sumaya 1991 9.5 (42) 21.1 (100)
Mawle 1995 57.5 (26) 35.2 (50)
Buchwald 1992 40.7 (134) 12.6 (27)
Holmes 1987 22 (15) 9 (30)
Levine 1992 6.0 (24) 2.1 (49)


Sumaya 1991 21.7 (42) 13.8 (100)
Mawle 1995 26.4 (26) 21.1 (50)
Holmes 1987 53 (15) 36 (30)


Mawle 1995 1460 (26) 1715 (50)
Buchwald 1992 1905 (134) 1288 (27)
Levine 1992 132.6 (27) 87.9 (89)

The only serological differences that are consistently significantly different between CFS and normal patients are HHV-6 DNA (except for the twin study), HHV-6 IgM, and EBV-VCA IgM levels.  IgM, as opposed to IgG, levels, indicate active infection, as do viral DNA levels. This suggests that chronic fatigue patients are more likely than controls to have reactivated herpesviruses, but may not be more likely than controls to have had past exposure to herpesviruses.

Chronic Fatigue and Other Infections

There are some studies that have found associations between chronic fatigue syndrome and other types of bacterial and viral infection.


Mycoplasma bacterial species can survive for a long time inside cells, evade immune response, and resist treatment with antibiotics. They can cause a form of pneumonia and a sexually transmitted disease, and have been associated with various types of cancer.

In a study of 200 CFS patients and 100 controls, 52% of CFS patients had Mycoplasma infections compared to 7% of controls, and 30.5% of CFS patients had HHV-6 infections compared to 9% of controls, as measured by forensic PCR.[41]

In 100 CFS patients and 50 controls, 52% of CFS patients had PCR results positive for Mycoplasma genus, compared to 14% of controls (p < 0.0001).[42]

Other viruses

In 258 patients from Dubbo in rural Australia, exposed to Epstein-Barr virus, Ross River virus, or Q fever, 35% had a post-infective fatigue syndrome at 6 weeks and 12% at 6 months, at which point 11% (28 patients) met criteria for chronic fatigue syndrome. [43]

Out of 51 patients infected with acute Parvovirus B19, 5 went on to meet criteria for CFS.  Those with prolonged fatigue and CFS had significantly higher rates of serum B19 DNA.[44]

In 50 patients with postviral fatigue, 6 were associated with a local epidemic of Coxsackie virus, and 9 from a different viral epidemic of unknown cause; 30 had high antibody titers to Coxsackie virus, but none to other viruses.[45]

Chronic fatigue syndrome seems to frequently follow acute infections, and it is associated with high DNA levels of pathogens, often ones (like viruses or Mycoplasma bacteria) that can persist in the body indefinitely.

Corticosteroids Relieve CFS In A Sub-Population of Patients

In a study of 37 patients with chronic fatigue syndrome and 28 healthy controls, the CFS group had higher baseline cortisol levels but weaker cortisol responses to CRH and fenfluramine, and lower urinary cortisol levels.  In a subset of responders (8 out of 23 patients) treated with low-dose hydrocortisone for 28 days, the blunting of the cortisol response recovered, and CRH again caused a strong cortisol spike.  In these patients, fatigue dropped to the same level as the normal population.[46]

In a randomized trial of 32 CFS patients with no comorbid disorders, self-reported fatigue scores fell by 7.2 points in treatment group vs. 3.3 points in placebo group (p = 0.009), and 28% of treated patients reached normal levels of fatigue, compared to 9% of the placebo patients. This was a crossover study: patients received either hydrocortisone or placebo for one month, and then the reverse.[50]

Patients with CFS have higher DHEA levels than controls; there is a correlation between higher DHEA and more disability; untreated CFS patients have a blunted DHEA response to CRH challenge compared to controls and hydrocortisone-treaded CFS patients; basal levels of DHEA also went down after treatment with hydrocortisone.[47]

In a much older study from 1948, 53 patients with chronic mononucleosis, with “infectious mononucleosis cells” in the blood, presenting with weakness or ease of fatigue, responded only to a preparation of adrenal cortical extract (“cortalex”). “There was but little subjective improvement during the first week, but a definite feeling of well being developed during the second week and was quite definite during the third week. After this the medication was discontinued and the improvement usually continued. In a few patients it was necessary to increase the dose, or resume it after its discontinuance. Associated with the subjective improvement, there was a decrease in the size of the spleen.”[48]

However, when patients are not selected for having a blunted cortisol response, sometimes trials of corticosteroids on CFS don’t show positive results.

A crossover study of 80 patients given hydrocortisone and fludrocortisone found no significant difference from placebo in reported fatigue.  Note that the treatment group here did not see a larger response than placebo to an ACTH injection. So this negative result would still be consistent with the hypothesis that steroids work only when they recover the cortisol response to CRH or ACTH.[49]

A controlled study of 63 patients given low-dose hydrocortisone vs. placebo found no significant difference in wellness score over a period of 3 months, but significantly more patients  (53% vs 29%, p =0.04) experiencing an improvement of >5 points on the wellness score, which could be consistent with the drug being effective on a sub-population.[51]

Corticosteroids in Autoimmune Neurological Disorders

Chronic fatigue syndrome has similar symptoms and may have similar causes to other autoimmune neurological disorders such as multiple sclerosis and inflammatory neuropathies. Fatigue, muscle weakness, and brain fog, as well as high antibody titers for viruses, are found in these diseases. Corticosteroids are often standard treatments. This suggests that analogous treatment may be useful in CFS.

Corticosteroids (particularly methylprednisone) decreased by 63% the probability of the patient failing to recover from an exacerbation of multiple sclerosis, according to a Cochrane Review.[52]

IVIG and/or corticosteroids are standard treatment for chronic inflammatory demyelinating polyradiculoneuropathy.  Both significantly reduce disability scores.[53][54]

Demyelinating peripheral neuropathy responded to corticosteroids in six children, who regained strength and ability to walk.[55]

Corticosteroids (prednisolone) have significant positive effects on muscle strength and ability to function in daily life for patients with myasthenia gravis, an autoimmune neurological disorder.[56]

However, corticosteroids are ineffective in Guillain-Barre syndrome, another autoimmune demyelinating disease causing weakness and numbness. Standard treatment for Guillain-Barre is plasmapheresis and/or IVIG.[57]

Corticosteroids suppress inflammation, so they are often effective on autoimmune disorders which damage the nervous system through inflammatory damage. While it is not known what causes CFS, if it is an autoimmune disorder, it may respond to similar treatment.

IVIG Is Not Consistently Effective in CFS

Intravenous immunoglobulin is the practice of treating immunodeficiency disorders with a variety of antibodies via injection.

A 30-person randomized trial of IVIG in CFS, with a dose of 1 gm/kg, found no significant differences in symptoms between treatment and control by the 5-month follow-up point.[58]

A 99-patient controlled trial of IVIG vs. placebo infusion on CFS patients found no significant treatment effect on any self-reported symptom scores.[59]

A 71-patient randomized controlled trial of IVIG vs. placebo infusion found a barely-significant (p = 0.04) difference between placebo and IVIG on symptom scores.[64]

However, a 49-person study of patients with CFS treated with a dose of 2 gm/kg of IVIG, 40 of which had reduced T-cell counts or reduced response to skin-test antigens, found  43% of the treated group compared to 12% of controls noticed major reductions in their symptoms at the 3-month follow-up point after treatment.  The responders also noticed recovery of their cell-mediated immunity findings.[60]

It’s possible that for a sub-population of CFS patients with abnormally low T-cell counts or T-cell subtype counts, IVIG can be helpful; but it doesn’t seem to be helpful for CFS patients across the board.

Staphylococcus Toxin May Help CFS

In a randomized trial treating 100 fibromyalgia or CFS patients with staphylococcus toxin or placebo found that the treatment group had 65% responders (reduction of >50% of symptoms on a comprehensive rating scale) compared to 18% for placebo, p < 0.001. There was improvement at a p < 0.01 level in fatiguability, reduced sleep, failing memory, concentration difficulties, and sadness.[61]

Rituximab May Help CFS

Rituximab, an immunosuppressant drug that targets B cells, was found to improve fatigue scores in 67% of 30 patients in a randomized trial, compared to 13% of placebo. (p = 0.003). There were no adverse effects except a worsening of psoriasis in two patients.[62]

In an open-label follow-up from the same lab, 18 out of 29 patients on maintenance rituximab therapy for 15 months had clinically significant responses.[63]


Reduced NK activity and viral reactivations naturally go together, and stress can cause both.  Cortisol usually inhibits NK activity, so long-term hypocortisolism might result in NK cells that become more sensitive to cortisol[65], a possible mechanism for how an impaired HPA axis could result in NK dysfunction and thence viral reactivation.  The picture that seems to be emerging is that prolonged stress and/or an acute viral infection can result in fatigue and immunocompromise. This would explain why there are often psychological comorbid factors.

If this is what’s going on, then the obvious intervention points would be to increase cortisol (particularly the phasic cortisol response to stress) and to increase NK activity.  Administering low dose corticosteroids seems to do reasonably well at the former. It’s not clear how to do the latter, but cytokines like IL-15 might work[66] and so might bacterial therapies like the staphylococcus toxin mentioned above.


[1]Roberts, Amanda DL, et al. “Salivary cortisol response to awakening in chronic fatigue syndrome.” The British Journal of Psychiatry 184.2 (2004): 136-141.

[2]Strickland, Paul, et al. “A comparison of salivary cortisol in chronic fatigue syndrome, community depression and healthy controls.” Journal of Affective Disorders 47.1 (1998): 191-194.

[3]Jerjes, W. K., et al. “Diurnal patterns of salivary cortisol and cortisone output in chronic fatigue syndrome.” Journal of affective disorders 87.2 (2005): 299-304.

[4]Cleare, Anthony J., et al. “Urinary free cortisol in chronic fatigue syndrome.” American Journal of Psychiatry 158.4 (2001): 641-643.

[5]Scott, Lucinda V., and Timothy G. Dinan. “Urinary free cortisol excretion in chronic fatigue syndrome, major depression and in healthy volunteers.” Journal of Affective Disorders 47.1 (1998): 49-54.

[6]Cleare, Anthony J., et al. “Contrasting neuroendocrine responses in depression and chronic fatigue syndrome.” Journal of affective disorders 34.4 (1995): 283-289.

[7]Wood, Barbara, et al. “Salivary cortisol profiles in chronic fatigue syndrome.” Neuropsychobiology 37.1 (1998): 1-4.

[8]Nater, Urs M., et al. “Attenuated morning salivary cortisol concentrations in a population-based study of persons with chronic fatigue syndrome and well controls.” The Journal of Clinical Endocrinology & Metabolism 93.3 (2008): 703-709.

[9]Gaab, Jens, et al. “Low-dose dexamethasone suppression test in chronic fatigue syndrome and health.” Psychosomatic Medicine 64.2 (2002): 311-318.

[10]Scott, Lucinda V., Sami Medbak, and Timothy G. Dinan. “The low dose ACTH test in chronic fatigue syndrome and in health.” Clinical endocrinology 48.6 (1998): 733-737.

[11]Jerjes, Walid K., et al. “Enhanced feedback sensitivity to prednisolone in chronic fatigue syndrome.” Psychoneuroendocrinology 32.2 (2007): 192-198.

[12]Scott, Lucinda V., et al. “Small adrenal glands in chronic fatigue syndrome: a preliminary computer tomography study.” Psychoneuroendocrinology 24.7 (1999): 759-768.

[13]Klimas, Nancy G., et al. “Immunologic abnormalities in chronic fatigue syndrome.” Journal of clinical microbiology 28.6 (1990): 1403-1410.

[14]Levine, Paul H., et al. “Dysfunction of natural killer activity in a family with chronic fatigue syndrome.” Clinical immunology and immunopathology 88.1 (1998): 96-104.

[15]Caligiuri, M. I. C. H. A. E. L., et al. “Phenotypic and functional deficiency of natural killer cells in patients with chronic fatigue syndrome.” The Journal of Immunology 139.10 (1987): 3306-3313.

[16]Patarca-Montero, Roberto, et al. “Immunology of chronic fatigue syndrome.” Journal of Chronic Fatigue Syndrome 6.3-4 (2000): 69-107.

[17]Strayer, D., V. Scott, and W. Carter. “Low NK cell activity in Chronic Fatigue Syndrome (CFS) and relationship to symptom severity.” J Clin Cell Immunol 6 (2015): 348.

[18]Orange, Jordan S. “Natural killer cell deficiency.” Journal of Allergy and Clinical Immunology 132.3 (2013): 515-525.

[19]Nerozzi, Dina, et al. “Reduced natural killer cell activity in major depression: neuroendocrine implications.” Psychoneuroendocrinology 14.4 (1989): 295-301.

[20]Sieber, William J., et al. “Modulation of human natural killer cell activity by exposure to uncontrollable stress.” Brain, behavior, and immunity 6.2 (1992): 141-156.

[21]Irwin, Michael, et al. “Reduction of immune function in life stress and depression.” Biological psychiatry 27.1 (1990): 22-30.

[22]Glaser, Ronald, et al. “Stress depresses interferon production by leukocytes concomitant with a decrease in natural killer cell activity.” Behavioral neuroscience 100.5 (1986): 675.

[23]Irwin, Michael, et al. “Plasma cortisol and natural killer cell activity during bereavement.” Biological psychiatry 24.2 (1988): 173-178.

[24]Irwin, Michael, et al. “Partial night sleep deprivation reduces natural killer and cellular immune responses in humans.” The FASEB journal 10.5 (1996): 643-653.

[25]Moldofsky, Harvey, et al. “Effects of sleep deprivation on human immune functions.” The FASEB Journal 3.8 (1989): 1972-1977.

[26]Buchwald, Dedra, et al. “Viral serologies in patients with chronic fatigue and chronic fatigue syndrome.” Journal of medical virology 50.1 (1996): 25-30.

[27]Sumaya, Ciro V. “Serologic and virologic epidemiology of Epstein-Barr virus: relevance to chronic fatigue syndrome.” Review of Infectious Diseases 13.Supplement 1 (1991): S19-S25.

[28]Mawle, Alison C., et al. “Seroepidemiology of chronic fatigue syndrome: a case-control study.” Clinical Infectious Diseases 21.6 (1995): 1386-1389.

[29]Levine, Paul H., et al. “Clinical, epidemiologic, and virologic studies in four clusters of the chronic fatigue syndrome.” Archives of internal medicine 152.8 (1992): 1611-1616.

[30]Buchwald, Dedra, et al. “A chronic illness characterized by fatigue, neurologic and immunologic disorders, and active human herpesvirus type 6 infection.” Annals of internal medicine 116.2 (1992): 103-113.

[31]Holmes, Gary P., et al. “A cluster of patients with a chronic mononucleosis-like syndrome: is Epstein-Barr virus the cause?.” JAMA 257.17 (1987): 2297-2302.

[32]LERNER, A. MARTIN, et al. “IgM serum antibodies to Epstein-Barr virus are uniquely present in a subset of patients with the chronic fatigue syndrome.” in vivo 18.2 (2004): 101-106.

[33]Swanink, Caroline MA, et al. “Epstein-Barr virus (EBV) and the chronic fatigue syndrome: normal virus load in blood and normal immunologic reactivity in the EBV regression assay.” Clinical infectious diseases 20.5 (1995): 1390-1392.

[34]Patnaik, Madhumita, et al. “Prevalence of IgM antibodies to human herpesvirus 6 early antigen (p41/38) in patients with chronic fatigue syndrome.” Journal of Infectious Diseases 172.5 (1995): 1364-1367.

[35]Sairenji, Takeshi, et al. “Antibody responses to Epstein-Barr virus, human herpesvirus 6 and human herpesvirus 7 in patients with chronic fatigue syndrome.” Intervirology 38.5 (1995): 269-273.

[36]Ablashi, D. V., et al. “Frequent HHV-6 reactivation in multiple sclerosis (MS) and chronic fatigue syndrome (CFS) patients.” Journal of Clinical Virology 16.3 (2000): 179-191.

[37]Yalcin, Safak, et al. “Prevalence of human herpesvirus 6 variants A and B in patients with chronic fatigue syndrome.” Microbiology and immunology 38.7 (1994): 587-590.

[38]Di Luca, D. A. R. I. O., et al. “Human herpesvirus 6 and human herpesvirus 7 in chronic fatigue syndrome.” Journal of clinical microbiology 33.6 (1995): 1660-1661.

[39]Koelle, David M., et al. “Markers of viral infection in monozygotic twins discordant for chronic fatigue syndrome.” Clinical Infectious Diseases 35.5 (2002): 518-525.

[40]Whelton, C. L., I. Salit, and H. Moldofsky. “Sleep, Epstein-Barr virus infection, musculoskeletal pain, and depressive symptoms in chronic fatigue syndrome.” The Journal of rheumatology 19.6 (1992): 939-943.

[41]Nicolson, G. L., R. Gan, and J. Haier. “Multiple co‐infections (Mycoplasma, Chlamydia, human herpes virus‐6) in blood of chronic fatigue syndrome patients: association with signs and symptoms.” Apmis 111.5 (2003): 557-566.

[42]Vojdani, A., et al. “Detection of Mycoplasma genus and Mycoplasma fermentans by PCR in patients with Chronic Fatigue Syndrome.” FEMS Immunology & Medical Microbiology 22.4 (1998): 355-365.

[43]Hickie, Ian, et al. “Post-infective and chronic fatigue syndromes precipitated by viral and non-viral pathogens: prospective cohort study.” Bmj 333.7568 (2006): 575.

[44]Kerr, Jonathan R., et al. “Chronic fatigue syndrome and arthralgia following parvovirus B19 infection.” The Journal of Rheumatology 29.3 (2002): 595-602.

[45]Hickie, Ian, et al. “Post-infective and chronic fatigue syndromes precipitated by viral and non-viral pathogens: prospective cohort study.” Bmj 333.7568 (2006): 575.

[46]Cleare, A. J., et al. “Hypothalamo-pituitary-adrenal axis dysfunction in chronic fatigue syndrome, and the effects of low-dose hydrocortisone therapy.” The Journal of Clinical Endocrinology & Metabolism 86.8 (2001): 3545-3554.

[47]Cleare, A. J., V. O’Keane, and J. P. Miell. “Levels of DHEA and DHEAS and responses to CRH stimulation and hydrocortisone treatment in chronic fatigue syndrome.” Psychoneuroendocrinology 29.6 (2004): 724-732.

[48]Isaacs, Raphael. “Chronic infectious mononucleosis.” Blood 3.8 (1948): 858-861.

[49]Blockmans, Daniel, et al. “Combination therapy with hydrocortisone and fludrocortisone does not improve symptoms in chronic fatigue syndrome: a randomized, placebo-controlled, double-blind, crossover study.” The American journal of medicine 114.9 (2003): 736-741.

[50]Cleare, Anthony J., et al. “Low-dose hydrocortisone in chronic fatigue syndrome: a randomised crossover trial.” The Lancet 353.9151 (1999): 455-458.

[51]McKenzie, Robin, et al. “Low-dose hydrocortisone for treatment of chronic fatigue syndrome: a randomized controlled trial.” Jama 280.12 (1998): 1061-1066.

[52]Citterio, Antonietta, et al. “Corticosteroids or ACTH for acute exacerbations in multiple sclerosis.” The Cochrane Library (2000).

[53]Hughes, R. A. C., et al. “European Federation of Neurological Societies/Peripheral Nerve Society guideline on management of chronic inflammatory demyelinating polyradiculoneuropathy: report of a joint task force of the European Federation of Neurological Societies and the Peripheral Nerve Society.” European journal of neurology 13.4 (2006): 326-332.

[54]Hughes, Richard, et al. “Randomized controlled trial of intravenous immunoglobulin versus oral prednisolone in chronic inflammatory demyelinating polyradiculoneuropathy.” Annals of neurology 50.2 (2001): 195-201.

[55]Sladky, John T., Mark J. Brown, and Peter H. Berman. “Chronic inflammatory demyelinating polyneuropathy of infancy: A corticosteroid‐responsive disorder.” Annals of neurology 20.1 (1986): 76-81.

[56]Schneider‐Gold, Christiane, et al. “Corticosteroids for myasthenia gravis.” The Cochrane Library (2005).

[57]Hughes, Richard AC, et al. “Corticosteroids for Guillain‐Barré syndrome.” The Cochrane Library (2006).

[58]Peterson, Phillip K., et al. “A controlled trial of intravenous immunoglobulin G in chronic fatigue syndrome.” The American journal of medicine 89.5 (1990): 554-560.

[59]Vollmer-Conna, Ute, et al. “Intravenous immunoglobulin is ineffective in the treatment of patients with chronic fatigue syndrome.” The American journal of medicine 103.1 (1997): 38-43.

[60]Lloyd, Andrew, et al. “A double-blind, placebo-controlled trial of intravenous immunoglobulin therapy in patients with chronic fatigue syndrome.” The American journal of medicine 89.5 (1990): 561-568.

[61]Zachrisson, Olof, et al. “Treatment with staphylococcus toxoid in fibromyalgia/chronic fatigue syndrome—a randomised controlled trial.” European Journal of Pain 6.6 (2002): 455-466.

[62]Fluge, Øystein, et al. “Benefit from B-lymphocyte depletion using the anti-CD20 antibody rituximab in chronic fatigue syndrome. A double-blind and placebo-controlled study.” PloS one 6.10 (2011): e26358.

[63]Fluge, Øystein, et al. “B-lymphocyte depletion in myalgic encephalopathy/chronic fatigue syndrome. an open-label phase II study with rituximab maintenance treatment.” PLoS One 10.7 (2015): e0129898.

[64]Rowe, Katherine S. “Double-blind randomized controlled trial to assess the efficacy of intravenous gammaglobulin for the management of chronic fatigue syndrome in adolescents.” Journal of psychiatric research 31.1 (1997): 133-147.

[65]Gatti, Giovanni, et al. “Inhibition by cortisol of human natural killer (NK) cell activity.” Journal of steroid biochemistry 26.1 (1987): 49-58.

[66]Childs, Richard W., and Mattias Carlsten. “Therapeutic approaches to enhance natural killer cell cytotoxicity against cancer: the force awakens.” Nature Reviews Drug Discovery 14.7 (2015): 487-498.

On Drama

Epistemic Status: Loose but mostly serious

One of the things that’s on my mind a lot is the psychology of Nazis.  Not neo-Nazis, but the literal Nazi party in Germany in the 1930’s and 40’s. In particular, Adolf Hitler.  What was it like inside his head? What could make a person into Hitler?

When I read Mein Kampf, I was warned by my more historically-minded friends that it wasn’t a great way to learn about Nazism. Hitler, after all, was a master manipulator. His famous work of propaganda would obviously paint him in an unrealistically favorable light.

The actual impressions I got from Mein Kampf, though, were very similar to the psychological profile of Hitler compiled by the OSS,  (h/t Alice Monday) the US’s intelligence service during WWII and the predecessor of the CIA.

Here’s what Hitler was like, as presented by the OSS:

  • Lazy by default, only able to be active when agitated
  • Totally uninterested in details, facts, sitting down to work, “dull” things
  • Dislikes and fears logic, prefers intuition
  • Keen understanding of human psychology, especially “baser” urges
  • Very sensitive to the “vibe” of the room, the emotional arc of the crowd
  • Strong aesthetic sense and interest in the visual and theatrical
  • Highly sentimental, kind to dogs and children, accepting of personal foibles
  • Views human interaction through the lens of seduction and sadomasochism
  • Eager to submit as well as to dominate, but puzzled or disgusted by anything which is neither submission nor domination
  • Sensitive to slights, delighted by praise, obsessed with superficial marks of rank & respect
  • Fixated on personal loyalty
  • Suicidal (and frequently threatened suicide long before he actually did it)

This is all very Cluster B, though the terminology for personality disorders didn’t exist at the time and I’m obviously not in a position to make a diagnosis.  Hitler’s tantrums, impulsiveness, inability to have lasting relationships, constant seeking of approval and need to be at the center of attention, grandiosity, envy, and lack of concern for moral boundaries, are all standard DSM symptoms of personality disorders.

In his own words, Hitler was very opposed to rule of law and intellectual principles: “The spectacled theorist would have given his life for his doctrine rather than for his people.”  He disapproved of intellectuals and of logical thinking, had contempt for “Manchester liberalism” (classical liberalism) and commerce, and instead praised the spiritual transfiguration that masses of people could attain through patriotism and self-sacrifice.

He said, “A new age of magic interpretation of the world is coming, of interpretation in terms of the will and not of the intelligence. There is no such thing as truth either in the moral or the scientific sense.”

He believed strongly in the need for propaganda, and repeatedly explained the principles for designing it:

  • it must be simple and easy to understand by the uneducated
  • it must be one-sided and present us as absolutely good and the enemy as absolutely bad
  • it must have constant repetition
  • it should NOT be designed to appeal to intellectuals or aesthetes
  • it should focus on feelings not objectivity

He believed in the need of the people for “faith”, not because he was a believing Christian, but because he thought it was psychologically necessary:

“And yet this human world of ours would be inconceivable without the practical existence of a religious belief. The great masses of a nation are not composed of philosophers. For the masses of the people, especially faith is absolutely the only basis of a moral outlook on life. The various substitutes that have been offered have not shown any results that might warrant us in thinking that they might usefully replace the existing denominations. But if religious teaching and religious faith were once accepted by the broad masses as active forces in their lives, then the absolute authority of the doctrines of faith would be the foundation of all practical effort. There may be a few hundreds of thousands of superior men who can live wisely and intelligently without depending on the general standards that prevail in everyday life, but the millions of others cannot do so. Now the place which general custom fills in everyday life corresponds to that of general laws in the State and dogma in religion. The purely spiritual idea is of itself a changeable thing that may be subjected to endless interpretations. It is only through dogma that it is given a precise and concrete form without which it could not become a living faith.”

In other words, the picture that is emerging is that Hitler himself craved, and understood other people’s craving, for a certain kind of emotionally resonant experience. Religious or mystical faith; absorption in the crowd; mass enthusiasm; sacrifice of self; and sacrifice of the outsider or scapegoat.  Importantly, truth doesn’t matter for this experience, and critical thinking must be absolutely suppressed in order to fully enact the ritual.

I’m pretty confident, despite not having much knowledge of history, that this was a real and central part of Hitler’s ideology and practice.

If you watch Triumph of the Will, it’s very clearly a mass ritual calculated to produce strong emotional responses from the crowd.

In particular, the emotion it evokes is certainty. The crowd looks to their leader for validation and assurance; and with great confidence, he gives it to them, assuring the German people eternal glory.  One can safely lay down one’s burden of worry and anxious thought.  One can be at peace, knowing that one has Hitler’s love and approval. One can rest in the faith that Hitler will take care of things.

Repetitive call-and-response rituals, endless ranks of soldiers, flags and logos and symbols, huge crowds, rhythmic beats, all give a sense of a simple, steady, loud, bold message. It is cognitively easy. There is no need to strain to hear or understand.  It will be the same, over and over again, forever.

What the OSS report suggests, which Nazi propaganda would never admit, is that Hitler himself craved external validation, and was distraught when it was not supplied.  He understood how badly the people wanted to be led and to be annihilated in the worship of a ruler, because he longed for that submission and release himself.

There is nothing particularly unusual about what I’m saying; the standard accounts of Nazism always make mention of the quasi-religious fanaticism it engendered.  And the connection to ritual is obvious: mass events, loss of individuality in the collective frenzy, the heightening of tension and its release, often through violence.  This is the pattern of all sacrificial festivals.

You can see a modern reconstruction of the primitive sacrificial festival in the Rite of Spring (here, with Nijinsky’s choreography and Roerich’s set design, which captures the atavistic character of the original ballet in a way later productions don’t).

You can also see a version of this in the coronation scene from Boris Godunov, which is a very beautiful expression of quasi-religious mass worship for a state leader.

There’s an important connection between drama, the drive for emotional validation and stirring up interpersonal conflict, and drama, acting out a play to produce a sense of catharsis in the audience, originally as part of a religious ritual involving both sacrifice and collective frenzy.

Both drama in the colloquial sense and the artistic sense are about evoking emotions and provoking sympathies.  Drama requires an emotional arc, in which tension rises, comes to a head, and is released (catharsis).

Why is this satisfying?  Why do we like to lose our minds, to go up into an irrational frenzy, and then to come down again, often through sorrow and sympathetic suffering?

Current psychological opinion holds that catharsis doesn’t work; venting anger makes people angrier and more violent, not less so.  This isn’t a new idea; Plato thought that encouraging violent passions through theater would only make them worse.

It’s possible that the purpose of drama isn’t to help people cool down, but quite the opposite: to provide plausibly-deniable occasions for mob violence, and to bind the group closer together by sharing strong emotional connections.  Emotional mirroring helps groups coordinate better, including for war or hunting. Highly rhythmic activities (like music, dance, and chanting) both promote emotional mirroring and make it easy to detect those individuals who are out of step or disharmonious.

(In the original Nijinsky choreography of the Rite of Spring, the girl who is chosen to be a human sacrifice is chosen by lot, through a “musical-chairs”-style game in which the one caught out of the circle is singled out. In both Greek and Biblical tradition, sacrifices were chosen by lot. “Random” choice of a victim is often an excellent, plausibly-deniable way to promote subconscious choice.)

Ben Hoffman’s concept of empathy as herd cognition is similar, though humans are more like pack predators than true herd animals.  Emotions are shared directly, through empathy, through song and dance and nonverbal vibrations.  This is a low-bandwidth channel and can’t convey complex chained plans ahead of time.  You can’t communicate “if-then” statements directly through emotional mirroring.  But you can communicate a lot about friend and foe, and guide quite complex behaviors through “warmer, colder, warmer”-style reinforcement learning.

It’s a channel of communication that’s optimized to be intelligible only to the people who are in harmony at the moment — that is, those who are feeling the same thing, are part of the group, are acting in roughly the same way.  This has some disadvantages. For one thing, it’s hard to use it to coordinate division of labor. You need more explicit reasoning to, for instance, organize your army into a pincer movement, as Shaka Zulu did.  Emotion-mirroring motivates people to “act as one”, not to separate into parts.  For another thing, emotion-mirroring doesn’t allow for fruitful disagreement or idea-generation, because that’s inherently disharmonious, no matter how friendly in intent or effect; suggesting a different idea is differing from the group.

The advantage of emotional-mirroring as a form of communication is precisely that it is only intelligible to people who are engaging in the mirroring. If you are coordinating against the people who are out of sync or out of harmony, you can be secretive in plain view, simply by communicating through a rhythm that they can’t quite detect.

It makes sense, in a sort of selfish-gene way.  A gene which caused individuals to become very good at coordinating with others who had the gene, to kill those who didn’t have the gene, would promote natural selection for itself.  It would make it feel good to harmonize and “become one with” the crowd, and elevate rage to a fever pitch against those who would interrupt the harmony.  Those who didn’t have the gene would be worse at seeing the mob coming, and would not be able to secretly coordinate with each other.

(This idea is not due to me, but to a friend who might prefer to remain anonymous.)

Only a small portion of the population can be antisocial in the long run, where antisocial means impulsive aggression, in the sense of “people who are more likely to drive at the oncoming car in the game of Chicken”; evolutionary game theory simulations bear that out.  Aggressive or risk-seeking behavior can only be a minority trait, because while it does result in more sexual success and more short-term wins in adversarial games, people with those traits have too high a risk of dying out. But the more sensitive, harmony-coordination-mob trait, might be better at surviving, because it’s usually quiescent and only initiates violence when there’s a critical mass of people moving in unison.

There also may be the “charismatic” or “Dionysian” or “actor/performer/poet/bard” trait: the ability of an individual to activate people’s harmony-sensing, emotional-mirroring moods, the ability to make people get up and dance or cheer or fight.  People with borderline personality disorder sometimes are better than neurotypicals at reading emotions and inferring people’s feelings and intentions in social situations.  Hyper-sensitive, hyper-expressive people may also be a stable minority strategy; minority, because getting people worked up increases risk, though not as much as unilaterally seeking conflict oneself.

High drama is, obviously, dangerous. It is also powerful and at times beautiful. Even those of us who would never be Nazis can be moved by art and music and theater and religious ritual.  It’s a profound part of the human psyche.  It’s just important to be aware of how it works.

Drama is inherently transient and immediate. It’s like a spell; it affects those within range, while the spell is being sustained, and dissipates when the spell is broken. If you want to enhance drama, you create an altered environment, separate from everyday life, and aim for repetition, unanimity, and cohesiveness.  You rev people up with enthusiasm.  You say “Yes, and…”, as in improv. If you want to dispel drama, you break up the scene with interruptions, disagreements, references to mundane details, collages of discordant elements.  You deescalate emotions by becoming calm and boring.  You impede the momentum. 

If you have a plan that you’re afraid will fail unless everyone stays rev’ed up 24/7 and unanimously enthusiastic, you have a plan that’s being communicated through drama, and you need to beware that drama’s nature is typically transient, irrational, and violent.

Denotative language, as opposed to enactive language, is literally opposed to role-playing. When you say out loud what is going on — not to cause anyone to do anything, but literally just to inform them what is going on — you are “breaking character.”

If I am playing the role of a sad person, it’s breaking character to say “I’d probably feel better if I took a nap.”  That’s not expressing sadness! That’s not what a Sad Person would say!  It’s not acting out the arc of “inconsolableness” to its inevitable conclusion. It’s cutting corners.  Cheating, almost.  Breaking momentum.

By alluding to the reality beyond the current improv scene, the scaffolding of facts and interests that lasts even after passions have cooled, I am ruining the scene and ceding my power to shape it, but potentially gaining a qualitatively different kind of power.

Breaking flow is inherently frustrating, because we humans probably have a desire for flow for its own sake.  Drama wants drama. Flow wants flow.

But ultimately, there’s a survival imperative that limits all of these complex adaptations. You have to be alive in order to act out a drama. The “scaffolding” facts of practical reality remain, even if they’re mostly far away when you’re well-insulated from danger.  Drama provides a relative, but not an absolute, survival advantage, which means it’s more-or-less a parasitic phenomenon, and has natural limitations on how much behavior it can co-opt before negative consequences start showing up.


Parenting and Heritability Overview

Epistemic status: pretty preliminary, not conclusive

Can parenting affect children’s outcomes? Can you raise your child to be better, healthier, smarter, more successful?

There’s a lot of evidence, from twin and adoption studies,  that behavioral traits are highly heritable and not much affected by adoptive parents or by the environment shared between siblings.

High heritability does not strictly imply that parenting doesn’t matter, for a few reasons.

  1. Changes across the entire population don’t affect heritability. For example, heights have risen as nutrition improves, but height remains just as heritable.  So if parenting practices have changed over time, heritability won’t show whether those changes helped or hurt children.
  2. Family environment and genes may be positively correlated. For instance, if a gene for anxiety causes both anxiety in children and harshness in parents, then it may be that the parenting still contributes to the children’s anxiety.  If parents who overcome their genetic predispositions are sufficiently rare, it may still be possible that choosing to parent differently can help.
  3. Rare behaviors won’t necessarily show up at the population level.  Extremely unusual parenting practices can still be helpful (or harmful), if they’re rare enough to not be caught in studies.  Extremely unusual outcomes in children (like genius-level achievement) might also not be caught in studies.
  4. Subtle effects don’t show up in studies that easily. A person who has to spend a lot of time in therapy unlearning subtle emotional harms from her home environment won’t necessarily show up as having a negative outcome on a big correlational study.

With those caveats in mind, let’s see what the twin and adoption studies show.


In a study of 331 pairs of twins reared together and apart, a negligible proportion of the variance in personality was due to shared family environment.  About 50% of the variance in personality scores was due to genetics; average heritability was 0.48.[1]

Attachment Style

In a study of 125 early-adopted adolescents, secure-attached infants were more likely to grow into secure-attached teenagers (correlation 0.30, p<0.01), and mothers of secure adolescents were more likely to show “sensitive support” (high relatedness and autonomy in resolving disagreements with children) at age 14. (p < 0.03).[2]

Antisocial/Criminal Behavior

An adoption study found that adolescents whose adoptive parents had high levels of conflict with them (arguments, hitting, criticizing and hurting feelings, etc) were more likely to have conduct problems. Correlations were between 0.574 and 0.696. Effects persisted longitudinally (i.e. past conflict predicted future delinquency).[3]

A meta-analysis of 51 twin and adoption studies found that 32% of the variance in antisocial behavior was due to genetic influences, while 16% was due to shared environment influences.[4]


Drug Abuse

In a Swedish adoption study of 18,115 children, adopted children with biological parents who abused drugs were twice as likely to abuse drugs themselves, while there was no elevated risk for having an adoptive parent who abused drugs.  However, adoptive siblings of adopted children with DA were twice as likely to abuse drugs as adoptive siblings of adopted children without DA. This implies that there is both environmental and genetic influence, but suggests that environmental influence may be more about peers than parents.[5]

Psychiatric Disorders

Having a mother (but not a father) with major depression was associated in adoptive children getting major depression, in a study of 1108 adopted and nonadopted adolescents.  Odds ratio of having a mother with major depression was 3.61 for nonadopted children and 1.97 for adopted children.  Odds ratio of externalizing disorders if you had a mother with depression was 2.23 for nonadopted children and 1.69 for adopted children.[6]


The Minnesota Study of Twins Reared Apart, which includes more than 100 pairs of twins, and found that 70% of the variance in IQ of monozygotic twins raised apart was genetic. No environmental factor (father’s education, mother’s education, socioeconomic status, physical facilities) contributed more than 3% of the variance between twins. Identical twins correlate about 70% in IQ, 53% on traditionalism, 49% on religiosity, 34% on social attitudes, etc.  Identical twins reared apart are roughly as similar as identical twins reared together.[7]

According to a twin study, heritability on PSAT scores was 50-75%, depending on subscore.[8]


Years of Schooling

In the Wisconsin Longitudinal Survey,  of 16481 children of which 610 were adopted, finds that adopted parental income has a significant positive effect on years of schooling.  Adoptive father’s years of schooling had a significant effect, but adoptive mother’s years of schooling were not significant. In nonadopted families, parental IQ and years of schooling (both mother and father) have a statistically significant effect.[9]


Reading Achievement

The Colorado Adoption Study finds that heritability of reading usually explains about 40% of the variance in outcomes in reading achievement, while adoptive-sibling correlations (a measure of shared environment) explain less than 10% of the variance. The rest is non-shared environment.  Unrelated sibling correlations are 0.05, while related sibling correlations are 0.26. Genetic correlations rise with age (from 0.34 at age 7 to 0.67 at age 16).[10]

In the Western Reserve Twin Study of 278 twin pairs, ages 6-12, IQ score variance was mostly due to heritability (37%-78%, depending on subscore) and not on shared environment (<8%).  However, school achievement was more dependent on shared environment (65-73%) than heritability (19-27%).[11]

In a twin study, spelling ability has a heritability of 0.53.[12]

Language ability in toddlers, in a twin study, was found to be more dependent on shared environment than genetics: 71% of variance explained by shared environment, 28% explained by genetics. This was reversed in the case of reading ability in 7-10-year-olds, where 72% of variance was explained by genetics, while 20% was explained by shared environment. Maybe the effects of home environment fade out with age.[13]

Academic Achievement

A twin study of 2602 twin pairs found that 62% of variance in science test scores at age 9 was explained by heredity, compared to 14% shared environment. There was no difference between boys and girls in heritability.[14]

51-54% of variance in grades, in the Minnesota Twin Study, is due to heredity, in girls and boys respectively.  Similar genetic contributions to IQ (52%, 37%), externalizing behavior (45%, 47%) and engagement (54%, 49%).  Shared environment mattered less (26%).  The majority (55%) of the change in grades after age 11 is due to “nonshared environment.”[15]



The National Longitudinal Study of Youth which included full and half-siblings found IQ was 64% heritable, education was 68% heritable, and income was 42% heritable.  Almost all the rest of income variation was non-shared environment (49%), leaving only 9% explained by shared environment.[16]

In a study of Finnish twins, 24% of the variance of women’s lifetime income and 54% of the variance of men’s lifetime income was due to genetic factors, and the contribution of shared environment is negligible.[17]


Corporal Punishment

In laboratory settings, corporal punishment is indeed effective at getting immediate compliance.  In a meta-analysis of mostly correlational and longitudinal studies, the weighted mean effect size of corporal punishment was -0.58 on the parent-child relationship, -0.49 on childhood mental health, 0.42 on childhood delinquent and antisocial behavior, 0.36 on childhood aggression, 1.13 on immediate compliance. There were no large adult effects significant at a <0.01 level, but there was an effect size of 0.57 on aggression significant at a <0.05 level.

Bottom line is that corporal punishment is fairly bad for childhood outcomes, but doesn’t usually cause lasting trauma or adult criminal/abusive behavior; still, there are good evidence-based reasons not to do it.


What Parenting Can’t Affect

Personality, IQ, reading ability in teenagers, and income are affected negligibly by the “shared environment” contribution. Drug abuse is also very heritable and not much affected by parenting.


What Parenting Might Affect

Reading ability in children and grades in teenagers have a sizable (but minority) shared environment component; reading ability in toddlers is mostly affected by shared environment. Grades are generally less IQ-correlated than test scores, and are highly affected by school engagement and levels of “externalizing” behavior (disruptive behavior, inattention, criminal/delinquent activity.)  Antisocial and criminal behavior has a sizable (but minority) shared environment component. You may be able to influence your kids to behave better and study harder, and you can definitely teach your kids to read younger, though a lot of this may turn out to be a wash by the time your kids reach adulthood.


What Parenting Can Affect

Having a mother — even an adoptive mother — with major depression puts children at risk for major depression, drug abuse, and externalizing behavior. Conflict at home also predicts externalizing behavior in teenagers. Mothers of teenagers who treat them well are more likely to have teenagers who have loving and secure relationships with them. Basically, if I were to draw a conclusion from this, it would be that it’s good to have a peaceful and loving home and a mentally healthy mom.

Father’s income and family income, but not mother’s income, predicts years of schooling; I’m guessing that this is because richer families can afford to send their kids to school for longer. You can, obviously, help your kids go to college by paying for it.


[1]Tellegen, Auke, et al. “Personality similarity in twins reared apart and together.” Journal of personality and social psychology 54.6 (1988): 1031.

[2]Klahr, Ashlea M., et al. “The association between parent–child conflict and adolescent conduct problems over time: Results from a longitudinal adoption study.” Journal of Abnormal Psychology 120.1 (2011): 46.

[3]Klahr, Ashlea M., et al. “The association between parent–child conflict and adolescent conduct problems over time: Results from a longitudinal adoption study.” Journal of Abnormal Psychology 120.1 (2011): 46.

[4]Rhee, Soo Hyun, and Irwin D. Waldman. “Genetic and environmental influences on antisocial behavior: a meta-analysis of twin and adoption studies.” Psychological bulletin 128.3 (2002): 490.

[5]Kendler, Kenneth S., et al. “Genetic and familial environmental influences on the risk for drug abuse: a national Swedish adoption study.” Archives of general psychiatry 69.7 (2012): 690-697.

[6]Tully, Erin C., William G. Iacono, and Matt McGue. “An adoption study of parental depression as an environmental liability for adolescent depression and childhood disruptive disorders.” American Journal of Psychiatry 165.9 (2008): 1148-1154.

[7]Bouchard, T., et al. “Sources of human psychological differences: The Minnesota study of twins reared apart.” (1990).

[8]Nichols, Robert C. “The national merit twin study.” Methods and goals in human behavior genetic (1965): 231-244.

[9]Plug, Erik, and Wim Vijverberg. “Does family income matter for schooling outcomes? Using adoptees as a natural experiment.” The Economic Journal 115.506 (2005): 879-906.

[10]Wadsworth, Sally J., et al. “Genetic and environmental influences on continuity and change in reading achievement in the Colorado Adoption Project.” Developmental contexts of middle childhood: Bridges to adolescence and adulthood (2006): 87-106.

[11]Thompson, Lee Anne, Douglas K. Detterman, and Robert Plomin. “Associations between cognitive abilities and scholastic achievement: Genetic overlap but environmental differences.” Psychological Science 2.3 (1991): 158-165.

[12]Stevenson, Jim, et al. “A twin study of genetic influences on reading and spelling ability and disability.” Journal of child psychology and psychiatry 28.2 (1987): 229-247.

[13]Harlaar, Nicole, et al. “Why do preschool language abilities correlate with later reading? A twin study.” Journal of Speech, Language, and Hearing Research 51.3 (2008): 688-705.

[14]Haworth, Claire MA, Philip Dale, and Robert Plomin. “A twin study into the genetic and environmental influences on academic performance in science in nine‐year‐old boys and girls.” International Journal of Science Education 30.8 (2008): 1003-1025.

[15]Johnson, Wendy, Matt McGue, and William G. Iacono. “Genetic and environmental influences on academic achievement trajectories during adolescence.” Developmental psychology 42.3 (2006): 514.

[16]Rowe, David C., Wendy J. Vesterdal, and Joseph L. Rodgers. “Herrnstein’s syllogism: Genetic and shared environmental influences on IQ, education, and income.” Intelligence 26.4 (1998): 405-423.

[17]Hyytinen, Ari, et al. “Heritability of lifetime income.” (2013).

[18]Gershoff, Elizabeth Thompson. “Corporal punishment by parents and associated child behaviors and experiences: a meta-analytic and theoretical review.” Psychological bulletin 128.4 (2002): 539.

Don’t Shoot the Messenger

Epistemic status: confident but informal

A while back, I read someone complaining that the Lord of the Rings movie depicted Aragorn killing a messenger from Mordor. In the book, Aragorn sent the messenger away.  The moviemakers probably only intended to add action to the scene, and had no idea that they had made Aragorn into a shockingly dishonorable character.

Why don’t you shoot messengers?  What does that tradition actually mean?

Well, in a war, you want to preserve the ability to negotiate for peace.  If you kill a member of the enemy’s army, that puts you closer to winning the war, and that’s fine.  If you kill a messenger, that sends a message that the enemy can’t safely make treaties with you, and that means you destroy the means of making peace — both for this war and the wars to come.  It’s much, much more devastating than just killing one man.

This is also probably why guest law exists in so many cultures.  In a world ruled by clans, where a “stranger” is a potential enemy, it’s vitally important to have a ritual that guarantees nonviolence, such as breaking bread under the same roof. Otherwise there would be no way to broker peace between your family and the stranger over the next hill.

This is why the Latin hostis (enemy) and hospes (guest or host) are etymologically cognate. This is why the Greeks had a concept of xenia so entrenched that they told stories about a man being tied to a fiery wheel for eternity for harming a guest.  This is why the sin of Sodom was inhospitality.

It’s actually not about charity or compassion, exactly. It’s about coordinating a way to not kill each other.

Guest law and not shooting messengers are natural law: they are practical necessities due to game theory, that ancient peoples traditionally concretized into virtues like “honor” or “hospitality.”  But it’s no longer common knowledge what they’re for.

A friend of mine speculated that, in the decades that humanity has lived under the threat of nuclear war, we’ve developed the assumption that we’re living in a world of one-shot Prisoner’s Dilemmas rather than repeated games, and lost some of the social technology associated with repeated games. Game theorists do, of course, know about iterated games and there’s some fascinating research in evolutionary game theory, but the original formalization of game theory was for the application of nuclear war, and the 101-level framing that most educated laymen hear is often that one-shot is the prototypical case and repeated games are hard to reason about without computer simulations.

One of the things about living in what feels like the shadow of the end of the world — there’s been apocalypse in the zeitgeist since at least the 1980’s and maybe longer — is that it’s very counterintuitive to think about a future that might last a long time.

What if we’re not wiped out by an apocalypse?  What if humans still have an advanced civilization in 50 years — albeit one that looks very different from today’s?  What if the people who are young today will live to grow old? What would it be like to take responsibility for consequences and second-order effects at the scale of decades?  What would it be like to have models of the next twenty years or so — not for the purpose of sounding cool at parties, but for the sake of having practical plans that actually extend that far?

I haven’t thought much about how to go about doing that, but I think we may have lost certain social technologies that have to do with expecting there to be a future, and it might be important to regain them.

Sepsis Cure Needs An RCT

Epistemic Status: Confident

Every now and then the news comes out with a totally clear-cut, dramatic example of an opportunity to do a lot of good. This is one of those times.

The story began in January, 2016, when Dr. Paul Marik was running the intensive care unit at Sentara Norfolk General Hospital. A 48-year-old woman came in with a severe case of sepsis — inflammation frequently triggered by an overwhelming infection.

“Her kidneys weren’t working. Her lungs weren’t working. She was going to die,” Marik said. “In a situation like this, you start thinking out of the box.”

Marik had recently read a study by researchers at Virginia Commonwealth University in Richmond. Dr. Berry Fowler and his colleagues had shown some moderate success in treating people who had sepsis with intravenous vitamin C.

Marik decided to give it a try. He added in a low dose of corticosteroids, which are sometimes used to treat sepsis, along with a bit of another vitamin, thiamine. His desperately ill patient got an infusion of this mixture.

“I was expecting the next morning when I came to work she would be dead,” Marik said.”But when I walked in the next morning, I got the shock of my life.”

The patient was well on the road to recovery.

Marik tried this treatment with the next two sepsis patients he encountered, and was similarly surprised. So he started treating his sepsis patients regularly with the vitamin and steroid infusion.

After he’d treated 50 patients, he decided to write up his results. As he described it in Chest, only four of those 47 patients died in the hospital — and all the deaths were from their underlying diseases, not from sepsis. For comparison, he looked back at 47 patients the hospital had treated before he tried the vitamin C infusion and found that 19 had died in the hospital.

This is not the standard way to evaluate a potential new treatment. Ordinarily, the potential treatment would be tested head to head with a placebo or standard treatment, and neither the doctors nor the patients would know who in the study was getting the new therapy.

But the results were so stunning, Marik decided that from that point on he would treat all his sepsis patients with the vitamin C infusion. So far, he’s treated about 150 patients, and only one has died of sepsis, he said.

That’s a phenomenal claim, considering that of the million Americans a year who get sepsis, about 300,000 die.

Sepsis is a really big deal. More people die from sepsis every year than from diabetes and COPD combined. Ten thousand people die of sepsis every day.  A lot of these cases are from pneumonia in elderly people, or hospital-acquired infections.  Curing sepsis would put a meaningful dent in the kind of hell that hospital-bound old people experience, that Scott described in Who By Very Slow Decay.

Sepsis is the destructive form of an immune response to infection. Normally the infection is managed with antibiotics, but the immune response still kills 30% of patients.  Corticosteroids, which reduce the immune response, and vitamin C, which reduces blood vessel permeability so that organs are less susceptible to pro-inflammatory signals, can treat the immune response itself.

Low-dose corticosteroids have been found to significantly reduce mortality in sepsis elsewhere in controlled studies (see e.g. here, here, here) and there’s some animal evidence that vitamin C can reduce mortality in sepsis (see here).

This treatment seems to work extraordinarily well in Marik’s retrospective study; it is made of simple, cheap, well-studied drugs with a fairly straightforward mechanism of action; the individual components seem to work somewhat on sepsis too.  In other words, it’s about as good evidence as you can get, before doing a randomized controlled trial.

But, of course, before you can start treating patients with it, you need an RCT.

I wrote Dr. Marik and asked him what the current status of the trials is; he’s got leads at several hospitals: “two in CA, one at Harvard, and one in RI. In addition the Veterinary University of Georgia is proposing a neat study in horses — horses are at increased risk of sepsis.”

But he needs funding.

Medical research does not progress by default. The world is full of treatments that one doctor has tried to great success, which never went through clinical trials, and so we’ll never know how many lives could have been saved.  Some of the best scientists in the world are chronically underfunded. The world has not solved this coordination problem.

By default, things fall apart and never get fixed. They only get better if we act.

You can click on this Google Form to give me estimates of how much you’d be willing to donate and your contact information; once I get a sense of what’s possible, my next step will be coordinating with Dr. Marik and finding a good vehicle for where to send donations.

(I don’t have any personal connection to Dr. Marik or to the treatment; I literally just think it’s a good thing to do.)


Are Adult Developmental Stages Real?

Epistemic status: moderately confident

Robert Kegan’s developmental stages have become popular in my corner of the social graph, and I was asked by Abram Demski and Jacob Liechty to write a literature review (which they kindly funded, before I started my new job) of whether Kegan’s theory is justified. Since Kegan’s model is a composite that builds on many previous psychologists’ work, I had to do an overview of several theories of developmental stages.  I cover the theories of Piaget, Kohlberg, Erikson, Piaget, Maslow, and Kegan.  All of these developmental stage theories posit that there are various levels of cognitive, moral, or psychological maturity and sophistication; children start at the low levels and progress to the higher ones; only a few of the very “wisest” adults reach the very top stages.

This makes intuitive sense and is a powerful story to tell. You can explain conflicts and seemingly strange behavior by understanding that some people are simply on more primitive levels and cannot comprehend more sophisticated ones. You can be motivated to reach towards self-improvement by a model of a ladder of development.

But for the moment I want to ask more from developmental theories than being interesting or good stories; I want to ask if they’re actually correct.

In order for a developmental theory to be correct, I think a few criteria must be met:

  • The developmental stages must be reliably detectable, e.g. by some questionnaire or observational test that has high internal consistency and/or inter-rater reliability
  • The developmental stages must improve with age, at least within a given cohort (most people progress to later stages as they grow older)
  • The developmental stages must be sequential and cumulative (people must learn earlier stages before later ones, and not skip stages)
  • In cases where the developmental stages are supposed to occur at particular ages, they must actually be observed being attained at those ages.

Most of the theories do not appear to meet these criteria.


Jean Piaget was one of the pioneers of child development psychology. Beginning in the 1930’s, his observations of children led him to a sequential theory of how children gain cognitive abilities over time.

Piaget’s stages of cognitive development are:

  • Sensorimotor, ages 0-2, hand-eye coordination and goal-directed motion
  • Pre-operational, ages 2-7, speech, pretend play and use of symbols
  • Concrete operational, age 7-11, inductive logic, perspective-taking
  • Formal operational, ages 11-adult, deductive logic, abstraction, metacognition, problem-solving

Piaget’s first study, The Origins of Intelligence in Children, published in 1952, was conducted on his own three children, from birth to age 2. He and his wife made daily observations of the children.

Reflexes, Piaget noticed, are present from birth: the sucking reflex, upon contact with the nipple, happens automatically. In the first month of life, he notes that the babies become more effective at finding the nipple.  From one to two months, babies self-stimulate even when there is no breast — they suck their thumbs or make sucking motions on their own. This, Piaget calls the “primary circular reaction”. A reflex has been transformed into a self-generated behavior.  At first, the baby can’t reliably find his thumb; he flails his arms until they happen to brush his face, and then engages the sucking reflex.  There are “circular reactions” to grasping, looking, and listening as well.  Later, babies learn to coordinate these circular reactions across senses, and to move their bodies in order to attain an objective (e.g. reaching for an object to take it).

Some of Piaget’s conclusions have been disputed by modern experiments.

In his studies of infants, he tested their ability to reason about objects by occluding the object from view, subjecting it to some further, hidden motion, and then having the child search for the object.

However, younger infants have less physical ability to search, so this task is less appropriate for assessing what young infants know.  In the 1980’s, Leslie and Ballargeon used looking time as a metric for how much infants were surprised by observations; since this doesn’t require physical coordination, it allows for accurate assessment of the cognitive abilities of younger infants.  Leslie’s experiments confirmed that babies understand causality, and Ballargeon’s confirmed that babies have object permanence — in both cases, looking times were longer for “impossible” transformations of objects that violated the laws of causality or caused objects to transform when behind a screen.  4-month-old infants are, contra Piaget, capable of object permanence; they understand that objects must move along continuous paths, and that solid objects cannot pass through each other.[2]

Kittens go through Piaget’s sensorimotor stages: first reflexes, then habits (pawing, oscillating head), then secondary circular formations (wrestling, biting, dribbling with objects), and finally means-end coordination (playing hide-and-seek.)[3]  This supports the ordering of sensorimotor skills in Piaget’s classification.

A 1976 study of 9-14-year-olds given a test and subjecting it to factor analysis found that there were three axes: formal operational systematic permutations; concrete operational addition of asymmetric relations; and formal operational logic of implications.[4]  This supports something like Piaget’s classifications of cognitive tasks.

Different studies conflict on which operational stages come before others: is class inclusion always required before multiplication of classes? Ordinal before cardinal? Logical and number abilities before number conservation? There’s no consistent picture.  “By 1970, it was evident in the important book, Measurement and Piaget, that the empirical literature functioned poorly as a data base on which the objective evaluation of Piagetian theory could be effectively attempted (Green, Ford, & Flamer, 1971)…For example, Beard (1963) found that 50% of her 5- to 6-year-old samples conserved quantity (solid). In contrast, Lovell and Ogilvie (1960) and Uzgiris (1964) reported that it is in the 8- to 9-year-old range that children conserve quantity. Elkind (1961) reported that 52% of 6-year-old children conserved weight, but Lovell and Ogilvie (1961) reported this percentage for 10-year-old children.”[5]

According to Piaget’s “structured whole” theory, when children enter a new stage, they should gain all the skills of that stage at once. For instance, they should learn conservation of volume of water at the same time as they learn that the length of a string is conserved. “ However, point synchrony across domains has never been found. To the contrary, children manifest high unevenness or decalage (Feldman 1980, Biggs & Collis 1982, Flavell 1982). Piaget acknowledged this unevenness but never explained it; late in his life he asserted that it could not be explained (Piaget 1971).”  However, it’s overwhelmingly true that success at cognitive tasks is age-dependent. On a host of tasks, age is the most potent predictor of performance.[6]

Piaget claimed that children develop cognitive skills in discrete stages, at particular ages, and in a fixed order. None of these claims appear to be replicated across the literature. The weaker claims that children learn more cognitive skills as they grow older, that some skills tend to be learned earlier than others, and that there is some clustering in which children who can perform one skill can also perform similar skills, have some evidentiary support.


Lawrence Kohlberg, working in the 1960’s and 70’s, sought to extend Piaget’s developmental-stage theories to moral as well as cognitive development.

Kohlberg’s stages of moral development are:

  • Obedience and punishment (“how can I avoid punishment?”)
  • Instrumentalist Relativist (“what’s in it for me?”)
  • Interpersonal Concordance (“be a good boy/girl”, conformity, harmony, being liked)
  • “Law and Order” (maintenance of the social order)
  • Social contract (democratic government, greatest good for the greatest number)
  • Universal ethical principles (eg Kant)

The evidence for Kohlberg’s theory comes from studies of how people respond to questions about hypothetical moral dilemmas, such as “Heinz steals the drug”, a story about a man who steals an expensive drug to save his dying wife.

Kohlberg did longitudinal studies of adolescents and adults over a period of six years, in the US, Taiwan, Mexico, and isolated villages in Turkey and the Yucatan.  In all three developed-country examples, the prevalence of stages 1 and 2 declined with age, while the prevalence of 5 and 6 increased with age.  In the isolated villages, stage 1 declined with age, while stage 3 and 4 increased with age, and stages 5 and 6 were always rare. Among 16-year-olds, Stage 5 was the most common in the US, while stages 3 and 4 were the most common in Taiwan and Mexico; in the isolated villages, Stage 1 was still the most common by age 16.

In adults there was likewise some change in moral development over time — Stage 4 (law and order) increased with age from 16 to 24, in both lower- and middle-class men, and the highest rates of stage 4 were found in the men’s fathers.  Most men stabilize at Stage 4, while most women stabilize at stage 3.

Kohlberg’s experiments show that there is change with age in how people explain moral reasoning, which is similar in direction but different in magnitude across cultures.[7]

In subsequent studies from around the world, 85% (out of 20 cross-sectional studies) showed an increase in moral stage with age, and none of them found “stage skipping” (all stages between the lowest and the highest were present.)  Contra Kohlberg, most subsequent studies do not show significant sex differences in moral reasoning. There are some cultural differences: stage 1 does not show up in children in Iran or Hutterite children; most folk tribal societies do not have stages 4, 5, or 6 at all.[8]

Subsequent studies have shown that children do in fact go through Kohlberg’s stages sequentially, usually without stage skipping or regression.[9]

Juvenile delinquents have lower scores on Kohlberg’s moral development test than nondelinquents; moreover, the most psychopathic delinquents had the lowest scores.[11]

Jonathan Haidt has critiqued Kohlberg’s theory, on the grounds that people’s verbal reasoning process for justifying moral hypotheticals does not drive their conclusions.  In hypothetical scenarios about taboos —  like a pair of siblings who have sex, using birth control and feeling no subsequent ill effects — people quickly assert that incest is wrong, but can’t find rational explanations to justify it. People’s affective associations with taboo scenarios (such as claiming that it would upset them to watch) were better predictors of their judgments than their assessments of the harm of the scenarios.[10]

If the social intuitionists like Haidt are correct, then research in Kohlberg’s paradigm may tell us something about people’s verbal narratives about morality, but not about their decision-making process.

There is also the possibility that interviews about hypotheticals are not good proxies for moral decision-making in practice; people may give the explanations that are socially desirable rather than the real reasons for their judgments, and their judgments about hypotheticals may not correspond to their actions in practice.

Still, Kohlberg’s stages are an empirical phenomenon: there is high inter-rater reliability, people  advance steadily in stage with age (before stabilizing), and industrialized societies have higher rates of the higher stages.


Erik Erikson was a psychoanalyst who came to his own theory of stages of psychosocial development in the 1950’s, in which different stages of life force the individual to confront different challenges and develop different “virtues.”

Erikson’s developmental stages are:

  1. Trust vs. Mistrust (infancy, relationship with mother, feeding and abandonment)
  2. Autonomy vs. Shame (toddlerhood, toilet training)
  3. Initiative vs. Guilt (kindergarten, exploring and making things)
  4. Industry vs. Inferiority (grade school, sports)
  5. Identity vs. Role Confusion (adolescence, social relationships)
  6. Intimacy vs. Isolation (romantic love)
  7. Generativity vs. Stagnation (middle age, career and parenthood)
  8. Ego integrity vs. Despair (aging, death)

This theory had its origins in subjective clinical impressions. There has been some attempt to correlate a measure of identity achievement with other positive attributes, but, for instance, it has no association with self-esteem or locus of control, which would seem counterintuitive if the “identity achievement” score really corresponded to the development of an independent self.

A self-report questionnaire, in which people rated themselves on Trust, Autonomy, Initiative, Industry, Identity, and Intimacy, was found to have moderately high Kronbach alpha scores (0.57-0.75).  Males scored higher on autonomy and initiative, while females scored higher on intimacy, as you’d expect from sex stereotypes.[12]

Two studies, one of 394 inner-city men, and one of 94 college sophomore men, classified them as stage 4 if they never managed to live independently or made lasting friendships, stage 5 if they managed to live apart from their family of origin and become financially independent, stage 6 if they lived with a wife or partner, and stage 7 if they had children, managed others at work, or otherwise “cared for others”. They added a stage 6.5 for career consolidation.   Adult life stages in this sense were independent of chronological age, and men who didn’t master earlier stages usually never mastered later ones.[13]

Erikson’s Stage 5, identity development, has some observational evidence behind it; children’s spontaneous story-telling exhibits less concern with identity than adolescents’.  One researcher “found the white adolescents to show a pattern of “progressive identity formation” characterized by frequent changes in self-concept during the early high school years followed by increasing consistency and stability as the person approached high school graduation. In contrast, the black adolescents showed a general stability in their identity elements over the entire study period, a pattern Hauser termed “identity foreclosure.” He interpreted this lack of change as reflecting a problem in development in that important developmental issues had been dodged rather than resolved.”  Of course, it may also mean that “identity development” is culturally contingent rather than universal.[14]

A study that gave 244 undergraduates a questionnaire measuring the Eriksonian ego strengths found that “purpose in life, internal locus of control, and self-esteem bore strong positive relations with all of the ego strengths, with the exception of care.”  But there were no significant correlations between the ego strengths and age, nor any indication that they are achieved in a succession.[15]

A study giving 1073 college students an Erikson developmental stage questionnaire found that it did not fit the “simplex” hypothesis (where people’s achievement of stage n would depend directly on how well they’d achieved step n-1, and less on other stages.)[16]

A 22-year longitudinal study showed that people continued to develop higher scores on Erikson developmental-stage questionnaires between the ages of 20 and 42, even “younger” stages; there was a significant increase over time in stages 1, 5, and 6, for several cohorts.[17]

While people do seem to in some cases gain more of  Erikson’s  ego strengths over time, this finding is not reliable in all studies. People do not climb Erikson’s stages in sequence, or at fixed ages.


Psychologist Abraham Maslow, inspired by the horrors of war to learn about what propels people to “self-actualized”, developed the concept of a “hierarchy of needs” in which the lower ones must be fulfilled before people can pursue the higher ones.

Maslow’s hierarchy of needs are:

  • Physiological (food, air, water)
  • Safety (security from violence, disease, or poverty)
  • Love and belonging
  • Esteem (self-respect, respect from others)
  • Self-actualization (realizing one’s potential)

The theory is that lower needs, when unsatisfied, “dominate” higher needs — that one cannot focus on esteem without first satisfying the need for safety, for instance.  Once one need is satisfied, the next higher need will “activate” and start driving the person’s actions.

Three researchers (Alderfer, Huizinga, and Beer) developed questionnaires designed to measure Maslow’s needs, but all had weaknesses, “particularly a low convergence among items designed to measure the same constants.”  None of the studies showed Maslow’s five needs as independent factors.  Both adjacent and nonadjacent needs overlap, contradicting Maslow’s theory that needs are cumulative.  People also do not rank the importance of those needs according to Maslow’s order.  Also the “deprivation/domination” paradigm (that, the more deprived you are of a need, the higher its importance to you) is contradicted by studies that show that this is not true for safety, belonging, and esteem needs.  The “gratification/activation” theory, that when need n is satisfied, need n becomes less important and need n + 1 becomes more important, was also not borne out by studies.

The author of the review concludes, “Maslow’s Need Hierarchy Theory is almost a nontestable theory…Maslow (1970) criticized what he called the newer methods of research in psychology. He called for a “humane” science.  Accordingly, he did not attempt to provide rigor in his writing or standard definitions of concepts. Further, he did not discuss any guides for empirical verification of his theory. In fact, his defense of his theory consisted of logical as well as clinical insight rather than well-developed research findings.”[18]

However, a more recent study of 386 Chinese subjects found Cronbach alpha scores in the 80-90% range, positive correlations between the satisfaction of all needs, and higher correlations between the satisfaction of adjacent needs than nonadjacent needs.  This seems to suggest a stagelike progression, although the satisfaction of all needs still overlap.  Also, satisfaction of the physiological needs was a predictor of the satisfaction of every one of the four higher-level needs.[19]

A global study across 123 countries found that subjective wellbeing, positive feelings, and negative feelings were all correlated in the expected ways with certain universal needs: basic needs for food and shelter, safety and security, social support and love, feeling respected and pride in activities, mastery, and self-direction and autonomy.  The largest proportion of variance explained globally in life evaluation was from basic needs, followed by social, mastery, autonomy, respect, and safety.  The largest proportion of variance explained in positive emotions was from social and respect. The largest proportion of variance explained in negative emotions was from basic needs, respect, and autonomy.  There are “crossovers”, people who have fulfillment of higher needs but not lower ones: “For example, respect is frequently fulfilled even when safety needs are not met.”[20]

It is unclear whether Maslow’s needs are distinct natural categories, and it is clear that they do not have to be satisfied in sequence, that the most important needs to people are not necessarily the lowest ones or the ones they lack most, and that people do not develop stronger drives towards higher needs when their lower needs are fulfilled. The only part of Maslow’s theory that is borne out by evidence is that people around the world do, indeed, value and receive happiness from all Maslow’s basic categories of needs.


Robert Kegan is not an experimental psychologist, but a practicing therapist, and his books are works of interpretation rather than experiment. He integrates several developmental-psychology frameworks in The Evolving Self, such as Piaget, Kohlberg, and Maslow.

Kegan’s stage 0 is “Incorporative” — babies, corresponding to Piaget’s sensorimotor stage, no real social orientation.

Stage 1 is “impulsive”, corresponding to Piaget’s concrete operational stage, Kohlberg’s punishment/obedience orientation, and Maslow’s physiological satisfaction orientation: the subject is impulses, the objects are reflexes, sensing and moving. This is roughly toddlers.

Stage 2 is “imperial”, corresponding to Piaget’s concrete operational stage, Kohlberg’s “instrumental” orientation, and Maslow’s safety orientation; this is roughly grade-school-aged children. The subject is needs and wishes, the objects are impulses.

Stage 3 is “interpersonal”, corresponding to Piaget’s early formal operational, Kohlberg’s interpersonal concordance orientation, and Maslow’s belongingness orientation. The subject is mutuality and interpersonal relations, the objects are needs and wishes.  These are young teenagers.

Stage 4 is “institutional”, corresponding to Piaget’s formal operational, Kohlberg’s social contract orientation, and Maslow’s self-esteem orientation.  The subject is personal autonomy, the objects are mutuality and interpersonal relations.  This is usually young adulthood and career socialization.

Stage 5 is “interindividual”, corresponding to Maslow’s self-actualization orientation and Kohlberg’s principled orientation.  This is usually mature romantic relationship.

The Subject Object Interview is Kegan’s scale for measuring progression along the stages.

In a study of West Point students, average inter-rater agreement on the Subject-Object Interview was 63%, and students developed from stage 2 to stage 3 and from stage 3 to stage 4 over their years in school. Kegan stage in senior year had a correlation of 0.41 with MD (military development) grade.[21]

A study of 67 executives found that Kegan stage was correlated with leader performance at a p < 0.05 level; Kegan stage was also positively correlated with age.[22]

I was not able to find any studies that indicated whether people skip Kegan stages, regress in stage, or exhibit characteristics of other stages, or other psychometric instruments that decompose into Kegan stages with factor analysis.  Kegan’s stages do appear to be relatively observable and higher stage seems to correspond fairly well with external evaluations of leadership skill.


Piaget’s stages are not distinct (they overlap) or sequential (they can be skipped or attained in different orders.  Later stages do correlate with greater age, but the stages do not arise at consistent ages.

Kohlberg’s stages are sequential; they are defined as distinct by the measurement instrument; and they increase with age (as well as with social class and socioeconomic development of the community.)  Stages don’t arise at fixed ages.

Erikson’s stages do not appear to be distinct, sequential, or even consistently increasing with age.

Maslow’s needs do not appear to be sequentially satisfied.

Kegan’s stages are defined to be distinct by the measurement instrument, and they increased with age in two studies.  I could not find evidence that they are attained sequentially.

Overall, the experimental evidence that distinct, cumulative stages of human development exist is rather weak. The strongest evidence is for Kohlberg’s stages, and these (like all the other stages considered) are limited by the fact that they are measures of how people talk about moral decision-making, rather than what they decide in practice.

Higher stages correlate with positive results in many cases: people at higher Kohlberg stages are less likely to be criminals or delinquents, positive psychological strengths like self-esteem correlate with the Eriksonian ego strengths, and leadership development measures correlate with Kegan stage.  This is evidence that developmental stages do often correspond to real psychological strengths or skills with external validity.  We just don’t generally have strong reason to believe that they progress in a developmental fashion.


[1] Piaget, Jean. The origins of intelligence in children. Vol. 8. No. 5. New York: International Universities Press, 1952.

[2]Spelke, Elizabeth S. “Physical knowledge in infancy: Reflections on Piaget’s theory.” The epigenesis of mind: Essays on biology and cognition (1991): 133-169.

[3]Dumas, Claude, and François Y. Doré. “Cognitive development in kittens (Felis catus): An observational study of object permanence and sensorimotor intelligence.” Journal of Comparative Psychology 105.4 (1991): 357.

[4]Gray, William M. “The Factor Structure of Concrete and Formal Operations: A Confirmation of Piaget.” (1976).

[5]Shayer, Michael, Andreas Demetriou, and Muhammad Pervez. “The structure and scaling of concrete operational thought: Three studies in four countries.” Genetic, Social, and General Psychology Monographs 114.3 (1988): 307-375.

[6]Fischer, Kurt W., and Louise Silvern. “Stages and individual differences in cognitive development.” Annual Review of Psychology 36.1 (1985): 613-648.

[7]Kohlberg, Lawrence. “Stages of moral development.” Moral education 29 (1971).

[8]Snarey, John R. “Cross-cultural universality of social-moral development: a critical review of Kohlbergian research.” Psychological bulletin 97.2 (1985): 202.

[9]Walker, Lawrence J. “The sequentiality of Kohlberg’s stages of moral development.” Child Development (1982): 1330-1336.

[10]Haidt, Jonathan. “The emotional dog and its rational tail: a social intuitionist approach to moral judgment.” Psychological review 108.4 (2001): 814.

[11]Chandler, Michael, and Thomas Moran. “Psychopathy and moral development: A comparative study of delinquent and nondelinquent youth.” Development and Psychopathology 2.03 (1990): 227-246.

[12]Rosenthal, Doreen A., Ross M. Gurney, and Susan M. Moore. “From trust on intimacy: A new inventory for examining Erikson’s stages of psychosocial development.” Journal of Youth and Adolescence 10.6 (1981): 525-537.

[13]Vaillant, George E., and Eva Milofsky. “Natural history of male psychological health: IX. Empirical evidence for Erikson’s model of the life cycle.” The American Journal of Psychiatry (1980).

[14]Waterman, Alan S. “Identity development from adolescence to adulthood: An extension of theory and a review of research.” Developmental psychology 18.3 (1982): 341.

[15]Markstrom, Carol A., et al. “The psychosocial inventory of ego strengths: Development and validation of a new Eriksonian measure.” Journal of youth and adolescence 26.6 (1997): 705-732.

[16]Thornburg, Kathy R., et al. “Testing the simplex assumption underlying the Erikson Psychosocial Stage Inventory.” Educational and psychological measurement 52.2 (1992): 431-436.

[17]Whitbourne, Susan K., et al. “Psychosocial development in adulthood: A 22-year sequential study.” Journal of Personality and Social Psychology 63.2 (1992): 260.

[18]Wahba, Mahmoud A., and Lawrence G. Bridwell. “Maslow reconsidered: A review of research on the need hierarchy theory.” Organizational behavior and human performance 15.2 (1976): 212-240.

[19]Taormina, Robert J., and Jennifer H. Gao. “Maslow and the motivation hierarchy: Measuring satisfaction of the needs.” The American journal of psychology 126.2 (2013): 155-177.

[20]Tay, Louis, and Ed Diener. “Needs and subjective well-being around the world.” Journal of personality and social psychology 101.2 (2011): 354.

[21]Lewis, Philip, et al. “Identity development during the college years: Findings from the West Point longitudinal study.” Journal of College Student Development 46.4 (2005): 357-373.

[22]Strang, Sarah E., and Karl W. Kuhnert. “Personality and leadership developmental levels as predictors of leader performance.” The Leadership Quarterly 20.3 (2009): 421-433.

Resolve Community Disputes With Public Reports?

Epistemic Status: speculative, looking for feedback

TW: Rape

You’re in a tight-knit friend group and you hear some accusations about someone. Often, but not always, these are rape or sexual harassment accusations. (I’ve also seen it happen with claims of theft or fraud.)  You don’t know enough to take it to court, nor do you necessarily want to ruin the accused’s life, but you’ve also lost trust in them, and you might want to warn other people that the accused might be dangerous.

What usually happens at this point is a rumor mill.  And there are a lot of problems with a rumor mill.

First of all, you can get the missing stair problem.  Let’s say Joe raped someone, and the rumor got out. People whisper to each other that Joe is a rapist, they warn each other to stay away from him at parties — but the new girl, who isn’t in on the gossip, is not so lucky, and Joe rapes her too.  And meanwhile, Joe suffers no consequences for his actions, no social disincentive, and maybe community elders actively try to hush up the scandal.  This is not okay.

Sometimes you don’t get a missing-stair situation, you get a witch hunt or a purge.  Some communities are really trigger-happy about “expelling” people or “calling them out”, even for trivial infractions. A girl attempted suicide because her internet “friends” thought her artwork was offensive and tormented her for it.  This is also not what we want.

Sometimes you get a feud, where Alice the Accuser’s friends all rally round her, and Bob the Accused’s friends all rally round him, and there’s a long-lasting, painful rift in the community where everyone is pressured to pick a side because Alice and Bob aren’t speaking.

And a lot of the time you get misinformation spreading around, where the accusation gets magnified in a game of telephone, and you hear vague intimations that Bob is terrible but you are getting conflicting stories about what Bob actually did, and you don’t know the right way to behave.

I have never gotten to know a tight-knit social circle of youngish people that didn’t have “drama” of this kind.  It’s embarrassing that it happens, so it isn’t talked about that much in public, but I’m starting to believe that it’s near-universal.

In a way, this is a question of law.

The American legal system is a really poor fit even for dealing with some legitimate crimes, like sexual assault and small-scale theft, because the odds of a conviction are so low.  Less than 1% of rape, robbery, and assault cases lead to convictions. It can be extremely difficult and stressful to deal with the criminal-justice system, particularly if you’re traumatized, and most of the time it won’t even work.  Moreover, the costs of a criminal penalty are extremely high — prison really does destroy lives — and so people are understandably reluctant to put people they know through that.  And, given problems with police violence, involving the police can be dangerous.

And, of course, for social disputes that aren’t criminal, the law is no use at all.  “Really terrible boyfriend/girlfriend” is not a crime.  “Sockpuppeting and trolling” is not a crime.

Do we have to descend to the level of gossip, feud, witch-hunt, or cover-up, just because we can’t (in principle or in practice) resolve disputes with the legal system?

I think there are alternatives.

My proposed solution (by no means final) is that a panel of trusted “judges” accepted by all parties in the dispute compile a summary of the facts of the case from the accuser(s) and accused, circulate it within the community — and then stop there.

It’s not an enforcement mechanism or a dispute-resolution mechanism, it’s an information-transmission mechanism.

For example, it means that now people will know Joe is an accused rapist, and also know if Joe has explained that he’s innocent. This prevents a few problems:

  • the “missing stair” problem where new people never get warned about Joe
  • the problem that Joe faces no consequences (now his reception will likely be chillier among people who read the summary and think he’s a threat)
  • if Joe is innocent, he’ll face less unfair shunning if people get to hear his side of the story
  • it prevents inflated rumors about Joe from spreading — only the actual accusations get printed, not the telephone-garbled ones

There are a few details about the mechanism that seem important to make the process fair:

  • Accused and accusers must all consent to participating in the process and having their statements made public, otherwise it doesn’t happen
  • Accusers should be allowed to stay anonymous
  • Everybody can meet with the judges at the same time, or one on one, if they choose; accused and accusers do not have to be in the same room together
  • Judges should not have any personal stake in the dispute, and should be accepted by both accused and accusers
  • The format for the report should be something like a password-locked webpage, an email to a mailing list, or a Google doc, not a page on the public internet
  • The report should be limited to what accused and accusers say, and some fact-checking by the judges — maybe a timeline of claimed events, maybe some links to references. But not a verdict.

I’ve heard some counterarguments to this proposal so far.

First, I’ve heard concerns that this is too hard on the accused. Being known to have been accused of anything will make people trust you less, even if you also have the opportunity to defend yourself.  And maybe people, not wanting to make trouble, will still use gossip rather than the formal system, because it seems like too harsh a penalty.

I think it’s fine if not every dispute gets publicly adjudicated; if people don’t want to take it that far, then we’re no worse off than before the option of public fact-finding was made available.

It’s also not obvious to me that this is harsher to the accused than the social enforcement technology we already have.  People are already able to cause scandals by unilaterally making public accusations. My proposal isn’t unilateral — it doesn’t go through unless the accused and accusers both think that transparency can clear their names.

Another criticism I’ve heard is that it gives a false sense of objectivity. People know that the rumor mill is unreliable and weight it appropriately; but if people hear “there’s been a report from a panel of judges about this”, they might assume that everything in the report is definitely true, or worse, that the accused is just guilty by virtue of having been investigated.

This is a real problem, I think, but one that’s difficult to avoid completely. If you attempt to be objective in any setting, you always run the risk that people will mistake you for an oracle. It’s just as true that objective news coverage can give people a false sense of trust in newspapers, but journalistic ideals still promote objectivity. I do think giving an impression of a Weight of Authority can be harmful, and is only somewhat mitigated by practices like not handing down any verdict.

But I think information-sharing is the mildest form of restorative justice.  Restorative justice is dispute resolution within a community, or between offender and victim, rather than being mediated by the state.  It usually involves some kind of penalty and/or restitution from the offender to the victim, or some kind of community penalty (like shunning in various religious congregations.)  Given the failures of the criminal-justice system, restorative justice seems like an appealing goal to me; but it’s hard to implement, especially in modern, non-religious communities of young people without firm shared norms.  If you’re uncomfortable merely publishing accusations and defenses, there’s no way you’re ready to impose restitution within your community.  Maybe that’s appropriate in a given situation — maybe loose friend groups aren’t ready to be self-governing communities. But if you have aspirations towards self-governance, from small-scale (communes) to large-scale (seasteading and the like), figuring out dispute resolution is a necessary step, and it’s worth thinking about what would be required before you’d be okay with any community promotion or enforcement of norms.

I’d actively welcome people’s thoughts and comments on this — how would it fail? how could the mechanism be improved?