Performance Trends in AI

Epistemic Status: moderately confident

Edit To Add: It’s been brought to my attention that I was wrong to claim that progress in image recognition is “slowing down”. As classification accuracy approaches 100%, obviously improvements in raw scores will be smaller, by necessity, since accuracy can’t exceed 100%. If you look at negative log error rates rather than raw accuracy scores, improvement in image recognition (as measured by performance on the ImageNet competition) is increasing roughly linearly over 2010-2016, with a discontinuity in 2012 with the introduction of deep learning algorithms.

Deep learning has revolutionized the world of artificial intelligence. But how much does it improve performance?  How have computers gotten better at different tasks over time, since the rise of deep learning?

In games, what the data seems to show is that exponential growth in data and computation power yields exponential improvements in raw performance. In other words, you get out what you put in. Deep learning matters, but only because it provides a way to turn Moore’s Law into corresponding performance improvements, for a wide class of problems.  It’s not even clear it’s a discontinuous advance in performance over non-deep-learning systems.

In image recognition, deep learning clearly is a discontinuous advance over other algorithms.  But the returns to scale and the improvements over time seem to be flattening out as we approach or surpass human accuracy.

In speech recognition, deep learning is again a discontinuous advance. We are still far away from human accuracy, and in this regime, accuracy seems to be improving linearly over time.

In machine translation, neural nets seem to have made progress over conventional techniques, but it’s not yet clear if that’s a real phenomenon, or what the trends are.

In natural language processing, trends are positive, but deep learning doesn’t generally seem to do better than trendline.

Chess

chesselo1

These are Elo ratings of the best computer chess engines over time.

There was a discontinuity in 2008, corresponding to a jump in hardware; this was the Rybka 2.3.1, a tree-search-based engine without any deep learning or indeed probabilistic elements. Apart from that, progress looks roughly linear.

Here again is the Swedish Chess Computer Association data on Elo scores over time:

chesselo2

Deep learning chess engines have only just recently been introduced; Giraffe, originated by Matthew Lai at Imperial College London, was created in 2015. It only has an Elo rating of 2412, about equivalent to late-90’s-era computer chess engines. (Of course, learning to predict patterns in good moves probabilistically from data is a more impressive achievement than brute-force computation, and it’s quite possible that deep-learning-based chess engines, once tuned over time, will improve.)

Go

(Figures from the Nature paper on AlphaGo.)

alphago.png

Fan Hui is a human player.  Alpha Go performed notably better than its predecessors Crazy Stone (2008, beat human players in mini go games), Pachi (2011), Fuego (2010), and GnuGo, all MCTS programs, but without deep learning or GPUs. AlphaGo uses much more hardware and more data.

Miles Brundage has argued that AlphaGo doesn’t represent that much of a surprise given the improvements in hardware and data (and effort).  He also graphed the returns in Elo rating to hardware by the AlphaGo team:

alphagovhardware

In other words, exponential growth in hardware produces only roughly linear (or even sublinear) growth in performance as measured by Elo score. To do better would require algorithmic innovation as well.

Arcade Games

Artificial Atari games are scored relative to a human professional playtester: (Computer score – random play)/(Human score – random play).

Compare to Elo scores: the ratio of expected scores for player A vs. player B is Q_A / Q_B, where Q_A = 10^(E_A/400), E_A being the Elo score.

Linear growth in Elo scores is equivalent to exponential growth in absolute scores.

Miles Brundage’s blog also offers a trend in Atari performance that looks exponential:

atari

This would, of course, still be plausibly linear in Elo score.

Superhuman performance at arcade games is already here:

ataribygame

This was a single reinforcement learner trained with a convolutional neural net over images of the game screen outputting behaviors (arrows).  Basically it’s dynamic programming, with a nonlinear approximation of the Q-function that estimates the quality of a move; in Deepmind’s case, that Q-function approximator is a convolutional neural net.  Apart from the convnet, Q-learning with function approximation has been around since the 90’s and Q-learning itself since 1989.

Interestingly enough, here’s a video of a computer playing Breakout:

https://www.youtube.com/watch?v=UXgU37PrIFM

It obviously doesn’t “know” the law of reflection as a principle, or it would place the bar near where the ball will eventually land, and it doesn’t.  There are erratic jerky movements that obviously could not in principle be optimal.  It does, however, find the optimal strategy of tunnelling through the bricks and hitting the ball behind the wall.  This is creative learning but not conceptual learning.

You can see the same phenomenon in a game of Pong:

https://www.youtube.com/watch?v=YOW8m2YGtRg

The learned agent performs much better than the hard-coded agent, but moves more jerkily and “randomly” and doesn’t know the law of reflection.  Similarly, the reports of AlphaGo producing “unusual” Go moves are consistent with an agent that can do pattern-recognition over a broader space than humans can, but which doesn’t find the “laws” or “regularities” that humans do.

Perhaps, contrary to the stereotype that contrasts “mechanical” with “outside-the-box” thinking, reinforcement learners can “think outside the box” but can’t find the box?

ImageNet

Image recognition as measured by ImageNet classification performance has improved dramatically with the rise of deep learning.

imagenet

There’s a dramatic performance improvement starting in 2012, corresponding to Geoffrey Hinton’s winning entry, followed by a leveling-off.  Plausibly accuracy is an S-shaped curve.

How does accuracy scale with processing power?

This paper from Baidu illustrates:

baiduscurve

The performance of a deep neural net follows an S-shaped curve over time spent training, but works faster with more GPUs.  How much faster?

baiduscaling

Each doubling in GPUs provides only a linear boost in speed.  At a given time interval for training (as one would have in a timed competition), this means that doubling the number of GPUs would result in a sublinear boost in accuracy.

MNIST

mnist

Using the performance data from Yann LeCun’s website, we can see that deep neural nets hugely improved MNIST digit recognition accuracy. The best algorithms of 1998, which were convolutional nets and boosted convolutional nets due to LeCun, had error rates of 0.7-0.8. Within 5 years, that had dropped to error rates of 0.4, within 10 years, to 0.39 (also a convolutional net), within 15 years, to 0.23, and within 20 years, to 0.21.  Clearly, performance on MNIST is leveling off; it took five years to halve and then 20 years to halve again.

As with ImageNet, we may be getting close to the limits of deep-learning performance (which may easily be human-level.)

Speech Recognition

Before the rise of deep learning, speech recognition was already progressing rapidly, though it was leveling off in conversational speech well above the 10% accuracy rate.

speech

Then, in 2011, the advent of context-dependent deep neural network hidden Markov models produced a discontinuity in performance:

speechdeep.png

More recently, accuracy has continued to progress:

Nuance, a dictation software company, shows steadily improving performance on word recognition through to the present day, with a plausibly exponential trend.

nuance

Baidu has progressed even faster, as of 2015, in speech recognition on Mandarin.

baiduspeech.png

As of 2016, the best performance on the NIST 2000 Switchboard set (of phone conversations) is due to Microsoft, with a word-error rate of 6.3%.

Translation

Machine translation is evaluated by BLEU score, which compares the translation to the reference via overlap in words or n-grams.  BLEU scores range from 0 to 1, with 1 being perfect translation.  As of 2012, Tilde’s  had BLEU scores in the 0.25-0.45 range, with Google and Microsoft performing similarly but worse.

In 2016, Google came out with a new neural-network-based version of its translation tool.  BLEU scores on English -> French and English -> German were 0.404 and 0.263 respectively. Human evaluations, however, rated the neural machine translations 60-87% better.

OpenMT, the machine translation contest, had top BLEU scores in 2012 of about 0.4 for Arabic-to-English, 0.32 for Chinese-to-English, 0.24 for Dari-to-English, 0.27 for Farsi-to-English, and 0.11 for Korean-to-English.

In 2008, Urdu-to-English had top BLEU scores of 0.32, Arabic-to-English scores of 0.48, and Chinese-to-English scores of 0.30.

This doesn’t correspond to an improvement in machine translation at all. Apart from Google’s improvement in human ratings, celebrated in this New York Times Magazine article, it’s unclear whether neural networks actually improve BLEU scores at all. On the other hand, scoring metrics may be an imperfect match to translation quality.

Natural Language Processing

The Association for Computational Linguistics Wiki has some numbers on state of the art performance for various natural language processing tasks.

SAT analogies have been becoming more accurate over time, roughly linearly, until the present day when they are roughly as accurate as the average US college applicant.  None of these are deep learning techniques.

satanalogies.png

Question answering (multiple choice of sentences that answer the question) has improved roughly steadily over time, with a discontinuity around 2006.  Neural nets did not start being used until 2014, but were not a discontinuous advance from the best models of 2013.

questions.png

Paraphrase identification (recognizing if one paragraph is a paraphrase of another) seems to have risen steadily over the past decade, with no special boost due to deep learning techniques; the top performance is not from deep learning but from matrix factorization.

paraphrase

On NLP tasks that have a long enough history to graph, there seems to be no clear indication that deep learning performs above trendline.

Trends relative to processing power and time

Performance/accuracy returns to processing power seem to differ based on problem domain.

In image recognition, we see sublinear returns to linear improvements in processing power, and gains leveling off over time as computers reach and surpass human-level performance. This may mean simply that image recognition is a nearly-solved problem.

In NLP, we see roughly linear improvements over time, and in machine translation, it’s unclear if we see any trends in improvements over time, both of which suggest sublinear returns to processing power, but this is not very confident.

In games, we see roughly linear returns to linear improvements in processing power, which means exponential improvements in performance over time (because of Moore’s law and increasing investment in AI).

This would suggest that far-superhuman abilities are more likely to be possible in game-like problem domains.

What does this imply about deep learning?

What we’re seeing here is that deep learning algorithms can provide improvements in narrow AI across many types of problem domains.

Deep learning provides discontinuous jumps relative to previous machine learning or AI performance trendlines in image recognition and speech recognition; it doesn’t in strategy games or natural language processing, and machine translation and arcade games are ambiguous (machine translation because metrics differ; arcade games because there is no pre-deep-learning comparison.)

A speculative thought: perhaps deep learning is best for problem domains oriented around sensory data? Images or sound, rather than symbols. If current neural net architectures, like convolutional nets, mimic the structure of the sensory cortex of the brain, which I think they do, one would expect this result.

Arcade games would be more analogous to the motor cortex, and perceptual control theory suggests that something similar to Q-learning may be going on in motor learning, though I’d have to learn more to be confident in that.  If mammalian motor learning turns out to look like Q-learning, I’d expect deep reinforcement learning to be especially good in arcade games and robotics, just as deep neural networks are especially good in visual and audio classification.

Deep learning hasn’t really proven itself better than trendline in strategy games (Go and chess) or in natural language tasks.

I might wonder if there are things humans can do with concepts and symbols and principles, the traditional tools of the “higher intellect”, the skills that show up on highly g-loaded tasks, that deep learning cannot do with current algorithms. Obviously hard-coding rules into an AI has grave limitations (the failure of such hard-coded systems was what caused several of the AI winters), but there may also be limitations to non-conceptual pattern recognition.  The continued difficulty of automating language-based tasks may be related to this issue.

Miles Brundage points out,

Progress so far has largely been toward demonstrating general approaches for building narrow systems rather than general approaches for building general systems. Progress toward the former does not entail substantial progress toward the latter. The latter, which requires transfer learning among other elements, has yet to have its Atari/AlphaGo moment, but is an important area to keep an eye on going forward, and may be especially relevant for economic/safety purposes.

I agree.  General AI systems, as far as I know, do not exist today, and the million-dollar question is whether they can be built with algorithms similar to those used today, or if there are further fundamental algorithmic advances that have yet to be discovered. So far, I think there is no empirical evidence from the world of deep learning to indicate that today’s deep learning algorithms are headed for general AI in the near future.  Discontinuous performance jumps in image recognition and speech recognition with the advent of deep learning are the most suggestive evidence, but it’s not clear whether those are above and beyond returns to processing power. And so far I couldn’t find any estimates of trends in cross-domain generalization ability.  Whether deep learning algorithms can be general-purpose is perhaps a more theoretical question; what we can say is that recent AI progress doesn’t offer any reason to suspect that they already are.

Life Extension Possibilities

Epistemic Status: Pretty confident

This is my first pass of a lit review of life-extension interventions apart from caloric restriction, with a focus on things that work in mammals (rather than fruit flies or other invertebrates.)

Intervention Longevity Increase
Ames dwarf mice 50%
PAPP-A knockout mice 38%
Irs knockout mice 32% (female only)
AC5 knockout mice 32%
Low methionine diet 30%
High dose rapamycin 25%
High dose vitamin E 15% females, 40% males
Lower core body temperature 12% males, 20% females
Low dose rapamycin 10-18%
NGDA 10% (male only)
Statins + ACE inhibitors 9%
Selegiline 7%
Metformin 4-5%

Bottom Lines

  • Low methionine diets (roughly, vegan diets) work really well at extending life in mice, and there’s a plausible mechanism (avoiding homocysteine buildup) that they might work in humans as well.  If it worked as well on humans as it does on mice, the average person would live to over 100.
  • Rapamycin extends life in mice by quite a lot. Unfortunately it’s a strong immunosuppressant, so isn’t very safe to use as a drug.
  • There’s a lot of evidence that the IGF/insulin signaling/growth hormone metabolic pathway is associated with aging and short lifespan, and that inhibiting genes on that pathway results in longer lifespan.  IGF-receptor-inhibiting or growth-hormone-inhibiting drugs could be studied for longevity, but haven’t yet.
  • The MAO inhibitor selegiline extends life in both mice and dogs.
  • Metformin seems to work, and is currently being studied in a human trial.
  • NDGA, an antioxidant derived from the creosote bush, might work, but it’s also toxic.
  • Sirtuin drugs and resveratrol don’t work.

Low methionine

60 Fischer rats fed a low-methionine diet lived 30% longer than control rats. The low-methionine rats grew significantly less as well.[1]

80 female mice fed a low-methionine diet lived longer than control mice, at p < 0.02; they also were lower in weight, lower in IGF, insulin, glucose, and thyroxine, had fewer cataracts, and experienced less loss of liver function in response to injected acetaminophen.[2]’

Some tumors are dependent on methionine to grow and will not kill methionine-starved mice as fast.[28]

Homocysteine is biosynthesized from methionine.  Homocysteine levels rise as we age and are associated with many diseases of aging, such as heart disease, cancer, stroke, Alzheimer’s, and presbyopia. Genetic conditions that cause homocysteinuria in younger people cause similar problems: vascular thrombosis, intellectual disability, lens disclocation.  Homocysteine levels are also associated with depression[32] and schizophrenia.[33]  Homocysteine is toxic and reacts to “homocysteinylate” many different kinds of proteins, rendering them ineffective.[29]  It might also cause its damage through oxidation, impaired methylation, or other chemical mechanisms.[30]  If you give a rabbit homocysteine injections, it’ll develop atherosclerosis.[31]

Children with homocysteinuria have been successfully treated with low-methionine diets.[34][35][36] This is now the standard treatment for patients with genetic homocysteinuria who don’t respond to vitamin B supplementation. A low-methionine diet in humans consists of abstaining from meat, fish, and dairy, instead getting protein from soy and vegetables, and making up the caloric deficit with fat.

Growth Hormone and IGF Inhibition

Rats which were heterozygous for an antisense growth-hormone transgene lived 7-10% longer than control rats. They were also smaller and had lower levels of IGF. [3]

Ames dwarf mice lack growth hormone, prolactin, and TSH, and live about 50% longer than normal mice due to a Prop1 mutation.[22]

Humans with Prop1 mutations lack growth hormone and so have short stature, hypothyroid, cortisol deficiency, and failure to go through puberty.[37]  Humans with growth hormone receptor deficiency in Ecuador had short stature and were obese but had a much lower incidence of cancer and diabetes, and greater insulin sensitivity, than their normal relatives.  They did not have higher longevity because they had higher rates of alcoholism and accidents.[38]

Female mice missing an IGF receptor (Irs1 -/-) live 32% longer on average; male Irs1 -/- mice have no change in longevity.  These mice are insulin resistant but have reduced fat mass despite eating more.[23]  A cohort of Ashkenazi Jewish centenarians had female offspring with 35% higher IGF1 and 2.5 centimeters shorter than age- and sex-matched controls.  The centenarians had many mutations in the IGF1 receptor gene. The centenarians with mutations had higher IGF1 and a trend towards shorter height than those without.[39]

Pegvisomant is a growth hormone receptor antagonist used to treat acromegaly; it could be investigated as an anti-aging therapy.  Somatostatin analogs such as octreotide and pasireotide could also be investigated; somatostatin inhibits the release of growth hormone.  There are also IGF receptor kinase inhibitors being investigated for antitumor properties, such as NVP-AEW-541.

Metformin

If started at 3 months of age (but not later), metformin increased mean lifespan of female SHR mice by 14%. It also delayed the onset of the first tumor by 22%.[4]

Metformin increases the mean lifespan of mice by 4-5%. Treated mice had lower cholesterol, lower LDL, and lower insulin.[7]

Rapamycin

If fed to mice near the end of lifespan (600 days), rapamycin extends mean lifespan by 14% for females and 9% for males.[5]  Rapamycin fed to mice starting at 9 months extends median survival by 10% in males and 18% in females.[6]  Rapamycin fed to Her/neu homozygous (cancer-prone) mice caused 4% extension in mean lifespan and 12.4% increase in maximum lifespan.  Rapamycin-treated mice were 25% less likely to develop tumors.[8]

High-dose rapamycin given to mice at 9 months extends life by 23% in males and 26% in females.[9]

Rapamycin increases the lifespan of Rb1+/- mice ( a model of neuroendocrine tumors) by inhibiting the incidence of neuroendocrine tumors.  Mean lifespan increased by 9% in females and 14% in males. Treated mice were significantly less likely to have thyroid tumors, and had smaller tumors of all kinds.[15]

NDGA

Nordihydroguaiaretic acid, an antioxidant derived from the creosote bush, increased mean lifespan by 12% in male but not female mice. Did not increase the proportion of extremely long-lived mice.[11]

NDGA increased median lifespan in male mice, but not female mice, by 8-10%.[12]

On the other hand, there have been reports of hepatitis and kidney damage from human consumption of NDGA or creosote.

High-dose Vitamin E

Male mice given tocopherol (an antioxidant) at a dose of 5g/kg of food from 28 weeks of age had 40% longer median lifespan than control, and 17% increased maximal lifespan; female mice given tocopherol had 15% increased median lifespan.[10]  Mice given tocopherol from 28 weeks and maintained in the cold (45 degrees Fahrenheit) lived 15% longer.[56]  On the other hand, high-dose vitamin E in humans, according to a meta-analysis, did not reduce all-cause mortality.[57]

Lower Core Body Temperature

Mice genetically engineered to overexpress the Hrct-UCP2 gene, which causes an 0.3-0.5 degree drop in core body temperature, had median lifespans increased by 12% in males and 20% in females.[13]  Lower core body temperature is one of the results of caloric restriction, and cooler humans tend to live longer and be less obese.[55]

Young Ovaries

Old mice transplanted with young mouse ovaries lived an average of 6% longer.[14]  In particular, mice ovariectomized before puberty and transplanted with ovaries at 11 months lived longer than intact mice, by 17%. Transplantation with ovaries at 11 months seems to shift the survival curve to the right, postponing aging.[54]

Selegiline

Male rats treated with deprenyl (aka selegiline, a Parkinson’s drug and MAO-B inhibitor) lived on average 35% longer than controls, according to a 1988 study.[16] However, later studies could never find an equally dramatic effect. Mice treated with selegiline starting at 18 months had no increase in survival.[17] Selegiline extends life in female but not male Syrian hamsters.[18] Fischer rats treated starting at 18 months with selegiline lived 7% longer.[19] Male Fischer rats treated starting at 12 months with selegiline lived 7% longer.[20] Female hamsters, but not male, treated with selegiline, lived significantly longer than controls.[24]

ACE Inhibitors

High dose ACE inhibition with ramipril doubled the lifespan of hypertensive rats, bringing it up to that of normal rats.[21] Statins + ramipril increased lifespan of long-lived mice by 9%.[53]

Ramipril is a standard drug for high blood pressure.

AC5 Knockout

Adenylyl cyclase 5 is primarily expressed in the heart and brain, and catalyzes the synthesis of cyclic AMP, an important second messenger which allows hormones to pass through the plasma membrane and activates protein kinases, in particular to regulate glucose and fat metabolism.

AC5 knockout mice have a median lifespan 32% longer than wild-type mice. Bones were less brittle, body weights were smaller, and GH levels were lower.[25]  AC5 knockout mice also have markedly attenuated responses to pain (heat, cold, mechanical, inflammation, and neuropathic.)[50]  The effects of morphine and mu or delta opioid receptor agonists are attenuated in AC5 knockout mice.[52] However, AC5 knockout mice had Parkinson’s-like motor symptoms.[51]

SIRT1 Activators

Sirtuin 1, determined by the SIRT1 gene, is downregulated in cells that have high insulin resistance, and increased in mice undergoing caloric restriction; mice with low levels of SIRT1 don’t live longer in response to caloric restriction, while mice with high levels mimic the caloric restriction phenotype. [49]

SRT1720, a SIRT1 activator, extends life by 8% in mice on a standard diet, and by 21.7% in mice fed a high-fat diet (who are generally shorter-lived).  SRT1720 also reduces the incidence of cataracts, improves glucose tolerance, and lowers LDL and cholesterol.[26]  SRT1720 reduces liver lipid accumulation in strains of mice bred for obesity and insulin resistance, and preserved liver function.[45]

A phase I trial of SRT1720 in elderly human volunteers found that it was safe and well-tolerated and reduced cholesterol, LDL, and triglycerides over the course of a month of treatment.[46]

However, a subsequent trial found that SRT1720 does not in fact activate SIRT except when SIRT is attached to a fluorophore (used for imaging), so it may be an artifact. This study also found that SRT1720 had no effect on glucose tolerance in mouse models of diabetes.[47]

The putative SIRT1 activator SRT2104 did not affect insulin or glucose in a randomized trial of type II diabetes.[48]

Investigation of the sirtuin drugs has shut down, due to these failures to replicate.

PAPP-A Knockout

Mice missing pregnancy-associated plasma protein A live 38% longer than control mice, not associated with changes in serum glucose, cholesterol, or dietary intake. Wild-type mice had many more tumors than knockout mice. (70% of wild-type vs. 15% of knockout had tumors.)[27]  Knockout mice are smaller than wild-type, and consume less food, though similar as a proportion of bodyweight; they also show more spontaneous physical activity. Knockout mice are not significantly different from wild-type in terms of insulin sensitivity, fasting glucose, or insulin levels.[42]  PAPP-A knockout mice do not demonstrate as much thymic atrophy in old age as wild-type mice: more immature thymus cells, more new T cells, less IGF1 expression, easier to activate T cells.  IGF-1 promotes differentiation of T cells, so releasing it slower could keep the thymus young longer.[43]  PAPP-A knockout and wild-type mice both gain similar amounts of subcutaneous fat on high-fat diets, but the knockout mice gain significantly less visceral fat; PAPP-A is most highly expressed in mesenteric fat.[44] PAPP-A may have some tissue-specific effects on promoting IGF-axis activity, without altering metabolism that much across the board.

PAPP-A encodes a metalloproteinase that cleaves insulin-like growth factor binding proteins.  These IGFBPs are inhibitors of IGF activity, and if you cleave them, the ability to inhibit IGF diminishes; so PAPP-A knockouts make IGF less bioavailable.[40]  PAPP-A is expressed in unstable atherosclerotic plaques but not in stable ones; serum PAPP-A levels are higher in patients with unstable angina or acute myocardial infarction than in patients with stable angina or controls, by about a factor of two.[41]

Dogs

Selegiline

80% of dogs receiving selegiline, compared to 39% of elderly (age 10-15) dogs receiving placebo, survived to the end of the two-year study.[65]

Ovaries

Female dogs who had their ovaries removed lived no longer than male dogs, while dogs with ovaries were twice as likely as male dogs to achieve “exceptional” longevity (>13 years).[66]

IGF and Weight

IGF is positively correlated with weight, and negatively correlated with age, in dogs across various breeds.  Larger dogs live less long. [67]

Humans

FOXO3A Mutation

Homozygous minor mutations in the FOXO3A gene were associated with a 2.75 odds ratio of being in a cohort of long-lived men, compared to controls.  They were 29% more likely to be “healthy” at baseline (free of cardiovascular disease, cancer, stroke, Parkinson’s, and diabetes, able to pass a walking and a cognitive test). The mutations were 85% more common in people who lived to more than 100 than in people who died at 72-74.[58]  A German sample of long-lived people found that minor alleles were 1.53x as common in centenarians than controls.[59]

Insulin-like growth factor signaling inhibits FOXO3 activity, while oxidative stress activates FOXO3.  FOXO3 represses the mTOR pathway and promotes DNA repair.  It is also anti-inflammatory: suppresses IL-2 and IL-6, reduces proliferation of T cells and lymphocytes, reduces inflammation.[60]

FOXO3 is activated by AMPK.[61] You can do this via metformin in vitro — meanwhile changing glioma precursor cells into non-tumor cells.[62]  You can also do it with AICAR, an AMP analogue that stimulates AMPK.[63]  Note that AICAR reduces triglycerides, increases HDL, lowers blood pressure, and reverses insulin resistance in mice.[64]

Unsupported Musings

I don’t think antioxidants generally have come out looking too good for anti-aging, and there are a lot of counterexamples to the “aging is oxidative damage” hypothesis.

I think the growth-hormone-and-insulin-signaling cluster of life extension techniques and mutations is probably a real thing, and matches well to an explanation for why caloric restriction works. It also makes sense evolutionarily; in times of food abundance you want to reproduce, while in times of food scarcity you just want to survive the season, so it would make sense if you had two hormonal modes, “reproductive mode” and “survival mode.”

I also think there’s probably an mTOR mechanism, possibly just due to cancer, that explains the effectiveness of both rapamycin and the significance of the FOXO3 genes.  AMPK, which is produced by exercise, is upstream of both the mTOR stuff and the insulin-signaling stuff; this would explain why both exercise and metformin seem to be helpful for longevity.

References

[1]Orentreich, Norman, and JAYA ZIMMERMAN. “Low methionine ingestion by rats extends life span.” Age (days) 1050 (1993): 1300.

[2]Miller, Richard A., et al. “Methionine‐deficient diet extends mouse lifespan, slows immune and lens aging, alters glucose, T4, IGF‐I and insulin levels, and increases hepatocyte MIF levels and stress resistance.” Aging cell 4.3 (2005): 119-125.

[3]Shimokawa, Isao, et al. “Life span extension by reduction in growth hormone-insulin-like growth factor-1 axis in a transgenic rat model.” The American journal of pathology 160.6 (2002): 2259-2265.

[4]Anisimov, Vladimir N., et al. “If started early in life, metformin treatment increases life span and postpones tumors in female SHR mice.” Aging (Albany NY) 3.2 (2011): 148-157.

[5]Harrison, David E., et al. “Rapamycin fed late in life extends lifespan in genetically heterogeneous mice.” nature 460.7253 (2009): 392-395.

[6]Miller, Richard A., et al. “Rapamycin, but not resveratrol or simvastatin, extends life span of genetically heterogeneous mice.” The Journals of Gerontology Series A: Biological Sciences and Medical Sciences (2010): glq178.

[7]Martin-Montalvo, Alejandro, et al. “Metformin improves healthspan and lifespan in mice.” Nature communications 4 (2013).

[8]Anisimov, Vladimir N., et al. “Rapamycin extends maximal lifespan in cancer-prone mice.” The American journal of pathology 176.5 (2010): 2092-2097.

[9]Miller, Richard A., et al. “Rapamycin‐mediated lifespan increase in mice is dose and sex dependent and metabolically distinct from dietary restriction.” Aging cell 13.3 (2014): 468-477.

[10]Navarro, Ana, et al. “Vitamin E at high doses improves survival, neurological performance, and brain mitochondrial function in aging male mice.” American Journal of Physiology-Regulatory, Integrative and Comparative Physiology 289.5 (2005): R1392-R1399.

[11]Strong, Randy, et al. “Nordihydroguaiaretic acid and aspirin increase lifespan of genetically heterogeneous male mice.” Aging cell 7.5 (2008): 641-650.

[12]Harrison, David E., et al. “Acarbose, 17‐α‐estradiol, and nordihydroguaiaretic acid extend mouse lifespan preferentially in males.” Aging cell 13.2 (2014): 273-282.

[13]Conti, Bruno, et al. “Transgenic mice with a reduced core body temperature have an increased life span.” Science 314.5800 (2006): 825-828.

[14]Mason, Jeffrey B., et al. “Transplantation of young ovaries to old mice increased life span in transplant recipients.” The Journals of Gerontology Series A: Biological Sciences and Medical Sciences 64.12 (2009): 1207-1211.

[15]Livi, Carolina B., et al. “Rapamycin extends life span of Rb1+/-mice by inhibiting neuroendocrine tumors.” Aging (Albany NY) 5.2 (2013): 100-110.

[16]Knoll, Joseph. “The striatal dopamine dependency of life span in male rats. Longevity study with (−) deprenyl.” Mechanisms of ageing and development 46.1 (1988): 237-262.

[17]Ingram, Donald K., et al. “Chronic treatment of aged mice with L-deprenyl produces marked striatal MAO-B inhibition but no beneficial effects on survival, motor performance, or nigral lipofuscin accumulation.” Neurobiology of aging 14.5 (1993): 431-440.

[18]Stoll, S., et al. “Chronic treatment of Syrian hamsters with low-dose selegiline increases life span in females but not males.” Neurobiology of aging 18.2 (1997): 205-211.

[19]Kitani, K., et al. “Chronic treatment of (-) deprenyl prolongs the life span of male Fischer 344 rats. Further evidence.” Life sciences 52.3 (1993): 281-288.

[20]Bickford, P. C., et al. “Long-term treatment of male F344 rats with deprenyl: assessment of effects on longevity, behavior, and brain function.” Neurobiology of aging 18.3 (1997): 309-318.

[21]Linz, Wolfgang, et al. “Long-term ACE inhibition doubles lifespan of hypertensive rats.” Circulation 96.9 (1997): 3164-3172.

[22]Bartke, Andrzej, et al. “Longevity: extending the lifespan of long-lived mice.” Nature 414.6862 (2001): 412-412.

[23]Selman, Colin, et al. “Evidence for lifespan extension and delayed age-related biomarkers in insulin receptor substrate 1 null mice.” The FASEB Journal 22.3 (2008): 807-818.

[24]Stoll, S., et al. “Chronic treatment of Syrian hamsters with low-dose selegiline increases life span in females but not males.” Neurobiology of aging 18.2 (1997): 205-211.

[25]Yan, Lin, et al. “Type 5 adenylyl cyclase disruption increases longevity and protects against stress.” Cell 130.2 (2007): 247-258.

[26]Mitchell, Sarah J., et al. “The SIRT1 activator SRT1720 extends lifespan and improves health of mice fed a standard diet.” Cell reports 6.5 (2014): 836-843.

[27]Conover, Cheryl A., and Laurie K. Bale. “Loss of pregnancy‐associated plasma protein A extends lifespan in mice.” Aging cell 6.5 (2007): 727-729.

[28]Hoffman, Robert M. “Methioninase: a therapeutic for diseases related to altered methionine metabolism and transmethylation: cancer, heart disease, obesity, aging, and Parkinson’s disease.” Human cell 10 (1997): 69-80.

[29]Krumdieck, Carlos L., and Charles W. Prince. “Mechanisms of homocysteine toxicity on connective tissues: implications for the morbidity of aging.” The Journal of nutrition 130.2 (2000): 365S-368S.

[30]Perna, Alessandra F., et al. “Possible mechanisms of homocysteine toxicity.” Kidney International 63 (2003): S137-S140.

[31]McCully, Kilmer S., and Bruce D. Ragsdale. “Production of arteriosclerosis by homocysteinemia.” The American journal of pathology 61.1 (1970): 1.

[32]Tolmunen, Tommi, et al. “Association between depressive symptoms and serum concentrations of homocysteine in men: a population study.” The American journal of clinical nutrition 80.6 (2004): 1574-1578.

[33]Applebaum, Julia, et al. “Homocysteine levels in newly admitted schizophrenic patients.” Journal of psychiatric research 38.4 (2004): 413-416.

[34]Perry, ThomasL, et al. “Treatment of homocystinuria with a low-methionine diet, supplemental cystine, and a methyl donor.” The Lancet 292.7566 (1968): 474-478.

[35]Kolb, Felix O., Jerry M. Earll, and Harold A. Harper. ““Disappearance” of cystinuria in a patient treated with prolonged low methionine diet.” Metabolism 16.4 (1967): 378-381.

[36]Sardharwalla, I. B., et al. “Homocystinuria: a study with low-methionine diet in three patients.” Canadian Medical Association Journal 99.15 (1968): 731.

[37]Reynaud, Rachel, et al. “A familial form of congenital hypopituitarism due to a PROP1 mutation in a large kindred: phenotypic and in vitro functional studies.” The Journal of Clinical Endocrinology & Metabolism 89.11 (2004): 5779-5786.

[38]Guevara-Aguirre, Jaime, et al. “Growth hormone receptor deficiency is associated with a major reduction in pro-aging signaling, cancer, and diabetes in humans.” Science translational medicine 3.70 (2011): 70ra13-70ra13.

[39]Suh, Yousin, et al. “Functionally significant insulin-like growth factor I receptor mutations in centenarians.” Proceedings of the National Academy of Sciences 105.9 (2008): 3438-3442.

[40]Lawrence, James B., et al. “The insulin-like growth factor (IGF)-dependent IGF binding protein-4 protease secreted by human fibroblasts is pregnancy-associated plasma protein-A.” Proceedings of the National Academy of Sciences 96.6 (1999): 3149-3153.

[41]Bayes-Genis, Antoni, et al. “Pregnancy-associated plasma protein A as a marker of acute coronary syndromes.” New England Journal of Medicine 345.14 (2001): 1022-1029.

[42]Conover, Cheryl A., et al. “Metabolic consequences of pregnancy-associated plasma protein-A deficiency in mice: exploring possible relationship to the longevity phenotype.” Journal of Endocrinology 198.3 (2008): 599-605.

[43]Vallejo, Abbe N., et al. “Resistance to age-dependent thymic atrophy in long-lived mice that are deficient in pregnancy-associated plasma protein A.” Proceedings of the National Academy of Sciences 106.27 (2009): 11252-11257.

[44]Conover, Cheryl A., et al. “Preferential impact of pregnancy-associated plasma protein-A deficiency on visceral fat in mice on high-fat diet.” American Journal of Physiology-Endocrinology and Metabolism 305.9 (2013): E1145-E1153.

[45]Yamazaki, Yu, et al. “Treatment with SRT1720, a SIRT1 activator, ameliorates fatty liver with reduced expression of lipogenic enzymes in MSG mice.” American Journal of Physiology-Endocrinology and Metabolism 297.5 (2009): E1179-E1186.

[46]Libri, Vincenzo, et al. “A pilot randomized, placebo controlled, double blind phase I trial of the novel SIRT1 activator SRT2104 in elderly volunteers.” PLoS One 7.12 (2012): e51395.

[47]Pacholec, Michelle, et al. “SRT1720, SRT2183, SRT1460, and resveratrol are not direct activators of SIRT1.” Journal of Biological Chemistry 285.11 (2010): 8340-8351.

[48]Baksi, Arun, et al. “A phase II, randomized, placebo‐controlled, double‐blind, multi‐dose study of SRT2104, a SIRT1 activator, in subjects with type 2 diabetes.” British journal of clinical pharmacology 78.1 (2014): 69-77.

[49]Cantó, Carles, and Johan Auwerx. “Caloric restriction, SIRT1 and longevity.” Trends in Endocrinology & Metabolism 20.7 (2009): 325-331.

[50]Kim, K‐S., et al. “Markedly attenuated acute and chronic pain responses in mice lacking adenylyl cyclase‐5.” Genes, Brain and Behavior 6.2 (2007): 120-127.

[51]Iwamoto, Tamio, et al. “Motor dysfunction in type 5 adenylyl cyclase-null mice.” Journal of Biological Chemistry 278.19 (2003): 16936-16940.

[52]Kim, Kyoung-Shim, et al. “Adenylyl cyclase type 5 (AC5) is an essential mediator of morphine action.” Proceedings of the National Academy of Sciences of the United States of America 103.10 (2006): 3908-3913.

[53]Spindler, Stephen R., Patricia L. Mote, and James M. Flegal. “Combined statin and angiotensin-converting enzyme (ACE) inhibitor treatment increases the lifespan of long-lived F1 male mice.” AGE 38.5-6 (2016): 379-391.

[54]Cargill, Shelley L., et al. “Age of ovary determines remaining life expectancy in old ovariectomized mice.” Aging cell 2.3 (2003): 185-190.

[55]Waalen, Jill, and Joel N. Buxbaum. “Is older colder or colder older? The association of age with body temperature in 18,630 individuals.” The Journals of Gerontology Series A: Biological Sciences and Medical Sciences 66.5 (2011): 487-492.

[56]Banks, Ruth, John R. Speakman, and Colin Selman. “Vitamin E supplementation and mammalian lifespan.” Molecular nutrition & food research 54.5 (2010): 719-725.

[57]Miller, Edgar R., et al. “Meta-analysis: high-dosage vitamin E supplementation may increase all-cause mortality.” Annals of internal medicine 142.1 (2005): 37-46.

[58]Willcox, Bradley J., et al. “FOXO3A genotype is strongly associated with human longevity.” Proceedings of the National Academy of Sciences 105.37 (2008): 13987-13992.

[59]Flachsbart F, Caliebe A, Kleindorp R, Blanché H, von Eller-Eberstein H, Nikolaus S, Schreiber S, Nebel A (Feb 2009). “Association of FOXO3A variation with human longevity confirmed in German centenarians”. Proceedings of the National Academy of Sciences of the United States of America. 106 (8): 2700–5.

[60]Morris, Brian J., et al. “FOXO3: a major gene for human longevity-a mini-review.” Gerontology 61.6 (2015): 515-525.

[61]Greer, Eric L., et al. “The energy sensor AMP-activated protein kinase directly regulates the mammalian FOXO3 transcription factor.” Journal of Biological Chemistry 282.41 (2007): 30107-30119.

[62]Sato, Atsushi, et al. “Glioma‐Initiating Cell Elimination by Metformin Activation of FOXO3 via AMPK.” Stem cells translational medicine 1.11 (2012): 811-824.

[63]Li, Xiao-Nan, et al. “Activation of the AMPK-FOXO3 pathway reduces fatty acid–induced increase in intracellular reactive oxygen species by upregulating thioredoxin.” Diabetes 58.10 (2009): 2246-2257.

[64]Buhl, Esben S., et al. “Long-term AICAR administration reduces metabolic disturbances and lowers blood pressure in rats displaying features of the insulin resistance syndrome.” Diabetes 51.7 (2002): 2199-2206.

[65]Ruehl, W. W., et al. “Treatment with L-deprenyl prolongs life in elderly dogs.” Life sciences 61.11 (1997): 1037-1044.

[66]Waters, David J., et al. “Exploring mechanisms of sex differences in longevity: lifetime ovary exposure and exceptional longevity in dogs.” Aging Cell 8.6 (2009): 752-755.

[67]Greer, Kimberly A., Larry M. Hughes, and Michal M. Masternak. “Connecting serum IGF-1, body size, and age in the domestic dog.” Age 33.3 (2011): 475-483.

[68]Arteaga, Silvia, Adolfo Andrade-Cetto, and René Cárdenas. “Larrea tridentata (Creosote bush), an abundant plant of Mexican and US-American deserts and its metabolite nordihydroguaiaretic acid.” Journal of ethnopharmacology 98.3 (2005): 231-239.

Reply to Criticism on my EA Post

My previous post, “EA Has A Lying Problem”, received a lot of criticism, and I’d like to address some of it here.

I was very impressed by what I learned about EA discourse norms from preparing this post and responding to comments on it. I’m appreciating anew that this is a community where people really do the thing of responding directly to arguments, updating on evidence, and continuing to practice discourse instead of collapsing into verbal fights.  I’m going to try to follow that norm in this post.

Structurally, what I did in my previous post was

  • quote some EAs making comments on forums and Facebook
  • interpret what I think is the attitude behind those quotes
  • claim that the quotes show a pervasive problem in which the EA community doesn’t value honesty enough.

There are three possible points of failure to this argument:

  • The quotes don’t mean what I took them to mean
  • The views I claimed EAs hold are not actually bad
  • The quotes aren’t evidence of a broader problem in EA.

There’s also a possible prudential issue: that I may have, indeed, called attention to a real problem, but that my tone was too extreme or my language too sloppy, and that this is harmful.

I’m going to address each of these possibilities separately.

Possibility 1: The quotes don’t mean what I took them to mean

Case 1: Ben Todd’s Quotes on Criticism

I described Ben Todd as asking people to consult with EA orgs before criticizing them, and as heavily implying that it’s more useful for orgs to prioritize growth over engaging with the kind of highly critical people who are frequent commenters on EA debates.

I went on to claim that this underlying attitude is going to favor growth over course correction, and prioritize “movement-building” by gaining appeal among uncritical EA fans, while ignoring real concerns.

I said,

Essentially, this maps to a policy of “let’s not worry over-much about internally critiquing whether we’re going in the right direction; let’s just try to scale up, get a bunch of people to sign on with us, move more money, grow our influence.”  An uncharitable way of reading this is “c’mon, guys, our marketing doesn’t have to satisfy you, it’s for the marks!”  

This is a pretty large extrapolation from Todd’s actual comments, and I think I was putting words in his mouth that are much more extreme than anything he’d agree with. The quotes I pulled didn’t come close to proving that Todd actually wants to ignore criticism and pander to an uncritical audience.  It was wrong of me to give the impression that he’s deliberately pursuing a nefarious strategy.

And in the comments, he makes it clear that this wasn’t his intent and that he’s actually made a point of engaging with criticism:

Hi Sarah,

The 80,000 Hours career guide says what we think. That’s true even when it comes to issues that could make us look bad, such as our belief in the importance of the risks from artificial intelligence, or when are issues could be offputtingly complex, such as giving now vs. giving later and the pros and cons of earning to give. This is the main way we engage with users, and it’s honest.

As an organisation, we welcome criticism, and we post detailed updates on our progress, including our mistakes:

https://80000hours.org/about/credibility/evaluations/

https://80000hours.org/about/credibility/evaluations/mistakes/

I regret that my comments might have had the effect of discouraging important criticism.

My point was that public criticism has costs, which need to be weighed against the benefits of the criticism (whether or not you’re an act utilitarian). In extreme cases, organisations have been destroyed by memorable criticism that turned out to be false or unfounded. These costs, however, can often be mitigated with things like talking to the organisation first – this isn’t to seek permission, but to do things like check whether they’ve already written something on the topic, and whether your understanding of the issues is correct. For instance, GiveWell runs their charity reviews past the charity before posting, but that doesn’t mean their reports are always to the charity’s liking. I’d prefer a movement where people bear these considerations in mind as well, but it seems like they’re often ignored.

None of this is to deny that criticism is often extremely valuable.

I think this is plainly enough to show that Ben Todd is not anti-criticism. I’m also impressed that 80,000 Hours has a “mistakes page” in which they describe past failures (which is an unusual and especially praiseworthy sign of transparency in an organization.)

Todd did, in his reply to my post, reiterate that he thinks criticism should face a high burden of proof because “organisations have been destroyed by memorable criticism that turned out to be false or unfounded.” I’m not sure this is a good policy; Ben Hoffman articulates some problems with it here.

But I was wrong to conflate this with an across-the-board opposition to criticism.  It’s probably fairer to say that Todd opposes adversarial criticism and prefers cooperative or friendly criticism (for example, he thinks critics should privately ask organizations to change their policies rather than publicly lambasting them for having bad policies.)

I still think this is a mistake on his part, but when I framed it as “EA Leader says criticizing EA orgs is harmful to the movement”, I was exaggerating for effect, and I probably shouldn’t have done that.

Case 2: Robert Wiblin on Promises

I quoted Robert Wiblin on his interpretation of the Giving What We Can pledge, and interpreted Wiblin’s words to mean that he doesn’t think the pledge is morally binding.

I think this is pretty clear-cut and I interpreted Wiblin correctly.

The context there was that Alyssa Vance, in the original post, had said that many people might rationally choose not to take the pledge because unforeseen financial circumstances might make it inadvisable in future. She said that Wiblin had previously claimed that this was not a problem, because he didn’t view the pledge as binding on his future self:

pledge taker Rob Wiblin said that, if he changed his mind about donating 10% every year being the best choice, he would simply un-take the pledge.”  

Wiblin doesn’t think that “maybe I won’t be able to afford to give 10% of my income in future” is a good enough reason for people to choose not to pledge 10% of their lifetime income, because if they ever did become poor, they could just stop giving.

Some commenters claimed that Wiblin doesn’t have a cavalier attitude towards promises, he just thinks that in extreme cases it’s okay to break them.  In the Jewish ritual law, it’s permissible to break a commandment if it’s necessary to save a human life, but that doesn’t mean that the overall attitude to the commandments is casual.

However, I think it does imply a cavalier attitude towards promises to say that you shouldn’t hesitate to make them on the grounds that you might not want to keep them.  If you don’t think, before making a lifelong pledge, that people should think “hmm, am I prepared to make this big a commitment?” and in some cases answer “no”, then you clearly don’t think that the pledge is a particularly strong commitment.

Case 3: Robert Wiblin on Autism

Does Robert Wiblin actually mean it as a pejorative when he speculates that maybe the reason some people are especially hesitant to commit to the GWWC pledge is that they’re on the autism spectrum?

Some people (including the person he said it to, who is autistic), didn’t take it as a negative.  And, in principle, if we aren’t biased against disabled people, “autistic” should be a neutral descriptive word, not a pejorative.

But we do live in a society where people throw around “autistic” as an insult to refer to anybody who supposedly has poor social skills, so in context, Wiblin’s comment does have a pejorative connotation.

Moreover, Wiblin was using the accusation of autism as a reason to dismiss the concerns of people who are especially serious about keeping promises. It’s equivalent to saying “your beliefs are the result of a medical condition, so I don’t have to take them seriously.”  He’s medicalizing the beliefs of those who disagree with him.  Even if his opponents are autistic, if he respected them, he’d take their disagreement seriously.

Case 4: Jacy Reese on Evidence from Intuition

I quoted Jacy Reese responding to criticism about his cost-effectiveness estimates by saying that the evidence base in favor of leafleting includes his intuition and studies that are better described as “evidence against the effectiveness of leafleting.”

His, and ACE’s, defense of the leafleting studies as merely “weak evidence” for leafleting, is a matter of public record in many places. He definitely believes this.

Does he really think that his intuition is evidence, or did he just use ambiguous wording? I don’t know, and I’d be willing to concede that this isn’t a big deal.

Possibility 2: The views I accused EAs of holding are not actually bad.

Case 1: Dishonesty for the greater good might sometimes be worthwhile.

A number of people in the comments to my previous post are making the argument that I need to weigh the harms of dishonest or misleading information against its benefits.

First of all, the fact that people are making these arguments at least partly belies the notion that all EAs oppose lying across the board; I’ll say more about the prevalence of these views in the community in the next section.

Holly Elmore:

What if, for the sake of argument, it *was* better to persuade easy marks to take the pledge and give life-saving donations than to persuade fewer people more gently and (as she perceives it) respectfully? How many lives is extra respect worth? She’s acting like this isn’t even an argument.

This is a more general problem I’ve had with Sarah’s writing and medical ethics in general– the fixation on meticulously informed consent as if it’s the paramount moral issue.

Gleb Tsipursky:

If you do not lie, that’s fine, but don’t pretend that you care about doing the most good, please. Just don’t. You care about being as transparent and honest as possible over doing the most good.

I’m including Gleb here, even though he’s been kicked out of the EA community, because he is saying the same things as Holly Elmore, who is a respected member of the community.  There may be more EAs out there sharing the same views.

So, cards on the table: I am not an act-utilitarian. I am a eudaimonistic virtue ethicist. What that means is that I believe:

  • The point of behaving ethically is to have a better life for yourself.
  • Dishonesty will predictably damage your life.
  • If you find yourself tempted to be dishonest because it seems like a good idea, you should instead trust that the general principle of “be honest” is more reliable than your guess that lying is a good idea in this particular instance.

(Does this apply to me and my lapses in honesty?  YOU BET.  Whenever it seems like a good idea at the time for me to deceive, I wind up suffering for it later.)

I also believe consent is really important.

I believe that giving money to charitable causes is a non-obligatory personal decision, while respecting consent to a high standard is not.

Are these significant values differences with many EAs? Yes, they are.

I wasn’t honest enough in my previous post about this, and I apologize for that. I should have owned my beliefs more candidly.

I also exaggerated for effect in my previous post, and that was misleading, and I apologize for that. Furthermore, in the comments, I claimed that I intended to exaggerate for effect; that was an emotional outburst and isn’t what I really believe. I don’t endorse dishonesty “for a good cause”, and on occasions  when I’ve gotten upset and yielded to temptation, it has always turned out to be a bad idea that came back to bite me.

I do think that even if you are a more conventional utilitarian, there are arguments in favor of being honest always and not just when the local benefits outweigh the costs.

Eliezer Yudkowsky and Paul Christiano have written about why utilitarians should still have integrity.

One way of looking at this is rule-utilitarianism: there are gains from being known to be reliably trustworthy.

Another way of looking at this is the comment about “financial bubbles” I made in my previous post.  If utilitarians take their best guess about what action is the Most Good, and inflate public perceptions of its goodness so that more people will take that action, and encourage the public to further inflate perceptions of its goodness, then errors in people’s judgments of the good will expand without bound.  A highly popular bad idea will end up dominating most mindshare and charity dollars. However, if utilitarians critique each other’s best guesses about which actions do the most good, then bad ideas will get shot down quickly, to make room for good ones.

Case 2: It’s better to check in with EA orgs before criticizing them

Ben Todd, and some people in the comments, have argued that it’s better to run critical blog posts by EA orgs before making those criticisms public.  This rhymes a little with the traditional advice to admonish friends in private but praise them in public, in order to avoid causing them to lose face.  The idea seems to be that public criticism will be less accurate and also that it will draw negative attention to the movement.

Now, some of the issues I criticized in my blog post have also been brought up by others, both publicly and privately, which is where I first heard about them. But I don’t agree with the basic premise in the first place.

Journalists check quotes with sources, that’s true, and usually get quotes from the organizations they’re reporting on. But bloggers are not journalists, first of all.  A blog post is more like engaging in an extended conversation than reporting the news. Some of that conversation is with EA orgs and their leaders — this post, and the discussions it pulls from, are drawn from discussions about writings of various degrees of “officialness” coming from EA orgs.  I think the public record of discussion is enough of a “source” for this purpose; we know what was said, by whom, and when, and there’s no ambiguity about whether the comments were really made.

What we don’t necessarily know without further discussion is what leaders of EA orgs mean, and what they say behind closed doors. It may be that their quotes don’t represent their intent.  I think this is the gist of what people saying “talk to the orgs in private” mean — if we talked to them, we’d understand that they’re already working on the problem, or that they don’t really have the problematic views they seem to have, etc.

However, I think this is an unfair standard.  “Talk to us first to be sure you’re getting the real story” is extra work for both the blogger and the EA org (do you really have to check in with GWWC every time you discuss the pledge?)

And it’s trying to massage the discussion away from sharp, adversarial critics. A journalist who got his stories about politics almost entirely from White House sources, and relied very heavily on his good relationship with the White House, would have a conflict of interest and would probably produce biased reporting. You don’t want all the discussion of EA to be coming from people who are cozy with EA orgs.  You don’t necessarily want all discussion to be influenced by “well, I talked to this EA leader, and I’m confident his heart’s in the right place.”

There’s something valuable about having a conversation going on in public. It’s useful for transparency and it’s useful for common knowledge. EA orgs like GiveWell and 80K are unusually transparent already; they’re engaging in open dialogue with their readers and donors, rather than funneling all communication through a narrow, PR-focused information stream.  That’s a remarkable and commendable choice.

But it’s also a risky one; because they’re talking a lot, they can incur reputational damage if they’re quoted unfavorably (as I did in my previous post).  So they’re asking us, the EA and EA-adjacent community, to do some work in guarding their reputation.

I think this is not necessarily a fair thing to expect from everyone discussing an EA topic. Some people are skeptical of EA as a whole, and thus don’t have a reason to protect its reputation. Some people, like Alyssa in her post on the GWWC pledge, aren’t even accusing an org of doing anything wrong, just discussing a topic of interest to EAs like “who should and shouldn’t take the pledge?” She couldn’t reasonably have foreseen that this would be perceived as an attack on GWWC’s reputation.

I think, if an EA org says or does something in public that people find problematic, they should expect to be criticized in public, and not necessarily get a chance to check over the criticism first.

Possibility 3: The quotes I pulled are not strong evidence of a big problem in EA.

I picked quotes that I and a few friends had noticed offhand as unusually bad.  So, obviously, it’s not the same thing as a survey of EA-wide attitudes.

On the other hand, “picking egregious examples” is a perfectly fine way to suggest that there may be a broader trend. If you know that a few people in a Presidential administration have made racist remarks, for instance, it’s not out of line to suggest that the administration has a racism problem.

So, I stand behind cherry-picked examples as a way to highlight trends, in the context of “something suspicious is going on here, maybe we should pay attention to it.”

The fact that people are, in response to my post, defending the practice of lying for the greater good is also evidence that these aren’t entirely isolated cases.

Of course, it’s possible that the quotes I picked aren’t egregiously bad, but I think I’ve covered my views on that in the previous two sections.

I think that, given the Intentional Insights scandal, it’s reasonable to ask the question “was this just one guy, or is the EA community producing a climate that shelters bullshit artists?”  And I think there’s enough evidence to be suspicious that the latter is true.

Possibility 4: My point stands, but my tactics were bad

I did not handle this post like a pro.

I used the title “EA Has a Lying Problem”, which is inflammatory, and also (worse, in my view), not quite the right word. None of the things I quoted were lies. They were defenses of dishonest behavior, or, in Ben Todd’s case, what I thought was a bias against transparency and open debate. I probably should have called it “dishonesty” rather than “lying.”

In general, I think I was inflammatory in a careless rather than a pointed way. I do think it’s important to make bad things look bad, but I didn’t take care to avoid discrediting a vast swath of innocents, and that was wrong of me.

Then, I got emotional in the comments section, and expressed an attitude of “I’m a bad person who does bad things on purpose”, which is rude, untrue, and not a good look on me.

I definitely think these were mistakes on my part.

It’s also been pointed out to me that I could have raised my criticisms privately, within EA orgs, rather than going public with a potentially reputation-damaging post (damaging to my own reputation or to the EA movement.)

I don’t think that would have been a good idea in my case.

When it comes to my own reputation, for better or for worse I’m a little reckless.  I don’t have a great deal of ability to consciously control how I’m perceived — things tend to slip out impulsively — so I try not to worry about it too much.  I’ll live with how I’m judged.

When it comes to EA’s reputation, I think it’s possible I should have been more careful. Some of the organizations I’ve criticized have done really good work promoting causes I care about.  I should have thought of that, and perhaps worded my post in a way that produced less risk of scandal.

On the other hand, I never had a close relationship with any EA orgs, and I don’t think internal critique would have been a useful avenue for me.
In general, I think I want to sanity-check my accusatory posts with more beta readers in future.  My blog is supposed to represent a pretty close match to my true beliefs, not just my more emotional impulses, and I should be more circumspect before posting stuff.

EA Has A Lying Problem

I am currently writing up a response to criticism of this post and will have it up shortly.

Why hold EA to a high standard?

“Movement drama” seems to be depressingly common  — whenever people set out to change the world, they inevitably pick fights with each other, usually over trivialities.  What’s the point, beyond mere disagreeableness, of pointing out problems in the Effective Altruism movement? I’m about to start some movement drama, and so I think it behooves me to explain why it’s worth paying attention to this time.

Effective Altruism is a movement that claims that we can improve the world more effectively with empirical research and explicit reasoning. The slogan of the Center for Effective Altruism is “Doing Good Better.”

This is a moral claim (they say they are doing good) and a claim of excellence (they say that they offer ways to do good better.)

EA is also a proselytizing movement. It tries to raise money, for EA organizations as well as for charities; it also tries to “build the movement”, increase attendance at events like the EA Global conference, get positive press, and otherwise get attention for its ideas.

The Atlantic called EA “generosity for nerds”, and I think that’s a fair assessment. The “target market” for EA is people like me and my friends: young, educated, idealistic, Silicon Valley-ish.

The origins of EA are in academic philosophy. Peter Singer and Toby Ord were the first to promote the idea that people have an obligation to help the developing world and reduce animal suffering, on utilitarian grounds.  The leaders of the Center for Effective Altruism, Giving What We Can, 80,000 Hours, The Life You Can Save, and related EA orgs, are drawn heavily from philosophy professors and philosophy majors.

What this means, first of all, is that we can judge EA activism by its own standards. These people are philosophers who claim to be using objective methods to assess how to do good; so it’s fair to ask “Are they being objective? Are they doing good? Is their philosophy sound?”  It’s admittedly hard for young organizations to prove they have good track records, and that shouldn’t count against them; but honesty, transparency, and sound arguments are reasonable to expect.

Second of all, it means that EA matters.  I believe that individuals and small groups who produce original thinking about big-picture issues have always had outsize historical importance. Philosophers and theorists who capture mindshare have long-term influence.  Young people with unusual access to power and interest in “changing the world” stand a good chance of affecting what happens in the coming decades.

So it matters if there are problems in EA. If kids at Stanford or Harvard or Oxford are being misled or influenced for the worse, that’s a real problem. They actually are, as the cliche goes, “tomorrow’s leaders.” And EA really seems to be prominent among the ideologies competing for the minds of the most elite and idealistic young people.  If it’s fundamentally misguided or vulnerable to malfeasance, I think that’s worth talking about.

Lying for the greater good

Imagine that you are a perfect act-utilitarian. You want to produce the greatest good for the greatest number, and, magically, you know exactly how to do it.

Wouldn’t a pretty plausible course of action be “accumulate as much power and resources as possible, so you can do even more good”?

Taken to an extreme, this would look indistinguishable from the actions of someone who just wants to acquire as much power as possible for its own sake.  Actually building Utopia is always something to get around to later; for now you have to build up your strength, so that the future utopia will be even better.

Lying and hurting people in order to gain power can never be bad, because you are always aiming at the greater good down the road, so anything that makes you more powerful should promote the Good, right?

Obviously, this is a terrible failure mode. There’s a reason J.K. Rowling gave her Hitler-like figure Grindelwald the slogan “For the Greater Good.”  Ordinary, children’s-story morality tells us that when somebody is lying or hurting people “for the greater good”, he’s a bad guy.

A number of prominent EA figures have made statements that seem to endorse lying “for the greater good.”  Sometimes these statements are arguably reasonable, taken in isolation. But put together, there starts to be a pattern.  It’s not quite storybook-villain-level, but it has something of the same flavor.

There are people who are comfortable sacrificing honesty in order to promote EA’s brand.  After all, if EA becomes more popular, more people will give to charity, and that charity will do good, and that good may outweigh whatever harm comes from deception.

The problem with this reasoning should be obvious. The argument would work just as well if EA did no good at all, and only claimed to do good.

Arbitrary or unreliable claims of moral superiority function like bubbles in economic markets. If you never check the value of a stock against some kind of ground-truth reality, if everyone only looks at its current price and buys or sells based on that, we’ll see prices being inflated based on no reason at all.  If you don’t insist on honesty in people’s claims of “for the greater good”, you’ll get hijacked into helping people who aren’t serving the greater good at all.

I think it’s worth being suspicious of anybody who says “actually, lying is a good idea” and has a bunch of intelligence and power and moral suasion on their side.

It’s a problem if a movement is attracting smart, idealistic, privileged young people who want to “do good better” and teaching them that the way to do the most good is to lie.  It’s arguably even more of a problem than, say, lobbyists taking young Ivy League grads under their wing and teaching them to practice lucrative corruption.  The lobbyists are appealing to the most venal among the youthful elite.  The nominally-idealistic movement is appealing to the most ethical, and corrupting them.

The quotes that follow are going to look almost reasonable. I expect some people to argue that they are in fact reasonable and innocent and I’ve misrepresented them. That’s possible, and I’m going to try to make a case that there’s actually a problem here; but I’d also like to invite my readers to take the paranoid perspective for a moment. If you imagine mistrusting these nice, clean-cut, well-spoken young men, or mistrusting Something that speaks through them, could you see how these quotes would seem less reasonable?

Criticizing EA orgs is harmful to the movement

In response to an essay on the EA forums criticizing the Giving What We Can pledge (a promise to give 10% of one’s income to charity), Ben Todd, the CEO  of 80,000 Hours, said:

Topics like this are sensitive and complex, so it can take a long time to write them up well. It’s easy to get misunderstood or make the organisation look bad.

At the same time, the benefits might be slight, because (i) it doesn’t directly contribute to growth (if users have common questions, then add them to the FAQ and other intro materials) or (ii) fundraising (if donors have questions, speak to them directly).

Remember that GWWC is getting almost 100 pledges per month atm, and very few come from places like this forum. More broadly, there’s a huge number of pressing priorities. There’s lots of other issues GWWC could write about but hasn’t had time to as well.

If you’re wondering whether GWWC has thought about these kinds of questions, you can also just ask them. They’ll probably respond, and if they get a lot of requests to answer the same thing, they’ll probably write about it publicly.

With figuring out strategy (e.g. whether to spend more time on communication with the EA community or something else) GWWC writes fairly lengthy public reviews every 6-12 months.

He also said:

None of these criticisms are new to me. I think all of them have been discussed in some depth within CEA.

This makes me wonder if the problem is actually a failure of communication. Unfortunately, issues like this are costly to communicate outside of the organisation, and it often doesn’t seem like the best use of time, but maybe that’s wrong.

Given this, I think it also makes sense to run critical posts past the organisation concerned before posting. They might have already dealt with the issue, or have plans to do so, in which posting the criticism is significantly less valuable (because it incurs similar costs to the org but with fewer benefits). It also helps the community avoid re-treading the same ground.

In other words: the CEO of 80,000 Hours thinks that people should “run critical posts past the organization concerned before posting”, but also thinks that it might not be worth it for GWWC to address such criticisms because they don’t directly contribute to growth or fundraising, and addressing criticisms publicly might “make the organization look bad.”

This cashes out to saying “we don’t want to respond to your criticism, and we also would prefer you didn’t make it in public.”

It’s normal for organizations not to respond to every criticism — the Coca-Cola company doesn’t have to respond to every internet comment that says Coke is unhealthy — but Coca-Cola’s CEO doesn’t go around shushing critics either.

Todd seems to be saying that the target market of GWWC is not readers of the EA forum or similar communities, which is why answering criticism is not a priority. (“Remember that GWWC is getting almost 100 pledges per month atm, and very few come from places like this forum.”) Now, “places like this forum” seems to mean communities where people identify as “effective altruists”, geek out about the details of EA, spend a lot of time researching charities and debating EA strategy, etc.  Places where people might question, in detail, whether pledging 10% of one’s income to charity for life is actually a good idea or not.  Todd seems to be implying that answering the criticisms of these people is not useful — what’s useful is encouraging outsiders to donate more to charity.

Essentially, this maps to a policy of “let’s not worry over-much about internally critiquing whether we’re going in the right direction; let’s just try to scale up, get a bunch of people to sign on with us, move more money, grow our influence.”  An uncharitable way of reading this is “c’mon, guys, our marketing doesn’t have to satisfy you, it’s for the marks!”  Jane Q. Public doesn’t think about details, she doesn’t nitpick, she’s not a nerd; we tell her about the plight of the poor, she feels moved, and she gives.  That’s who we want to appeal to, right?

The problem is that it’s not quite fair to Jane Q. Public to treat her as a patsy rather than as a peer.

You’ll see echoes of this attitude come up frequently in EA contexts — the insinuation that criticism is an inconvenience that gets in the way of movement-building, and movement-building means obtaining the participation of the uncritical.

In responding to a criticism of a post on CEA fundraising, Ben Todd said:

I think we should hold criticism to a higher standard, because criticism has more costs. Negative things are much more memorable than positive things. People often remember criticism, perhaps just on a gut level, even if it’s shown to be wrong later in the thread.

This misses the obvious point that criticism of CEA has costs to CEA, but possibly has benefits to other people if CEA really has flaws.  It’s a sort of “EA, c’est moi” narcissism: what’s good for CEA is what’s good for the Movement, which is what’s good for the world.

Keeping promises is a symptom of autism

In the same thread criticizing the Giving What We Can pledge, Robert Wiblin, the director of research at 80,000 Hours, said:

Firstly: I think we should use the interpretation of the pledge that produces the best outcome. The use GWWC and I apply is completely mainstream use of the term pledge (e.g. you ‘pledge’ to stay with the person you marry, but people nonetheless get divorced if they think the marriage is too harmful to continue).

A looser interpretation is better because more people will be willing to participate, and each person gain from a smaller and more reasonable push towards moral behaviour. We certainly don’t want people to be compelled to do things they think are morally wrong – that doesn’t achieve an EA goal. That would be bad. Indeed it’s the original complaint here.

Secondly: An “evil future you” who didn’t care about the good you can do through donations probably wouldn’t care much about keeping promises made by a different kind of person in the past either, I wouldn’t think.

Thirdly: The coordination thing doesn’t really matter here because you are only ‘cooperating’ with your future self, who can’t really reject you because they don’t exist yet (unlike another person who is deciding whether to help you).

One thing I suspect is going on here is that people on the autism spectrum interpret all kinds of promises to be more binding than neurotypical people do (e.g. https://www.reddit.com/r/aspergers/comments/46zo2s/promises/). I don’t know if that applies to any individual here specifically, but I think it explains how some of us have very different intuitions. But I expect we will be able to do more good if we apply the neurotypical intuitions that most people share.

Of course if you want to make it fully binding for yourself, then nobody can really stop you.

In other words: Rob Wiblin thinks that promising to give 10% of income to charity for the rest of your life, which the Giving What We Can website describes as “a promise, or oath, to be made seriously and with every expectation of keeping it”, does not literally mean committing to actually do that. It means that you can quit any time you feel like it.

He thinks that you should interpret words with whatever interpretation will “do the most good”, instead of as, you know, what the words actually mean.

If you respond to a proposed pledge with “hm, I don’t know, that’s a really big commitment”, you must just be a silly autistic who doesn’t understand that you could just break your commitment when it gets tough to follow!  The movement doesn’t depend on weirdos like you, it needs to market to normal people!

I don’t know whether to be more frustrated with the ableism or the pathologization of integrity.

Once again, there is the insinuation that the growth of EA depends on manipulating the public — acquiring the dollars of the “normal” people who don’t think too much and can’t keep promises.

Jane Q. Public is stupid, impulsive, and easily led.  That’s why we want her.

“Because I Said So” is evidence

Jacy Reese, a prominent animal-rights-focused EA, responded to some criticism of Animal Charity Evaluators’ top charities on Facebook as follows:

Just to note, we (or at least I) agree there are serious issues with our leafleting estimate and hope to improve it in the near future. Unfortunately, there are lots of things that fit into this category and we just don’t have enough research staff time for all of them.

I spent a good part of 2016 helping make significant improvements to our outdated online ads quantitative estimate, which now aggregates evidence from intuition, experiments, non-animal-advocacy social science, and veg pledge rates to come up with the “veg-years per click” estimate. I’d love to see us do something similar with the leafleting estimate, and strongly believe we should keep trying, rather than throwing our hands in the air and declaring weak evidence is “no evidence.”

For context here, the “leafleting estimate” refers to the rate at which pro-vegan leaflets cause people to eat less meat (and hence the impact of leafleting advocacy at reducing animal suffering.)  The studies ACE used to justify the effectiveness of leafleting actually showed that leafleting was ineffective: an uncontrolled study of 486 college students shown a pro-vegetarianism leaflet found that only one student (0.2%) went vegetarian, while a controlled study conducted by ACE itself found that consumption of animal products was no lower in the leafleted group than the control group.  The criticisms of ACE’s leafleting estimate were not merely that it was flawed, but that it literally fabricated numbers based on a “hypothetical.”  ACE publishes “top charities” that it claims are effective at saving animal lives; the leafleting effectiveness estimates are used to justify why people should give money to certain veganism-activist charities.  A made-up reason to support a charity isn’t “weak evidence”, it’s lying.

In that context, it’s exceptionally shocking to hear Reese talking about “evidence from intuition,” which is…not evidence.

Reese continues:

Intuition is certainly evidence in this sense. If I have to make quick decisions, like in the middle of a conversation where I’m trying to inspire someone to help animals, would I be more successful on average if I flipped a coin for my responses or went with my intuition?

But that’s not the point.  Obviously, my intuition is valuable to me in making decisions on the fly.  But my intuition is not a reason why anybody else should follow my lead. For that, I’d have to give, y’know, reasons.

This is what the word “objectivity” means. It is the ability to share data between people, so that each can independently judge for themselves.

Reese is making the same kind of narcissistic fallacy we saw before. Reese is forgetting that his readers are not Jacy Reese and therefore “Jacy Reese thinks so” is not a compelling reason to them.  Or perhaps he’s hoping that his donors can be “inspired” to give money to organizations run by his friends, simply because he tells them to.

In a Facebook thread on Harrison Nathan’s criticism of leafleting estimates, Jacy Reese said:

I have lots of demands on my time, and like others have said, engaging with you seems particularly unlikely to help us move forward as a movement and do more for animals.

Nobody is obligated to spend time replying to anyone else, and it may be natural to get a little miffed at criticism, but I’d like to point out the weirdness of saying that criticism doesn’t “help us move forward as a movement.”  If a passenger in your car says “hey, you just missed your exit”, you don’t complain that he’s keeping you from moving forward. That’s the whole point. You might be moving in the wrong direction.

In the midst of this debate somebody commented,

“Sheesh, so much grenade throwing over a list of charities!  I think it’s a great list!”

This is a nice, Jane Q. Public, kind of sentiment.  Why, indeed, should we argue so much about charities? Giving to charity is a nice thing to do.  Why can’t we all just get along and promote charitable giving?

The point is, though — it’s giving to a good cause that’s a praiseworthy thing to do.  Giving to an arbitrary cause is not a good thing to do.

The whole point of the “effective” in “Effective Altruism” is that we, ideally, care about whether our actions actually have good consequences or not. We’d like to help animals or the sick or the poor, in real life. You don’t promote good outcomes if you oppose objectivity.

So what? The issue of exploitative marketing

These are informal comments by EAs, not official pronouncements.  And the majority of discussion of EA topics I’ve seen is respectful, thoughtful, and open to criticism.  So what’s the big deal if some EAs say problematic things?

There are some genuine scandals within the EA movement that pertain to deceptive marketing.  Intentional Insights, a supposed “EA” organization led by history professor Gleb Tsipursky, used astroturfing, paid for likes and positive comments, made false claims about his social media popularity, falsely claimed affiliation with other EA organizations, and may have required his employees to “volunteer” large amounts of unpaid labor for him.

To their credit, CEA repudiated Intentional Insights; Will McAskill’s excellent post on the topic argued that EA needs to clarify shared values and guard against people co-opting the EA brand to do unethical things.  One of the issues he brought up was

People engaging in or publicly endorsing ‘ends justify the means’ reasoning (for example involving plagiarism or dishonesty)

which is a perfect description of Tsipursky’s behavior.

I would argue that the problem goes beyond Tsipursky.  ACE’s claims about leafleting, and the way ACE’s advocates respond to criticism about it, are very plausibly examples of dishonesty defended with “ends justify the means” rhetoric.

More subtly, the most central effective altruism organizations and the custodians of the “Effective Altruism” brand are CEA and its offshoots (80,000 Hours and Giving What We Can), which are primarily focused on movement-building. And sometimes the way they do movement-building risks promoting an exploitative rather than cooperative relationship with the public.

What do I mean by that?

When you communicate cooperatively with a peer, you give them “news they can use.”  Cooperative advertising is a benefit to the consumer — if I didn’t know that there are salons in my neighborhood that specialize in cutting curly hair, then you, as the salon, are helping me by informing me about your services. If you argue cooperatively in favor of an action, you are telling your peer “hey, you might succeed better at your goals if you did such-and-such,” which is helpful information. Even making a request can be cooperative; if you care about me, you might want to know how best to make me happy, so when I express my preferences, I’m offering you helpful information.

When you communicate exploitatively with someone, you’re trying to gain from their weaknesses. Some of the sleazier forms of advertising are good examples of exploitation; if you make it very difficult to unsubscribe from your service, or make spammy websites whose addresses are misspellings of common website names, or make the “buy” button large and the “no thanks” button tiny, you’re trying to get money out of people’s forgetfulness or clumsiness rather than their actual interest in your product.  If you back a woman into an enclosed space and try to kiss her, you’re trying to get sexual favors as a result of her physical immobility rather than her actual willingness.

Exploitativeness is treating someone like a mark; cooperativeness is treating them like a friend.

A remarkable amount of EA discourse is framed cooperatively.  It’s about helping each other figure out how best to do good.  That’s one of the things I find most impressive about the EA movement — compared to other ideologies and movements, it’s unusually friendly, exploratory, and open to critical thinking.

However, if there are signs that EA orgs, as they grow and professionalize, are deliberately targeting growth among less-critical, less-intellectually-engaged, lower-integrity donors, while being dismissive towards intelligent and serious critics, which I think some of the discussions I’ve quoted on the GWWC pledge suggest, then it makes me worry that they’re trying to get money out of people’s weaknesses rather than gaining from their strengths.

Intentional Insights used the traditional tactics of scammy, lowest-common-denominator marketing. To a sophisticated reader, their site would seem lame, even if you didn’t know about their ethical lapses. It’s buzzwordy, clickbaity, and unoriginal.  And this isn’t an accident, any more than it’s an accident that spam emails have poor grammar. People who are fussy about quality aren’t the target market for exploitative marketing. The target market for exploitative marketing is and always has been the exceptionally unsophisticated.  Old people who don’t know how to use the internet; people too disorganized to cancel their subscriptions; people too impulsive to resist clicking on big red buttons; sometimes even literal bots.

The opposite approach, if you don’t want to drift towards a pattern of exploitative marketing, is to target people who seek out hard-to-fake signals of quality.  In EA, this would mean paying attention to people who have high standards in ethics and accuracy, and treating them as the core market, rather than succumbing to the temptation to farm metrics of engagement from whomever it’s easiest to recruit in the short-term.

Using “number of people who sign the GWWC pledge” as a metric of engagement in EA is nowhere near as shady as paying for Facebook likes, but I think there’s a similar flavor of exploitability between them.  You don’t want to be measuring how good you are at “doing good” by counting how many people make a symbolic or trivial gesture.  (And the GWWC pledge isn’t trivial or symbolic for most people…but it might become so if people keep insisting it’s not meant as a real promise.)

EAs can fight the forces of sleaze by staying cooperative — engaging with those who make valid criticisms, refusing the temptation to make strong but misleading advertising claims, respecting rather than denigrating people of high integrity, and generally talking to the public like we’re reasonable people.

CORRECTION

A previous version of this post used the name and linked to a Facebook comment by a volunteer member of an EA organization. He isn’t an official employee of any EA organization, and his views are his own, so he viewed this as an invasion of his privacy, and he’s right. I’ve retracted his name and the link.