Asking Permission

Compliance Costs

Governance is about setting policies; rules for what people can do, who has authority to make decisions, what procedures must be used for decisionmaking. Governance encompasses both what governments do and what non-governmental bodies do — businesses, voluntary or charitable organizations, online discussion spaces. Anything that has written policies needs to think about governance and how to do it well –and a lot of groups do governance badly because they don’t know they’re doing it.

In this post I want to talk about rules, permissions, and what kinds of considerations should be kept in mind when requiring people to ask for permission before doing things.

Rules, of course, are often necessary, but always impose some difficulty or inconvenience on those who have to follow them.

The burden of a rule can be separated into (at least) two components.

First, there’s the direct opportunity cost of not being allowed to do the things the rule forbids. (We can include here the penalties for violating the rule.)

Second, there’s the “cost of compliance”, the effort spent on finding out what is permitted vs. forbidden and demonstrating that you are only doing the permitted things.

Separating these is useful. You can, at least in principle, aim to reduce the compliance costs of a rule without making it less stringent.

For instance, you could aim to simplify the documentation requirements for environmental impact assessments, without relaxing standards for pollution or safety.  “Streamlining” or “simplifying” regulations aims to reduce compliance costs, without necessarily lowering standards or softening penalties.

If your goal in making a rule is to avoid or reduce some unwanted behavior — for instance, to reduce the amount of toxic pollution people are exposed to — then shifting up or down your pollution standards is a zero-sum tradeoff between your environmental goals and the convenience of manufacturers who produce pollution.  Looser rules are worse for environmental safety but better for polluters; tighter rules are better for environmental safety but worse for polluters. It’s a direct tug-of-war between opposed interests.

Reducing the costs of compliance, on the other hand, is positive-sum: it saves money for manufacturers, without increasing pollution levels.  Everybody wins. Where possible, you’d intuitively think rulemakers would always want to do this.

There may be fundamental limits on how much you can streamline the process of demonstrating compliance, but there are totally rules in our world that could be made easier to comply with. For instance, why don’t you report your income to the government and have them automatically calculate your income taxes?  This is technically feasible; they do it in European countries; and it certainly doesn’t need to reduce the amount of tax revenue the government collects. Everyone would save time “doing their taxes” — why isn’t this a win-win?

Of course, this assumes an idealized world where the only goal of a rule is to get as many people as possible to comply as fully as possible.

You might want compliance costs to be high if you’re using the rule, not to reduce incidence of the forbidden behavior, but to produce distinctions between people — i.e. to separate the extremely committed from the casual, so you can reward them relative to others.  Costly signals are good if you’re playing a competitive zero-sum game; they induce variance because not everyone is able or willing to pay the cost.

For instance, some theories of sexual selection (such as the handicap principle) argue that we evolved traits which are not beneficial in themselves but are sensitive indicators of whether or not we have other fitness-enhancing traits. E.g. a peacock’s tail is so heavy and showy that only the strongest and healthiest and best-fed birds can afford to maintain it. The tail magnifies variance, making it easier for peahens to distinguish otherwise small variations in the health of potential mates.

Such “magnifying glasses for small flaws” are useful in situations where you need to pick “winners” and can inherently only choose a few. Sexual selection is an example of such a a situation, as females have biological limits on how many children they can bear per lifetime; there is a fixed number of males they can reproduce with.  So it’s a zero-sum situation, as males are competing for a fixed number of breeding slots.  Other competitions for fixed prizes are similar in structure, and likewise tend to evolve expensive signals of commitment or quality.  A test that’s so easy anyone can pass it, is useless for identifying the top 1%.

On a regulatory-capture or spoils-based account of politics, where politics (including regulation) is seen as a negotiation to divide up a fixed pool of resources, and loyalty/trust is important in repeated negotiations, high compliance costs are easy to explain. They prevent diluting the spoils among too many people, and create variance in people’s ability to comply, which allows you to be selective along whatever dimension you care about.

Competitive (selective, zero-sum) processes work better when there’s wide variance among people. A rule (or boundary, or incentive) that’s meant to optimize collective behavior, is, by contrast, looking at aggregate outcomes, and will tend to want to reduce variance between people.

If you can make it easier for people to do the desired behavior and refrain from the undesired, you’ll get better aggregate behavior, all else being equal.  These aggregate goals are, in a sense, “democratic” or “anti-elitist”; if you care about encouraging good behavior in everyone, then you want good behavior to be broadly accessible.

Requiring Permission Raises Compliance Costs 

A straightforward way of avoiding undesired behavior is to require people to ask an authority’s permission before acting.

This has advantages: sometimes “undesired behavior” is a complex, situational thing that’s hard to codify into a rule, so the discretional judgment of a human can do better than a rigid rule.

One disadvantage that I think people underestimate, however, is the chilling effect it has on desired behavior.

For instance:

  • If you have to ask the boss’s permission individually for each purchase, no matter how cheap, not only will you waste a lot of your employees’ time, but you’ll disincentivize them from asking for even cost-effective purchases, which can be more costly in the long run.
  • If you require a doctor’s appointment for giving pain medication every time, to guard against drug abuse, you’re going to see a lot of people who really do have chronic pain doing without medication because they don’t want the anxiety of going to a doctor and being suspected of “drug-seeking”.
  • If you have to get permission before cleaning or contributing supplies for a shared space, then that space will be chronically under-cleaned and under-supplied.
  • If you have to get permission from a superior in order to stop the production line to fix a problem, then safety risks and defective products will get overlooked. (This is why Toyota mandated that any worker can unilaterally stop the production line.)

The inhibition against asking for permission is going to be strongest for shy people who “don’t want to be a bother” — i.e. those who are most conscious of the effects of their actions on others, and perhaps those who you’d most want to encourage to act.  Those who don’t care about bothering you are going to be undaunted, and will flood you with unreasonable requests.  A system where you have to ask a human’s permission before doing anything is an asshole filter, in Siderea’s terminology; it empowers assholes and disadvantages everyone else.

The direct costs of a rule fall only on those who violate it (or wish they could); the compliance costs fall on everyone.  A system of enforcement that preferentially inhibits desired behavior (while not being that reliable in restricting undesired behavior) is even worse from an efficiency perspective than a high compliance cost on everyone.

Impersonal Boundaries

An alternative is to instantiate your boundaries in an inanimate object — something that can’t intimidate shy people or cave to pressure from entitled jerks.  For instance:

  • a lock on a door is an inanimate boundary on space
  • a set of password-protected permissions on a filesystem is an inanimate boundary on information access
  • a departmental budget and a credit card with a fixed spending limit is an inanimate boundary on spending
  • an electricity source that shuts off automatically when you don’t pay your bill is an inanimate boundary against theft

The key element here isn’t information-theoretic simplicity, as in the debate over simple rules vs. discretion.  Inanimate boundaries can be complex and opaque.  They can be a black box to the user.

The key elements are that, unlike humans, inanimate boundaries do not punish requests that are refused (even socially, by wearing a disappointed facial expression), and they do not give in to repeated or more forceful requests.

An inanimate boundary is, rather, like the ideal version of a human maintaining a boundary in an “assertive” fashion; it enforces the boundary reliably and patiently and without emotion.

This way, it produces less inhibition in shy or empathetic people (who hate to make requests that could make someone unhappy) and is less vulnerable to pushy people (who browbeat others into compromising on boundaries.)

In fact, you can get some of the benefits of an inanimate boundary without actually taking a human out of the loop, but just by reducing the bandwidth for social signals. By using email instead of in-person communication, for instance, or by using formalized scripts and impersonal terminology.  Distancing tactics make it easier to refuse requests and easier to make requests; if these effects are roughly the same in magnitude, you get a system that selects more effectively for enabling desired behavior and preventing undesired behavior. (Of course, when you have one permission-granter and many permission-seekers, the effects are not the same in aggregate magnitude; the permission-granter can get spammed by tons of unreasonable requests.)

Of course, if you’re trying to select for transgressiveness — if you want to reward people who are too savvy to follow the official rules and too stubborn to take no for an answer — you’d want to do the opposite; have an automated, impersonal filter to block or intimidate the dutiful, and an extremely personal, intimate, psychologically grueling test for the exceptional. But in this case, what you’ve set up is a competitive test to differentiate between people, not a rule or boundary which you’d like followed as widely as possible.

Consensus and Do-Ocracy

So far, the systems we’ve talked about are centralized, and described from the perspective of an authority figure. Given that you, the authority, want to achieve some goal, how should you most effectively enforce or incentivize desired activity?

But, of course, that’s not the only perspective one might take. You could instead take the perspective that everybody has goals, with no a priori reason to prefer one person’s goals to anyone else’s (without knowing  what the goals are), and model the situation as a group deliberating on how to make decisions.

Consensus represents the egalitarian-group version of permission-asking. Before an action is taken, the group must discuss it, and must agree (by majority vote, or unanimous consent, or some other aggregation mechanism) that it’s sufficiently widely accepted.

This has all of the typical flaws of asking permission from an authority figure, with the added problem that groups can take longer to come to consensus than a single authority takes to make a go/no-go decision. Consensus decision processes inhibit action.

(Of course, sometimes that’s exactly what you want. We have jury trials to prevent giving criminal penalties lightly or without deliberation.)

An alternative, equally egalitarian structure is what some hackerspaces call do-ocracy.

In a do-ocracy, everyone has authority to act, unilaterally. If you think something should be done, like rearranging the tables in a shared space, you do it. No need to ask for permission.

There might be disputes when someone objects to your actions, which have to be resolved in some way.  But this is basically the only situation where governance enters into a do-ocracy. Consensus decisionmaking is an informal, anarchic version of a legislative or executive body; do-ocracy is an informal, anarchic version of a judicial system.  Instead of needing governance every time someone acts, in a judicial-only system you only need governance every time someone acts (or states an intention to act) AND someone else objects.

The primary advantage of do-ocracy is that it doesn’t slow down actions in the majority of cases where nobody minds.  There’s no friction, no barrier to taking initiative.  You don’t have tasks lying undone because nobody knows “whose job” they are.  The compliance cost for unobjectionable actions is zero.

Additionally, it grants the most power to the most active participants, which intuitively has a kind of fairness to it, especially in voluntary clubs that have a lot of passive members who barely engage at all.

The disadvantages of do-ocracy are exactly the same as its advantages.  First of all, any action which is potentially harmful and hard to reverse (including, of course, dangerous accidents and violence) can be unilaterally initiated, and do-ocracy cannot prevent it, only remediate it after the fact (or penalize the agent.)  Do-ocracies don’t deal well with very severe, irreversible risks. When they have to, they evolve permission-based or rules-based functions; for instance, the safety policies that firms or insurance companies institute to prevent risky activities that could lead to lawsuits.

Secondly, do-ocracies grant the most power to the most active participants, which often means those who have the most time on their hands, or who are closest to the action, at the expense of absent stakeholders. This means, for instance, do-ocracy favors a firm’s executives (who engage in day-to-day activity) over investors or donors or the general public; in volunteer and political organizations it favors those who have more free time to participate (retirees, students, the unemployed, the independently wealthy) over those who have less (working adults, parents). If you’ve seen dysfunctional homeowners’ associations or PTAs, this is why: they’re run for the personal benefit of those who have nothing better to do than super involved in the association’s politics, at the expense of the people who are too busy to show up.

The general phenomenon here is principal-agent problems — theft, self-dealing, negligence, all cases where the people who are physically there and acting take unfair advantage of the people who are absent and not in the loop, but depend on things remaining okay.

A judicial system doesn’t help those who don’t know they’ve been wronged.

Consensus systems, in fact, are designed to force governance to include or represent all the stakeholders — even those who would, by default, not take the initiative to participate.

Consumer-product companies mostly have do-ocratic power over their users. It’s possible to quit Facebook, with the touch of a button. Facebook changes its algorithms, often in ways users don’t like — but, in most cases, people don’t hate the changes enough to quit.  Facebook makes use of personal data — after putting up a dialog box requesting permission to use it. Yet, some people are dissatisfied and feel like Facebook is too powerful, like it’s hacking into their baser instincts, like this wasn’t what they’d wanted. But Facebook hasn’t harmed them in any way they didn’t, in a sense, consent to. The issue is that Facebook was doing things they didn’t reflectively approve of while they weren’t paying attention. Not secretly — none of this was secret, it just wasn’t on their minds, until suddenly a big media firestorm put it there.

You can get a lot of power to shape human behavior just by showing up, knowing what you want, and enacting it before anyone else has thought about it enough to object. That’s the side of do-ocracy that freaks people out.  Wherever in your life you’re running on autopilot, an adversarial innovator can take a bite out of you and get away with it long before you notice something’s wrong.  

This is another part of the appeal of permission-based systems, whether egalitarian or authoritarian; if you have to make a high-touch, human connection with me and get my permission before acting, I’m more likely to notice changes that are bad in ways I didn’t have any prior model of. If I’m sufficiently cautious or pessimistic, I might even be ok with the costs in terms of causing a chilling effect on harmless actions, so long as I make sure I’m sensitive to new kinds of shenanigans that can’t be captured in pre-existing rules.  If I don’t know what I want exactly, but I expect change is bad, I’m going to be much more drawn to permssion-based systems than if I know exactly what I want or if I expect typical actions to be improvements.

Advertisements

The Costs of Reliability

A question that used to puzzle me is “Why can people be so much better at doing a thing for fun, or to help their friends and family, than they are at doing the exact same thing as a job?”

I’ve seen it in myself and I’ve seen it in others. People can be hugely more productive, creative, intelligent, and efficient on just-for-fun stuff than they are at work.

Maybe it’s something around coercion? But it happens to people even when they choose their work and have no direct supervisor, as when a prolific hobbyist writer suddenly gets writer’s block as soon as he goes pro.

I think it has a very mundane explanation; it’s always more expensive to have to meet a specific commitment than merely to do something valuable.

If I feel like writing sometimes and not other times, then if writing is my hobby I’ll write when I feel like it, and my output per hour of writing will be fairly high. Even within “writing”, if my interests vary, and I write about whatever I feel like, I can take full advantage of every writing hour.  By contrast, if I’ve committed to write a specific piece by a specific deadline, I have to do it whether or not I’m in the mood for it, and that means I’ll probably be less efficient, spend more time dithering, and I’ll demand more external compensation in exchange for my inconvenience.

The stuff I write for fun may be valuable! And if you simply divide the value I produce by my hours of labor or the amount I need to be paid, I’m hugely more efficient in my free time than in my paid time! But I can’t just trivially “up my efficiency” in my paid time; reliability itself has a cost.

The costs of reliability are often invisible, but they can be very important. The cost (in time and in office supplies and software tools) of tracking and documenting your work so that you can deliver it on time. The cost (in labor and equipment) of quality assurance testing. The opportunity cost of creating simpler and less ambitious things so that you can deliver them on time and free of defects.

Reliability becomes more important with scale. Large organizations have more rules and procedures than small ones, and this is rational. Accordingly, they pay more costs in reliability.

One reason is that the attack surface for errors grows with the number of individuals involved. For instance, large organizations often have rules against downloading software onto company computers without permission.  The chance that any one person downloads malicious software that seriously harms the company is small, but the chance that at least one person does rises with the number of employees.

Another reason is that coordination becomes more important with more people. If a project depends on many people cooperating, then you as an individual aren’t simply trying to do the best thing, but rather the best thing that is also understandable and predictable and capable of achieving buy-in from others.

Finally, large institutions are more tempting to attackers than small ones, since they have more value to capture. For instance, large companies are more likely to be targeted by lawsuits or public outcry than private individuals, so it’s strategically correct for them to spend more on defensive measures like legal compliance procedures or professional PR.

All of these types of defensive or preventative activity reduce efficiency — you can do less in a given timeframe with a given budget.  Large institutions, even when doing everything right, acquire inefficiencies they didn’t have when small, because they have higher reliability requirements.

Of course, there are also economies of scale that increase efficiency. There are fixed expenses that only large institutions can afford, that make marginal production cheaper. There are ways to aggregate many risky components so that the whole is more robust than any one part, e.g. in distributed computation, compressed sensing, or simply averaging.  Optimal firm size is a balance.

This framework tells us when we ought to find it possible to get better-than-default efficiency easily, i.e. without any clever tricks, just by accepting different tradeoffs than others do. For example:

1.) People given an open-ended mandate to do what they like can be far more efficient than people working to spec…at the cost of unpredictable output with no guarantees of getting what you need when you need it. (See: academic research.)

2.) Things that come with fewer guarantees of reliable performance can be cheaper in the average use case…at the cost of completely letting you down when they occasionally fail. (See: prototype or beta-version technology.)

3.) Activities within highly cooperative social environments can be more efficient…at the cost of not scaling to more adversarial environments where you have to spend more resources on defending against attacks. (See: Eternal September)

4.) Having an “opportunistic” policy of taking whatever opportunities come along (for instance, hanging out in a public place and chatting with whomever comes along and seems interesting, vs. scheduling appointments) allows you to make use of time that others have to spend doing “nothing” … at the cost of never being able to commit to activities that need time blocked out in advance.

5.) Sharing high-fixed-cost items (like cars) can be more efficient than owning…at the cost of not having a guarantee that they’ll always be available when you need them.

In general, you can get greater efficiency for things you don’t absolutely need than for things you do; if something is merely nice-to-have, you can handle it if it occasionally fails, and your average cost-benefit ratio can be very good indeed. But this doesn’t mean you can easily copy the efficiency of luxuries in the production of necessities.

(This suggests that “toys” are a good place to look for innovation. Frivolous, optional goods are where we should expect it to be most affordable to experiment, all else being equal; and we should expect technologies that first succeed in “toy” domains to expand to “important, life-and-death” domains later.)

 

Book Review: Why Are The Prices So Damn High?

Economist Alex Tabarrok has recently come out with a short book, “Why are the prices so Damn High”, available in full PDF here.

Since the 1950’s, the inflation-adjusted cost of physical goods has fallen since the 1950’s, and the cost of food has stayed about the same.  But the cost of education, professional services, and healthcare has risen dramatically, despite those sectors not producing much improvement. Why?

The traditional economic explanation for the rising cost of services is the Baumol Effect. Some sectors, like manufacturing, are subject to efficiency improvements over time as technology improves; the more we automate the production of goods, the cheaper they get.  Other sectors are intrinsically harder to automate, so they don’t get cheaper over time. For instance, it takes the same number of musicians the same number of time to play a symphony as it did in 1950.  So, as a proportion of the average person’s paycheck, the cost of intrinsically un-automatable things like live concerts must rise relative to the cost of automatable things like manufactured goods.

Tabarrok doesn’t cover housing in his book, but home prices have also been rising since the 1970’s and I’ve seen the Baumol effect deployed to explain rising housing costs as well. “Land is the one thing they’re not making any more of” — for the most part, technological improvements don’t increase the quantity of livable land, so if technology makes some sectors more efficient and drives costs down, land will become relatively more expensive.

My Beef With Baumol

My preconception coming into the book was that the Baumol effect doesn’t actually answer the question. Why are healthcare, professional services, and education intrinsically hard to make more efficient?  It’s prima facie absurd to say that medicine is just one of those things that technology can’t improve — the biomedical industry is one of the biggest R&D endeavors in the whole economy!  So why is it obvious that none of that innovation can make medicine cheaper?  If it’s not making medicine cheaper, that’s an empirical fact that deserves explanation, and “it’s the Baumol effect” doesn’t actually answer the “why” question.

The same holds true for the other industries, even housing to some degree. While it’s true that the amount of land on Earth is fixed (modulo landfill) and the amount of space in Manhattan is fixed, there’s also the options of building taller buildings, expanding cities, and building new cities.  Why is it in principle impossible for the production of housing to become more efficient over time just as the production of other goods does?

The Baumol Effect doesn’t make sense to me as an explanation, because its answer to “why are these sectors getting more expensive?” is, in effect, “because it’s obvious that they can’t get cheaper.”

It’s Not Administrative Bloat, It’s More Service Providers

A popular explanation for why college and K-12 education have gotten more expensive is “bloat”, the idea that most of the cost is due to increasing numbers of bureaucratic administrators and unnecessary luxury amenities.

Tabarrok points out that this story can’t be true. In reality, the percent of university costs going to administration has stayed relatively constant since 1980, and the percent going to facilities has decreased. In the K-12 world, the number of administrators is tiny compared to the number of teachers, and it’s barely budged; it’s the number of teachers per student that has grown.  Most of the increase in educational costs, says Tabarrok, comes from rising numbers of teachers and college professors, and higher wages for those teachers and professors.

In other words, education is getting more “inefficient”, not necessarily in a pejorative sense but in an economic sense; we are using more people to achieve similar educational results (average test scores are flat.)

This may be fine; maybe people get value out of personal teacher attention that doesn’t show up in test scores, so we’re spending more to get a better outcome, just one that the narrow metric of standardized test performance doesn’t capture.

Likewise, in healthcare, we have an increasing number of doctors and nurses in the US per capita, and (relative to median income) doctors and nurses are making higher salaries over time.  Whatever improvements we’re making in medical technology, we’re not using them to automate away the need for labor.

Again, maybe this is what people want; maybe personal attention is intrinsically valuable to people, so we’re getting more for our money.  (And overall health outcomes like life expectancy have increased modestly since 1950, though I’d argue that they’re underperforming relative to what’s possible.)

But What About Housing?

The argument that the cost of services is rising because we use our increasing prosperity to “buy” more personal attention from teachers and doctors does not apply directly to the rising cost of housing, which is not a service.

However, it may be that the rising cost of housing, especially in cities, is really about buying proximity to increasingly valuable services — good schools, live music, and so on. If the only thing you can’t automate away is human contact, maybe we’re willing to spend more to be around fancier humans.

But What About Immigration?

You might argue “but labor prices don’t come down because immigration restrictions keep foreigners out! Labor-intensive industries are getting more expensive because we allow too little immigration!  The reason why education and medicine are getting expensive is just precisely because those are the sectors where restrictive laws keep the cost of inputs high.”

But, like the Baumol effect, this explanation also begs the question.  Why are healthcare and education, relative to other industries, the sectors where labor costs are the most important?

The immigration explanation is also compatible with the Baumol effect, not a counterargument to it. If we just take as a given that it’s impossible to make healthcare or education more labor-efficient, then it can both be true that “other things getting cheaper” and “immigration restrictions keeping wages high” contribute to the high cost of healthcare & education relative to other things.

Cost Increases Aren’t Driven By Supply-Side Gatekeeping

From Tabarrok’s point of view, rising housing costs, education costs, and healthcare costs are not really mysterious facts in need of explanation by gatekeeping tactics like monopolies, regulation, zoning, or restrictive licensing, nor can they be explained by gatekeeping tactics alone.

Gatekeeping on the supply side increases price and reduces output. For instance, a monopolist’s profit-maximizing output is lower than the equilibrium output in a competitive market, and increases the monopolist’s profit relative to what firms in a competitive market can obtain. Likewise, restrictive licensing laws reduce the supply of doctors and lawyers and raise their wages.

But we don’t see declines in the number of doctors, lawyers, teachers, and professors over time — we see clear and steady increases.  Therefore, the increased cost of medicine can’t be explained by increased restrictions on licensing.

It’s still possible that licensing is artificially restricting the supply of skilled professionals relative to an even higher counterfactual growth rate, but this doesn’t by itself explain the growth in spending we see. Demand for professional services is rising.

Prescription-only drugs are another good example of regulatory gatekeeping not being enough to explain rising costs. The total cost of getting a prescription drug is higher when there’s a legal requirement of a doctor visit than when you can just buy the drug over the counter; in that sense it’s true that regulation increases costs.  However, prescription-only requirements have been pretty much fixed for decades, not getting more severe, while consumption of prescription drugs per capita is rising; we’re spending more on drugs because there’s growing demand for drugs.

This means that deregulation alone won’t change the fact that a growing portion of each person’s paycheck is getting spent on medicine.  If the law reclassifies a drug as over-the-counter, we’d expect a one-time downward shift in the price of that drug, but the slope of the curve of total spending on that drug over time won’t get flatter unless demand declines.

Now, increased demand isn’t only possible to get from consumer preferences; governments can also increase demand for a service by providing it to the public, in effect (through taxes) requiring society to buy more of it.

You can still in principle make a case that government is to blame for increasing healthcare and education prices; you just can’t claim it’s only about gatekeeping, you have to include demand in the picture.

A “Dismal” Conclusion

Ever-increasing healthcare, education, and housing costs are a big problem. It would be “good news” if we could solve the problem by passing or repealing a law.  It would also be “good news” if the high costs were driven by foolish waste — then a competitor could just remove the waste and offer consumers lower prices.

Tabarrok’s analysis suggests this isn’t the case.

The cost increases are coming from lots of skilled professional labor — something that isn’t obviously a thing you can get rid of without making people unhappy!  In order to reduce costs, it wouldn’t be enough to cut gatekeeping regulations, you’d also have to cut subsidies — which does, unfortunately, entail taking a thing away from people (albeit potentially giving them lower taxes in exchange.) This “minimalism” can be the kind of free-market minimalism that Bryan Caplan talks about, or it can be part of a state-run but price-conscious system like the UK’s (where doctors go to school for fewer years than in the US). But either way, it involves less man-hours spent on education and healthcare.

One way or another, for costs to come down, people would have to spend less time going to school, and get less personal attention from less-educated experts.

Deeper Issues

Tabarrok’s attitude, and the implicit attitude of the Baumol effect, is that the increasing relative costs of education and healthcare are not a problem. They are just a side effect of a society getting richer. Goods whose production is easy to automate get cheap faster than goods whose production is hard to automate. Fundamentally, we’re spending more on healthcare and education, as a society, because we want to.  (If not as consumers, then as voters.)

This isn’t how most people feel about it. Most people feel like it’s getting harder to get the same level of stuff their parents’ generation got.  That the rising prices actually mean something bad.

If the real driver of cost is that we’re getting more man-hours of time with professionals who, themselves, have spent more man-hours of time getting educated by other professionals, then in one sense we’re “paying more to get more”, and in another sense we’re not. It’s nice to get more one-on-one time with professors; but part of the reason we get higher education is to be considered for jobs that require a diploma, and the rise in education costs means that a diploma costs more.

We’re “paying more for more”, but the “more” we’re getting is primarily social and emotional — more personal time with more prestigious people — while we’re not getting much more of the more concretely observable stuff, like “square feet of housing”, “years of life”, “number of children we can afford to have”, etc.

At this point, I tend to agree with Robin Hanson.  We have more doctors, nurses, lawyers, professors, teachers, and financial managers, without corresponding improvements in the closest available metrics for the results those professionals are supposed to provide (health outcomes, access to efficient dispute resolution, knowledge and skills, and financial returns.)

Ultimately you have to conclude that this is a matter of divided will. (Hanson would call it hypocrisy, but unexamined confusion, or conflict between interest groups, might explain the same phenomenon.)  People are unhappy because they are “spending more for the same stuff”; at the same time, we are spending more for “more” in terms of prestige, and at least some of us, some of the time, must want that.

All You Need Is Love?

It’s directly valuable, as in, emotionally rewarding and probably even physically health-promoting, to get personal care and undivided attention from someone you think highly of.

Hanson may think that getting personal attention from prestigious people is merely “showing off”, but something that brings joy and enhances health is at least as much of a valid human benefit as food or housing space.

The feelings that come from good human connection, the feeling of being loved and cared for, are real.  They are “objective” in a way that I think people don’t always appreciate — in a way that I did not appreciate until very recently. What I mean is, just because you do something in search of a good feeling, does not mean that you will get that good feeling. The feeling is “subjective” in the sense that it occurs inside your mind, but it is “objective” in the sense that you cannot get it arbitrarily by wishing; some things produce it and some do not. For instance, it is a hell of a lot easier to feel loved by getting eye contact and a hug, than it is by typing words into a computer screen.  “Facts vs. feelings” is a false dichotomy that stops us from learning the facts about what creates good feelings.

Prestige addiction may come from spending a lot of resources trying to obtain a (social, emotional) thing by proxy, when in principle it would be possible to get it more directly.  If what you want is to be cared for by a high-integrity, kind, skilled person, but instead you insist on being cared for by someone with an M.D., you may miss out on the fact that nurses or even hospital techs can be just as good, but cheaper, on the dimensions you really care about.  To the extent that credentialism results from this sort of misunderstanding, it may be possible to roll it back through advocacy.  That’s hard, because changing minds always is, but it’s doable in principle.

To the extent that people want fancy things because they are expensive, in a zero-sum sense, there is no “efficiency-improving” solution.  No attempt to make healthcare or education cheaper will help if people only care about having more than their neighbors.

But: to the extent that some people are doing mostly zero-sum things while other people are doing mostly positive-sum things, the positive-sum people can notice that the zero-sum people are ruining things for everyone and act accordingly.

 

Circle Games

I may be reinventing a known thing in child development or psychology here, but bear with me.

The simplest games I see babies play — games simple enough that cats and dogs can play them too — are what I’d call “circle games.”

Think of the game of “fetch”.  I throw the ball, Rover runs and brings it back, and then we repeat, ad infinitum.  (Or, the baby version: baby throws the item out of the stroller, I pick it up, and then we repeat.)

Or, “peek-a-boo.” I hide, I re-emerge, baby laughs, repeat.

My son is also fond of “open the door, close the door, repeat”, or “open the drawer, close the drawer, repeat”, which are solo circle games, and “together/apart”, where he pushes my hands together and apart and repeats, and of course being picked up and put down repeatedly.

A lot of toys are effectively solo circle games in physical form.  The jack-in-the-box: “push a button, out pops something! close the box, start again.” Fidget toys with buttons and switches to flip: “push the button, get a satisfying click, repeat.”

It’s obvious, observing a small child, that the purpose of these “games” is learning.  And, in particular, learning cause and effect.  What do you learn by opening and closing a door? Why, how to open and close doors; or, phrased a different way, “when I pull the door this way, it opens; when I push it that way, it closes.” Playing fetch or catch teaches you about how objects move when dropped or thrown.  Playing with button-pushing or latch-turning toys teaches you how to handle the buttons, keys, switches, and handles that are ubiquitous in our built environment.

But what about peek-a-boo? What are you “learning” from that? (It’s a myth that babies enjoy it because they don’t have object permanence; babies get object permanence by 3 months, but enjoy peek-a-boo long after that.) My guess is that peek-a-boo trains something like “when I make eye contact I get smiles and positive attention” or “grownups go away and then come back and are happy to see me.” It’s social learning.

It’s important for children to learn, generally, “when I act, the people around me react.” This gives them social efficacy (“I can achieve goals through interaction with other people”), access to social incentives (“people respond positively when I do this, and negatively when I do that”), and a sense of social significance (“people care enough about me to respond to my actions.”) Attachment psychology argues that when babies and toddlers don’t have any adults around who respond to their behavior, their social development goes awry — neglected children can be extremely fearful, aggressive, or checked-out, missing basic abilities in interacting positively with others.

It’s clear just from observation that the social game of interaction — “I make a sound, you make a sound back” — is learned before verbal speech.  Preverbal babies can even execute quite sophisticated interaction patterns, like making the tonal pattern of a question followed by an answering statement.  This too is a circle game.

The baby’s fascination with circle games completely belies the popular notion that drill is an intrinsically unpleasant way to learn. Repetition isn’t boring to babies who are in the process of mastering a skill. They beg for repetition.

My personal speculation is that the “craving for drill”, especially in motor learning, is a basal ganglia thing; witness how abnormalities in the ganglia are associated with disorders like OCD and Tourette’s, which involve compulsive repetition of motor activities; or how some dopaminergic drugs given to Parkinsonian patients cause compulsions to do motor activities like lining up small objects or hand-crafts. Introspectively, a “gear can engage” if I get sufficiently fascinated with something and I’ll crave repetition — e.g. craving to listen to a song on repeat until I’ve memorized it, craving to get the hang of a particular tricky measure on the piano — but there’s no guarantee that the gear will engage just because I observe that it would be a good idea to master a particular skill.

I also think that some kinds of social interaction among adults are effectively circle games.

Argument or fighting, in its simplest form, is a circle game: “I say Yes, you say No, repeat!” Of course, sophisticated arguments go beyond this; each player’s “turn” should contribute new information to a logical structure. But many arguments in practice are not much more sophisticated than “Yes, No, repeat (with variations).”  And even intellectually rigorous and civil arguments usually share the basic turn-taking adversarial structure.

Now, if the purpose of circle games is to learn a cause-and-effect relationship, what are we learning from adversarial games?

Keep in mind that adversarial play — “you try to do a thing, I try to stop you” — kicks in very early and (I think) cross-culturally. It certainly exists across species; puppies do it.

Almost tautologically, adversarial play teaches resistance.  When you push on others, others push back; when others push on you, you push back.

War, in the sense we know it today, may not be a human universal, and certainly isn’t a mammalian universal; but resistance seems to be an inherent feature of social interaction between any beings whose interests are imperfectly aligned.

A lot of social behaviors generally considered maladaptive look like adversarial circle games. Getting sucked into repetitive arguments? That’s a circle game. Falling into romantic patterns like “you want to get closer to me, I pull away, repeat”? Circle game.  Being shocking or reckless to get attention? Circle game.

The frame where circle games are for learning suggests that people do these things because they feel like they need more practice learning the lesson.  Maybe people who are very combative feel, on some level, that they need to “get the hang of” pushing back against social resistance, or conversely, learning how not to do things that people will react badly to.  It’s unsatisfying to feel like a ghost, moving through the world but not getting any feedback one way or another. Maybe when people crave interaction, they’re literally craving training data.

If you always do A, and always get response B, and you keep wanting to repeat that game, for much longer than is “normal”, then a couple things might be happening:

  • Your “learning algorithm” has an unusually slow “learning rate” such that you just don’t update very efficiently on what ought to be ample data (in general or in this specific context).
  • You place a very high importance on the A-B relationship such that you have an unusually high need to be sure of it.  (e.g. your algorithm has a very high threshold for convergence.) So even though you learn as well as anybody else, you want to keep learning for longer.
  • You have a very strong “prior” that A does not cause B, which it takes a lot of data to “disprove.”
  • You have something like “too low a rate of stochasticity.”  What you actually need is variation — you need to see that A’ causes B’ — but you’re stuck in a local rut where you can’t explore the space properly so you just keep swinging back and forth in that rut. But your algorithm keeps returning “not mastered yet”. (You can get these effects in algorithms as simple as Newton’s Method.)
  • You’re not actually trying to learn “A causes B.” You’re trying to learn “C causes D.” But A correlates weakly with C, and B correlates weakly with D, and you don’t know how to specifically do C, so you just do A a lot and get intermittent reinforcement.

These seem like more general explanations of how to break down when repetition will seem “boring” vs. “fascinating” to different people or in different contexts.

Pecking Order and Flight Leadership

It was recently pointed out to me that humans are weird, compared to other social animals, in that we conflate the pecking order with the group decision-making process.

The pecking order, for instance in birds, is literally the ranking of who gets to eat first when food is scarce.

We can also call it a “dominance hierarchy”, but the words “dominance” and “hierarchy” call up associations with human governance systems like aristocracy and monarchy, where the king or chief is both the decisionmaker for the group and the person entitled to the most abundant resources.

In birds, it’s not like that. Being top chicken doesn’t come with the job of “leading” the other chickens anywhere; it just entitles you to eat better (or have better access to other desirable resources).  In fact, group decisionmaking (like deciding when and where to migrate) does occur in birds, but not necessarily according to the “pecking order”.  Leadership (setting the direction of the group) and dominance (being high in the pecking order) are completely independent in pigeons, for instance.  Pigeons have stable, transitive hierarchies of flight leadership, and they have stable pecking order hierarchies, and these hierarchies do not correlate.

Logically, it isn’t necessary for the individual who decides what others shall do to also be the individual who gets the most goodies.  They can be related — one of the things you can do with the power to give instructions is to instruct others to give you more goodies. But you can, at least with nonhuman animals, separate pecking-order hierarchies from decision-making hierarchies.

You can even set this up as a 2×2:

High rank in pecking order, high decision-making power: Liege

High rank in pecking order, low decision-making power: Eloi

Low rank in pecking order, high decision-making power: Morlock

Low rank in pecking order, low decision-making power: Vassal

“Eloi” and “Morlocks” are, of course, borrowed from H.G. Wells’ The Time Machine, which depicted a human species divided between the privileged, childlike Eloi, and the monstrous underground Morlocks, who farm them for food.  Eloi enjoy but don’t decide; Morlocks decide but don’t enjoy.

The other archetypal example of someone with low rank in the pecking order but high decision-making power is the prophet. Biblical prophets told people what to do — they could even give instructions to the king — but they did not enjoy positions of privilege, palaces, many wives, hereditary lands, or anything like that.  They did sometimes have the power to threaten or punish, which is a sort of “executive” power, but not the power to personally enjoy more resources than others.

In American common parlance, “leadership” or “dominance” generally means both being at the top of a pecking order and being a decision-maker for the group.  My intuition and experience says that if somebody wants to be the decision-maker for the group but doesn’t seem to be conspicuously seeking & enjoying goodies in zero-sum contexts — in other words, if somebody behaves like a Morlock or prophet — they will read as not behaving like a “leader”, and will fail to get a certain kind of emotional trust and buy-in and active participation from others.

My previous post on hierarchy conflated pecking-order hierarchies with decision-making hierarchies. I said that people-telling-others-what-to-do (decision-making hierarchy) “usually goes along with” special privileges or luxuries for the superiors (pecking-order hierarchy.)  But, in fact, they are different things, and the distinction matters.

Most of the practical advantages of hierarchy in organizations come from decision-making hierarchy.  A tree structure, or chain of command, helps get decisions made more efficiently than many-to-many deliberative assemblies.  Many of the inefficiencies of hierarchy in organizations (expensive displays of deference, poor communication across power distance) are more about pecking-order hierarchy.  “So just have decision-making hierarchy without pecking-order hierarchy!” But that’s rule-by-prophets, and in practice people seem to HATE prophets.

The other model for leadership is the “good king”, of the kind that Siderea writes about in this series of posts on Watership Down.  The good king is not just sitting on top of the pecking order enjoying luxury at the expense of his people. He listens to his people and empowers them to do their best; he shares their privations; he is genuinely committed to the common good. But he’s still a king, not a prophet. (In Watership Down, there actually is a prophet — Fiver — and Hazel, the king, is notable for listening to Fiver, while bad leaders ignore their prophets.)

My guess is that the “good king” does sit on top of a pecking-order hierarchy, but a very mild and public-spirited one.  He’s generous, as opposed to greedy; but generosity implies that he could be greedy if he wanted to. He shares credit with others who do good work, instead of hogging all the credit for himself; but being the one to give credit itself makes him seem central and powerful.

A “good king” seems more emotionally sustainable for humans than just having a “prophet”, but it could be that there’s a way to implement pigeon-like parallel hierarchies for resource-enjoyment and decision-making, or other structures I haven’t thought of yet.

Degrees of Freedom

Something I’ve been thinking about for a while is the dual relationship between optimization and indifference, and the relationship between both of them and the idea of freedom.

Optimization: “Of all the possible actions available to me, which one is best? (by some criterion).  Ok, I’ll choose the best.”

Indifference: “Multiple possible options are equally good, or incommensurate (by the criterion I’m using). My decision algorithm equally allows me to take any of them.”

Total indifference between all options makes optimization impossible or vacuous. An optimization criterion which assigns a total ordering between all possibilities makes indifference vanishingly rare. So these notions are dual in a sense. Every dimension along which you optimize is in the domain of optimization; every dimension you leave “free” is in the domain of indifference.

Being “free” in one sense can mean “free to optimize”.  I choose the outcome that is best according to an internal criterion, which is not blocked by external barriers.  A limit on freedom is a constraint that keeps me away from my favorite choice. Either a natural limit (“I would like to do that but the technology doesn’t exist yet”) or a man-made limit (“I would like to do that but it’s illegal.”)

There’s an ambiguity here, of course, when it comes to whether you count “I would like to do that, but it would have a consequence I don’t like” as a limit on freedom.  Is that a barrier blocking you from the optimal choice, or is it simply another way of saying that it’s not an optimal choice after all?

And, in the latter case, isn’t that basically equivalent to saying there is no such thing as a barrier to free choice? After all, “I would like to do that, but it’s illegal” is effectively the same thing as “I would like to do that, but it has a consequence I don’t like, such as going to jail.” You can get around this ambiguity in a political context by distinguishing natural from social barriers, but that’s not a particularly principled distinction.

Another issue with freedom-as-optimization is that it’s compatible with quite tightly constrained behavior, in a way that’s not consistent with our primitive intuitions about freedom.  If you’re only “free” to do the optimal thing, that can mean you are free to do only one thing, all the time, as rigidly as a machine. If, for instance, you are only free to “act in your own best interests”, you don’t have the option to act against your best interests.  People in real life can feel constrained by following a rigid algorithm even when they agree it’s “best”; “but what if I want to do something that’s not best?”  Or, they can acknowledge they’re free to do what they choose, but are dismayed to learn that their choices are “dictated” as rigidly by habit and conditioning as they might have been by some human dictator.

An alternative notion of freedom might be freedom-as-arbitrariness.  Freedom in the sense of “degrees of freedom” or “free group”, derived from the intuition that freedom means breadth of possibility rather than optimization power.  You are only free if you could equally do any of a number of things, which ultimately means something like indifference.

This is the intuition behind claims like Viktor Frankl’s: “Between stimulus and response there is a space. In that space is our power to choose a response. In our response lies our growth and our freedom.”  If you always respond automatically to a given stimulus, you have only one choice, and that makes you unfree in the sense of “degrees of freedom.”

Venkat Rao’s concept of freedom is pretty much this freedom-as-arbitrariness, with some more specific wrinkles. He mentions degrees of freedom (“dimensionality”) as well as “inscrutability”, the inability to predict one’s motion from the outside.

Buddhists also often speak of freedom more literally in terms of indifference, and there’s a very straightforward logic to this; you can only choose equally between A and B if you have been “liberated” from the attractions and aversions that constrain you to choose A over B.  Those who insist that Buddhism is compatible with a fairly normal life say that after Buddhist practice you still will choose systematically most of the time — your utility function cannot fully flatten if you act like a living organism — but that, like Viktor Frankl’s ideal human, you will be able to reflect with equinamity and consider choosing B over A; you will be more “mentally flexible.”  Of course, some Buddhist texts simply say that you become actually indifferent, and that sufficient vipassana meditation will make you indistinguishable from a corpse.

Freedom-as-indifference, I think, is lurking behind our intuitions about things like “rights” or “ownership.” When we say you have a “right” to free speech — even a right bounded with certain limits, as it of course always is in practice — we mean that within those limits, you may speak however you want.  Your rights define a space, within which you may behave arbitrarily.  Not optimally. A right, if it’s not to be vacuous, must mean the right to behave “badly” in some way or other.  To own a piece of property means that, within whatever limits the concept of ownership sets, you may make use of it in any way you like, even in suboptimal ways.

This is very clearly illustrated by Glen Weyl’s notion of radical markets, which neatly disassociates two concepts usually both considered representative of free-market systems: ownership and economic efficiency.  To own something just is to be able to hang onto it even when it is economically inefficient to do so.  As Weyl says, “property is monopoly.”  The owner of a piece of land can sit on it, making no improvements, while holding out for a high price; the owner of intellectual property can sit on it without using it; in exactly the same way that a monopolist can sit on a factory and depress output while charging higher prices than he could get away with in a competitive market.

For better or for worse, rights and ownership define spaces in which you can destroy value.  If your car was subject to a perpetual auction and ownership tax as Weyl proposes, bashing your car to bits with a hammer would cost you even if you didn’t personally need a car, because it would hurt the rental or resale value and you’d still be paying tax.  On some psychological level, I think this means you couldn’t feel fully secure in your possessions, only probabilistically likely to be able to provide for your needs. You only truly own what you have a right to wreck.

Freedom-as-a-space-of-arbitrary-action is also, I think, an intuition behind the fact that society (all societies, but the US more than other rich countries, I think) is shaped by people’s desire for more discretion in decisionmaking as opposed to transparent rubrics.  College admissions, job applications, organizational codes of conduct, laws and tax codes, all are designed deliberately to allow ample discretion on the part of decisionmakers rather than restricting them to following “optimal” or “rational”, simple and legible, rules.  Some discretion is necessary to ensure good outcomes; a wise human decisionmaker can always make the right decision in some hard cases where a mechanical checklist fails, simply because the human has more cognitive processing power than the checklist.  This phenomenon is as old as Plato’s Laws and as current as the debate over algorithms and automation in medicine.  However, what we observe in the world is more discretion than would be necessary, for the aforementioned reasons of cognitive complexity, to generate socially beneficial outcomes.  We have discretion that enables corruption and special privileges in cases that pretty much nobody would claim to be ideal — rich parents buying their not-so-competent children Ivy League admissions, favored corporations voting themselves government subsidies.  Decisionmakers want the “freedom” to make illegible choices, choices which would look “suboptimal” by naively sensible metrics like “performance” or “efficiency”, choices they would prefer not to reveal or explain to the public.  Decisionmakers feel trapped when there’s too much “accountability” or “transparency”, and prefer a wider sphere of discretion.  Or, to put it more unfavorably, they want to be free to destroy value.

And this is true at an individual psychological level too, of course — we want to be free to “waste time” and resist pressure to account for literally everything we do. Proponents of optimization insist that this is simply a failure mode from picking the wrong optimization target — rest, socializing, and entertainment are also needs, the optimal amount of time to devote to them isn’t zero, and you don’t have to consider personal time to be “stolen” or “wasted” or “bad”, you can, in principle, legibilize your entire life including your pleasures. Anything you wish you could do “in the dark”, off the record, you could also do “in the light,” explicitly and fully accounted for.  If your boss uses “optimization” to mean overworking you, the problem is with your boss, not with optimization per se.

The freedom-as-arbitrariness impulse in us is skeptical.

I see optimization and arbitrariness everywhere now; I see intelligent people who more or less take one or another as ideologies, and see them as obviously correct.

Venkat Rao and Eric Weinstein are partisans of arbitrariness; they speak out in favor of “mediocrity” and against “excellence” respectively.  The rationale being, that being highly optimized at some widely appreciated metric — being very intelligent, or very efficient, or something like that — is often less valuable than being creative, generating something in a part of the world that is “dark” to the rest of us, that is not even on our map as something to value and thus appears as lack of value.  Ordinary people being “mediocre”, or talented people being “undisciplined” or “disreputable”, may be more creative than highly-optimized “top performers”.

Robin Hanson, by contrast, is a partisan of optimization; he speaks out against bias and unprincipled favoritism and in favor of systems like prediction markets which would force the “best ideas to win” in a fair competition.  Proponents of ideas like radical markets, universal basic income, open borders, income-sharing agreements, or smart contracts (I’d here include, for instance, Vitalik Buterin) are also optimization partisans.  These are legibilizing policies that, if optimally implemented, can always be Pareto improvements over the status quo; “whatever degree of wealth redistribution you prefer”, proponents claim, “surely it is better to achieve it in whatever way results in the least deadweight loss.”  This is the very reason that they are not the policies that public choice theory would predict would emerge naturally in governments. Legibilizing policies allow little scope for discretion, so they don’t let policymakers give illegible rewards to allies and punishments to enemies.  They reduce the scope of the “political”, i.e. that which is negotiated at the personal or group level, and replace it with an impersonal set of rules within which individuals are “free to choose” but not very “free to behave arbitrarily” since their actions are transparent and they must bear the costs of being in full view.

Optimization partisans are against weakly enforced rules — they say “if a rule is good, enforce it consistently; if a rule is bad, remove it; but selective enforcement is just another word for favoritism and corruption.”  Illegibility partisans say that weakly enforced rules are the only way to incorporate valuable information — precisely that information which enforcers do not feel they can make explicit, either because it’s controversial or because it’s too complex to verbalize. “If you make everything explicit, you’ll dumb everything in the world down to what the stupidest and most truculent members of the public will accept.  Say goodbye to any creative or challenging innovations!”

I see the value of arguments on both sides. However, I have positive (as opposed to normative) opinions that I don’t think everybody shares.  I think that the world I see around me is moving in the direction of greater arbitrariness and has been since WWII or so (when much of US society, including scientific and technological research, was organized along military lines).  I see arbitrariness as a thing that arises in “mature” or “late” organizations.  Bigger, older companies are more “political” and more monopolistic.  Bigger, older states and empires are more “corrupt” or “decadent.”

Arbitrariness has a tendency to protect those in power rather than out of power, though the correlation isn’t perfect.  Zones that protect your ability to do “whatever” you want without incurring costs (which include zones of privacy or property) are protective, conservative forces — they allow people security.  This often means protection for those who already have a lot; arbitrariness is often “elitist”; but it can also protect “underdogs” on the grounds of tradition, or protect them by shrouding them in secrecy.  (Scott thought “illegibility” was a valuable defense of marginalized peoples like the Roma. Illegibility is not always the province of the powerful and privileged.)  No; the people such zones of arbitrary, illegible freedom systematically harm are those who benefit from increased accountability and revealing of information. Whistleblowers and accusers; those who expect their merit/performance is good enough that displaying it will work to their advantage; those who call for change and want to display information to justify it; those who are newcomers or young and want a chance to demonstrate their value.

If your intuition is “you don’t know me, but you’ll like me if you give me a chance” or “you don’t know him, but you’ll be horrified when you find out what he did”, or “if you gave me a chance to explain, you’d agree”, or “if you just let me compete, I bet I could win”, then you want more optimization.

If your intuition is “I can’t explain, you wouldn’t understand” or “if you knew what I was really like, you’d see what an impostor I am”, or “malicious people will just use this information to take advantage of me and interpret everything in the worst possible light” or “I’m not for public consumption, I am my own sovereign person, I don’t owe everyone an explanation or justification for actions I have a right to do”, then you’ll want less optimization.

Of course, these aren’t so much static “personality traits” of a person as one’s assessment of the situation around oneself.  The latter cluster is an assumption that you’re living in a social environment where there’s very little concordance of interests — people knowing more about you will allow them to more effectively harm you.  The former cluster is an assumption that you’re living in an environment where there’s a great deal of concordance of interests — people knowing more about you will allow them to more effectively help you.

For instance, being “predictable” is, in Venkat’s writing, usually a bad thing, because it means you can be exploited by adversaries. Free people are “inscrutable.”  In other contexts, such as parenting, being predictable is a good thing, because you want your kids to have an easier time learning how to “work” the house rules.  You and your kid are not, most of the time, wily adversaries outwitting each other; conflicts are more likely to come from too much confusion or inconsistently enforced boundaries.  Relationship advice and management advice usually recommends making yourself easier for your partners and employees to understand, never more inscrutable.  (Sales advice, however, and occasionally advice for keeping romance alive in a marriage, sometimes recommends cultivating an aura of mystery, perhaps because it’s more adversarial.)

A related notion: wanting to join discussions is a sign of expecting a more cooperative world, while trying to keep people from joining your (private or illegible) communications is a sign of expecting a more adversarial world.

As social organizations “mature” and become larger, it becomes harder to enforce universal and impartial rules, harder to keep the larger population aligned on similar goals, and harder to comprehend the more complex phenomena in this larger group.  . This means that there’s both motivation and opportunity to carve out “hidden” and “special” zones where arbitrary behavior can persist even when it would otherwise come with negative consequences.

New or small organizations, by contrast, must gain/create resources or die, so they have more motivation to “optimize” for resource production; and they’re simple, small, and/or homogeneous enough that legible optimization rules and goals and transparent communication are practical and widely embraced.  “Security” is not available to begin with, so people mostly seek opportunity instead.

This theory explains, for instance, why US public policy is more fragmented, discretionary, and special-case-y, and less efficient and technocratic, than it is in other developed countries: the US is more racially diverse, which means, in a world where racism exists, that US civil institutions have evolved to allow ample opportunities to “play favorites” (giving special legal privileges to those with clout) in full generality, because a large population has historically been highly motivated to “play favorites” on the basis of race.  Homogeneity makes a polity behave more like a “smaller” one, while diversity makes a polity behave more like a “larger” one.

Aesthetically, I think of optimization as corresponding to an “early” style, like Doric columns, or like Masaccio; simple, martial, all form and principle.  Arbitrariness corresponds to a “late” style, like Corinthian columns or like Rubens: elaborate, sensual, full of details and personality.

The basic argument for optimization over arbitrariness is that it creates growth and value while arbitrariness creates stagnation.

Arbitrariness can’t really argue for itself as well, because communication itself is on the other side.  Arbitrariness always looks illogical and inconsistent.  It kind of is illogical and inconsistent. All it can say is “I’m going to defend my right to be wrong, because I don’t trust the world to understand me when I have a counterintuitive or hard-to-express or controversial reason for my choice.  I don’t think I can get what I want by asking for it or explaining my reasons or playing ‘fair’.”  And from the outside, you can’t always tell the difference between someone who thinks (perhaps correctly!) that the game is really rigged against them a profound level, and somebody who just wants to cheat or who isn’t thinking coherently.  Sufficiently advanced cynicism is indistinguishable from malice and stupidity.

For a fairly sympathetic example, you see something like Darkness at Noon, where the protagonist thinks, “Logic inexorably points to Stalinism; but Stalinism is awful! Therefore, let me insist on some space free from the depredations of logic, some space where justice can be tempered by mercy and reason by emotion.” From the distance of many years, it’s easy to say that’s silly, that of course there are reasons not to support Stalin’s purges, that it’s totally unnecessary to reject logic and justice in order to object to killing innocents.  But from inside the system, if all the arguments you know how to formulate are Stalinist, if all the “shoulds” and “oughts” around you are Stalinist, perhaps all you can articulate at first is “I know all this is right, of course, but I don’t like it.”

Not everything people call reason, logic, justice, or optimization, is in fact reasonable, logical, just, or optimal; so, a person needs some defenses against those claims of superiority.  In particular, defenses that can shelter them even when they don’t know what’s wrong with the claims.  And that’s the closest thing we get to an argument in favor of arbitrariness. It’s actually not a bad point, in many contexts.  The counterargument usually has to boil down to hope — to a sense of “I bet we can do better.”

 

Personalized Medicine For Real

I was part of the founding team at MetaMed, a personalized medicine startup.  We went out of business back in 2015.  We made a lot of mistakes due to inexperience, some of which I deeply regret.

I’m reflecting on that now, because Perlara just went out of business, and they got a lot farther on our original dream than we ever did. Q-State Biosciences, which is still around, is using a similar model.

The phenomenon that inspired MetaMed is that we knew of stories of heroic, scientifically literate patients and families of patients with incurable diseases, who came up with cures for their own conditions.  Physicist Leo Szilard, the “father of the atom bomb”, designed a course of radiation therapy to cure his own bladder cancer.  Computer scientist Matt Might analyzed his son’s genome to find a cure for his rare disorder.  Cognitive scientist Joshua Tenenbaum found a personalized treatment for his father’s cancer.

So, we thought, could we try to scale up this process to help more people?

In Lois McMaster Bujold’s science fiction novels, the hero suffers an accident that leaves him with a seizure disorder. He goes to a medical research center and clinic, the Durona Group, and they design a neural prosthetic for him that prevents the seizures.

This sounds like it ought to be a thing that exists. Patient-led, bench-to-bedside drug discovery or medical device engineering.  You get an incurable disease, you fund scientists/doctors/engineers to discover a cure, and now others with the disease can also be cured.

There’s actually a growing community of organizations trying to do things sort of in this vein.  Recursion Pharmaceuticals, where I used to work, does drug discovery for rare diseases. Sv.ai organizes hackathons for analyzing genetic data to help patients with rare diseases find the root cause.  Perlara and Q-state use animal models and in-vitro models respectively to simulate patients’ disorders, and then look for drugs or gene therapies that reverse those disease phenotypes in the animals or cells.

Back at MetaMed, I think we were groping towards something like this, but never really found our way there.

One reason is that we didn’t narrow our focus enough.  We were trying to solve too many problems at once, all called “personalized medicine.”

Personalized Lifestyle Optimization

Some “personalized medicine” is about health optimization for basically healthy people. A lot of it amounts to superficial personalization on top of generic lifestyle advice. Harmless, but more of a marketing thing than a science thing, and not very interesting from a humanitarian perspective.  Sometimes, we tried to get clients from this market.  I pretty much always thought this was a bad idea.

Personalized Medicine For All

Some “personalized medicine” is about the claim that the best way to treat even common diseases often depends on individual factors, such as genes.

This was part of our pitch, but as I learned more, I came to believe that this kind of “personalization” has very little applicability.  In most cases, we don’t know enough about how genes affect response to treatment to be able to improve outcomes by stratifying treatments based on genes.  In the few cases where we know people with different genes need different treatments, it’s often already standard medical practice to run those tests.  I now think there’s not a clear opportunity for a startup to improve the baseline through this kind of personalized medicine.

Preventing Medical Error

Some of our founding inspirations were the work of Gerd Gigerenzer and Atul Gawande, who showed that medical errors were the cause of many deaths, that doctors tend to be statistically illiterate, and that systematizing tools like checklists and statistical prediction rules save lives.  We wanted to be part of the “evidence-based medicine” movement by helping patients whose doctors had failed them.

I now think that we weren’t really in a position to do that as a company that sold consultations to individual patients. Many of the improvements in systematization that were clearly “good buys” have, in fact, been implemented in hospitals since Gawande and Gigerenzer first wrote about them.  We never saw a clear-cut case of a patient whose doctors had “dropped the ball” by giving them an obviously wrong treatment, except where the patient was facing financial hardship and had to transfer to substandard medical care.  I think doctors don’t make true unforced errors in diagnosis or treatment plan that often; and medical errors like “operating on the wrong leg” that happen in fast-paced decisionmaking environments were necessarily outside our scope.  I think there might be an opportunity to do a lot better than baseline by building a “smart hospital” that runs on checklists, statistical prediction rules, outcomes monitoring, and other evidence-based practices — Intermountain is the closest thing I know about, and they do get great outcomes — but that’s an epically hard problem, it’s political as much as medical and technological, and we weren’t in a position to make any headway on it.

AI Diagnosis

We were also hoping to automate diagnosis and treatment planning in a personalized manner.  “Given your symptoms, demographics, and genetic & lab test data, and given published research on epidemiology and clinical experiments, what are the most likely candidate diagnoses for you, and what are the treatments most likely to be effective for you?”

I used to be a big believer in the potential of this approach, but in the process of actually trying to build the AI, I ran into obstacles which were fundamentally philosophical. (No, it’s not “machines don’t have empathy” or anything like that.  It’s about the irreducible dependence on how you frame the problem, which makes “expert systems” dependent on an impractical, expensive amount of human labor up front.)

Connecting Patients with Experimental Therapies

Yet another “personalized medicine” problem we were trying to solve is the fact that patients with incurable diseases have a hard time learning about and getting access to experimental therapies, and could use a consultant who would guide them through the process and help get them into studies of new treatments.

I still think this is a real and serious problem for patients, and potentially an opportunity for entrepreneurs.  (Either on the consulting model, or more on the software side, via creating tools for matching patients with clinical trials — since clinical trials also struggle to recruit patients.)  In order to focus on this model, though, we’d have had to invest a lot more than we did into high-touch relationships with patients and building a network of clinician-researchers we could connect them with.

When Standard Practice Doesn’t Match Scientific Evidence

One kind of “medical error” we did see on occasion was when the patient’s doctors are dutifully doing the treatment that’s “standard-of-care”, but the medical literature actually shows that the standard-of-care is wrong.

There are cases where large, well-conducted studies clearly show that treatment A and treatment B have the same efficacy but B has worse side effects, and yet, “first-line treatment” is B for some reason.

There are cases where there’s a lot of evidence that “standard” cut-offs are in the wrong place. “Subclinical hypothyroidism” still benefits from supplemental thyroid hormone; higher-than-standard doses of allopurinol control gout better; “standard” light therapy for seasonal affective disorder doesn’t work as well as ultra-bright lights; etc.  More Dakka.

There are also cases where a scientist found an intervention effective, and published a striking result, and maybe it was even publicized widely in places like the New Yorker or Wired, but somehow clinicians never picked it up.  The classic example is Ramachandran’s mirror box experiment — it’s a famous experiment that showed that phantom limb pain can be reversed by creating an illusion with mirrors that allows the patient to fix their “body map.” There have since been quite a few randomized trials confirming that the mirror trick works. But, maybe because it’s not a typical kind of “treatment” like a drug, it’s not standard of care for phantom limb pain.

I think we were pretty successful at finding these kinds of mismatches between medical science and medical practice.  By their nature, though, these kinds of solutions are hard to scale to reach lots of people.

N=1 Translational Medicine for Rare Diseases

This is the use case of “personalized medicine” that I think can really shine.  It harnesses the incredible motivation of patients with rare incurable diseases and their family members; it’s one of the few cases where genetic data really does make a huge difference; and the path to scale is (relatively) obvious if you discover a new drug or treatment.  I think we should have focused much more tightly on this angle, and that a company based on bench-to-bedside discovery for rare diseases could still become the real-world “Durona Group”.

I think doing it right at MetaMed would have meant getting a lot more in-house expertise in biology and medicine than we ever had, more like Perlara and Q-State, which have their own experimental research programs, something we never got off the ground.

Speaking only about myself and not my teammates, while I was at MetaMed I was deeply embarrassed to be a layman in the biomedical field, and I felt like “why would an expert ever want to work with a layman like me?” So I was far too reluctant to reach out to prominent biologists and doctors. I now know that experts work with laymen all the time, especially when that layman brings strategic vision, funding, and logistical/operational manpower, and listens to the expert with genuine curiosity.  Laymen are valuable — just ask Mary Lasker!  I really wish I’d understood this at the time.

People overestimate progress in the short run and underestimate it in the long run.  “Biohackers” and “citizen science” and “N=1 experimentation” have been around for a while, but they haven’t, I think, gotten very far along towards the ultimate impact they’re likely to have in the future.  Naively, that can look a lot like “a few people tried that and it didn’t seem to go anywhere” when the situation is actually “the big break is still ahead of us.”

The Tale of Alice Almost: Strategies for Dealing With Pretty Good People

Suppose you value some virtue V and you want to encourage people to be better at it.  Suppose also you are something of a “thought leader” or “public intellectual” — you have some ability to influence the culture around you through speech or writing.

Suppose Alice Almost is much more V-virtuous than the average person — say, she’s in the top one percent of the population at the practice of V.  But she’s still exhibited some clear-cut failures of V.  She’s almost V-virtuous, but not quite.

How should you engage with Alice in discourse, and how should you talk about Alice, if your goal is to get people to be more V-virtuous?

Well, it depends on what your specific goal is.

Raising the Global Median

If your goal is to raise the general population’s median Vlevel, for instance, if V is “understanding of how vaccines work” and your goal is to increase the proportion of people who vaccinate their children, you want to support Alice straightforwardly.

Alice is way above the median V level. It would be great if people became more like Alice. If Alice is a popular communicator, signal-boosting Alice will be more likely to help rather than harm your cause.

For instance, suppose Alice makes a post telling parents to vaccinate their kids, but she gets a minor fact wrong along the way.  It’s still OK to quote or excerpt the true part of her post approvingly, or to praise her for coming out in favor of vaccines.

Even spreading the post with the incorrect statement included, while it’s definitely suboptimal for the cause of increasing the average person’s understanding of vaccines, is probably net positive, rather than net negative.

Raising the Median Among The Virtuous

What if, instead, you’re trying to promote V among a small sub-community who excel at it?  Say, the top 1% of the population in terms of V-virtue?

You might do this if your goal only requires a small number of people to practice exceptional virtue. For instance, to have an effective volunteer military doesn’t require all Americans to exhibit the virtues of a good soldier, just the ones who sign up for military service.

Now, within the community you’re trying to influence, Alice Almost isn’t way above average any more.  Alice is average. 

That means, you want to push people, including Alice, to be better than Alice is today.  Sure, Alice is already pretty V-virtuous compared to the general population, but by the community’s standards, the general population is pathetic.  

In this scenario, it makes sense to criticize Alice privately if you have a personal relationship with her.  It also makes sense to, at least sometimes, publicly point out how the Alice Almosts of the community are falling short of the ideal of V.  (Probably without naming names, unless Alice is already a famous public figure.)

Additionally, it makes sense to allow Alice to bear the usual negative consequences of her actions, and to publicly argue against anyone trying to shield her from normal consequences. For instance, if people who exhibit Alice-like failures of V are routinely fired from their jobs in your community, then if Alice gets fired, and her supporters get outraged about it, then it makes sense for you to argue that Alice deserved to be fired.

It does not make sense here to express outrage at Alice’s behavior, or to “punish” her as though she had committed a community norm violation.  Alice is normal — that means that behavior like Alice’s happens all the time, and that the community does not currently have effective, reliably enforced norms against behavior like hers.

Now, maybe the community should have stronger norms against her behavior!  But you have to explicitly make the case for that.  If you go around saying “Alice should be jailed because she did X”, and X isn’t illegal under current law, then you are wrong.  You first have to argue that X should be illegal.

If Alice’s failures of V-virtue are typical, then you do want to communicate the message that people should practice V more than Alice does.  But this will be news to your audience, not common knowledge, since many of them are no better than Alice.  To communicate effectively, you’ll have to take a tone of educating or sharing information: “Alice Almost, a well-known member of our community, just did X.  Many of us do X, in fact. But X is not good enough. We shouldn’t consider X okay any more. Here’s why.”

Enforcing Community Norms

What if Alice is inside the community of top-1%-V-virtue you care about, but noticeably worse than average at V or violating community standards for V?

That’s an easy case. Enforce the norms! That’s what they’re there for!

Continuing to enforce the usual penalties against failures of V, and making it common knowledge that you do so, and support others who enforce penalties, keeps the “floor” of V in your community from falling, either by deterrence or expulsion or both.

In terms of tone, it now makes sense for you to communicate in a more “judgmental” way, because it’s common knowledge that Alice did wrong.  You can say something like “Alice did X.  As you know, X is unacceptable/forbidden/substandard in our community. Therefore, we will be penalizing her in such-and-such a way, according to our well-known, established traditions/code/policy.”

Splintering off a “Remnant”

The previous three cases treated the boundaries of your community as static. What if we made them dynamic instead?

Suppose you’re not happy with the standard of V-virtue of “the top 1% of the population.”  You want to create a subcommunity with an even higher standard — let’s say, drawing from the top 0.1% of the population.

You might do this, for instance, if V is “degree of alignment/agreement with a policy agenda”, and you’re not making any progress with discourse/collaboration between people who are only mostly aligned with your agenda, so you want to form a smaller task force composed of a core of people who are hyper-aligned.

In that case, Alice Almost is normal for your current community, but she’s notably inferior in V-virtue compared to the standards of the splinter community you want to form.

Here, not only do you want to publicly criticize actions like Alice’s, but you even want to spend most of your time talking about how the Alice Almosts of the world fall short of the ideal V, as you advocate for the existence of your splinter group.  You want to reach out to the people who are better at V than Alice, even if they don’t know it themselves, and explain to them what the difference between top-1% V-virtue and top 0.1% V-virtue looks like, and why that difference matters.  You’re, in effect, empowering and encouraging them to notice that they’re not Alice’s peers any more, they’ve leveled up beyond her, and they don’t have to make excuses for her any more.

Just like in the case where Alice is a typical member of your community and you want to push your community to do better, your criticisms of Alice will be news to much of your audience, so you have to take an “educational/informational” tone. Even the people in the top 0.1% “remnant” may not be aware yet that there’s anything wrong with Alice’s behavior.

However, you’re now speaking primarily to the top 0.1%, not the top 1%, so you can now afford to be somewhat more insulting towards Alice.  You’re trying to create norms for a future community in which Alice’s behavior will be considered unacceptable/substandard, so you can start to introduce the frame where Alice-like behavior is “immoral”, “incompetent”, “outrageous”, or otherwise failing to meet a reasonable person’s minimum expectations.

Expanding Community Membership

Let’s say you’re doing just the opposite. You think your community is too selective.  You want to expand its boundaries to, say, a group drawn from the top 10% of the population in V-virtue.  Your goals may require you to raise the V-levels of a wider audience than you’d been speaking to before.

In this case, you’re more or less in the same position as in the first case where you’re just trying to raise the global median.  You should support Alice Almost (as much as possible without yourself imitating or compounding her failures), laud her as a role model, and not make a big public deal about the fact that she falls short of the ideal; most of the people you’re trying to reach fall short even farther.

What if Alice is Diluting Community Values?

Now, what if Alice Almost is the one trying to expand community membership to include people lower in V-virtue … and you don’t agree with that?

Now, Alice is your opponent.

In all the previous cases, the worst Alice did was drag down the community’s median V level, either directly or by being a role model for others.  But we had no reason to suppose she was optimizing for lowering the median V level of the community.  Once Alice is trying to “popularize” or “expand” the community, that changes. She’s actively trying to lower median V in your community — that is, she’s optimizing for the opposite of what you want.

This means that, not only should you criticize Alice, enforce existing community norms that forbid her behavior, and argue that community standards should become stricter against Alice-like, 1%-level failures of V-virtue, but you should also optimize against Alice gaining more power generally.

(But what if Alice succeeds in expanding the community size 10x and raising the median V level within the larger community by 10x or more, such that the median V level still increases from where it is now? Wouldn’t Alice’s goals be aligned with your goals then?  Yeah, but we can assume we’re in a regime where increasing V levels is very hard — a reasonable assumption if you think about the track record of trying to teach ethics or instill virtue in large numbers of people — so such a huge persuasive/rhetorical win is unlikely.)

Alice, for her part, will see you as optimizing against her goals (she wants to grow the community and you want to prevent that) so she’ll have reason to optimize generally against you gaining more power.

Alice Almost and you are now in a zero-sum game.  You are direct opponents, even though both of you are, compared to the general population, both very high in V-virtue.

Alice Almost in this scenario is a Sociopath, in the Chapman sense — she’s trying to expand and dilute the subculture.   And Sociopaths are not just a little bad for the survival of the subculture, they are an existential threat to it, even though they are only a little weaker in the defining skills/virtues of the subculture than the Geeks who founded it.  In the long run, it’s not about where you are, it’s where you’re aiming, and the Sociopaths are aiming down.

Of course, getting locked into a zero-sum game is bad if you can avoid it.  Misidentifying Alice as a Sociopath when she isn’t, or missing an opportunity to dialogue with her and come to agreement about how big the community really needs to be, is costly.  You don’t want to be hasty or paranoid in reading people as opponents.  But there’s a very, very big difference between how you deal with someone who just happened to do something that blocked your goal, and how you deal with someone who is persistently optimizing against your goal.

Humans Who Are Not Concentrating Are Not General Intelligences

Recently, OpenAI came out with a new language model that automatically synthesizes text, called GPT-2.

It’s disturbingly good.  You can see some examples (cherry-picked, by their own admission) in OpenAI’s post and in the related technical paper.

I’m not going to write about the machine learning here, but about the examples and what we can infer from them.

The scary thing about GPT-2-generated text is that it flows very naturally if you’re just skimming, reading for writing style and key, evocative words.  The “unicorn” sample reads like a real science press release. The “theft of nuclear material” sample reads like a real news story. The “Miley Cyrus shoplifting” sample reads like a real post from a celebrity gossip site.  The “GPT-2” sample reads like a real OpenAI press release. The “Legolas and Gimli” sample reads like a real fantasy novel. The “Civil War homework assignment” reads like a real C-student’s paper.  The “JFK acceptance speech” reads like a real politician’s speech.  The “recycling” sample reads like a real right-wing screed.

If I just skim, without focusing, they all look totally normal. I would not have noticed they were machine-generated. I would not have noticed anything amiss about them at all.

But if I read with focus, I notice that they don’t make a lot of logical sense.

For instance, in the unicorn sample:

The scientist named the population, after their distinctive horn, Ovid’s Unicorn. These four-horned, silver-white unicorns were previously unknown to science.

Wait a second, “Ovid” doesn’t refer to a “distinctive horn”, so why would naming them “Ovid’s Unicorn” be naming them after a distinctive horn?  Also, you just said they had one horn, so why are you saying they have four horns in the next sentence?

While their origins are still unclear, some believe that perhaps the creatures were created when a human and a unicorn met each other in a time before human civilization. According to Pérez, “In South America, such incidents seem to be quite common.”

Wait, unicorns originated from the interbreeding of humans and … unicorns?  That’s circular, isn’t it?

Or, look at the GPT-2 sample:

We believe this project is the first step in the direction of developing large NLP systems without task-specific training data. That is, we are developing a machine language system in the generative style with no explicit rules for producing text.

Except the second sentence isn’t a restatement of the first sentence — “task-specific training data” and “explicit rules for producing text” aren’t synonyms!  So saying “That is” doesn’t make sense.

Or look at the LOTR sample:

Aragorn drew his sword, and the Battle of Fangorn was won. As they marched out through the thicket the morning mist cleared, and the day turned to dusk.

Yeah, day doesn’t turn to dusk in the morning.

Or in the “resurrected JFK” sample:

(1) The brain of JFK was harvested and reconstructed via tissue sampling. There was no way that the tissue could be transported by air. (2) A sample was collected from the area around his upper chest and sent to the University of Maryland for analysis. A human brain at that point would be about one and a half cubic centimeters. The data were then analyzed along with material that was obtained from the original brain to produce a reconstruction; in layman’s terms, a “mesh” of brain tissue.

His brain tissue was harvested…from his chest?!  A human brain is one and a half cubic centimeters?!

So, ok, this isn’t actually human-equivalent writing ability. OpenAI doesn’t claim it is, for what it’s worth — I’m not trying to diminish their accomplishment, that’s not the point of this post.  The point is, if you skim text, you miss obvious absurdities.  The point is OpenAI HAS achieved the ability to pass the Turing test against humans on autopilot.

The point is, I know of a few people, acquaintances of mine, who, even when asked to try to find flaws, could not detect anything weird or mistaken in the GPT-2-generated samples.

There are probably a lot of people who would be completely taken in by literal “fake news”, as in, computer-generated fake articles and blog posts.  This is pretty alarming.  Even more alarming: unless I make a conscious effort to read carefully, I would be one of them.

Robin Hanson’s post Better Babblers is very relevant here.  He claims, and I don’t think he’s exaggerating, that a lot of human speech is simply generated by “low order correlations”, that is, generating sentences or paragraphs that are statistically likely to come after previous sentences or paragraphs:

After eighteen years of being a professor, I’ve graded many student essays. And while I usually try to teach a deep structure of concepts, what the median student actually learns seems to mostly be a set of low order correlations. They know what words to use, which words tend to go together, which combinations tend to have positive associations, and so on. But if you ask an exam question where the deep structure answer differs from answer you’d guess looking at low order correlations, most students usually give the wrong answer.

Simple correlations also seem sufficient to capture most polite conversation talk, such as the weather is nice, how is your mother’s illness, and damn that other political party. Simple correlations are also most of what I see in inspirational TED talks, and when public intellectuals and talk show guests pontificate on topics they really don’t understand, such as quantum mechanics, consciousness, postmodernism, or the need always for more regulation everywhere. After all, media entertainers don’t need to understand deep structures any better than do their audiences.

Let me call styles of talking (or music, etc.) that rely mostly on low order correlations “babbling”. Babbling isn’t meaningless, but to ignorant audiences it often appears to be based on a deeper understanding than is actually the case. When done well, babbling can be entertaining, comforting, titillating, or exciting. It just isn’t usually a good place to learn deep insight.

I used to half-joke that the New Age Bullshit Generator was actually useful as a way to get myself to feel more optimistic. The truth is, it isn’t quite good enough to match the “aura” or “associations” of genuine, human-created inspirational text. GPT-2, though, is.

I also suspect that the “lyrical” or “free-associational” function of poetry is adequately matched by GPT-2.  The autocompletions of Howl read a lot like Allen Ginsberg — they just don’t imply the same beliefs about the world.  (Moloch whose heart is crying for justice! sounds rather positive.)

I’ve noticed that I cannot tell, from casual conversation, whether someone is intelligent in the IQ sense.

I’ve interviewed job applicants, and perceived them all as “bright and impressive”, but found that the vast majority of them could not solve a simple math problem.  The ones who could solve the problem didn’t appear any “brighter” in conversation than the ones who couldn’t.

I’ve taught public school teachers, who were incredibly bad at formal mathematical reasoning (I know, because I graded their tests), to the point that I had not realized humans could be that bad at math — but it had no effect on how they came across in friendly conversation after hours. They didn’t seem “dopey” or “slow”, they were witty and engaging and warm.

I’ve read the personal blogs of intellectually disabled people — people who, by definition, score poorly on IQ tests — and they don’t read as any less funny or creative or relatable than anyone else.

Whatever ability IQ tests and math tests measure, I believe that lacking that ability doesn’t have any effect on one’s ability to make a good social impression or even to “seem smart” in conversation.

If “human intelligence” is about reasoning ability, the capacity to detect whether arguments make sense, then you simply do not need human intelligence to create a linguistic style or aesthetic that can fool our pattern-recognition apparatus if we don’t concentrate on parsing content.

I also noticed, upon reading GPT2 samples, just how often my brain slides from focused attention to just skimming. I read the paper’s sample about Spanish history with interest, and the GPT2-generated text was obviously absurd. My eyes glazed over during the sample about video games, since I don’t care about video games, and the machine-generated text looked totally unobjectionable to me. My brain is constantly making evaluations about what’s worth the trouble to focus on, and what’s ok to tune out. GPT2 is actually really useful as a *test* of one’s level of attention.

This is related to my hypothesis in https://srconstantin.wordpress.com/2017/10/10/distinctions-in-types-of-thought/ that effortless pattern-recognition is what machine learning can do today, while effortful attention, and explicit reasoning (which seems to be a subset of effortful attention) is generally beyond ML’s current capabilities.

Beta waves in the brain are usually associated with focused concentration or active or anxious thought, while alpha waves are associated with the relaxed state of being awake but with closed eyes, before falling asleep, or while dreaming. Alpha waves sharply reduce after a subject makes a mistake and begins paying closer attention. I’d be interested to see whether ability to tell GPT2-generated text from human-generated text correlates with alpha waves vs. beta waves.

The first-order effects of highly effective text-generators are scary. It will be incredibly easy and cheap to fool people, to manipulate social movements, etc. There’s a lot of opportunity for bad actors to take advantage of this.

The second-order effects might well be good, though. If only conscious, focused logical thought can detect a bot, maybe some people will become more aware of when they’re thinking actively vs not, and will be able to flag when they’re not really focusing, and distinguish the impressions they absorb in a state of autopilot from “real learning”.

The mental motion of “I didn’t really parse that paragraph, but sure, whatever, I’ll take the author’s word for it” is, in my introspective experience, absolutely identical to “I didn’t really parse that paragraph because it was bot-generated and didn’t make any sense so I couldn’t possibly have parsed it”, except that in the first case, I assume that the error lies with me rather than the text.  This is not a safe assumption in a post-GPT2 world. Instead of “default to humility” (assume that when you don’t understand a passage, the passage is true and you’re just missing something) the ideal mental action in a world full of bots is “default to null” (if you don’t understand a passage, assume you’re in the same epistemic state as if you’d never read it at all.)

Maybe practice and experience with GPT2 will help people get better at doing “default to null”?

The Relationship Between Hierarchy and Wealth

Epistemic Status: Tentative

I’m fairly anti-hierarchical, as things go, but the big challenge to all anti-hierarchical ideologies is “how feasible is this in real life? We don’t see many examples around us of this working well.”

Backing up, for a second, what do we mean by a hierarchy?

I take it to mean a very simple thing: hierarchies are systems of social organization where some people tell others what to do, and the subordinates are forced to obey the superiors.  This usually goes along with special privileges or luxuries that are only available to the superiors.  For instance, patriarchy is a hierarchy in which wives and children must obey fathers, and male heads of families get special privileges.

Hierarchy is a matter of degree, of course. Power can vary in the severity of its enforcement penalties (a government can jail you or execute you, an employer can fire you, a religion can excommunicate you, the popular kids in a high school can bully or ostracize you), in its extent (a totalitarian government claims authority over more aspects of your life than a liberal one), or its scale (an emperor rules over more people than a clan chieftain.)

Power distance is a concept from the business world that attempts to measure the level of hierarchy within an organization or culture.  Power distance is measured by polling less-powerful individuals on how much they “accept and expect that power is distributed unequally”.  In low power distance cultures, there’s more of an “open door” policy, subordinates can talk freely with managers, and there are few formal symbols of status differentiating managers from subordinates.  In “high power distance” cultures, there’s more formality, and subordinates are expected to be more deferential.  According to Geert Hofstede, the inventor of the power distance index (PDI), Israel and the Nordic countries have the lowest power distance index in the world, while Arab, Southeast Asian, and Latin American countries have the highest.  (The US is in the middle.)

I share with many other people a rough intuition that hierarchy poses problems.

This may not be as obvious as it sounds.  In high power distance cultures, empirically, subordinates accept and approve of hierarchy.  So maybe hierarchy is just fine, even for the “losers” at the bottom?  But there’s a theory that subordinates claim to approve of hierarchy as a covert way of getting what power they can.   In other words, when you see peasants praising the benevolence of landowners, it’s not that they’re misled by the governing ideology, and not that they’re magically immune to suffering from poverty as we would in their place, but just that they see their situation as the best they can get, and a combination of flattery and (usually religious) guilt-tripping is their best chance for getting resources from the landowners.  So, no, I don’t think you can assume that hierarchy is wholly harmless just because it’s widely accepted in some societies. Being powerless is probably bad, physiologically and psychologically, for all social mammals.

But to what extent is hierarchy necessary?

Structurelessness and Structures

Nominally non-hierarchical organizations often suffer from failure modes that keep them from getting anything done, and actually wind up quite hierarchical in practice. I don’t endorse everything in Jo Freeman’s famous essay on the Tyranny of Structurelessness, but it’s important as an account of actual experiences in the women’s movement of the 1970s.

When organizations have no formal procedures or appointed leaders, everything goes through informal networks; this devolves into popularity contests, privileges people who have more free time to spend on gossip, as well as people who are more privileged in other ways (including economically), and completely fails to correlate decision-making power with competence.

Freeman’s preferred solution is to give up on total structurelessness and accept that there will be positions of power in feminist organizations, but to make those positions of power legible and limited, with methods derived from republican governance (which are also traditional in American voluntary organizations.)  Positions of authority should be limited in scope (there is a finite range of things an executive director is empowered to do), accountable to the rest of the organization (through means like voting and annual reports), and impeachable in cases of serious ethical violation or incompetence. This is basically the governance structure that nonprofits and corporations use, and (in my view) it helps make them, say, less likely to abuse their members than cults and less likely to break up over personal drama than rock bands.

Freeman, being more egalitarian than the republican tradition, also goes further with her recommendations and says that responsibilities should be rotated (so no one person has “ownership” over a job forever), that authority should be distributed widely rather than concentrated, that information should be diffused widely, and that everyone in the organization should have equal access to organizational resources.  Now, this is a good deal less hierarchical than the structure of republican governments, nonprofits, and corporations; it is still pretty utopian from the point of view of someone used to those forms of governance, and I find myself wondering if it can work at scale; but it’s still a concession to hierarchy relative to the “natural” structurelessness that feminist organizations originally envisioned.

Freeman says there is one context in which a structureless organization can work; a very small team (no more than five) of people who come from very similar backgrounds (so they can communicate easily), spend so much time together that they practically live together (so they communicate constantly), and are all capable of doing all “jobs” on the project (no need for formal division of labor.)  In other words, she’s describing an early-stage startup!

I suspect Jo Freeman’s model explains a lot about the common phenomenon of startups having “growing pains” when they get too large to work informally.  I also suspect that this is a part of how startups stop being “mission-driven” and ambitious — if they don’t add structure until they’re forced to by an outside emergency, they have to hurry, and they adopt a standard corporate structure and power dynamics (including the toxic ones, which are automatically imported when they hire a bunch of people from a toxic business culture all at once) instead of having time to evolve something that might achieve the founders’ goals better.

But Can It Scale? Historical Stateless Societies

So, the five-person team of friends is a non-hierarchical organization that can work.  But that’s not very satisfying for anti-authoritarian advocates, because it’s so small.  And, accordingly, an organization that small is usually poor — there’s only so many resources that five people can produce.

(Technology can amplify how much value a single person can produce. This is probably why we see more informal cultures among people who work with high-leverage technology.  Software engineers famously wear t-shirts, not suits; Air Force pilots have a reputation as “hotshots” with lax military discipline compared to other servicemembers. Empowered with software or an airplane, a single individual can be unusually valuable, so  less deference is expected of the operators of high technology.)

When we look at historical anarchies or near-anarchies, we usually also see that they’re small, poor, or both.  We also see that within cultures, there is often surprisingly more freedom for women among the poor than among the rich.

Medieval Iceland from the tenth to thirteenth centuries was a stateless society, with private courts of law, and competing legislative assemblies (Icelanders could choose which assembly and legal code to belong to), but no executive branch or police.  (In this, it was an unusually pure form of anarchy but not unique — other medieval European polities had much more private enforcement of law than we do today, and police are a 19th-century invention.)

The medieval Icelandic commonwealth lasted long enough — longer than the United States — that it was clear this was a functioning system, not a brief failed experiment.  And it appears that it was less violent, not more, compared to other medieval societies.  Even when the commonwealth was beginning to break down in the thirteenth century, battles had low casualty rates, because every man still had to be paid for!  The death toll during the civil war that ended the commonwealth’s independence was only as high per capita as the current murder rate of the US.  While Christianization in neighboring Norway was a violent struggle, the decision of whether to convert to Christianity in Iceland was decided peacefully through arbitration.  In this case, it seems clear that anarchy brought peace, not war.

However, medieval Iceland was small — only 50,000 people, confined to a harsh Arctic environment, and ethnically homogeneous.

Other historical and traditional stateless societies are and were also relatively poor and low in population density. The Igbo of Nigeria traditionally governed by council and consensus, with no kings or chiefs, but rather a sort of village democracy.   This actually appears to be fairly common in small polities.  The Iroquois Confederacy governed by council and had no executive. (Note that the Iroquois are a hoe culture.)  The Nuer of Sudan, a pastoral society currently with a population of a few million, have traditionally had a stateless society with a system of feud law — they had judges, but no executives. There are many more examples — perhaps most familiar to Westerners, the society depicted in the biblical book of Judges appears to have had no king and no permanent war-leader, but only judges who would decide cases which would be privately enforced. In fact, stateless societies with some form of feud law seem to be a pretty standard and recurrent type of political organization, but mostly in “primitive” communities — horticultural or pastoral, low in population density.  This sounds like bad news for modern-day anarchists who don’t want to live in primitive conditions. None of these historical stateless societies, even the comparatively sophisticated Iceland, are urban cultures!

It’s possible that the Harappan civilization in Bronze Age India had no state, while it had cities that housed tens of thousands of people, were planned on grids, and had indoor plumbing.  The Harappans left no massive tombs, no palaces or temples, houses of highly uniform size (indicating little wealth inequality) no armor and few weapons (despite advanced metalworking), no sign of battle damage on the cities or violent death in human remains, and very minimal city walls.  The Harappan cities were commercial centers, and the Harappans engaged in trade along the coast of India and as far as Afghanistan and the Persian Gulf.  Unlike other similar river-valley civilizations (such as Mesopotamia), the Harappans had so much arable land, and farmsteads so initially spread out, that populations steadily grew and facilitated long-distance trade without having to resort to raiding, so they never developed a warrior class.  If so, this is a counterexample to the traditional story that all civilizations developed states (usually monarchies) as a necessary precondition to developing cities and grain agriculture.

Bali is another counterexample.  Rice farming in Bali requires complex coordination of irrigation. This was traditionally not organized by kings, but by subaks, religious and social organizations that supervise the growing of rice, supervised by a decentralized system of water temples, and led by priests who kept a ritual calendar for timing irrigation.  While precolonial Bali was not an anarchy but a patchwork of small principalities, large public works like irrigation were not under state control.

So we have reason to believe that Bronze Age levels of technological development (cities, metalworking, intensive agriculture, literacy, long-distance trade, and high populations) can be developed without states, at scales involving millions of people, for centuries.  We also have much more abundant evidence, historical and contemporary, of informal governance-by-council and feud law existing stably at lower technology levels (for pastoralists and horticulturalists).  And, in special political circumstances (the Icelanders left Norway to settle a barren island, to escape the power of the Norwegian king, Harald Fairhair) an anarchy can arise out of a state society.

But we don’t have successful examples of anarchies at industrial tech levels. We know industrial-technology public works can be built by voluntary organizations (e.g. the railroads in the US) but we have no examples of them successfully resisting state takeover for more than a few decades.

Is there something about modern levels of high technology and material abundance that is incompatible with stateless societies? Or is it just that modern nation-states happened to already be there when the Industrial Revolution came around?

Women’s Status and Material Abundance

A very weird thing is that women’s level of freedom and equality seems almost to anticorrelate with the wealth and technological advancement.

Horticultural (or “hoe culture“) societies are non-patriarchal and tend to allow women more freedom and better treatment in various ways than pre-industrial agricultural societies. For instance, severe mistreatment of women and girls like female infanticide, foot-binding, honor killings, or sati, and chastity-oriented restrictions on female freedom like veiling and seclusion, are common in agricultural societies and unknown in horticultural ones. But horticultural societies are poor in material culture and can’t sustain high population densities in most cases.

You also see unusual freedom for women in premodern pastoral cultures, like the Mongols. Women in the Mongol Empire owned and managed ordos, mobile cities of tents and wagons which also comprised livestock and served as trading hubs.  While the men focused on hunting and war, the women managed the economic sphere. Mongol women fought in battle, herded livestock, and occasionally ruled as queens.  They did not wear veils or bind their feet.

We see numerous accounts of ancient and medieval women warriors and military commanders among Germanic and Celtic tribes and steppe peoples of Central Asia.  There are also accounts of medieval European noblewomen who personally led armies. The pattern isn’t obvious, but there seem to be more accounts of women military leaders in pastoral societies or tribal ones than in large, settled empires.

Pastoralism, to a lesser extent than horticulture but still more than plow agriculture, gives women an active role in food production. Most pastoral societies today have a traditional division of labor in which men are responsible for meat animals and women are responsible for milk animals (as well as textiles).  Where women provide food, they tend to have more bargaining power.  Some pastoral societies, like the Tuareg, are even matrilineal; Tuareg women traditionally have more freedom, including sexual freedom, than they do in other Muslim cultures, and women do not wear the veil while men do.

Like horticulture, pastoralism is less efficient per acre at food production than agriculture, and thus does not allow high population densities. Pastoralists are poorer than their settled farming neighbors. This is another example of women being freer when they are also poorer.

Another weird and “paradoxical” but very well-replicated finding is that women are more different from men  in psychological and behavioral traits (like Big 5 personality traits, risk-taking,  altruism, participation in STEM careers) in richer countries than in poorer ones.  This isn’t quite the same as women being less “free” or having fewer rights, but it seems to fly in the face of the conventional notion that as societies grow richer, women become more equal to men.

Finally, within societies, it’s sometimes the case that poor women are treated better than rich ones.  Sarah Blaffer Hrdy writes about observing that female infanticide was much more common among wealthy Indian Rajput families than poor ones. And we know of many examples across societies of aristocratic or upper-class women being more restricted to the domestic sphere, married off younger, less likely to work, more likely to experience restrictive practices like seclusion or footbinding, than their poorer counterparts.

Hrdy explains why: in patrilinear societies, men inherit wealth and women don’t. If you’re a rich family, a son is a “safe” outcome — he’ll inherit your wealth, and your grandchildren through him will be provided for, no matter whom he marries. A daughter, on the other hand, is a risk. You’ll have to pay a dowry when she marries, and if she marries “down” her children will be poorer than you are — and at the very top of the social pyramid, there’s nowhere to marry but down.  This means that you have an incentive to avoid having daughters, and if you do have daughters, you’ll be very anxious to avoid them making a bad match, which means lots of chastity-enforcement practices. You’ll also invest more in your sons than daughters in general, because your grandchildren through your sons will have a better chance in life than your grandchildren through your daughters.

The situation reverses if you’re a poor family. Your sons are pretty much screwed; they can’t marry into money (since women don’t inherit.) Your daughters, on the other hand, have a chance to marry up. So your grandchildren through your daughters have better chances than your grandchildren through your sons, and you should invest more resources in your daughters than your sons. Moreover, you might not be able to afford restrictive practices that cripple your daughters’ ability to work for a living. To some extent, sexism is a luxury good.

A similar analysis might explain why richer countries have larger gender differences in personality, interests, and career choices.  A degree in art history might function as a gentler equivalent of purdah — a practice that makes a woman a more appealing spouse but reduces her earning potential. You expect to find such practices more among the rich than the poor.  (Tyler Cowen’s take is less jaundiced, and more general, but similar — personal choices and “personality” itself are more varied when people are richer, because one of the things people “buy” with wealth is the ability to make fulfilling but not strictly pragmatic self-expressive choices.)

Finally, all these “paradoxical” trends are countered by the big nonparadoxical trend — by most reasonable standards, women are less oppressed in rich liberal countries than in poor illiberal ones.  The very best countries for women’s rights are also the ones with the lowest power distance: Nordic and Germanic countries.

Is Hierarchy the Engine of Growth or a Luxury Good?

If you observe that the “freest” (least hierarchical, lowest power distance, least authoritarian, etc) functioning organizations and societies tend to be small, poor, or primitive, you could come to two different conclusions:

  1. Freedom causes poverty (in other words, non-hierarchical organization is worse than hierarchy at scaling to large organizations or rich, high-population societies)
  2. Hierarchy is expensive (in other words, only the largest organizations or richest societies can afford the greatest degree of authoritarianism.)

The first possibility is bad news for freedom. It means you should worry you can’t scale up to wealth for large populations without implementing hierarchies.  The usual mechanism proposed for this is the hypothesis that hierarchies are needed to coordinate large numbers of people in large projects.  Without governments, how would you build public works? Or guard the seas for global travel and shipping? Without corporate hierarchies, how would you get mass-produced products to billions of people?  Sure, goes the story, idealists have proposed alternatives to hierarchy, and we know of intriguing counter examples like Harappa and Bali, but these tend to be speculative or small-scale and the success stories are sporadic.

The second possibility is (tentative)  good news for freedom.  It says that hierarchy is inefficient.  For instance, secluding women in harems wastes their productive potential. Top-down state control of the economy causes knowledge problems that limit economic productivity. The same problem applies to top-down control of decisionmaking in large firms.  Dominance hierarchies inhibit accurate transmission of information, which worsens knowledge problems and principal-agent problems (“communication is only possible between equals.”)  And elaborate displays of power and deference are costly, as nonproductive displays always are.  Only accumulations of large amounts of resources enable such wasteful activity, which benefits the top of the hierarchy in the short run but prevents the “pie” of total resources from growing.

This means that if you could just figure out a way to keep inefficient hierarchies from forming, you could grow systems to be larger and richer than ever.  Yes, historically, Western economies grew richer as states grew stronger — but perhaps a stateless society could be richer still.  Perhaps without the stagnating effects of rent-seeking, we could be hugely better off.

After all, this is kind of what liberalism did. It’s the big counter-trend to “wealth and despotism go together” — Western liberal-democratic countries are much richer and much less authoritarian (and less oppressive to women) than any pre-modern society, or than developing countries. One of the observations in Wealth of Nations is that countries with strong middle classes had more subsequent economic growth than countries with more wealth inequality — Smith uses England as an example of a fast-growing, equal society and China as an example of a stagnant, unequal one.

But this is only partial good news for freedom, after all. If hierarchies tend to emerge as soon as size, scale, and wealth arise, then that means we don’t have a solution to the problem of preventing them from emerging. On a model where any sufficiently large accumulation of resources begins to look attractive to “robber barons” who want to appropriate it and forcibly keep others out, we might hypothesize that a natural evolution of all human institutions is from an initial period of growth and value production towards inevitable value capture, stagnation, and decline.  We see a lack of freedom in the world around us, not because freedom can’t work well, but because it’s hard to preserve against the incursions of wannabe despots, who eventually ruin the system for everyone including themselves.

That model points the way to new questions, surrounding the kinds of governance that Jo Freeman talks about. By default an organization will succumb to inefficient hierarchy, and structureless organizations will succumb faster and to more toxic hierarchies. When designing governance structures, the question you want to ask is not just “is this a system I’d want to live under today?” but “how effective will this system be in the future at resisting the guys who will come along and try to take over and milk it for short-term personal gain until it collapses?”  And now we’re starting to sound like the rationale and reasoning behind the U.S. Constitution, though I certainly don’t think that’s the last word on the subject.