The World is Simple

In the world of image and signal processing, a paper usually takes the form “We can prove that this algorithm gives such-and-such accuracy on images that have such-and-such regularity property.  We tried it on some examples and it worked pretty well.  We’re going to assume that most of the images we might care about have the regularity property.”

For instance, sparse coding and compressed sensing techniques assume that the representation of the image in some dictionary or basis is “sparse”, i.e. has very few nonzero coefficients.

There’s some biological justification for this: the mammalian brain seems to recognize images with an overcomplete dictionary of Gabor filters, only a few of which are firing at any given time.

There’s a basic underlying assumption that the world is, in some sense, simple. This is related to ideas around the “unreasonable effectiveness of mathematics.”  Observations from nature can be expressed compactly.  That’s what it means to live in an intelligible universe.

But what does this mean specifically?

One observation is that the power spectrum, that is, the square of the (discrete) Fourier transform, of natural images obeys a power law

S(k) = \frac{A}{k^{2-\eta}}

where \eta is usually small.  It’s been hypothesized that this is because natural images are composed of statistically independent objects, whose scales follow a power-law distribution.

What does this mean?  One way of thinking about it is that a signal with a power-law spectrum exists at all scales.  It’s referred to as “pink noise”.  You can generate a signal with a power-law spectrum by defining a “fractional Brownian motion“.  This is like a Brownian motion, except the increment from time t to time s is a normal distribution with mean zero and variance |t-s|^{2H} for the Hurst exponent H, which equals 1/2 in the special case of a Brownian motion.  The covariance function of a fractional Brownian motion is homogeneous of degree 2H.  Fractional Brownian motions are Lipschitz-continuous with exponent H.

As a matter of fact, any function whose wavelet transform is homogeneous of degree \lambda is a fractional Brownian motion of degree (\lambda-1)/2.

Cosma Shalizi has a good post on this phenomenon.  Systems in thermodynamic equilibrium, i.e. “boring” systems, have correlations that decay exponentially in space and time. Systems going through phase transitions, like turbulent flows, and like most things you’ll observe in nature, have correlations that decay slower, with a power law. There are many simple explanations for why things might wind up being power-law-ish.

Imagine you have some set of piles, each of which grows, multiplicatively, at a constant rate. New piles are started at random times, with a constant probability per unit time. (This is a good model of my office.) Then, at any time, the age of the piles is exponentially distributed, and their size is an exponential function of their age; the two exponentials cancel and give you a power-law size distribution. The basic combination of exponential growth and random observation times turns out to work even if it’s only the mean size of piles which grows exponentially.

If we’re in a domain with \eta < 1 or H > 1/2, we’re looking at a function with a square-summable Fourier transform.  This is why L^2 assumptions are not completely insane in the domain of signal processing, and why it makes sense to apply wavelet (and related) transforms and truncate after finitely many coefficients.

Not all regularity assumptions in the signal-processing and machine-learning world are warranted.  Image processing is full of bounded-variation methods like Mumford-Shah, and at least one paper is claiming to observe that natural images are not actually BV.  My own research deals with the fact that Riemannian assumptions in manifold learning are not realistic for most real-world datasets, and that manifold-learning methods need to be extended to sub-Riemannian manifolds (or control manifolds).  And I’m genuinely uncertain when sparsity assumptions are applicable.

But decaying power spectra are a pretty robust empirical observation, and not for shocking reasons. It’s a pretty modest criterion of “simplicity” or “intelligibility”, it’s easy to reproduce with simple random processes, it’s kind of unsurprising that we see it all over the place. And it allows us to pretend we live in Hilbert spaces, which is always a win, because you can assume that Fourier transforms converge and discrete approximations of projections onto orthogonal bases/dictionaries are permissible.

Power-law spectrum decay is a sort of minimum assumption of simplicity that we can expect to see in all kinds of data sets that were not generated “adversarially.”  It’s not a great mystery that the universe is set up that way; it’s what we would expect.  Stronger assumptions of simplicity are more likely to be domain-specific.  If you know what you’re looking at (it’s a line drawing; it’s a set of photographs of the same thing taken at different angles; it’s a small number of anomalies in an otherwise featureless plane; etc) you can make more stringent claims about sparsity or regularity.  But there’s a certain amount of simplicity that we’re justified in assuming almost by default, which allows us to use the tools of applied harmonic analysis in the first place.

Beyond the One Percent: Categorizing Extreme Elites

A lot of people talk about “1%” as though it was synonymous with “almost nothing.”  Except that when it comes to people, that’s extremely misleading.  One percent of the US population is more than three million people!

Confused thinking is especially common when we talk about extreme elites, of achievement or wealth.  If “top 1%” means millions of people, what about even smaller, even more extreme elites?  The top 0.01% is as far removed from the 1% as the 1% is from the general population; and yet that’s still tens of thousands of people!  How do you have any kind of gauge for these numbers?

Because human intuition is evolved for much smaller social groups than the United States, our mental models can be very badly wrong. If you’re a mathematician at a top-tier school, it feels like “lots” of people are at that level of mathematical ability.  To you, that’s “normal”, so you don’t have much intuition for exactly how rare it is.  Anecdotally, it seems very common for intellectual elites to implicitly imagine that the community of people “like them” is orders of magnitude bigger than it actually is.

So I’ve done a little “Powers of Ten” exercise, categorizing elite groups by size and giving a few illustrative examples.  All numbers are for the US. Fermi calculations have been used liberally.

Of course, people don’t belong to one-and-only-one group: you could be a One-Percenter in money, an Elite in programming ability, and average in athletic ability.

Historical figures: people who achieve things of a caliber only seen a few times a century. People who show up in encyclopedias and history books.

Superstars: people who win prizes that are only awarded to a handful of people a year or so — there are usually dozens alive/active at that level at any given time. Nobel Prize winners (per field) and Fields medalists. Movie stars and pop music celebrities. Cabinet members. Tennis grand slam winners and Olympic medalists (per event).  People at the superstar level of wealth are household names and have tens of billions of dollars in net worth.  Groups of superstars are usually too small to develop a distinctive community or culture.

Leaders: members of a group of several hundred. International Mathematics Olympiad contestants. National Academy of Sciences or American Academy of Arts and Sciences members, per field. Senators and congressmen. NBA players. Generals (in the US military). Billionaires. Groups of Leaders form roughly Dunbar-sized tribes: a Leader can personally get to know all the people at his level.

Ultra-elites: members of a group of a few thousand. PhDs from top-ten universities, by department. Chess grandmasters. Major league baseball players. TED speakers. Fashion models.

Elites: members of a group of tens of thousands. “Ultra high net worth individuals” owning more than $30 million in assets. Google software engineers. AIME qualifiers. Symphony orchestra musicians. Groups of Elites are about the size of the citizen population of classical Athens, or the number of Burning Man attendees. Too large to get to know everyone personally; small enough to govern by assembly and participate in collective rituals.

Aristocrats: members of a group of hundreds of thousands. Ivy League alumni. Doctors. Lawyers. Officers (in US military). People of IQ over 145. People with household incomes of over $1 million a year (the “0.1%”). Groups of Aristocrats are large enough to be professions, as in law or medicine, or classes, like the career military class or the socioeconomic upper class.

One-percenters: members of a group of a few million. Engineers. Programmers.  People of IQ over 130, or people who scored over 1500 on SAT’s (out of 1600). People who pass the Cognitive Reflection Test. People with over $1 million in assets, or household income over $200,000. If you are in a group of One-Percenters, it’s a whole world; you have little conception of what it might be like to be outside that group, and you may have never had a serious conversation with someone outside it.


Fun with BLS statistics

What do people in America do for a living?

What is a “normal” job, statistically?

What are the best-paying jobs?

Most of us don’t know, even though these are incredibly relevant facts for career choice, education, and having some idea of what kind of country you live in.  And even though all the statistics are available free to the public from the Bureau of Labor Statistics!

What Jobs Pay Best?

Doctors. Definitely doctors. The top ten highest mean annual wage occupations are all medical specialties. Anesthesiologists top the list, with an average salary of $235,070.

Obviously doctors are not the richest people in the US. The Forbes 400 consists largely of executives.  But “chief executive” as a profession actually ranks behind “psychiatrist.” The average CEO makes $178,400 a year.

Dentists, nurse anaesthetists, and petroleum engineers make over $150,000 a year. Managers of all sorts, as well as lawyers, range in the $120,000-$140,000s.

Air traffic controllers make about as much as physicists, at $118,000 a year.

Yep, you got that right: the average air traffic controller is slightly richer than the average physicist.

Physicists are the richest pure-science specialty, followed by astronomers and computer scientists ($110,000) and mathematicians ($103,000).  Actuaries, software engineers, computer hardware engineers, and nuclear, aerospace, and chemical engineers, cluster around the $100,000-110,000 range.

Bottom line: if you want a high-EV profession, be a doctor. Or a dentist — the pay is almost as good. The “professions” — medicine, law, engineering — are, in fact, high-paying, and sort by income in that order.  It is, obviously, good to be a manager; but still not as good as being a doctor. Going into the hard sciences is, as far as income goes, basically the same as going into engineering. It’s the bottom of the 6-figure range.  There are a few underappreciated jobs, like air traffic controllers, pilots, anaesthetists, pharmacists, actuaries, and optometrists, which aren’t generally given as much social status as doctors and lawyers, but pay comparably.

What Jobs Pay Worst?

Flipping burgers. It’s not just a punchline: fast food cooks are the lowest-paid occupation, at $18,870 a year.

For comparison purposes, the federal poverty line for a single person is given at $11,670, and for a family of four at $23,850. So a burger-flipper is only technically living in poverty if she supports at least two dependents. 15% of Americans live below the poverty line.  Since a fair number (19%) of people living alone are poor, this suggests that unemployment or underemployment is a bigger factor in poverty than low wages.

We have a lot of low-paid fast-food cooks and servers. Three million Americans work in fast-food preparation and service.

The lowest of the low-paid jobs, making under $30,000 a year, are service workers. Cooks, cashiers, desk clerks, maids, bartenders, parking lot attendants, manicurists.  When somebody waits on you in a commercial establishment, you’re looking at one of the poorest people who have jobs at all.

The other kind of ultra-low-paid jobs are laborers. Agricultural workers, graders and sorters, cleaners of vehicles and equipment, meat cutters and trimmers and meatpackers, building cleaning and pest control workers. Groundskeeping workers.  Not, it’s important to note, people who work in manufacturing and repair; most of those jobs are in the $30,000-$40,000 range.

As you get to the top of the <$30,000 range, you begin to see office workers. Office clerks (and there are two million of them!) get paid about $29,000 a year. Data entry. File clerks. Despite living in the age of computers, we still have lots of people whose jobs are low-level paperwork. And they’re very poorly paid.

This is the depressing side of the income scale.  Where are all the poor people? They’re in customer service, unskilled labor, or low-level office work.

Who is the Middle Class?

The median US household income is $51,000.  The average household is 2.55 people.  The median US salary is $48,872.  (This seems to imply that most wage earners support at least one dependent.)  So let’s look at jobs that pay around the median.

Firefighters, at $48,270. Social workers, at $48,370, as well as librarians, at $47,750, counselors, at $47,820, teachers, at $54,740, and clergy, at $47,540. Fine artists, at $50,900, and graphic designers, at $49,610.   Things like “mine cutting and channeling machine operators”, “aircraft cargo handling supervisors”, “tool and die makers”, “civil engineering technicians”, “derrick operators, oil and gas”, “explosive workers, ordnance handling experts, and blasters”, “railroad brake, signal, and switch operators”, and so on, get paid in the $48,000-51,000 range.  Basically, jobs that involve the skilled use of machinery, the actual making and operating of an industrial civilization.

Who is the middle class? “Teachers and firemen” isn’t far off, as stereotypes go.  It’s mostly unionized jobs, either in the “helping professions” or in manufacturing/industry.

How do you get a job like that?  For example, CNC programmers are pretty evenly split between people with associates’ degrees (36%), people with post-secondary certificates (31%), and people with college degrees (15%). You need to pass a licensing exam and spend several years as an apprentice.  Mining machine operators, on the other hand, mostly don’t even need a high school diploma. Tool and die makers need a post-secondary certificate but generally not a college degree. By contrast, you usually need a masters’ to be a counselor, for comparable pay.

Where do most people work?

Of the broad sectors defined by the BLS, the most common is “office and administrative support occupations.”

Who are these? Things like “data entry keyers”, “human resources assistants”, “shipping clerks”, “payroll and timekeeping clerks”, and so on. They make an average salary of $34,900, and they are mostly employed by government, banks, hospitals and medical practices.  A full 16% of employed Americans work in this sector.

The second most common sector is “sales and related occupations.”

Who are these? Everything from counter clerks to real estate brokers to sales engineers, but not management of sales departments.  The mean annual wage is $38,200 — most people in “sales” are clerks in stores (grocery stores, department stores, clothing stores, etc.) 14 million people work in sales altogether, around 11% of employed Americans.

The next most common sector is “food preparation and services”, at 8% of employed Americans.  The mean wage is $21,580.

By single occupation, the most common occupations in America are “retail sales workers”, “food and beverage serving workers”, and “information and record clerks.”  We are, more than anything else, a nation of shitty retail jobs.

We have a lot of school teachers (4 million), a lot of people working in construction (3.7 million), a lot of nurses (2.7 million) and health technicians (2.8 million).  But the most common occupations are very heavily weighted towards retail, service, unskilled labor, and low-level office work.

What about the arts and sciences?

Shockingly, there are only 3030 mathematicians. Maybe a lot of them are calling themselves something else, like the 89,740 “post-secondary math and computer teachers”, though that’s hardly how I’d describe my professors.  There are 24,950 statisticians, 24,380 computer scientists, 17,340 physicists, and 87,560 chemists.

By contrast, there are 1.4 million software developers and programmers. In my little bubble, it feels like almost all the smart people wind up as software engineers; by the numbers, it looks like this is more or less true. All non-software engineers combined only make up 1.5 million jobs.  I hear a lot of rhetoric about “Silicon Valley only does software, real atom-pushing engineering technology is lagging” — I don’t have a basis for evaluating the truth of that, but we definitely have a lot of people employed in software compared to the rest of engineering.

There are 87,240 artists, more than half of whom are animators and art directors; there are 420,130 designers;  there are 63,230 actors, 39,260 musicians and singers, 11,540 dancers, and 43,590 writers.  Writers don’t actually do so badly: average wage is $69,250.  For all the hand-wringing about the end of writing as a profession, it’s still a real job.

There are a ton of doctors (623,380) and almost as many therapists (600,650).  Therapists here refers to physical therapists, occupational therapists, speech therapists, and so on, not psychological counselors.  There are far more people lower on the totem pole: 2.8 million medical technicians, 2.7 million registered nurses, and 3.9 million “healthcare support occupations” (nurses’ aides, orderlies, etc.  These fall into the lowest-paid category, average yearly income $28,300.)

There are 592,670 lawyers, and 27,190 judges.

Basically, when it comes to the arts and professions, doctors and lawyers are the most common as well as the best-paid, followed by engineers and programmers, and then scientists and artists.

What does the BLS tell you about what you should do for a living?

Of course, it depends on who you are and what resources are available to you. But here’s a few things that popped out to me.

1.) The most reliable way to make a high salary is to be a doctor.  There is absolutely no ambiguity on that point.

2.) Programming/engineering/hard science and management are the skills involved in most of the top-paid jobs.

3.) The best-paid job that doesn’t require a college degree is airline pilot. If you’re broke or you hate school, consider learning to fly.

4.) Writers and visual artists are not that poor, so long as they’re willing to work on commercial projects.

EDIT: Michael Vassar has questioned the numbers of doctors and lawyers.  It turns out the BLS numbers may be slight underestimates but aren’t too far off from other sources.

The Kaiser Foundation says there are 834,769 “professionally active physicians” in the US, as of 2012.  The Federation of State Medical Boards is giving the number 878,194 for licensed physicians as of 2012. We have roughly one physician for every 400 people, according to the World Bank.

The ABA gives 1,225,452 licensed lawyers.  Harvard Law School says the BLS numbers are lower because there are more people licensed to practice law than currently employed as attorneys.

All in all, I’m fairly confident that the number of “professionals” (doctors, lawyers, and engineers, including software engineers) is around 5 million, and likely not more than 10 million. It’s two or three percent of the population.

Taste and Consumerism

Peter Drucker, whose writings form the intellectual foundation behind the modern management corporation, defined “consumerism” as follows:

What consumerism demands of business is that it actually market.  It demands that business start out with the needs, the realities, the values of the customer.  It demands that business base its reward on its contribution to the customer. … It does not ask “What do we want to sell?” but “What does the customer want to buy?”  It does not say, “This is what our product or service does.” It says “These are the satisfactions the customer looks for, values, and needs.”

Peter Drucker, Management

A consumerist business, then, is like an optimization process, and customer feedback (in the form of sales, surveys, complaints, usage statistics, and so on) is its reward function.  A consumerist business is exquisitely sensitive to customer feedback, and adapts continually in order to better satisfy customers. The consumerist philosophy is antithetical to preconceived ideas about what the company “should” make.  Lean Startups, an extreme implementation of consumerist philosophy, don’t even start with a definite idea of what the product is; the company constantly evolves into selling whatever customers want to buy.

Another way of thinking about this: in a market, there are many possible voluntary trades that could happen.  A consumerist company tries to swim towards one of these trade points and slot itself into a convenient niche.  The whole purpose of trade is to produce win-win exchanges; “consumerism” just means being flexible enough to be willing to search through all the possibilities, instead of leaving opportunities unexploited. 

Yet another, more negative slant on consumerism, is that consumerism is the absence of taste.

A manager, according to Drucker, should not ask “What do we want to sell?”  But an artist always asks “What do I want to make?”

Computer scientist Richard Hamming famously said:

And I started asking, “What are the important problems of your field?” And after a week or so, “What important problems are you working on?” And after some more time I came in one day and said, “If what you are doing is not important, and if you don’t think it is going to lead to something important, why are you at Bell Labs working on it?”

A scientist, in other words, has to care what he’s working on.  Problems that are interesting, that have the potential to be world-changing.  Any good scientist is intrinsically motivated by the problem.  If you told Hamming you’d pay him a million dollars to crochet shawls all year, he’d laugh and refuse.  If he were the kind of person who could be induced to quit working on information theory, he wouldn’t be Hamming in the first place.

Ira Glass on creativity and taste:

All of us who do creative work … we get into it because we have good taste. But it’s like there’s a gap, that for the first couple years that you’re making stuff, what you’re making isn’t so good, OK? It’s not that great. It’s really not that great. It’s trying to be good, it has ambition to be good, but it’s not quite that good. But your taste — the thing that got you into the game — your taste is still killer, and your taste is good enough that you can tell that what you’re making is kind of a disappointment to you, you know what I mean?

J.D. Salinger on writing

You wrote down that you were a writer by profession. It sounded to me like the loveliest euphemism I’ve ever heard. When was writing ever your profession? It’s never been anything but your religion. Never…

If only you’d remember before ever you sit down to write that you’ve been a reader long before you were ever a writer. You simply fix that fact in your mind, then sit very still and ask yourself, as a reader, what piece of writing in all the world Buddy Glass would most want to read if he had his heart’s choice. The next step is terrible, but so simple I can hardly believe it as I write it. You just sit down shamelessly and write the thing yourself. I won’t even underline that. It’s too important to be underlined. 

Eric S. Raymond on software:

Every good work of software starts by scratching a developer’s personal itch.

There’s very clearly a tradition, across the creative disciplines, that a creator must be intrinsically motivated by love of the work and by the ambition to make something great.  Great by what standard?  Well, this is often informed by the standards of the professional community, but it’s heavily driven by the creator’s own taste.  She has some sense of what makes a great photograph, what makes a beautiful proof, what makes an ingenious design.  

Is taste universal? Is there some sense in which Beethoven’s 9th is “really” good — is there some algorithmic regularity in it, or some resonance with the human ear, something that makes its value more than a matter of opinion?  Maybe, and maybe not.  I’m inclined to be intrigued but skeptical of simple explanations of what humans find beautiful, like Schmidthuber’s notion of low Kolmogorov complexity.  My own speculation is that hidden symmetry or simplicity is also a fundamental principle of aesthetics: a perfect circle is all right, but an intricate and non-obvious pattern, which takes more attention to notice, is more interesting to the eye, because minds take pleasure in recognition.  

Whether there are some universal principles behind aesthetics or not, in practice aesthetics are mediated through individual taste. You cannot write a book by committee, or by optimizing around a dashboard of reader feedback stats.  You can’t write a proof that way either.  

Creative original work isn’t infinitely fungible and modifiable, like other commodities. The mindset of infinitely flexible responsiveness to feedback is extremely different from the mindset of focused creation of a particular thing.  The former involves lots of task switching; the latter involves blocks of uninterrupted time.  You can’t be a maker and a manager at the same time.  Managing, responding to feedback, being a “consumerist,” requires engaging your social brain: modeling people’s responses to what you do, and adapting accordingly.  Making things involves turning that part of your brain off, and engaging directly with physical objects and senses, or abstract concepts.

Creative work is relevant to businesses.  Design, for instance, matters. So does technological innovation.  But, for a consumerist business, the constraints of creative work are unwelcome limitations.  Makers want to make a particular thing, while the company as a whole needs to find any niche where it can be profitable.

Drucker defines “knowledge workers” as skilled experts, whose loyalty is stronger to their profession than to their company.  They’ll introduce themselves with “I’m a natural language processing guy”, not “I work for IBM.”  Drucker’s “knowledge workers” seem somewhat analogous to “makers.” A cynical view of his book Management is that it’s about how to organize and motivate knowledge workers without giving them any real power.  The NLP guy’s goal is to make a tool that does an excellent job at machine translation. The manager’s goal is to promote the growth and survival of the organization.  These goals are, ideally, aligned, but when they conflict, in a Druckerian organization, the manager’s goal has to take priority.

What this means is that makers, people with taste, have a few options.

1. Work for a manager in a successful company. You’ll have serious constraints on the type of work you do, and you won’t be able to capture much of its financial value, but your work will be likely to be implemented at a large scale out in the world, and you’ll have steady income.

2. Have a small lifestyle business that caters only to the few who share your taste.  You’ll never have much money, and you won’t have large-scale impact on the world, but you’ll be able to keep your aesthetic integrity absolutely.

3. Find a patron. (Universities are the classic example, but this applies to some organizations that are nominally companies as well. A hedge fund that has a supercomputer to model protein folding is engaging in patronage.  Family money is an edge case of patronage.)  A patron is a high-status individual or group that seeks to enhance its status by funding exceptional creators and giving them freedom in their work.  You can make a moderate amount of money, you’ll get a lot of creative freedom (but you’ll be uncertain how much or for how long) and you might be able to have quite a lot of impact. The main problem here is uncertainty, because patrons are selective and their gifts often have strings attached.

4. Start a business that bets hard on your taste.  If you’re Steve Jobs, or Larry Page, your personal vision coincides with market success. You can win big on all fronts: money, impact, and creative freedom.  The risk is, of course, that the overwhelming majority of people trying this strategy fail, and you’re likely to wind up with much less freedom than 1-3.

Howard Roark, the prototypical absolutist of personal taste, picked option 2: he made the buildings he liked, for the people who shared his taste in architecture, refused to engage in any marketing whatsoever, and was nearly broke most of the time.  In fact, Ayn Rand, who has a reputation as a champion of big business, is if anything a consistent advocate of a sort of Stoic retirement. You’d be happier and more virtuous if you gave up trying to “make it big,” and instead went to a small town to ply your craft.  “Making it”, in the sense of wealth or fame or power, means making yourself beholden to lots of people and losing your individuality. 

I’m not sure I’m that much of a hipster. I don’t think the obvious thing for a creative person to do is “retirement.”  Especially not if you care about scope.  If you’ve designed a self-driving car, you don’t want to make one prototype, you want a fleet of self-driving taxis on the streets of New York.  Even more so, if you’ve discovered a cure for a disease, you want it prescribed in hospitals everywhere, not just a home remedy for your family.

What I actually plan to do is something between 1 and 3 (there’s an emerging trend for tech companies to seem to straddle the line between patrons and employers, though I’m not certain what that looks like on the inside) and explore what it would take to do 4.