The Long Journey Out Of The Wilderness

by Patrick Appel

Bruce Bartlett explains why he is "anti-Republican:"

I will know that the party is on the path to recovery when someone in a position of influence reaches out to former Republicans like me. We are the most likely group among independents to vote Republican. But I see no effort to do so. All I see is pandering to the party’s crazies like the birthers . In the short run that may be enough to pick up a few congressional seats next year, but I see no way a Republican can retake the White House for the foreseeable future. Both CBO and OMB are predicting better than 4% real growth in 2011 and 2012. If those numbers are even remotely correct Obama will have it in the bag. Also, Republicans have to find a way to win some minority votes because it is not viable as a whites-only party in presidential elections. That’s why I wrote my Wrong on Race book, which no one read.

Does Universal Health Care Reduce Employment?

by Jim Manzi

Three academics claim to have a preliminary answer with the provenance of empirical science. William H. Dow, Arindrajit Dube and Carrie Hoverman Colla recently had an editorial in the New York Times arguing that San Francisco’s “near universal health care program” initiated early last year has not contributed to reduced employment despite the fact that “many businesses there had to raise their health spending substantially to meet the new requirements.”

How do they know the impact of this regulation on employment in San Francisco, when so many factors influence employment? They obviously can’t just look at whether employment went up or down after the law was passed. They need to answer the question: “But for the introduction of this regulation, what would employment have been?” The way they do this is identify a control group of other localities that did not introduce this change, and use this to proxy for what the change in employment in San Francisco would have been but for the introduction of this regulation. In the editorial they say “the early results are in”, and:

As of December 2008, there was no indication that San Francisco’s employment grew more slowly after the enactment of the employer-spending requirement than did employment in surrounding areas in San Mateo and Alameda counties. If anything, employment trends were slightly better in San Francisco.

There are at least two huge problems with concluding from this statement that the results so far in San Francisco tell us anything useful about the impact of such laws on employment. First, a period of just less than 12 months is almost certainly not enough time to observe the effects of the labor force impacts. Second, even if we accept this time period as relevant, the measurement method they describe is not nearly sufficient to identify significant changes in employment, positive or negative, caused by this law. Inadequate Time Period

Normally when the price of labor to a business goes up, the reaction of the business is some combination of (1) figuring out how to use less labor, and (2) just passing on the cost increase to consumers. If the business thinks all competitors face the same price increase, it tends to be a lot more of latter. Even when this is the case, the price of the whole category (whatever is sold by the business and all its competitors, whether this is an industry, geography or some other grouping) is now more expensive versus other categories of goods, so it tends to suffer over time as compared to what its sales and profits would have been had there been no labor price increase. This will tend to depress employment for companies in that category over time – often in the form of new jobs that otherwise would have been created, but now are not. Further, this category-level price increase tends to invite the entry of new competitors who can find a way around the labor costs. An obvious example of this set of dynamics is that the ever-increasing economic costs of labor to the Detroit ecosystem created by synchronized union contracts seemed OK for a long time because “everybody” (i.e., GM, Ford and Chrysler) faced the same costs. Eventually, they became obviously unsustainable because of external competition. In sum, the employment effects of a structural increase in labor costs can take a long time to play through.

The authors argue, somewhat unpersuasively, that San Francisco, like Detroit decades ago, can raise labor costs with impunity:

Local service businesses can … raise prices without risking their competitive position, since their competitors will be required to take similar measures.

But of course, this assumes that these service businesses are not in competition, over time, with businesses outside of San Francisco. To some extent, they are.

They also make the argument that the improved health care should create an offsetting benefit. This isn’t like an oil price shock, which is just all bad, but a reallocation of resources that could grow the whole pie of wealth for San Francisco. The authors put this as:

Over the longer term, if more widespread coverage allows people to choose jobs based on their skills and not out of fear of losing health insurance from one specific employer, increased productivity will help pay for some of the costs of the mandate.

Fair enough.

But think about the various dynamics involved. Labor costs rise in early 2008, and as a result prices are increased to some extent, and profit margins go down to some extent. Some restaurants lay people off, and other businesses are more reluctant to hire. As an example, last May the President of a local chain of hardware stores described avoiding hiring in order to remain below the 100-employee threshold for a more onerous tier of the program. Some people in San Francisco and the surrounding suburbs note prices are higher for dinner (and hammers, and groceries, and …) in the city, and start to buy marginally more goods and services in nearby towns, further pressuring margins and employment in San Francisco. Entrepreneurs, on the margin, locate businesses in Hayward or other towns just outside San Francisco, and as these businesses grow, the jobs that would have been created in San Francisco are now created in the suburbs. “Over the longer term” some career switches occur that otherwise would not, potentially raising labor productivity, growing the economy and increasing employment. How likely is all of this to play out in less than 12 months?

In light of such obvious issues, it is exceedingly odd that the authors have published an editorial in August of 2009 that relies on the results of the San Francisco policy “as of December 2008”. They’re throwing away at least six months of data (Q1 and Q2 of 2009). This is about one-third of all the time since the law was implemented, and given the reaction time involved, almost certainly more than one-third of all the information about what has happened as a result. More on this later.

Inadequate Test and Control Matching

But there is a further, and more severe, problem with the reasoning presented by the authors. Even if we look at the effects just within 2008, Alameda and San Mateo counties do not provide a sufficiently good control population for San Francisco to draw the conclusions that they assert.

We can examine the usefulness of this proposed control group by looking at how closely annual changes employment in Alameda and San Mateo counties (“Control”) track San Francisco (SF). I’ve taken the total Control percentage growth in employment from year X to year X+1, and applied this percentage change to SF’s employment in year X to create “expected” employment in SF in year X+1 (i.e., what employment would have been in San Francisco in year X+1 had Alameda and San Mateo formed a perfect control). I then compare the actual change in the number of jobs in year X+1 in SF to this expectation, and call this the “residual” for that year. If the residual is positive, therefore, it means that SF gained more jobs in that year than would be expected based on the Control; if negative, the reverse.

Here is what this calculation looks like for about the last decade:

SFRedidualEmployment

Hopefully, stating the residual in terms of number of jobs helps to make this intuitive. San Francisco has total employment of about 425,000. So, as an example, a swing of about 4,000 jobs represents a 1% change in employment. I think it’s fair to characterize such a causal impact as “significant”, in that on a national basis it would translate to an increase in the U.S. structural unemployment rate of about a percentage point (or the equivalent number of jobs lost through some combination of an increase in the unemployment rate and a reduction in the number of people looking for work). How likely is it that this instrument could find a causal effect of 4,000 jobs?

Asked more rigorously, what are the odds that the ~5,000 job gain in SF vs. Control in 2008 (“If anything, employment trends were slightly better in San Francisco.”) is simply statistical noise? Here’s a simple but useful way to think about it. If the SF health program had the causal effect of significantly reducing employment by, say, 1 percent, this would mean that but for the SF health program, SF would have had a residual of 9,000 jobs in 2008 (the 5,000 actual residual + the extra 4,000 jobs that would have been there but for the health program). SF has shown a residual at least this high in two of the past ten years (2000 and 2007), or 20% of the cases. (And even this understates our real uncertainty, since we don’t know that the distribution of control error that we have seen over the prior decade is representative of differences between SF and Control in 2008). Conventionally, we would not reject the null hypothesis that this could be random variation unless there is less than a 5% chance of this occurring.

If you think this is quibbling, ask yourself this question: had the authors conducted this analysis in 2002, do you think they would publish a study, and the New York Times would run an editorial, saying that “The early results are in, and universal health care seems to be a job killer.”? And if they (or more likely, if a paper more hostile to universal health care had), they would have been wrong, as we know that the 10,000+ downward swing in San Francisco versus Alameda and San Mateo counties in 2002 had nothing to do with a program that would not be implemented for six more years.

Or consider that if we take the residual through the first half of 2009 and simply annualize it, we get a chart looks like this: 

SFRedidualEmployment2  

It would be very easy for me to “construct a narrative” that we now see the longer-term negative impacts of this program emerging: “Look, the employment in San Francisco stopped its upward trend versus Alameda and San Mateo counties right when this program was implemented, and has now started a precipitous decline!”. Or whatever. But this would be, like the editorial, a just-so story. The relative employment performance of San Francisco versus two nearby counties over about 18 months is not an instrument with sufficient precision to identify the even quite significant potential causal impacts of this program on employment.

As far as I can see, there is no published paper that would allow an external observer to evaluate the work behind claims in the editorial more completely, just an unpublished work-in progress not available for download (though obviously there may be some way to get it that I haven’t found). It’s possible, for example, that the authors have carved out a clever subset of geographies within the Alameda and San Mateo counties to use as controls, or have used regression-like techniques to further adjust the residuals. Each of these methods has its own problems. In any event, the stated demonstration in the editorial does not seem to hold water.

Interestingly, one of the authors (Dube), has previously published a methodologically sophisticated academic paper in which he argued that when doing exactly this kind of analysis that compares contiguous counties that straddle a political jurisdiction in order to estimate the employment effects of a policy discontinuity (in the case of the paper mentioned, to evaluate the impacts of increases in the minimum wage on employment). One of his methodological conclusions was that such individual “case study” comparisons as that of SF to Alameda and San Mateo counties are fraught with danger:

As we show in this paper, the odds of obtaining a large positive or negative elasticity from a single case study is non-trivial. This result establishes the importance of pooling across individual case studies to obtain more reliable inference, a point made in earlier papers.

(As an aside, in this paper Dube uses the Quarterly Census of Employment and Wages, which permits industry-level analysis, as his primary data set for employment outcomes. This data is only produced by the government with a six or seven month lag. I assume that he is employing a very similar method for the analysis behind the San Francisco health care editorial, and that this is what accounts for only using data through the end of 2008.)

I am an aggressive proponent of using experiments to obtain valid inferences about the effects of public policies. What this example makes clear, however, is the importance of either very careful experimental design, or in the case of so-called “natural experiments”, extreme caution about methods of interpretation. In this case, the argument made by the authors in one of the most prominent pieces of real estate in American public debate appears to be insufficient to support their conclusion.

Health Insurers Vs. Big Corn?

by Patrick Appel

Unsurprisingly, Michael Pollan is refusing to boycott Whole Foods even though he disagrees with John Mackey’s Op-Ed. His reasoning:

[If] health insurers can no longer pick and choose their clients, and throw sick people out, they will develop a much stronger interest in prevention, which is to say, in changing the way America feeds itself. When health insurers realize they will make thousands more in profits for every case of type II diabetes they can prevent, they will develop a strong interest in things like corn subsidies, local food systems, farmer’s markets, school lunch, public health campaigns about soda, etc.

The Latest Euphemism From The Torture Party

by Andrew

Max Boot deploys a new one (they keep coming up with them):

It would be easy to conclude with a “high degree of confidence” that one of the most effective intelligence-gathering tactics in the war on terrorism — the aggressive interrogation of captured terrorists — has been eliminated and, along the way, the agency charged with being on the front lines of the war has been severely degraded in operational effectiveness. In other words, the Obama administration has taken some of the most effective changes implemented by the Bush administration and reversed them in what could be a Carter-style emasculation of American intelligence capabilities.

"Aggressive interrogation of captured terrorists"  needs translation into plain English. It means "the torture of captives suspected of being terrorists." "One of the most effective intelligence-gathering tactics in the war on terrorism" also needs translation, since there is no evidence, as Bush DHS official Frances Townsend and every neutral observer has noted, that the intelligence, if accurate, could not have been achieved by legal, American and ethical means. We also know for a fact that the majority of all those who have been abused and tortured by the US under Bush and Cheney were innocent of any terror offenses. (At Abu Ghraib, one of the test-sites for Cheney's methods, up to 90 percent were completely innocent, according to the Bush administration). We have no idea how many of those captured, abused and tortured at Bagram were and are innocent. And we know that the Red Cross has definitively ruled the Bush-Cheney treatment as torture and, at the very least, illegal "cruel and inhuman treatment" of prisoners.

"Aggressive interrogation" means, in plain English, stripping suspects, hooding them, beating them, putting a collar around their neck and launching their bodies against a plywood wall up to thirty times, subjecting them to sleep deprivation in one case as long as 960 hours over 54 days, shackling them in stress positions used by the Vietnamese against John McCain, denying medical care in some cases, sexually traumatizing them, using Islam as a weapon against them, putting them in upright coffins, threatening to kill their children and spouses, threatening to drill their skulls with power-drills, freezing them in iced water or freezing air-conditioning until near-death, subjecting them to extreme heat, and sensory deprivation in isolation for months until they become mental and physical shells. It means Abu Ghraib, the one place where we have been able to see what neoconservatism has come to stand for: the brutal torture and abuse of Arabs and Muslims. It means murdering over a hundred of such prisoners – merely because they are suspects and Arab Muslims. It means verschaerfte Vernehmung, in which neocons eagerly adopt the precise methods and even terminology of the Gestapo and brandish their cooptation of Nazi standards of prisoner treatment as an American value.

If Max Boot wants to defend these things, he should have the courage to defend them in plain English. He should have the courage to defend what we saw at Abu Ghraib and what we have not been allowed to see at Bagram.

As for "Carter-style emasculation", let us also remember that the return to ethical, legal treatment of prisoners is just as easily described as "Reagan-style emasculation." It was Reagan who signed the UN Convention on Torture which these neocons have torn up and despise. It is his legacy of American support for human rights that they reject. Indeed it is every president before Bush that they describe as emasculating US defense, because no president until Bush authorized and enforced torture and abuse of war prisoners as a national policy.

You want a "Reagan-style emasculation" of American intelligence? Support Obama.

“What if Todd really was innocent?”

by Patrick Appel

Jonah noted this earlier, but David Grann’s article on Cameron Todd Willingham, who was executed by Texas in 2004, is a must read. The overwhelming evidence, some of it unearthed by Grann for the first time, suggests that Willingham was not guilty. I can’t believe that this was admitted:

The prosecution cited such evidence in asserting that Willingham fit the profile of a sociopath, and brought forth two medical experts to confirm the theory. Neither had met Willingham. One of them was Tim Gregory, a psychologist with a master’s degree in marriage and family issues, who had previously gone goose hunting with [prosecutor John] Jackson, and had not published any research in the field of sociopathic behavior. His practice was devoted to family counselling.

At one point, Jackson showed Gregory Exhibit No. 60—a photograph of an Iron Maiden poster that had hung in Willingham’s house—and asked the psychologist to interpret it. “This one is a picture of a skull, with a fist being punched through the skull,” Gregory said; the image displayed “violence” and “death.” Gregory looked at photographs of other music posters owned by Willingham. “There’s a hooded skull, with wings and a hatchet,” Gregory continued. “And all of these are in fire, depicting—it reminds me of something like Hell. And there’s a picture—a Led Zeppelin picture of a falling angel. . . . I see there’s an association many times with cultive-type of activities. A focus on death, dying. Many times individuals that have a lot of this type of art have interest in satanic-type activities.”

I’ve long been against the death penalty because the criminal justice system is fallible and because pursuing the death penalty is a terrible allocation of resources. In his excellent new book, Mark Kleiman describes how our limited criminal justice dollars are squandered by pursuing the harshest of penalties. He writes:

Theory and evidence agree: swift and certain punishment, even if not severe, will control the vast bulk of offending behavior. One problem with the brute-force, high-severity approach is that severity is incompatible with swiftness and certainty. Severity means using a large share of punishment resources on a (relatively) few offenders, and (as the American experience with capital punishment since its reintroduction illustrates) the more severe a sentence is the more reluctantly it will be imposed and the more “due process”— and therefore the more time—it will require.

The resources of the current criminal-justice system, matched against the volume of crime, simply do not allow it to punish, even modestly, all offenses or all offenders. Trying to control everything and everyone—the tough-sounding “zero tolerance” approach—leads to sporadic and delayed punishments as the system overloads. The result is great quantities of punishment, much of it severe, and effective control of nothing and no one except those actually behind bars: a bad bargain.

On a related note, Grann quotes the prosecutor in the case:

Unlike many other prosecutors in the state, Jackson, who had ambitions of becoming a judge, was personally opposed to capital punishment. “I don’t think it’s effective in deterring criminals,” he told me. “I just don’t think it works.” He also considered it wasteful: because of the expense of litigation and the appeals process, it costs, on average, $2.3 million to execute a prisoner in Texas—about three times the cost of incarcerating someone for forty years.

It’s not conceivable that the so called deterrent effect from capital punishment is three times more effective than forty years in prison. Not only is capital punishment unethical for the reasons made very clear by Grann’s reporting, it results in more crime for the reasons outlined by Kleiman. It’s an illogical feature of the criminal justice system only made possible by political opportunism and the all-to-human desire for revenge. Another incredible bit of reporting from Grann:

In March, 2000…[prison informant Johnney] Webb [who said that Willingham confessed to him] unexpectedly sent [the case’s prosecutor John] Jackson a Motion to Recant Testimony, declaring, “Mr. Willingham is innocent of all charges.” But Willingham’s lawyer was not informed of this development, and soon afterward Webb, without explanation, recanted his recantation. When I recently asked Webb, who was released from prison two years ago, about the turnabout and why Willingham would have confessed to a virtual stranger, he said that he knew only what “the dude told me.” After I pressed him, he said, “It’s very possible I misunderstood what he said.” Since the trial, Webb has been given an additional diagnosis, bipolar disorder. “Being locked up in that little cell makes you kind of crazy,” he said. “My memory is in bits and pieces. I was on a lot of medication at the time. Everyone knew that.” He paused, then said, “The statute of limitations has run out on perjury, hasn’t it?”

Read it all.

The Personal Bubble

by Jonah Lehrer

Most people prefer to interact with others at a distance of about two feet, a polite gap that's known as the personal bubble. It's our zone of privacy, a way of ensuring that our hand gestures, smells and spittle don't interfere with the conversation. (It's also hard to focus on someone else's face when they get much closer than 10-12 inches.) However, there's a new paper in Nature Neuroscience documenting the strange case of patient SM, who is completely missing this personal bubble due to a selective pattern of brain damage in the temporal lobe. The end result is that SM doesn't mind "close talkers" and has to constantly remind herself that everyone else prefers a little social distance. Ed Yong, over at Not Exactly Rocket Science, examines the case report in detail:

She [SM] said time and time again that she was actually comfortable at any distance, and during one trial, she actually walked all the way to her partner until they were actually touching. Even when they were making direct eye contact and touching nose-to-nose, she only rated the experience as 1 on a comfort scale of 1 to 10, where 1 is perfectly comfortable. When a male stranger talked to her up close, she again rated the chat as a 1 (even though he gave it a 7).