Estimating Greenhouse Gas Emissions:
The Washington Post and information quality
29 Apr 2007 in Information Quality
A Page One story by Washington Post staff writer David A. Fahrenthold says carbon dioxide emissions in the Washington, DC, area increased 13.4% from 2001 to 2005.
The article clearly is intended to influence public policy. But there are significant problems with this estimate that are not disclosed in the article. The federal Information Quality Act does not apply to the Washington Post, but it would apply to any federal agency that attempted to either take action based on them, or even to report them in a manner suggesting that it thought they were valid. (Congress is exempt from the statutory requirement to only disseminate scientific and statistical data that meet applicable information quality standards. Unlike Executive branch agencies, of course, Congress is never regarded as an authoritative body for scientific or statistical information. )
Below we compare the data reported by Fahrenthold with the information quality standards that apply to federal agencies.
Here's what Fahrenthold tells us about how he derived his estimate:
Then, using methods from the U.S. Energy Information Administration, those figures were used to calculate the total amount of carbon dioxide emitted from vehicles and power-plant smokestacks. [See the chart for details.]
The figures from those calculations leave out greenhouse gases from other sources, such as agriculture, planes, boats and oil furnaces. Those missing figures could account for half of all emissions.
The the chart referred to in square brackets above is found only in the print edition and is titled "The Rapid Rise of Emissions." It contains the following reported data, but in graphical form:
|
"The Rapid Rise of Emissions"
("The rate of increase was calculated by The Washington Post" using data from governments, environmental groups and electric utilities") |
|||
| Cars and Trucks* | Electricity Use** | Both Sources Combined | |
| Virginia Suburbs | 16.8 | 20.4 | 18.8 |
| Maryland Suburbs | 10.1 | 12.2 | 11.2 |
| The District | -1.6 | 9.1 | 6.7 |
| Total Area | 11.7 | 14.6 | 13.4 |
| US Total | 4.9 | 6.0 | 5.6 |
| *Arlington County not included in Virginia Suburbs ** Frederick County not included in Maryland Suburbs. Only partial data available for Stafford, Fauquier, Calvert, Montgomery and Prince George's counties SOURCE: Staff reporting Washington Post, April 29, 2007 Print Edition, A16 |
|||
These data do not adhere to the minimum information quality standards that would apply if they had been disseminated by the federal government.
TRANSPARENCY AND REPRODUCIBILITY
Federal information quality guidelines require government agencies to practice transparency and reproducibility when they disseminate statistical information. Transparency means fully revealing all sources and methods. Reproducibility means providing enough information that a qualified third party would obtain essentially the same answer. The Post's data do not satisfy either of these requirements.
The Post's choice of data is not transparent, and Fahrenthold only hints at his sources. At least one of his acknowledged sources -- "environmental groups" -- have a policy interest in maximizing the reported percentage increase in CO2 emissions. It is possible that they did not bias their data in accordance with these policy interests. However, Fahrenthold does not inform readers of this potential conflict of interest, nor does he reveal whether the Post performed due diligence to validate the validity and reliability their data. It appears that the Post simply accepted their data without question.
The Post acknowledges that its cdata are incomplete two ways -- first, by not counting all emissions from categories that it included, and second, by excluding source categories. When data are incomplete, inferences about them should be made with caution. Instead, the Post mentions these defects but draws inferences as if these defects are minor.
With regard to its analytic methods, the Post also reveals nothing of importance. Presumably, the Post performed a simple subtraction of 2001 from 2005 values and assumed the resulting difference to be an unbiased estimate. An unbiased estimate is one that is just as likely to overestimate the true but unknown value as to underestimate it. But simple subtraction yields an unbiased estimate of the difference only under certain restrictive conditions, including:
- All definitions must be identical for 2001 and 2005. Any change in definitions means that the data are not comparable across years, and the result of subtraction is uninterpretable. Apples cannot be subtracted from oranges.
- Data that were missing in each year must be missing from both years. Counties partially counted or missing in 2001 must be either missing or excluded in 2005, and vice versa. Where coverage was partial in 2001, it must be identically partial in 2005.
- The methods used to estimate values for 2001 must be the same methods used for estimating values for 2005. Any change in methods implies an explainable discrepancy in the reported difference.
This leads to the Post's second procedural failure. The Post's calculations are not reproducible by a qualified independent third party. Fahrenthold reports that "Jonathan Cogan, a spokesman for the [Department of Energy's] Energy Information Administration reviewed The Post's calculations and said the agency's formulas appeared to have been used correctly." The extent of this external review is unclear -- was it limited to fidelity to EIA formulae, or did it also include a review of the Post's input data? (By responding to the Post's request, Cogan put EIA in the position of violating the spirit of the law by implicitly conveying its endorsement. He did not violate the letter of the law because statements made by agency spokesmen are exempt.)
The depth of Cogan's review notwithstanding, the reproducibility requirement in federal information quality standards can't be satisfied by reliance on a hand picked third party. Satisfying the reproducibility requirement can be achieved only by disclosure.
OBJECTIVITY
Federal information quality guidelines require federal agencies to ensure that statistical information intended to influence policy be objective.
Presentational objectivity means that information must be "presented in an accurate, clear, complete, and unbiased manner," including "within a proper context" that may include"other information" necessary "to ensure an accurate, clear, complete, and unbiased presentation, including sources and supporting data and models "so that the public can assess for itself whether there may be some reason to question the objectivity of the sources."
We've already documented why the Post's estimates are unlikely to be substantively objective. If a federal agency disseminated statistical information this way, it would be presumptively in violation of the law. So we'll focus on presentational objectivity, which applies even if substantive objectivity is assured.
- Excess precision
- Invalid baseline
- Invalid comparisons
Figures for the Maryland Suburbs are even more problematic. Fahrenthold reports that there are data missing from Montgomery and Prince George's counties, and he excludes Frederick County. These counties represent 33%, 30% and 3%, respectively, of the Maryland Suburbs. Data are incomplete or excluded with respect to 66% of the suburban Maryland population.
Howard County, located midway between Washington and Baltimore, is also excluded by the Post. Had Howard County been included, the population for the Maryland Suburbs would have been about 10% greater.
- Invalid inferences from the data
But Fahrenthold did not point out that DC's population had declined about 4% during this period, whereas the population of Suburban Virginia and Suburban Maryland increased about 11% and 10%, respectively. Adjusting for DC's population decline, Fahrenthold's figures, if true, would mean DC's CO2 emissions rose 11% per capita.
Indeed, the entire picture changes when population changes are taken into account. When Fahrenthold's (unverified) estimates of percentage changes in CO2 emissions from 2001 to 2005 are divided by the Census Bureau's (validated) estimates of population changes from 2000 to 2005, DC's performance is the worst in the region rather than the best:
| How Adjusting for Population Changes the Washington Post's Estimates |
||
| Jursdictions | Percentage Change in CO2 Emissions Reported by the Washington Post |
Percentage Change in CO2 Emissions Reported by the Washington Post Adjusted for Population Changes |
| Virginia Suburbs |
18.8% | 7% |
| Maryland Suburbs |
11.2% | 2% |
| The District | 6.7% | 11% |
| See table. |
||
To be clear, we hesitate to draw any inferences from Fahrenthold's data. We doubt they are useful for any public policy purpose. Most importantly, his inferences about both the absolute change in CO2 emissions in the Washington metropolitan area and his comparisons across jurisdictions are unsupported by his own data.
- Invalid inferences beyond the data
- Information quality defects lead others to draw invalid inferences
A plausible explanation for the invalid inferences made by the anonymous DC government officials cited by Fahrenthold is that Fahrenthold himself premised his request for a reaction on invalid inferences about the data. When pressed for a reaction, public officials may offer answers that are consistent with other data at their disposal. Alternatively, they may give an explanation that is either self-serving or what they think the reporter wants to hear. (Sometimes these are the same thing.) It's possible that DC officials have data supporting their suggestion that DC's allegedly lower rate of increase CO2 emissions is a "sign of changing behavior." But it's more plausible that they didn't want to attribute the lower rate to a decline in the District's population, about which they would be familiar and would not be interpreted favorably by a reporter whose narrative is that regional CO2 are "rapidly rising."
Similarly, Frank O'Donnell's claim that "sprawl is causing a big increase in greenhouse gases" is most plausibly related to the public policy positions he and his organization advocate. Because they are opposed to what they call "suburban sprawl," sprawl is a convenient inference from Fahrenthold's data that also fits the reporter's likely narrative.
If sprawl were actually the culprit, then one would expect to find that commuting times are significantly higher for jurisdictions farther away from the District. The available data don't support that inference. Average commute times reported by the Census Bureau are not nearly as different across the region as one would expect if sprawl were the underlying cause of rising CO2 emissions. For Virginia, average commute times vary from 27.3 minutes (Arlington County) to 37.7 minutes (Stafford County). But Arlington is located adjacent to the District and Stafford is about 45 miles southwest. A 10-minute difference in average commuting time seems much less than one would expect if proximity to the District reduced CO2 emissions from commuting. For Maryland the range is 29.2 minutes (St. Mary's County) to 39.8 minutes (Calvert County) -- again, a range of just 10-minutes.
Indeed, the average commuting time for residents of the District was almost 30 minutes in 2000. The higher population density of the District apparently does not translate into a significantly reduced commute. When DC's figure is treated as a baseline and subtracted from the averages for the other jurisdictions, the range in net average commuting times in Virginia becomes -2.4 to 8, and the range in Maryland becomes -0.5 to 10.1. People in the Washington metropolitan area don't all work in the District, and they choose places to live based on many criteria other than the length of their commute. But their average commute is remarkable stable irrespective if where they live.
Of all the errors in Fahrenthold's story, surely the most pernicious is the claim that CO2 emissions are "rising rapidly." As we've already noted, a rate of acceleration cannot be discerned from two static observations. But this narrative is clearly an appealing one for those who are predisposed to believe that "the problem" of anthropogenic global climate change is "getting worse." This narrative is often expressed by Post reporters and the newspaper's editorial board. The Post should make a diligent effort to understand information quality principles and apply them to the newspaper's work products, especially when a story appears to conform to the revealed biases of its reporters and editors.
| Jurisdiction | 2000 | 20051 | % Ch2 | 2000 Avg Commute (mins)3 |
2000 Avg Commute Normalized by District Avg Commute |
Pop'n Adjusted % Change CO24 |
| VIRGINIA SUBURBS | 1,962,782 | 2,220,329 | 10.8% | -- | -- | 7% |
| Arlington County | 189,453 | 195,965 | 3.4% | 27.3 | -2.4 | -- |
| Alexandria City | 128,283 | 128,923 | 0.5% | 29.7 | 0.0 | -- |
| Fairfax City | 21,498 | 21,963 | 2.2% | 30.1 | 0.4 | -- |
| Fairfax County | 969,749 | 1,006,529 | 3.8% | 30.7 | 1.0 | -- |
| Falls Church City | 10,377 | 10,781 | 3.9% | 26.4 | 6.7 | -- |
| Fauquier County | 55,139 | 64,997 | 17.9% | 36.8 | 7.1 | -- |
| Loudon County | 169,599 | 255,518 | 50.7% | 30.8 | 1.1 | -- |
| Manassas City | 35,135 | 37,569 | 6.9% | 32.4 | 2.7 | -- |
| Manassas Park City | 10,290 | 11,622 | 12.9% | 35.6 | 5.9 | -- |
| Prince William County | 280,813 | 348,588 | 24.1% | 36.9 | 7.2 | -- |
| Stafford County | 92,446 | 117,874 | 27.5% | 37.7 | 8.0 | -- |
| MARYLAND SUBURBS |
2,561,109 | 2,828,550 | 9.5% | -- | -- | 2% |
| Anne Arundel County | 489,656 | 510,878 | 4.3% | 28.9 | 0.1 | -- |
| Calvert County | 74,563 | 87,925 | 17.9% | 39.8 |
10.1 | -- |
| Charles County | 120,546 | 138,822 | 15.2% | 39.3 | 9.4 | -- |
| Frederick County | 195,277 | 220,701 | 13.0% | 31.9 | 2.2 | -- |
| Montgomery County | 873,341 | 927,583 | 6.1% | 32.8 | 3.1 | -- |
| Prince George's County | 801,515 | 846,123 | 5.7% | 35.9 | 6.2 | -- |
| St. Mary's County | 6,211 | 96,518 | 11.9% | 29.2 | -0.5 | -- |
| THE DISTRICT | 572,059 | 550,521 | -3.8% | 29.7 | 0.0 | 11% |
| 1 Estimated by Census Bureau; see data quality note. 2 Estimated by Census Bureau; see data quality note. 3 Estimated by Census Bureau; see data quality note. 4 Estimated by the Washington Post; no data quality disclosed. |
||||||


