Ratios: Abuses and Misuses!
Special note: We are responding to reader requests concerning the development of a solid foundation for understanding, in very practical terms, statistical analysis.
Many people suffer from acute statistical trauma. We are introducing a continuing series of articles that, we hope, provides readers with the basics required for proper usage of statistical tools and methods.
The first four "back-to-basics articles" should be read in the following order—namely: 1) Preface To Performance Measurement: Understanding Ratios, 2) Ratios: Abuses and Misuses, 3) Averaging Ratios and The Perils of Aggregation and 4) A Painless Look At Using Statistical Techniques To Find The Root Cause of a Problem.
Editor's note: We discussed the construction and meaning of ratios in a previous article. We suggest new readers read the article Preface to Performance Measurement: Understanding Ratios. Probably no family of statistical measures is more prone to honest misuses and wanton abuses than that of the percent.
In this article we survey a variety of ways that percents misinform, mislead and misdirect. Improper comparisons and faulty conclusions about ratios result from statistical ignorance. H. G. Wells once said: "Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write."
Wells’ prediction was correct. Today, more than ever in our history, we are subject to meaningless averages, improper comparisons, cheating charts and faulty conclusions.
Many schools of journalism, realizing the awesome responsibility of the news media, are now requiring students to take programs, which will enable them to evaluate statistical evidence more judiciously.
There are many ways to misuse ratios. Yet practically all ratio errors can be classified into a relatively small number of general categories. With the exceptions of averaging ratios and averaging time rates (Read future article on "Special Averages Used In Ratio Analysis"), the categories are as follows:
The first three are fairly easy to diagnose. They are:
- Errors involving the base of the ratio
- Failure to distinguish between percent change and percent points of change
- Failure to use ratios when needed
The second three categories contain the problem children. They are:
- Failure to compare ratios refined to the same degree
- Failure to use a control group when needed
- Sloppy data collection methods
These six categories are not mutually exclusive. They are also not always sharply differentiated; a given ratio error might, for instance, be classified as "failure to compare ratios refined to the same degree" or as "jumping to wrong conclusion about a partition percentage."
Further, it is possible to make an error that combines two, sometimes even elements of three categories, in one colossal blunder. It is impossible in this brief review, to give more than a suggestion of the range of ratio errors possible.
Indeed, we suggest readers purchase, if possible, Stephen K. Campbell's book Flaws And Fallacies In Statistical Thinking. It contains many additional categories of all-too-common ratio errors. (When we last looked, it was available through Amazon.)
Hopefully, this discussion will make you more alert to the possibilities of faulty ratio analysis thinking...and improper ratio comparisons.
Category 1: Errors Involving the Base of the Ratio
There are many types of distinct errors involving a ratio's base. Typically these errors are:
Base Too Small
The classic example of a misleading percent resulting from too small a base concerns the marrying habits of women students at John Hopkins University. Stephen Campbell, in his marvelous book Flaws And Fallacies In Statistical Thinking (Englewood Cliffs: Prentice-Hall,1974), tells us:
"Some decades ago John Hopkins broke a precedent and began admitting woman students . Shortly thereafter it was reported that 33 1/3 percent of women students who had enrolled had married members of the faculty. But the story fizzled when it was disclosed that only three women had enrolled the first year and one had married a faculty member."
Use of The Wrong Base
Some years ago in a Science article, the authors stated that "the cost of a telephone call has decreased by 12,000 percent since the formation of The Communication Satellite Corporation." A flood of letters to the editor of Science pointed out that it is impossible to have more than a 100 percent decrease.
Why is it impossible to have a decrease of more than 100 percent? Our article on the construction of ratios reviewed the formula for rate of change... If a number goes from 15,000 to 5,000, how much of a percentage decrease is involved? Would the authors of the Science article say 200 percent? We are certain you would say 66 2/3 percent!
It's interesting to note that one of the authors decided to defend his calculation. He said: "Referring to cost as having decreased by 12,000 percent... we took literary license to dramatize the cost-reduction... "
We believe the author confused exaggeration with statistical misuse. The right number is like right word. As Mark Twain once said: "The difference between the right word and the wrong word is the difference between lightning and the lightning bug... "
Category 2: Failure to Distinguish Between Percent Change and Percent Points of Change
A difficulty that often arises in the interpretation of percent data is failure to distinguish between percent points of change and percent changes.
Suppose that a production line is turning out radios and that two radios per 100, on the average, do not pass final inspection; we would say that the defective rate is 2 percent.
Suppose for some reason the number of defective radios increases to three per 100 on the average; that is, the defective rate increases to 3 percent. The defective rate has increased by one percentage point. This does not mean, of course, that the defective rate increased by one per cent. Actually, the defective rate increased by [(3-2)/2] x 100 = 50 percent.
As you might suppose, a figure representing percent points of change is often interpreted, incorrectly, as if it were a percent change.
Category 3: Failure to Use Ratios When Needed
Harry V. Roberts, Professor Emeritus at the University of Chicago, and former speaker at many IQPC conferences on Total Quality Management provided this memorable example of the importance of using ratios when making comparisons:
"During World War 11, about 375,000 people were killed in the United States by accidents and about 408,000 were killed in the armed forces. From these figures, it has been argued that it was not much more dangerous to be overseas in the armed forces than to be at home...
...A more meaningful comparison, however, would consider rates, not numbers, of deaths, and would also consider the same age groups. This comparison would reflect adversely on the safety of the armed forces during the war—in fact, the armed forces death rate (about 12 per thousand men per year) was 15 to 20 times as high, per person per year, as the over-all civilian death rate from accidents (about 0.7 per thousand per year).
Peacetime versions of the same fallacy are also common: ‘Homes are more dangerous than places of work, since more accidents occur at home.’ ‘There are more illiterates in New York than in California.’ ‘Beds are the dangerous thing in the world, because more people die in bed than anywhere else."
We suggest readers work out for themselves what's wrong with the statements. It's good practice. Especially today, with the increasing tendency to present twisted numbers and faulty analysis.
Category 4: Failure to Compare Ratios Refined to the Same Degree
Years ago, the statement was made that the death rate in the United States Navy during the Spanish American War was smaller than the death rate in New York City for the same time period. The conclusion drawn from this statement was the New York City was not a very healthy place to live.
The probable reason for this, according to journalists of the period, was the horrible water supply system in the City of New York. Without commenting about the fallacious rationale involved in going from differences in death rates to water supply as the cause, and thinking only about the comparability of the two rates, what problems are encountered in this comparison?
The answer is quite straightforward. The U.S. Navy was composed of mostly healthy young males; whereas New York City had young males, young females, old males, old females and the like.
In short, the United States Navy had a very homogeneous grouping... New York City's population structure was composed of many groups.
"A major cause of death in that time period," said Stanley S. Schor, an eminent bio-statistician, "was child-bearing. Deaths due to this cause were not found in the U.S. Navy during the Spanish-American War... A more comparable death rate in the City of New York may have been the total number of deaths per 1,000 males who could pass the Navy health examination and met the Navy's age requirements."
Schor provides another excellent example that hammers home the point about comparison of ratios refined to different degrees:
"Another case in which the ratios were not refined to the same degree and therefore was not comparable occurred in the garment trade industry in New York. During the period of sweat shops, tuberculosis was a common cause of death...
Not much was known about the disease and investigators thought that they could study it in the garment industry since everyone in the industry was given a health examination before being hired. It turned out that the incidence of tuberculosis among males in the industry was much higher than the incidence among females...
An inference was drawn that males were more apt to get TB than females and there was some relationship between the incidence of TB and gender. Since most females worked a few years, got married and quit; whereas the males continued in the industry for the rest of their lives, one can see that the average age of the males in the garment trade industry was much higher than the females...
TB, of course, is a function of exposure time or age. Thus, the difference in the incidence of TB between male and females could be attributed to comparing ratios not refined to the same degree."
Years ago, Newsweek magazine pointed out in a feature article that "Military prosecutors regularly run up an eye-catching 94 percent conviction rate compared to 81 percent in the federal courts for civilians... The reader is left with the distinct impression that military trials result in a near-certain conviction."
Yet a little digging reveals that many military trials are concerned with AWOL cases. Civilian courts rarely try people accused of absenteeism—at least not yet. The comparison is not valid because the Newsweek article compared ratios refined to different degrees.
Stephen Campbell noted: "In any given comparison, the assumption of sameness of the characteristics compared and the similarity of other relevant factors may or may not be valid... when it is not valid, the comparison is a lie."
Category 5: Failure to Use a Control Group When Needed
Of the police officers who died in 1995, 9 percent failed to reach the age of 50, 60 percent of the articles published in Medical Journal Z have conclusions of doubtful validity, 95 percent of couples seeking divorce have either one or both partners who do not attend church regularly.
We are bombarded by headline news with phrases such as the ones above. For example, the inference some readers would draw from the 9 percent (a number we have not checked) of police officers who died before the age of 50, is that police officers die prematurely.
Appropriate control groups may show that a much higher proportion of deaths in other professions occurred before age 50. Indeed, statisticians call this type of error "a dangling or missing comparison." In short, it's a meaningless statistic without the appropriate comparison.
The general conclusion drawn from the second statement, namely that Journal Z is particularly lax in rejecting invalid studies, may also be wrong. What about the other journals? Is it possible that 70 percent of articles appearing in the top 10 medical journals contain conclusions of doubtful validity?
An inference that might be drawn from the third statement is church-avoiding habits play a big role in divorce. Is it possible that 99.9 percent of happily married couples have either one or both partners who do not attend church regularly?
Our point? Statements of numbers without appropriate comparisons can be very misleading.
Category 6: Sloppy Data Collection Methods
For years it was claimed and backed by statistical evidence that instigators of bar room brawls in London were more likely to be killed in such fights than those forced to defend themselves.
Sociologists, psychologists, neuroscientists and substance abuse agencies studying this phenomenon formulated theories and produced voluminous articles that attempted to explain this rather unexpected and unusual outcome.
The Royal Statistical Society, however, was somewhat suspicious of these reported "outcomes" of bar room brawls and set out to check for themselves their accuracy. The statisticians quickly discovered that the data was, indeed, questionable as the collection methods did not encourage accurate results.
What the statisticians discovered was that every time a bar brawl ended with a fatality or with serious injuries, the investigating police officer dispatched to the incident would invariably ask: "Who started this?" Witnesses would immediately point to the victim lying on the floor and say, "He did!"
This response was duly noted and because of the victim’s inability to dispute the accusation, the witnesses' assertions were usually taken to be true.
Multi-colored charts accompanied by accurate looking ratios are totally meaningless if the data collection methodology leaves much to be desired. A classic article in Fortune magazine many years ago titled "We're Drowning In Phony Statistics" described how we're being inundated by meaningless statistics and unknowable statistics disguised as statistical fact.
Nothing much has changed. In subsequent issues, we will discuss problems related to data collection.
Why are we studying ratios? Because they are the foundation for many statistical methods. Most performance measurements involve ratios. In today's measurement-management environment, it is important to understand the potential for erroneous conclusions.
As we proceed in months to come, we will see many examples of "statistical non-sequiturs," a Latin term meaning "it doesn't follow." Just remember the famous statement by Will Rogers: "It ain't so much the things we know that gets us in trouble. It's the things we know that ain't so."