Its sort of a joke, especially among industrial scientists, that physicists are ok at lots of things but not excellent at anything and that this explains why there are so few physicists in industry doing physics. While there is some accuracy to the joke, the truth is that the one thing that physicists excel at is measurement. As my graduate advisor used to point out, all of physics is counting. The trick is just to figure out the right things to count and the right way to count them. That’s the essence of measurement and its not always as easy as it seems.
Everybody needs to measure stuff. And whether you’re in a traditional business or a own a Web 2.0 startup or are just an average gal or guy, the need to measure things quickly and precisely has gotten a lot more intense in the past decade. You want to understand where your business is in the Long Tail or how “sticky” your website is or how much your coffee habit is costing you annually. And when I say precisely, I mean precisely in a, well, precise sense. To speak precisely, precise and accurate are not the same things. And this is the first thing to understand about measurement – a measurement is only valid when it is both sufficiently accurate and sufficiently precise.
Accuracy vs. precision
When you measure something accurately, your measurement gives you a number that is very close to the truth. You may not get the same number each time you make your measurement, but you know that its close to the actual value. When you measure something precisely, you’ll get close to the same result each time, but you may not be close to the actual answer. Ideally, we want our measurements to be both accurate and precise. In reality, most folks have a higher tolerance for lack of accuracy than they do for lack of precision. As long as the measurement is reasonably accurate, most people will settle for something that is off from the truth by a good bit so long as they get consistent answers from it. If you check your measuring cups in your kitchen drawers, you will find that they are pretty precise. Fill your 1/4 cup measure up 4 times and dump it in your 1 cup measure and it will fill it up exactly. (Or at least it did on each of the three sets of measures I had in my kitchen.) Yet, I have no idea – nor do I care – if the cups are calibrated properly. Do they deliver exactly 1 cup? If you care about that kind of accuracy, you’ll probably be using a graduated cylinder, not a plastic measuring cup. For most of us, the fact that the cups are precise is more important.
Does our tolerance for inaccuracy seem surprising? If you use Google Analytics to track your website stats, it shouldn’t be. Google can’t know, accurately, how many unique visitors actually visited your site. How can they? Even though they set a cookie to track your visitors, a lot of folks using Firefox will only accept cookies for that session, thus preventing Google from counting them over multiple visits. I do essentially the same thing with Omniweb. A lot of folks using IE will occasionally flush all of their cookies as a privacy measure. Each time the Google Analytics cookie for your site gets deleted, that user looks like a new user to Google. This means your unique visitor count is artificially high, as is your percentage of new users visiting your site. But, really, it doesn’t matter. You don’t care how accurate that number is, because whether you have 570 or 450 unique visitors per day isn’t as important as the trend. Is that number going up or down? Is it higher on Saturday mornings or weekday nights? As long as the measurement is precise, then those trends can be analyzed meaningfully.
Now we know what makes a measurement valid, and we understand that a large fraction of the time, we don’t need as much accuracy as we need precision. While I didn’t explicitly talk about it, it’s important to note that validity is predicated only upon sufficient accuracy and precision. Your car’s fuel gauge is neither terribly accurate nor terribly precise, but it represents a valid measurement because it gives you the data with sufficient accuracy and precision to keep you from running out of gas.
There are four other things to keep in mind about a measurement, which I’ll call the Four ‘R’s: Relevance, Range, Resolution, and Reproducibility.
Relevance in measurement is both subtle and obvious: you want your measurement to give you the information you need. Obvious, right? If the answer is “you have a full tank of gas” or “1 cup of milk,” then relevance seems to be trivial. We don’t always ask such easy questions, though. A lot of the things we measure aren’t things we really care about, but are instead things we believe are highly correlated to the things we care about. If you choose a proxy to measure that isn’t very relevant to the answers you need, then you can accumulate a lot of data that is useless to you. This isn’t always an easy problem to solve.
One example of the difficulty of make a relevant measurement is SAT scores. The SAT is a measurement that we believe tells us something about how a student will perform in college. We can’t measure “successful in college” – at least not a priori, so ETS has made a test that some people think is an excellent proxy for measuring future success in college. Whether or not it is actually relevant is a matter of some debate currently.
Most college grads can name a buddy from their college days with excellent SAT scores who flunked out early on, due to boredom or irresponsibility. We might infer from this observation that discipline and work ethic is more highly correlated with college success than SAT score. If this were true, it would be a much more relevant measurement. To prove this hypothesis, though, we’d have to actually measure a student’s discipline. That’s a hard problem, one that doesn’t lend itself to easy measurement. Not to say that colleges don’t try – I recall having to include letters of recommendation for my college applications, which presumably were complimentary of my excellent academic discipline, stellar study habits, and good dental hygiene. Since, when I entered college, I only had the good dental hygiene part going for me, I expect that most colleges recognize that this measurement is neither precise nor accurate, despite its relevance to the answer we seek.
Range and Resolution
Range and resolution are the properties of measurement that are the easiest to explain and understand. Range is the distance between the highest and lowest value your measurement technique will register. Resolution is how many levels of distinction there are within your range. Your desk ruler has a range of 12″, or about 30.5 cm, and a resolution of 1/16″, or about 1 mm. If you want to measure out 6 yards of a geotextile sheet for your garden, you could do that with the ruler, but you’d certainly agree that a tape measure would be a better tool. With some measurements, though, making sure you have a method with sufficient range is a lot more critical. This is why your meat thermometer and your household medical thermometer are different instruments, with different means of measuring temperature. By the same token, the resolution of your meat thermometer is a lot lower. You don’t really need 0.1 F resolution on your meat thermometer, but when taking your temperature to determine if you have a fever, that resolution is important.
A measurement’s reproducibility is different from its precision, even though the two are easily confused. One way to think of it is that if you measured something ten times, the variance in your numbers would be represent the precision of the measurement. If ten people measured it once, the variance in the measurement would represent its reproducibility. What this means is that if the measurement technique is easy to “do correctly,” then the measurement is reproducible. Conversely, it is possible to have a highly precise measurement that is not very reproducible, because of the difficulty in making the measurement. This is frequently the case with a lot of microscopical measurements, as variations in sample preparation and operator technique can affect the images before they’re measured. If you’re trying to determine whether or not a tissue biopsy contains cancerous cells by looking at a stained specimen, getting this right is critical.
Reproducibility is an issue with a lot of qualitative or semi-quantitative measurements. When I see in a recipe “fry onions until golden-brown,” I have to make a measurement with my eyes of the color of the onions in my pan as they are frying. I know from experience, though, that my idea of “golden-brown” means a lot more cooking than it does for many other people. The human visual system, it is safe to say, does not represent a very reproducible measurement. It works well enough, though, so I don’t expect that many recipes will include spectrophotometer data for accurate, precise and reproducible measurements of onion color anytime in the near future.
The key to making a measurement reproducible is to have a clear process for making the measurement. In my field, there are standards bodies that publish volumes and volumes of proper procedures for making measurements such as the stiffness of a plastic (ASTM D6272) or how well something resists burning. (ISO 11925) For the example I gave above about biopsied tissue, there are standard staining protocols, that ensure that the proper stains are used and at the appropriate levels. But you probably understand this concept instinctively – if you always level off a measuring spoon or hold a measuring cup up so that the liquid is at eye level, you’re following a standard procedure that will make your measurements reproducible.
So, what now? If you’ve gotten this far, you’re at least somewhat interested in the topic, since I am no J. K. Rowling. Thinking about measurement can be terribly addictive – at least, it is to me. Have you ever wondered how they measure inflation? While you’ve probably heard of the Consumer Price Index, the details of the measurement might surprise you. More practically, you might also be interested in combing your customer data for indications of whether you’re doing the right things in your business. You might want to measure your website traffic to determine whether your new web ad campaign is giving you a good return. If you own a restaurant, you may want to measure how effective your menu is. In any case, good measurement is the key to getting good answers in a lot of fields and understanding the mechanics of making a measurement is part of good measurement.