Last month John Cooley released the results of his 2007 Verification Census. He concluded, among other things, that SystemVerilog use is up, 'e' use is down, and that most engineers think specialty languages such as 'e' and Vera will be dead in 5 years. Mike Fister, head honcho at Cadence shot back at Cooley saying that he felt the survey wasn't "statistically relevant". Cooley claims his 818 responses must be significant, and that Fister is simply "protecting his $4 M paycheck":
My second question was "what is this 3% that Fister is talking about?" Then
I figured 818 responses / 25,000 ESNUG subscribers = 3.2%. That must be it.
Hmmm... I'm not a statistician. So I phoned Gary Smith about this 3%.
"Heck, 818 responses is plenty. We do directed surveys all the
time and easily as few as 35 responses in a selected category can
be statistically significant. Fister needs to track these
subcategories very closely to know. So far, Cadence has not been
open at all about outside information coming into the company."
- Gary Smith of Gary Smith EDA
OK, so I'm not drinking my own Kool-Aid in this survey. Crap! And I'm just
now remembering all those CNN polls where they only asked *500* people about
some Big Issue -- and *that* poll data is considered statistically kosher
to represent the attitudes of 300 million Americans! Crap.
All this barking was just Mike Fister protecting his $4 M paycheck. Funny.
After reading this exchange, I felt as though both Cooley and Fister had made mistakes regarding the validity of survey data. I am definitely not a survey expert, so I decided to do some checking on the web to find out whether any of the claims regarding the veracity of Cooley's data could be true.
One of the first things I found was this sample size calculator from Raosoft. According to the calculator, a sample size of 818 out of population size ranging from 20,000 to 50,000 provides a margin of error just a bit over +/-3%. As the population size gets much over 20,000, the margin of error stays roughly the same. So, Cooley is right that a sample size of 818 could be statistically relevant. However, I still had a sneaking suspicion that something fishy was going on. Why? Take a look at those quick polls they sometimes have on CNN.com or ESPN.com, among others. People visiting the site are asked a question such as "Who looks better in argyle shorts - Democrats or Republicans?" Hundreds or thousands of people answer the question. But does that make the survey statistically relevant? No, not really. Why? Because the respondants weren't selected at random. According to Chapter 1 of WhatIsASurvey.info, a publication written by the American Statistical Association:
In a bona fide survey, the sample is not selected haphazardly or only from persons who volunteer to participate. It is scientifically chosen so that each person in the population will have a measurable chance of selection. This way, the results can be reliably projected from the sample to the larger population.
Additionally, under the heading "What Are Other Potential Concerns":
The quality of a survey is largely determined by its purpose and the way it is conducted.
Most call-in TV inquiries (e.g., 900 "polls") or magazine write-in "polls," for example, are highly suspect. These and other "self-selected opinion polls (SLOPS)" may be misleading since participants have not been scientifically selected. Typically, in SLOPS, persons with strong opinions (often negative) are more likely to respond.
Herein lies the main issue with Cooley's survey. The survey participants are self-selected out of, among other things, people who subscribe to the Deepchip mailing list. A better survey would randomly select participants from a wider cross section of the verification community at large. Could Cooley's survey be valid? Possibly. Can he prove it? Not simply by throwing around the size of his sample.