Statistics, Darn Statistics, and Xamplify
Xamplify made some wild claims about the statistics behind its methodologies but realizing that Dave Donoho of Stanford was the person behind its statistics, it was hard to ignore. However, since I never saw David Donoho consulting the company about its statistics claims, I should be forgiven for stating that, in my humble view, Xamplify's statistics was just hot air.Sample Size
Following is from Xamplify's web site:What's the minimum sample size required to get an accurate understanding of my installed base?
- Approximately 900 to 1000 users will provide a statistically valid sample of the core psychometric information through the dialogue collection mechanism. This sample is used by Xamplify's extrapolation algorithms to build models for the entire customer-base of a client. For a company like Charles Schwab, this sampling size would be approximately 0.02% of their customers.
- ...[snip]...
- The extrapolation algorithms rely on a 3-stage association/correlation technology that is well established and used in other disciplines in the US (e.g., presidential election predictions-notwithstanding the most recent one). These techniques were architected and audited by our Chief Statistician (and Stanford University Professor), Dr. Dave Donaho.
I worked on the models and the correlations in the sample were quite low to begin with (r-square of about 0.07 was the norm) how could we extrapolate the results to a much large population? In my view, the models were so poor (partly because the company was trying to fit a straight line which didn't seem correct) that forget about the extrapolation, even for the sample surveyed, they explained the relations very poorly.
Extrapolation
Following is from the same web page:How can Xamplify create psychometric profiles of a client's entire customer base?
Xamplify segments the entire installed base through advanced extrapolation techniques. The software extrapolates models of the non-profiled portion of the customer base from those of the profiled portion. Our software creates highly focused associations between users via their demographic-plus-transactional history and their psychometric profiles. The extrapolation algorithms and transfer equations we use have been field tested repeatedly and become more accurate with additional data. We are currently operating at an accuracy rate of over 90%. In addition, the system's "hypothesis testing" schema ensures that the customer models self-correct over time. Even prior to self-correction, however, our sampling and extrapolation technique gives a far superior performance, since the hypothesis relies on a large, independent set of intelligence (demographics, psychometrics and transactional information).
My comments: Oh, my.
Link (Saved copy from archive.org. Look towards the bottom of the page - last 6 headings.)