### Comparing treatment groups at baseline

Is anything more aggravating than a statement that treatment group differences were 'not statistically significant' at baseline? This can mask two fundamental misunderstandings - the meaning of statistical differences or the purpose of comparing groups.

What is desired is an indication that the treatment groups are exchangeable - on all relevant characteristics. This means identical centers (e.g. means) or proportions, and/or p values approaching 1.

Trait Interv Cntrl n p-value InterpretationMean age 49.1 49.2 10,000 <.01 NO difference Mean age 23.5 46.2 10 0.46 Different!

Of course you could have identical means and very different distributions, e.g. young and old in one group, middle-aged in the other. Some day I'll ask a statistician about another idea that someone must have proposed already - an overlap statistical for describing similarity. To lay out the idea...

Histograms of **continuous variables** would be smoothed according to sample size. The overlap statistic would be the proportion of the area under both curves to the total area.

For **categorical variables** the overlap statistic would be the sum of the minimum percentage in each category. It's easy to illustrate in a table:

Category %F %M min sumFemale 45 50 45 45 Male 52 45 +45 90 Other 3 5 +3 93

Conclusion: The overlap statistic would be 0.93 (i.e. 93%), because for every 100 people in the two groups 7 would not have a matching 'partner' in the other group.