Statistics Resources

This is a page of resources on statistics and computing.

Stata Add-Ons:
I’ve developed and continue to develop a set of Add-Ons for the statistical package Stata. These Add-Ons are meant to be downloaded and used together. They can be installed in any personal ado directory for Stata. These are usually found in  c:\ado\personal  for windows  and in ~/Library/Application Support/Stata/ado/personal for macs. To install, simply place the files directly into your personal ado directory (or modify your file to point your desired personal .ado directory). When opening Stata, type “discard” in the command prompt (this must be done whenever placing new files in these folders). Each program has a help file that can be accessed by typing “help program_name” in the Stata command prompt. There is also an overall description of programs available in the packages which be accessed by typing “help programlist“.

Stata Add-On package can be downloaded here:
Stata AddOns (Stata Versions 9-13)
Stata AddOns (Stata Versions 14 & 15)

These programs are continuously updated, so please check back often for updates.
If you discover a problem please contact me at

Presentations/White Papers:
Data Organization for Young Investigators is a primer for biomedical researchers on how to set-up, enter, and store data using Microsoft Excel. This paper was designed over the years as a response to receiving disastrous data sets from researchers using excel. Now when someone asks “how should I give you the data?” I simply hand them this document.

Data Organization and Creating Reproducible Analyses: A presentation concerning how to organize data and the workflow of analysis to ensure reproducible results.

Analysis of Categorical Data: A presentation with examples and syntax in Stata, R, and SPSS on binary, ordinal, and multinomial logistic regression models.

Graphical Exploration of Statistical Interactions:  A presentation showing how to understand two-way and three-way interactions, including examples for graphing continuous-by-continuous interactions.

Latent Growth Curves as Mixed Models: Latent growth curves are a common means of modeling longitudinal data in psychology. In other fields, these models can go by many other names such as the hierarchical linear model, multilevel model, and mixed effects model. While the latent growth curve is implemented in a Structural Equation Modeling (SEM) framework, the mixed effects model can be conducted by most statistical packages and does not require the use of SEM.

Analysis of Confounding: There is much focus in the psychological literature on the influence of mediating variables, such that Baron and Kenny’s 1986 paper on mediation is one of the most widely cited papers in psychology. However, mediation is only one small part of the story as it pertains to third variables in statistical models. While mediation addresses attenuation of direct effects, the direct effects could also be amplified or have their direction reversed. In disciplines from the health sciences, analyses of these alternate possibilities is termed “Analysis of Confounding”.  

Analysis of Repeated Measures using Mixed Effects Models: Mixed effects models are some of the most commonly used models to analyse data with repeated measures. Though these models may go by many names (HLM, Multilevel, Random Effects, Latent Growth) they can all be expressed as special cases of the mixed model, where we have repeated measures which cause a violation of the normal assumptions of statistical independence. This talk will attempt to demystify mixed models by discussing the why and how of fitting mixed models with exemplars in R using the “nlme” package. [Example Syntax]