I often use the *stcmd* package written by Roger Newson that allows you to read SPSS, SAS and other file types directly into Stata using the *inputst * command. While this has worked well on the PCs I use, I’ve never been able to get the program to work on my linux (Ubuntu) or unix (Mac OS) machines…until now.

Continue reading

## Integrating StatTransfer into Stata for Linux and Unix (Mac OS)

## Reading Shared Dropbox Files into Stata

I often share code and datafiles with collaborators and have found Dropbox to be a very useful tool in this regard (keeping in mind that Dropbox is not HIPPA or FERPA compliant so be certain data is deidentified).

I recently discovered an even easier way to use Dropbox to demonstrate code to a collaborator.

When sharing a Dropbox link to a datafile, you get a path like the following: https://www.dropbox.com/s/h3kknr9jq98c41h/census.txt?dl=0

By changing the zero (dl=0) at the end of the path to a one (dl=1) the file can actually be directly read-in with the standard **use**, **import excel**, or** insheet** commands. For example:

insheet using “https://www.dropbox.com/s/h3kknr9jq98c41h/census.txt?dl=1″, clear

Now the Stata syntax I share can work on any computer regardless of the file/directory structure of the collaborator. Similarly, files they share don’t have to be locally on my machine. Awesome!

## Sample Size: Is 10 a magic number?

I’m often asked “What is the minimal sample size I need to do…” This is perhaps one of my favorite questions, as I certainly don’t perform sample size calculations off the cuff, yet there should be some notion as to what is an appropriate minimal N. Often I’m answering this question in the context of reproducibility/reliability analyses and I typically find myself saying “10, no less than 10”. As students we’re taught that there should be 10 subjects per independent variable (IV) in a regression model to prevent under fitting, yet why does this notion of appropriate number of subject s per IV propagate itself into a minimal sample size. As it turns out, there are some valid reasons to think that 10 (or thereabouts) is a magical number in sample size estimation.

## Integer Sequences in SAS

In Stata, I regularly find myself using the integer sequence command ( * seq()* ) available as part of the

**egen**functions. When I use SAS, I’m dismayed at how difficult creating an integer sequence can be. There’s a website by Richard DeVenezia which goes over a variety of ways to create integer sequences in SAS (index-within-group.sas).

I’m going to demonstrate how this code relates to the Stata egen seq() command.