a thoughtful web.
Good ideas and conversation. No ads, no tracking.   Login or Take a Tour!
comment
teamramonycajal  ·  3894 days ago  ·  link  ·    ·  parent  ·  post: Soapbox Sundays: Expound on an Idea/Philosophy of Yours

What made me learn R in my second semester of statistics was that my professor actually gave us references with the exact command and all the variables and what they meant. He EXPLAINED it. He didn't make us scour the internets for what the hell we were supposed to do for a two-sample t-test. He gave us the commands. He expected us to know how to use them.

Did you explain to your boss that Student's T is an ANOVA for only 2 samples and that for ANOVA you have to do Tukey's HSD (which is really not that hard in R)?

This is the thingamabobber I typed up for the sophomore, published here just in case anyone else on Hubski wants a mini-statistics and mini-R lesson:

IMPORTING YOUR DATA INTO R

Say you have a set of data. Before you do anything with this data at all, make sure each variable is in the columns, not the rows (i.e. the headers are at the top of each column and the data goes down the column). This makes working with the data a lot easier.

Let's assume you use Microsoft Excel 2007 or later, because that's common, and that you have this little sheet of data, called dogbones.csv (save it as a .csv, because R hates Excel. Also, save it in your My Documents folder, because this is where R will pull datasheets from). Even though the data's going to be typed out here in rows because this is an email, imagine that it is in columns.

     Dogs: 1 2 3 4 5
     FemurLengths: 14 15 16 17 18
So dog 1 has femur length 14, dog 5 has femur length 18, and so on.

To import this data into R, there are a few ways to do it.

If you have a small dataset with one IV and one DV, you can type:

   dogs <- scan(1, 2, 3, 4, 5)
   fmrlengths <- scan(14, 15, 16, 17, 18)
and ignore the more complicated dataset commands.

If you have a huge dataset, you type: dogbones <- read.csv("dogbones.csv", header=TRUE)

To get the same sets as 'dogs' and 'fmrlengths', you type:

     dogs <- dogbones$Dogs
     fmrlengths <- dogbones$FemurLengths
(as an aside here, before you import a data set in R, it's a lot easier if the variable names are kinda condensed).

TESTS

One-sample T-test: Let's say we're comparing a population and we want to see whether it's significantly different than a mean of 3.

   t.test(population, mu=3)

Two-sample T-test: Let's say we're comparing heights of men and women, and that these variables are coded as 'hmales' and 'hfemales', and that we're assuming equal variance. All you type is:

   t.test(hmales, hfemales, var.equal=TRUE)

Pearson's chi-square: For a 2-way table called 'townsmog' where the rows are towns A and B and the columns are 'bothered by pollution' and 'not bothered by pollution':

   chisq(townsmog)

Correlation test: Is 'money' correlated with 'intelligence'?

   cor.test(money, intelligence)

Linear regression test: Assuming 'money' is correlated with intelligence, how much does 'intelligence' determine 'money', and is this significant?

   moneyIQreg <- lm(money ~ intelligence)
   summary(moneyIQreg)