ebook img

Stats 250 Full RLabs for Fall 2015 PDF

101 Pages·2015·8.92 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Stats 250 Full RLabs for Fall 2015

Author: Brenda Gunderson, Ph.D., 2015 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution- NonCommercial-Share Alike 3.0 Unported License: http://creativecommons.org/licenses/by-nc-sa/3.0/ The University of Michigan Open.Michigan initiative has reviewed this material in accordance with U.S. Copyright Law and have tried to maximize your ability to use, share, and adapt it. The attribution key provides information about how you may share and adapt this material. Copyright holders of content included in this material should contact [email protected] with any questions, corrections, or clarification regarding the use of content. For more information about how to attribute these materials visit: http://open.umich.edu/education/about/terms-of-use. Some materials are used with permission from the copyright holders. You may need to obtain new permission to use those materials for other uses. This includes all content from: Attribution Key For more information see: http:://open.umich.edu/wiki/AttributionPolicy Content the copyright holder, author, or law permits you to use, share and adapt: Creative Commons Attribution-NonCommercial-Share Alike License Public Domain – Self Dedicated: Works that a copyright holder has dedicated to the public domain. Make Your Own Assessment Content Open.Michigan believes can be used, shared, and adapted because it is ineligible for copyright. Public Domain – Ineligible. Works that are ineligible for copyright protection in the U.S. (17 USC §102(b)) *laws in your jurisdiction may differ. Content Open.Michigan has used under a Fair Use determination Fair Use: Use of works that is determined to be Fair consistent with the U.S. Copyright Act (17 USC § 107) *laws in your jurisdiction may differ. Our determination DOES NOT mean that all uses of this third-party content are Fair Uses and we DO NOT guarantee that your use of the content is Fair. To use this content you should conduct your own independent analysis to determine whether or not your use will be Fair. Statistics  250   Lab  Workbook     Fall  2015         Weekly  Labs  and  Supplements     Used  in  all  lab  sections  of  Stat  250             Dr.  Brenda  Gunderson   Department  of  Statistics   University  of  Michigan Table  of  Contents     Material   Page       Note  to  Students  and  Supplements   1     Supplement  1:  R  Commands  Summary   2     Supplement  2:  Notation  Sheet   4     Supplement  3:  Name  That  Scenario   6     Supplement  4:  Interpretation  Examples   8     Supplement  5:  Summary  of  the  Main  t-­‐Tests   10     Supplement  6:  Regression  Output  in  R   12   Lab  1:    Describing  Data  with  Graphs  and  Numbers   15   Lab  2:    Probability  and  Random  Variables   19   Lab  3:    Confidence  Intervals  for  a  Population  Proportion   27   Lab  4:    Hypothesis  Testing  for  a  Population  Proportion   33   Lab  5:    Understanding  Normal  and  Random  Data   39   Lab  6:    Learning  about  a  Population  Mean   47   Lab  7:    Paired  Data  Analysis   53   Lab  8:   Comparing  Two  Means   59   Lab  9:   One-­‐Way  Analysis  of  Variance  (ANOVA)   67   Lab  10:Exploring  Linear  Regression   75   Lab  11:  Regression  Inference   81   Lab  12:  Chi-­‐Square  Tests   85 Note  to  Students   Welcome  to  Statistics  250  at  the  University  of  Michigan!       This  is  the  first  summer  term  in  which  R  and  R  Commander  will  be  used  as  the  software  package  for   Stats  250.    Some  of  the  reasons  why  we  made  this  switch  are:     • The  ability  to  use  R  is  a  valuable  skill  recognized  by  employers.   • Other  Statistics  courses  use  R  and  this  will  make  for  an  easier  transition  into  these  next  courses.   • R  is  a  free,  open  source  software  that  can  be  downloaded  onto  student  machines,  so  students   can  have  access  to  it  any  time  on  their  personal  devices  and  won't  have  to  use  Virtual  Sites.     This  lab  workbook  is  designed  for  you  to  use  in  lab  and  as  extra  preparation  for  exams.    In  the  workbook,   you  will  find  the  following  materials:     Supplemental  Material  –  great  summaries  for  reference  throughout  the  term:   1.   R  Commands  Reference   2.   Notation  Sheet     3.   Name  That  Scenario   4.   Interpretation  Examples   5.   Summary  of  T-­‐tests   6.   Regression  Output  in  R     Weekly  Labs  (numbered  1  to  12)  –  each  lab  contains  the  follow  parts:   o Lab  Background  –  objective  and  brief  overview  material,  which  is  good  to  take  a  couple  minutes  to   read  before  you  come  to  lab  each  week.   o Warm-­‐Up  Activity  –  quick  questions  for  you  to  do  before  the  In-­‐Lab  Project,  usually  a  quick  review  of   concepts  you  have  seen  in  lecture.   o ILP  (In-­‐Lab  Project)  –  one  or  more  activities  you  will  work  on  in  lab,  in  groups.       o Cool-­‐Down  Activity  –questions  for  you  to  do  after  the  ILP  for  further  reflection  and  application  of   the  concepts  covered  in  the  ILP.           The  Labs  are  designed  to  be  interactive  and  to  provide  you  with  a  complete  example  for  each  concept.     Completing  the  corresponding  PreLab  assignment  (a  link  to  video  instructions  for  PreLabs  will  be  on   Canvas  and  the  Stat  250  YouTube  channel)  and  reading  the  upcoming  lab  background  overview  before   lab  each  week  is  a  good  way  to  prepare  for  the  various  lab  activities.         Good  luck  in  Statistics  250!                   -­‐-­‐  The  Stat  250  Instructors  and  GSIs       Special  Thanks  to  the  Statistics  Graduate  Students   Kit  Clement   Sean  Pikosz   Daniel  Walter   For  their  substantial  contributions  to  transition  and  modernize     the  Lab  Materials  to  the  Awesome  R  computing  package     1 Supplement  1:  R  Commands  Summary       By  Lab  –  For  Quick  Reference     Lab  1  –  Bar  Charts,  Histograms,  Numerical  Summaries,  Boxplots     Open  a  data  file  after  loading  R  Commander:  Data  >  Load  data  set     To  produce  a  Histogram:  Graphs  >  Histogram     To  generate  Descriptive  Statistics:  Statistics  >  Summaries  >  Numerical  summaries     To  produce  a  Bar  Chart:  Graphs  >  Bar  Graph     To  produce  a  Boxplot:  Graphs  >  Boxplot     Lab  5  –  Time  Plots,  QQ  Plots     To  produce  a  Sequence  or  Time  Plot  for  the  variable  named  “VARIABLE”  in  the  data  set  “DATA”     you  must  type  these  two  lines  of  code  into  the  R  Script  box: plot(DATA$VARIABLE, type =”l”, main="Normal QQ Plot of variable by name")   Note  that  you  can  find  the  dataset  name  in  blue  text  at  the  top.  To  find  variable  names,  click  View  data   set  and  look  at  the  top  row.  To  create  the  plot,  highlight  the  above  code  and  click  the  Submit  button.   To  produce  a  QQ  Plot:  you  can  use  the  built  in  option  under  Graphs  >  Quantile-­‐comparison  plot   Or  you  can  make  a  QQ  plot  for  the  variable  “VARIABLE”  in  the  data  set  “DATA”  by  typing  these  two  lines   of  code  into  the  R  Script  box:     qqnorm(DATA$VARIABLE, main="Normal QQ Plot of variable by name") qqline(DATA$VARIABLE)   Then  highlight  this  code  and  click  the  Submit  button.     Lab  6  –  One-­‐Sample  t  Procedures  for  a  Population  Mean     To  perform  a  One-­‐Sample  T  Test  for  a  population  mean  and  obtain  a  confidence  interval:  Statistics  >   Means  >  Single-­‐sample  t-­‐test     Lab  7  –  Paired  t  Procedures     To  perform  a  Paired  T  Test  and  obtain  a  confidence  interval:  Statistics  >  Means  >  Paired  t-­‐test     To  compute  Differences:  Data  >  Manage  variables  in  active  data  set  >  Compute  new  variable.     Lab  8  –  Independent  Samples  t  Procedures     To  perform  Levene’s  Test:  Statistics  >  Variances  >  Levene’s  Test     2 To  perform  a  Two-­‐Samples  T  Test  and  obtain  a  confidence  interval:  Statistics  >  Means  >  Independent   samples  t-­‐test     Lab  9  –  One-­‐way  Analysis  of  Variance  (ANOVA)     To  perform  an  ANOVA:  Statistics  >  Means  >  One-­‐Way  ANOVA       Lab  10  and  11  –  Linear  Regression     To  produce  the  correlation  (R)  for  all  pairs  of  variables:  Statistics  >  Summaries  >  Correlation  matrix     To  produce  a  Scatterplot:  Graphs  >  Scatterplot     To  perform  a  Linear  Regression:  Statistics  >  Fit  models  >  Linear  regression     To  produce  a  Residual  plot  and  QQ  Plot  of  residuals,  first  make  sure  you  have  the  correct  model   selected,  then  follow:  Models  >  Graphs  >  Basic  diagnostic  plots     Lab  12  –  Chi-­‐Square  Tests     To  perform  a  Goodness  of  Fit  Test:  Statistics  >  Summaries  >  Frequency  distributions.    Make  sure  to   check  the  box  to  run  a  goodness  of  fit  test,  and  then  you  can  specify  the  null  probabilities.     To  perform  a  Test  of  Independence:  Statistics  >  Contingency  tables  >  Two-­‐way  table     To  perform  a  Test  of  Homogeneity:  Statistics  >  Contingency  tables  >  Two-­‐way  table           3 Supplement  2:  Notation  Sheet     The  table  below  defines  important  notations,  including  that  used  by  R,  which  you  will  come  across  in  the   course.    This  is  not  an  exhaustive  list,  but  it  is  a  fairly  comprehensive  overview  of  the  “strange  letters”   used  in  the  course.       Note:  Blank  cells  mean  there  is  no  corresponding  notation.     Notation  used  in  R   Name   Population  Notation   Sample  Notation   Commander   Summary  Measures   Mean   μ  (read  as  “mu”)   x  (x-­‐bar)   Mean   Proportion   p   pˆ (p-­‐hat)     Standard  deviation   σ  (sigma)   s   Varies,  often  “sd”   Variance   σ2   s2   Variance   Sample  size     n   n  (sometimes  N)   Confidence  Intervals     z*  (z-­‐star)     Multipliers     t*  (t-­‐star)     Margin  of  error     m,  m.e.     Hypothesis  Testing     z       t   t   Test  statistics   Note:  t,  F,  and  χ2  statistics     F   F   have  degrees  of  freedom   (abbreviated  df)  associated   with  them.    Look  for  these     χ2  (chi-­‐square)   Chi-­‐square   on  your  Formula  Card.   Significance  level     α(alpha)     Pr(*)     p-­‐value     p-­‐value   (the  star  will  depend  on   what  test  is  being  used)   4 Name   Population  Notation   Sample  Notation   Notation  used  in  R   Analysis  of  Variance  (abbreviated  ANOVA)   Row  labeled  with  the   Sum  of  squares  for     SSG   grouping  variable,   groups   column  labeled  Sum  Sq   Sum  of  squares  for   Row  labeled  Residuals,     SSE   error   column  labeled  Sum  Sq   Row  labeled  with  the   Mean  square  for   grouping  variable,     MSG   groups   column  labeled  Mean   Sq   Row  labeled  Residuals,   Mean  square  error     MSE   column  labeled  Mean   Sq   Regression   Response  (dependent)   (given  by  name  of  y-­‐ y   y   variable   variable)   Predicted  (estimated)   E(y)  (expected  value  of   yˆ (y-­‐hat)     response   y)   Explanatory   (given  by  name  of  x-­‐ x   x   (independent)  variable   variable)   B  (look  in  the  row   y-­‐intercept   β  (beta-­‐not)   b   o o labeled  (Intercept))   B  (look  in  the  row   Slope   β  (beta-­‐one)   b   labeled  with  the  name   1 1 of  the  x-­‐variable)   Coefficient  of   Values  in  Correlation     r   correlation   Matrix   Coefficient  of     r2   Multiple-­‐R  Squared   determination   Error  terms  vs   Unstandardized   ε(error  terms)   e  (residuals)   Residuals   residuals     5

Description:
3. To perform a Two-‐Samples T Test and obtain a confidence interval: Statistics > Means > Independent samples t-‐test. Lab 9 – One-‐way Analysis of Variance (ANOVA). To perform an ANOVA: Statistics > Means > One-‐Way ANOVA. Lab 10 and 11 – Linear Regression. To produce the correlation
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.