Course description
With the increasing use of programming languages in data analytics, now is the time to learn their ins and outs. This course focuses upon understanding statistical models and analyzing the results whilst learning to work with R. As well as introducing the software to newcomers, it presents basic and more advanced statistics using an overarching framework of the generalized linear model.
The first week is devoted to learning how to use R and regression analysis. We start with reading data into R, descriptive statistics and visual representation of data, which is the first step for statistical analyses. We then introduce the linear regression model, a widely used model with two main purposes: modeling relationships among the variables and predicting future observations.
In the second week, we will extend the linear model to the generalized linear framework, in order to analyze discrete dependent variables. The logit regression that you will work with, proves useful to understand the remainder of the course: classification. You will learn how to reduce data dimensions using principal component analysis and cluster analysis, and how to use the learned methods for prediction.
Every day consists of short lectures with examples, and exercises in which you apply what you have learned right away. The focus in the exercises and assignment is the coding in R and how to apply and to interpret generalized linear regression models. After class, you are supposed to work on an assignment in which you integrate what you have learned in the exercises during class. This assignment will be graded.
Download here the detailed preliminary course syllabus.
Continue reading below for more information