COURSE AIMS AND OBJECTIVES: This course aims to demonstrate how computer intensive methods can be used to extend the inference methods to those situations where classical approaches fall short. Additionally, students will be exposed to Monte Carlo, re-sampling, and selected statistical learning methods. The emphasis will be on algorithms, software tools, and practical applications from pharmaceutical, finance, and engineering areas.
COURSE DESCRIPTION AND SYLLABUS:
1. Introduction via application: Monte Carlo experiment for examining robustness, Bootstraping for measuring the accuracy of the estimators. Random number generators. The inverse probability method. Box-Muller method. Marsaglia's table method. Rejection methods.
2. Generators in SAS and R. Univariate generators: uniform, normal, binomial, gamma. Fleishman algorithm. Metropolis-Hastings algorithm. Gibbs algorithm.
3. Multivariate generators: transformations and Gibbs method. Generating random matrices.
4. Monte Carlo estimation. Estimation of a definite integral. Estimation of variance in Monte Carlo estimation. Examples. Planning student projects.
5. Monte Carlo experiments. Design, control, reproducibility, efficiency, documentation, reporting issues.
6. Detailed explanation/implementation/visualization of a Monte Carlo experiment. Implementation of student projects.
7. Monte Carlo tests: generating data from a hypothesized model. Variance reduction techniques in Monte Carlo estimation: antithetic variables, control variables, control variables with regression, systematic sampling.
8. Variance reduction techniques in Monte Carlo estimation: Conditional sampling, Importance sampling, Stratification. Example: Estimation of the roots of a random covariance matrix.
9. Re-sampling methods: Bootstrap (nonparametric and parametric), jackknife, cross-validation, data partitioning, randomization, "Out of the bag".
10. Graphical methods in computational statistics. Graphical displays for one and two variables. Displaying the third variable.
11. Visualizing multivariate data. Dynamic graphics. Exploring multivariate data.
12. Monte Carlo methods and statistical learning. Multiple linear regression model. Variable selection methods. Bagging and its application in multiple linear regression.
13. Classification trees: CART, CHAID. SAS Enterprise Miner examples.
14. Bagging classification trees. Boosting. Random Forests. Graphical displays.