Model selection is essential to both CAS FCAS and ICAS CSPA examinations. This talk will provide the basic concept of over- and under-fitting models. I will introduce commonly used parametric, semi-, and non-parametric models. Penalties and validation would be demonstrated through the fitting of generalized linear models, classification and tree-based regression, and random forest models. Pros and cons of one and two-stage modeling will be discussed. This talk will also provide common problems and suggest improvement of model selection in various examination papers from the author's past five-year examiner experience. It will also help candidates taking the new CAS PCPA examination in 2025.
Learning Objectives:
4. Summarize the pros and cons of one and two-stage modeling.
5. Explore different steps in identifying sub-optimal models in GLM.
6. Benchmark the baseline line result in classification modeling and other modeling results in both training and testing phases.