What is Predictive Modeling? Casualty Actuaries of the Northeast Spring 2005 Sturbridge, MA March 23, 2005 Presented by Christopher Monsour, FCAS, MAAA ©©22000054 TToowweerrss PPeerrrriinn What is it? (cid:132) Estimation of likely outcomes based on historical data (cid:132) The emphasis is on estimating the parameters as a means to estimating the outcomes (cid:132) As opposed to financial modeling, where the emphasis is on modeling the probabilities of various outcomes, given the parameters (cid:132) The emphasis is on different estimates for different combinations of characteristics or for different entities (cid:132) In financial modeling, the emphasis is on the range of possible outcomes for a single entity (cid:132) Thus, predictive modeling belongs to statistics and data mining (cid:132) Whereas financial modeling largely belongs to probability theory (cid:132) Finally, emphasis on predictions, NOT on interpreting model parameters (cid:132) May “interpret” parameters when building model, but only as a means to O/3-05 developing the best model H C Can1.ppt/ Shared\05prggr\Till- S: ©2005 Towers Perrin 1 What sort of outcomes? Quantitative (regression models) (cid:132) The expected length of time to repair an automobile, given (cid:132) Its make, model, and model year (cid:132) The nature of the repair (cid:132) The technician assigned (cid:132) The day of the week service began (cid:132) The expected losses for an insured based on that insured’s (cid:132) Driving record (cid:132) Age, sex, marital status (cid:132) Location (cid:132) Credit Rating (cid:132) Occupation O/3-05 H C Can1.ppt/ Shared\05prggr\Till- S: ©2005 Towers Perrin 2 What sort of outcomes? Categorical (classification models) Soft assignment (cid:132) The probability of your home being broken into, depending on (cid:132) Your location (cid:132) The life-stage of your household (cid:132) Whether you have a burglar alarm (cid:132) Whether you have a garage (cid:132) The probability that an insured will buy pet health insurance if asked, based on (cid:132) Age, sex, marital status (cid:132) Location (cid:132) Occupation (cid:132) Household type O/3-05 (cid:132) Home value H C Can1.ppt/ Shared\05prggr\Till- S: ©2005 Towers Perrin 3 What sort of outcomes? Qualitative (classification models) Hard assignment (cid:132) Often soft assignment model plus a threshold, but not always (cid:132) Classic example…to which subspecies does a particular botanical specimen belong, based on: (cid:132) Dimensions (cid:132) Coloring (cid:132) Is a claim fraudulent? (cid:132) Characteristics of claim (cid:132) Of doctors and lawyers involved (cid:132) Of claimant (cid:132) Of agent or broker O/3-05 H C Can1.ppt/ Shared\05prggr\Till- S: ©2005 Towers Perrin 4 What tools come from predictive models? Rating factors (cid:132) In a linear regression model or a GLM, the model parameters may be interpreted more-or-less directly as indicated rating factors in an additive or multiplicative rating scheme (depending on the type of model) (cid:132) The model parameters in a loss ratio model may be interpreted as the amount by which the rating factors need to change O/3-05 H C Can1.ppt/ Shared\05prggr\Till- S: ©2005 Towers Perrin 5 What tools come out of it? Scoring Farm — Illustration of Scoring (cid:132) In a GLM or linear regression, the model Animals scores are added up and a different treatment applied to various ranges of Horses 20 scores, such as Sheep -10 (cid:132) Tier assignment for rating / underwriting Cattle (ranch) -30 Cattle (dairy) 20 (cid:132) Adjuster assignment for claims Size of Farm < 50 acres -20 50-100 acres 0 100-320 acres 10 320-640 0 640+ -30 O/3-05 Crops H C Can1.ppt/ Shared\05prggr\Till- WBahreleayt -1300 S: ©2005 Towers Perrin 6 What tools come out of it? Rules (cid:132) Other models produce branching rules Are there horses? (cid:197) No Yes (cid:198) Are there dairy More than 320 cattle? acres? (cid:197) No Yes (cid:198) (cid:197) No Yes (cid:198) O/3-05 Score is -20 Score is 10 Score is -10 Score is +20 H C Can1.ppt/ Shared\05prggr\Till- S: ©2005 Towers Perrin 7 Related types of modeling “Unsupervised” learning (cid:132) Categorical modeling where the categories are not determined in advance (cid:132) Effectively amounts to looking for dense patches, or “clusters”, in an appropriate feature space (cid:132) Classic example is subspecies classification when name and number of subspecies is unknown in advance (cid:132) Geographic use in insurance (cid:132) Feature space can be one dimensional, e.g., pure premium (cid:132) Or can be multi-dimensional, e.g., crime rate, percentage of housing units occupied by owners, etc. O/3-05 H C Can1.ppt/ Shared\05prggr\Till- S: ©2005 Towers Perrin 8 Related types of modeling Cause-and-effect (cid:132) About interpretation of parameters (cid:132) Is a certain model of automobile more dangerous than another? (cid:132) Suppose you attempted to answer this just from accident data or insurance data (cid:132) Think about what you might miss (cid:132) Well-known that the sign of a coefficient for a predictor can change in a regression model as you add more predictors (cid:132) Is the model correctly specified? (cid:132) Have you added all the predictors you should have out of a possibly infinite number? (cid:132) Much more difficult to validate than predictions (cid:132) There are specialized methods, used especially in psychology O/3-05 H C Can1.ppt/ Shared\05prggr\Till- S: ©2005 Towers Perrin 9
Description: