Generative and Discriminative Approaches to Graphical Models CMSC 35900 Topics in AI Lecture 1 Yasemin Altun January 3, 2007 ADMINISTRATIVE STUFF I Lectures: Wednesday 3:30-6:00, TTI Room 201 I Office hours: Wednesday 10am-Noon I No text book. Reference reading will be handed out. I No homework, no exam I Presentation/Discussions: 40% of grade I Final Project: 60% of grade I Applyoneofmethodsdiscussedtoyourresearcharea(eg. NLP,vision,CompBio) I AddnewfeaturestoSVM-structorotheravailablepackages I Theoreticalwork I URL http://ttic.uchicago.edu/ altun/Teaching/CS359/index.html I Class mail list Prerequisites I Familiarity with I Probability: Randomvariables,densities,expectations, joint,marginal,conditionalprobabilities,Bayesrule, independence I Linearalgebra I Optimization: Lagrangianmethods Traditional Prediction Problems I Supervised Learning: Given input-output pairs, find a function that predict outputs of new inputs I Binaryclassification,labelclass{0,1} I Multiclassclassification,labelclass{0,...,m} I Regression,labelclass< I Unsupervised learning: Given only inputs, discover some structure, eg. clusters, outliers I Semi-supervised Learning: Given a few input-output pairs and many inputs, find a function to predict outputs of new inputs I Transduction: Given a few input-output pairs and many inputs, find a function to predict well on unlabeled inputs Key Components I 4 aspects of learning I Representation I Parameterizationandthehypothesisspace I Learningobjective I Optimizationmethod I Different settings lead to different learning methods I For prediction tasks, state-of-the art methods Support Vector Machines, Boosting, Gaussian Processes Discriminative Learning I All these methods are from the discriminative learning paradigm I (Treat inputs, outputs as random variables. X for input with instantiation x, Y for output with instantiation y. p(x) for probability (X = x) ) I Given an input x, they discriminate the target label y. eg. p(y|x) I Since conditioning on x, they can treat arbitrarily complex objects as input I Versus a generative approach, I wherethegoalistoestimatethejointdistributionp(x,y). I p(x,y)=p(y)p(x|y) I p(x|y): Giventhetargetlabel,generatetheinput. I eg. NaiveBayesclassifier Structured (Output) Prediction I Traditionally, discriminative methods predict one simple variable. I In real-life, it is rarely the case. I Not taking dependencies into account is an important shortcoming. I Domains: Natural Language Processing, Speech, Information Retrieval, Computer Vision, Bioinformatics, Computational Economy Examples I Domain: Natural Language Processing I Application: Part-of-speech tagging I Input: A sequence of words I Output: Labels of each word as noun,verb,adjective,etc. John hit the ball. Noun Vb Det Noun Examples I Domain: Computational Biology I Application: Protein Secondary Structure Prediction I Input: Amino-acid sequence AAY KSHGSGDYGDHDVGHPTPGDPWVEPDYGINVYH I Output: H/E/- regions HHHH——-EEEEEEEE—- – - - - HHHHH- - - - Examples I Domain: Computer vision I Application: Identifying joint angles of human body
Description: