ebook img

Machine Learning for Information Retrieval PDF

108 Pages·2012·2.38 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Machine Learning for Information Retrieval

Machine Learning for Information Retrieval Yi Zhang University of California Santa Cruz 1 What’s Information Retrieval Generate Knowledge Access Information Clustering Question and Categorization Answer Information Filtering/ Mining: Sentiment Retrieval Recommendation Analysis, Information Extraction, etc. Search Data/Information 2 Related Areas Information Domain CS: Modeling Economics Marketing Statistics, optimization Artificial Intelligence Phycology Machine learning & data mining Information Healthy Retrieval Natural language processing Legal Financial … Human computer interaction Distributed computing Computer networks Storage Security System Outline PART I   Supervised learning and its IR applications Text classification, adaptive filtering, collaborative filtering  Search engine: ranking   Semi-supervised learning and its application to text classification PART II   Frontiers of information retrieval  Example: Proactive information retrieval (recommendation systems) First part prepared with Rong Jin 4 Supervised Learning: Basic Setting Supervised learning  Given: input and output variables pairs (training data)  {(x ,y )(x ,y )…(x ,y )} 1 1 2 2 N N Learning: infer a function f(X) from the training data  Inference: predict future outcomes y=f(x) given x  Prediction of classfication : R|V| {0,1} X classifier category c regression : R|V|  R Prediction of real regression X value output 5 f y How to Learn Given evidence (data) E, find hypothesis (model) h  Maximum likelihood (ML) estimation  h  arg max P(E | h ) ML h Maximum a posteriori (MAP) estimation  h  arg max P(h | E) Posterior probability of MAP h hypothesis Bayes’ rule P(E | h)P(h)  arg max P(E) h  arg max P(E | h)P(h) h Data Prior probability of hypothesis 6 likelihood Text Categorization Given:   Predefined categories  Documents with class labels (training data)  A set of unlabelled new documents (testing data) Task: classify the new documents to existing classes  Applications:   News article classification  Email classification, including spam filtering  Web page classification  Scientific journal articles  Word sense disambiguation  Language identification, author identification, tagging  Bug vs. non Bug 7 Major Steps for Text Classification Gathering a training set (input objects and corresponding outputs)  from human or from other measurements Determine the input feature of the learned function (What’s X)   Controlled vocabulary? Bag of words? Title only? Stemming? Stopping? Phrases? Linguistic Features? Meta data? Other contextual information?  Typically, the input object is transformed into a feature vector  This step influence the final performance of the system greatly Determine the functional form of the learned algorithm   Logistic regression? Neural network? SVM? Gradient boosted trees? Determine the corresponding learning algorithm (ML or MAP or other  loss function) Learn: run the learning algorithm on the gathered training set   Optional: adjust the parameter via cross validation Test the performance on a test set  8 Text Classification Approaches Manual  Manually assign labels to each document 1. Manual create rule based expert system 2. Automatic approaches  Naive Bayes and language models 1. Decision tree 2. Neural networks 3. Support vector machines 4. Boosting or bagging 5. Regression (linear, polynomial …) 6. K nearest neighbors 7. Rocchio 8. 9 Two Major ML Approaches Generative models  Model the joint probabilistic distribution p(c, X), and  derive the conditional probability P(X | c )P(c ) j j P(c | X )  j  P(X | c )P(c ) j j j Examples: Naive Bayes, language models, HMM  Discriminative models  Model the conditional probability P(c|X) directly  Examples: decision tree, neural networks, support  vector machines, boosting or bagging, regression (linear, polynomial …), k nearest neighbor 10

Description:
Semi-supervised learning and its application to text classification. ▫ PART II. □ Frontiers of Determine the corresponding learning algorithm (ML or MAP or other loss function). ▫ .. 3rd Century B.C. Library of Alexandrian. □ Catalogs and . Physical limitation of the mobile device limits a
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.