Chapters
1: Introduction
2: Recommendation systems
3: Item-based filtering
4: Classification
5: More on classification
6: Naïve Bayes
7: Unstructured text
8: Clustering

Classification

In the previous chapters we used people’s ratings of products to make recommendations. In this chapter we use attributes of the products themselves to make recommendations. This approach is used by Pandora among others.

Contents

  • Introduction to Pandora-like systems
  • The importance of selecting appropriate attributes and values
  • An example: music attributes and a nearest neighbor approach
  • Data normalization
  • Modified Standard Score
  • Python code: music, attributes, and a simple nearest neighbor approach
  • A sports example.
  • Ways of acquiring attribute data

The PDF of the Chapter

Python code

The code for the example on page 4-13: ch4-filteringdata.py.

page 46: the template to use for you to write the getMedian and getAbsoluteDeviation methods: testMedianAndASD.py. You will also need athletesTrainingSet.txt.

p48: the template to use for you to write the normalizeColumn method: normalizeColumnTemplate.py. You will also need athletesTrainingSet.txt.

p52: the template for the nearestNeighbor method: classifyTemplate.py. You will also need athletesTrainingSet.txt.

p53: code for the complete nearest neighbor classifier – nearestNeighborClassifier.py

Data