一本关于Web2.0时代web数据挖掘、协同过滤的图书，从内容来看涉及了web2.0网站中诸如search rankings, product recommendations, social bookmarking, and online matchmaking功能中所涉及的算法及数学模型，正如评论所说：
Toby Segaran’s new book, Programming Collective Intelligence, teaches algorithms and techniques for extracting meaning from data, including user data. This is the programmer’s toolbox for Web 2.0. It’s no longer enough to know how to build a database-backed web site. If you want to succeed, you need to know how to mine the data that users are adding, both explicitly and as a side-effect of their activity on your site.
There’s been a lot written about Web 2.0 since we first coined the term in 2004, but in many ways, Toby’s book is the first practical guide to programming Web 2.0 applications.
This book explains:
- Collaborative filtering techniques that enable online retailers to recommend products or media
- Methods of clustering to detect groups of similar items in a large dataset
- Search engine features — crawlers, indexers, query engines, and the PageRank algorithm
- Optimization algorithms that search millions of possible solutions to a problem and choose the best one
- Bayesian filtering, used in spam filters for classifying documents based on word types and other features
- Using decision trees not only to make predictions, but to model the way decisions are made
- Predicting numerical values rather than classifications to build price models
- Support vector machines to match people in online dating sites
- Non-negative matrix factorization to find the independent features in a dataset
- Evolving intelligence for problem solving — how a computer develops its skill by improving its own code the more it plays a game
High Performance Web Sites: Essential Knowledge for Front-End Engineers
作者Steve Souders是yahoo的工程师，书籍的内容来自于Yahoo!’s Exceptional Performance team的成果rules for high performance web sites，可以算是Web2.0时代构建高性能服务器必须读的书籍。