We give a short overview of data mining techniques and algorithms for a sub-problem, which is classification. Parallelization approaches are also described, and some performance results from the ScalParC project presented. This lecture borrows heavily from a lecture by Vipin Kumar and Mahesh Joshi.
This lecture also described some common pitfalls of parallel performance analysis that we saw in the N-Body assignment, which should help students in preparing their final projects.
PowerPoint, Postscript, PDF
Tutorial on parallel data mining by Mahesh Joshi and Vipin Kumar.