Figure 2 summarizes the results. The plots depict a 2-D cross-section of the 3-D input space. Specifically, two of the matrix dimensions were set equal (M = N). The colored points show the ``truth'' values: a1 (blue), a2 (green), and a3 (red). These were tuned approximately for small, medium, and large matrices, respectively. Note that the algorithm tuned for medium-sized matrices actually performed best on the smallest sizes (the cluster of green in the lower-left corner of the plots).
The lines on the plots indicate the decision boundaries.3 Figure 2 (left) shows the results of the optimization using 3 decision functions, one for each algorithm. The boundary lines (and arrows) indicate the regions assigned to the algorithm. As shown, the green region in the lower left was misclassified completely as blue. The misclassification rate, measured on the sample inputs, was roughly 19 percent. To be fair, however, our binary assignment of ``time'' makes our results pessimistic, since regions where algorithms appear to overlap most likely indicate that the execution times were very similar.
Figure 2 (right) shows the results when we use 6 decision
functions, instead of just three. (Note also that the input samples are
slightly different.) For the purpose of the optimization, we accomplish
this by assigning
,
where m is the true
number of algorithms, and k is between 1 and 6 inclusive. As shown,
this resulted in a much better classification. In fact, this reduced
the misclassification rate to 12 percent.
In general, we will be able to capture more complicated or isolated regions by using more decision functions. However, the trade-off is that we will have to spend more time making those decisions at run-time, potentially sacrificing performance on small matrices.
|