Next: Bibliography
Up: Automatic assembly of highly
Previous: Results: matrix multiply
The main result is that the input sampling and optimization approach is
a promising way to automate the process of assembling tuned code
fragments into a single routine. We are able to obtain simple criteria
for evaluating an input and quickly selecting an algorithm. Still,
our simple (i.e., linear) boundaries result in relatively high (10-15
percent) misclassification rates. However, these rates are taken
against a strict criteria in which we have assigned ``execution times''
in a binary fashion.
There are still a number of issues that need to be explored
before this procedure can be incorporated into an actual system such
as PHiPAC:
- Can we construct a more realistic ``time'' criteria (i.e.,
a more accurate function for T(a,x))?
- Are there other simple, parametric boundaries besides lines
that would yield better fits? E.g., the natural boundaries in
the matrix multiply data appear more hyperbolic than linear.
- How do convergence and classification behave when we vary
the input space sampling density? Can we obtain better
performance for particular applications by sampling the input
sizes from a realistic workload?
- Is there a parametric model for execution time as a function
of the inputs? If so, we could convert this problem into a
piecewise regression problem. It would be interesting to compare
the current non-probablistic approach to a fully probablistic
model.
The author wishes to thank Andrew Ng for numerous useful discussions.
Next: Bibliography
Up: Automatic assembly of highly
Previous: Results: matrix multiply
Richard Vuduc
1998-12-15