Snapshot of the research iterative refinement code (test drivers, non-GE versions, etc.) is at http://www.cs.berkeley.edu/~ejr/dev/lapack/itref-work/.
Note that the code still needs some changes. See the section Code Work below.
Included in main branch of the LAPACK repo.
In XBLAS distribution and LAPACK trunk.
Included in main branch of the XBLAS repo.
So users can hook XERBLA rather than having to hook XERBLA, BLAS_error, and BLAS_ERROR.
Mostly done. Still needs build documentation and wider build testing.
Having the CBLAS interface (BLAS_dgemm and crew) in the XBLAS is potentially confusing. Not generating the routines themselves is easy, but disentangling them from the tester is a bit more complicated.
Currently, these routines are functions of N and NRHS only. We could use additional workspace to work on multiple right-hand sides at once.
Requires:
| S | D | C | Z | |
|---|---|---|---|---|
| GE | code, test | code, test(ip) | code, test | code, test(ip) |
| SY | code, test(ip) | code(ip), test(ip) | code(ip), test(ip) | code(ip), test(ip) |
| HE | n/a | n/a | ||
| PO | code | |||
| GB | ||||
| GT | ||||
| PB | ||||
| PT | ||||
| PP | ||||
| HP | n/a | n/a | ||
| PS |
Legend:
Code for xGESV and xGESVX is included in timing-integration.
Nope.
Not now.
In itref:master:
The tests were disabled "temporarily" by commit e061efc111b, but they may stay disabled. We can only re-enable them if we can produce more accurate true solutions for testing.
Deferred, so long as all the tests pass on various machines:
Testing code doesn't require that the equilibration generated in the test routine be the same as the equilibration used internally.
We don't have torture-testing code to check.
We don't have torture-testing code to check.
Requires:
We promise nothing. Explained in leading comments of xyySVXX, also needs explained in the LUG.
This list covers only xGESVXX.
A snapshot of the LUG is available at http://www.cs.berkeley.edu/~ejr/lapack-snapshots/trunk/lug-latest.tar.gz or unpacked under http://www.cs.berkeley.edu/~ejr/lapack-snapshots/trunk/lug . It's also on the EECS file system under file:/project/cs/demmel/b/LAPACK/snapshots/trunk/ .
Just a table entry.
xyyEQUB, xyyRFSX, and section's last sentence
Add "Cost of Extra Precision"?
Add a 4.4.2 for extra precise refinement, possibly absorbing the last bit of 4.4.1. Caveat for user-provided factorization and equilibration.
Symmetric routines will need a description of the equilibration somewhere.
Array for optional arguments in xGESVXX: need feedback from users if this is acceptable. See the description of PARAMS in http://www.cs.berkeley.edu/~ejr/dev/lapack/itref-work/sgesvxx/sgesvxx.f and http://www.cs.berkeley.edu/~ejr/dev/lapack/itref-work/sgesvxx/sgerfsx.f .
Add failure modes from tech report. Actually, this section should provide a bit more guidance for those cases where the algorithms honestly fail. Right now it just blames the user.
The included LAWN 81 needs a lot of basic editing and updating. It certainly needs edited in the following ways:
Currently, the code is structured as follows:
Each level operates on all NRHS right-hand sides. There is a loop over the rhs in SLA_GERFSX_* that processes each rhs on its own. The routine currently is provided only enough workspace to process one rhs at a time.
The "trust me" policies are separated from the mechanics of refinement. So a user can apply the refinement loop in SLA_GERFSX_EXTENDED without applying the LAPACK policies in SGERFSX.
To keep this separation but apply the "only trust the answer when refinement has converged" policy, we need to return whether each rhs has converged. And we don't have space for an NRHS-long array anywhere.
Possible options:
This completely disallows working on blocks of the rhs at once using SLA_GERFSX_*. We currently don't have the workspace for blocks (or working on each rhs in a separate thread), so it probably doesn't matter.
If we start mucking with the workspace, we should add a full LWORK/LRWORK/LIWORK/etc. query interface. The query routine would take N and NRHS, possibly asking for enough workspace to process blocks at a time.
Block refinement would allow a single pass through A for each step, which may prove useful for the first few steps.
Componentwise convergence implies normwise convergence, so we should always accept the normwise result when we accept the componentwise result.
The normwise condition number for xyySVXX is
norm(inv(R*A),inf)*norm(R*A,inf)for a suitable R. This ignores the solution and right-hand side, so we do not detect correct solutions in such cases as A*x=0. But the componentwise condition number will catch these cases and accept the solution.
There are more considerations when A*x=0 occurs as a block in (possibly permuted) block-diagonal systems. We may want to include the solution / right-hand side in the normwise condition number at some point.
Won't do for this release. Re-evaluate how it might affect existing users.
xyySVXX (extended) calculates the pivot growth for each column and reports the maximum. The older xyySVX (not extended) calculates the growth across the entire equilibrated matrix. If the user disables equilibration or provides a poor column scaling factor, the xyySVX code may underestimate the pivot growth.
A SuperLU user reported an "interesting" bug which also can occur in LAPACK: Solving Ax = 0 is accurate but reports BERR = 1. Other interesting problems occur with block-structured matrices where some block has a zero solution. To protect against underflow and mis-rounding the residual, the routines compute:
BERR = max(0, SAFEMIN) / max(0, SAFEMIN) = 1
To fix this, xyySVXX uses a modified abs(A)*abs(x)+abs(b) calculation that still protects against underflow but does not perturb exact zero components (see xLA_yyAMV). We also encapsulated slightly more complicated 0/0 logic in xLA_LIN_BERR.
This should be evaluated for eventual inclusion in the plain xyySVX. But this shouldn't block releasing LAPACK now.
Might be worth considering an interval xyyMV for the computation, if any actually exist.
The xyySVXX routines equilibrate by powers of the radix to avoid the extra rounding error. This would be trivial to include in xyySVX. However, users may expect entries in the equilibrated A to be less than one in magnitude. Scaling by xyyEQUB's results only guarantee the entries to be less than the radix in magnitude.
However, users cannot require the matrix be equilibrated. If the equilibration factors are all less than some threshold (see the refinement drivers), they are not applied. So really the user cannot rely on the entries' magnitudes...
This section is not up-to-date.
itref:master-f90
See also http://icl.cs.utk.edu/trac/lapack-dev/wiki/RefinementComparison .
Variables:
Results:
Need:
For what would be saved if we had fast intervals:
Remaining in total:
Interesting problems?
If we can solve these accurately enough, combinatorial optimization algorithms may recognize when the solution is at a vertex rather than in a face.
We mostly use git for source control.
locations:
snapshots:
dependencies:
interesting branches:
locations:
snapshots:
interesting branches:
locations:
snapshots:
interesting branches:
merged, no longer interesting:
dead branches:
locations:
snapshots:
interesting branches in local git:
mkdir dev-tools cd dev-tools git init-db git fetch `pwd`/.. dev-tools:master git checkout master git repo-config remote.origin.url `pwd`/.. git repo-config remote.origin.fetch refs/heads/dev-tools:refs/heads/master
Then you can ignore generated files by:
cp dev-tools/git-info-exclude .git/info/exclude
Date: 2007/11/08 09:20:44 PM