From: Andrea Campera on 12 Jan 2006 03:26 I know that matrix multiplication alforithm can be lower than n^3, for example thew Strassen algorithm is n^2.81 and there exists algorithm of order n^2.3. I have also read somewhere that those algorithm are slightly unstable in some cases. I have also tried with MATMUL (f90) but the code runs slower. MAy there is a smart way to resolve (I+A*B)^(-1). Thanks again Andrea "Gordon Sande" <g.sande(a)worldnet.att.net> ha scritto nel messaggio news:2006011109410016807-gsande(a)worldnetattnet... > On 2006-01-11 04:14:42 -0400, "Andrea Campera" > <andrea.campera(a)iet.unipi.it> said: > >> To be more precise I have to do an operation like (A*B+I)^ -1. up to now >> I have used >> >> call multi(...) >> call summa(...) >> call gaussj(...) >> >> I post also the profile: >> >> Each sample counts as 0.01 seconds. >> % cumulative self self total >> time seconds seconds calls Ks/call Ks/call name > > <snip> > >> if I increase the dimension Nc of the matrices gaussj and multi become >> the most time consuming routines. I have seen the subroutine cgemm.f of >> Blas for the multiplication, but what about the inversion? is ther >> anything faster than gaussj? I use g77 (3.2 or 3.4) Pentium IV or Xeon or >> Opteron or AMD64 (we have several machines). >> >> Andrea > > > The fastest way to invert matrices is to not invert matrices. This > apparent > piece of self contradictory advice really means that most uses of > inversion > are better done some other way. > > Solving is usually what is wanted and that can be better done by factoring > and solving with the the factors. The question is "Why to you think you > actually need the inverse?" Other that your simple derivation in matrix > algebra. > > Matrix multiply will always be n^3 and inversion and/or solving will > also be n^3. You can fiddle the coefficient of multiply by polishing > code, or having someone else do it for you in a library. You can also > polish inversion but beware the "highly polished" codes that pay no > attention to numerical stability. Many Gauss-Jordan matrix inversion > programs are the trivial algebra coded naively so you have to read and > understand the documentation. Factoring has a bigger reduction in > coefficient over inversion than you are ever going to see with polishing. > And it may be that with a bit of insight into you real problem that > there are other things that can be done. Thee are examples of easy > problems with awkward derivations that are turned into awkward, and > numerically harder, problems with easy derivations. For more details > consult a competent numerical analyst. > > > >
From: Joost on 12 Jan 2006 04:03 You might want to read this paper: http://surj.stanford.edu/2004/pdfs/kakaradov.pdf 'Ultra-Fast Matrix Multiplication:An Empirical Analysis of Highly Optimized Vector Algorithms' It has the following interesting sentence: The value of n for which Strassen's algorithm (in this implementation) becomes better than the naive algorithm is n = 60,750. Having said this, I have the following timings comparing multi with cgemm on an opteron using GOTO blas: Nc=351 multi 0.28795699999999996 CGEMM 0.04499200000000003 Nc=1351 multi 35.40061800000001 CGEMM 2.246658999999994 Cheers, Joost
From: David Jones on 12 Jan 2006 04:51 Andrea Campera wrote: > I know that matrix multiplication alforithm can be lower than n^3, for > example thew Strassen algorithm is n^2.81 and there exists algorithm > of order n^2.3. I have also read somewhere that those algorithm are > slightly unstable in some cases. I have also tried with MATMUL (f90) > but the code runs slower. MAy there is a smart way to resolve > (I+A*B)^(-1). Thanks again > The result wanted is (I+A*B)^(-1)= I-A*( (I+B*A)^(-1) )*B savings arise if B*A is smaller in dimension than A*B.
From: Andrea Campera on 12 Jan 2006 05:00 "unfortunately" A and B are square matrices, hence A*B and B*A have the same dimensions. Using cgemm or multi plus summa to do (I+A*B) with Nc=300 gives both 0.34 seconds on a 2.4 GHz pentium IV. but I don't use goto BLAS. for N=1000 cgemm and multi plus summa gives, respectively, 8.5 and 13.2 seconds. May be implemented goto BLAS can save lots of time in the multiplication. Andrea "David Jones" <dajxxx(a)ceh.ac.uk> ha scritto nel messaggio news:43c6268d$1(a)news.nwl.ac.uk... > Andrea Campera wrote: >> I know that matrix multiplication alforithm can be lower than n^3, > for >> example thew Strassen algorithm is n^2.81 and there exists algorithm >> of order n^2.3. I have also read somewhere that those algorithm are >> slightly unstable in some cases. I have also tried with MATMUL (f90) >> but the code runs slower. MAy there is a smart way to resolve >> (I+A*B)^(-1). Thanks again >> > The result wanted is > > (I+A*B)^(-1)= I-A*( (I+B*A)^(-1) )*B > > savings arise if B*A is smaller in dimension than A*B. > >
From: Joost on 12 Jan 2006 06:27 > Using cgemm or multi plus summa to do (I+A*B) with Nc=300 gives > both 0.34 seconds on a 2.4 GHz pentium IV. but I don't use goto BLAS. > May be implemented goto BLAS can save lots of time in the multiplication. :-) The last sentence makes the point... In my opinion, BLAS is just a specification of an interface to some basic linear algebra operations, the implementation made available in netlib is just a reference implementation that provides 'correct results' in a reasonable time, but not fast (i.e. the fortran code for cgemm in the reference implementation is very similar to your multi). Certain specific implementations of BLAS do much better when it comes to the 'fast' bit. In these implementations people make sure that the operations are ordered in such a way the code is optimal for the CPU. Joost
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 Prev: Anybody use GPPTOOL to do programming before Next: Multiple Key Sort in Fortran |