From: Tamas K Papp on 16 Jul 2010 08:47 On Thu, 15 Jul 2010 22:58:23 +0300, Captain Obvious wrote: > But I'd say it is actually pretty fast -- library I'm using can do SVD > of 17 seconds worth of a sound in 2 seconds on 2 GHz CPU. Faster than > realtime. I am just curious, which library are you using? LLA currently uses LAPACK's divide-and-conquer algorithm for SVD by default, which I found to be pretty fast, even for large matrices. LAPACK's lower-level SVD functions also have other possible speedups, if you only need some of the singular values/vectors etc. It is really amazing how fast SVD is these days, nowadays I tend to use it in places where I would have used other decompositions (QR, Cholesky) before, it is so stable and robust. Tamas
From: Captain Obvious on 16 Jul 2010 09:19 ??>> But I'd say it is actually pretty fast -- library I'm using can do SVD ??>> of 17 seconds worth of a sound in 2 seconds on 2 GHz CPU. Faster than ??>> realtime. TKP> I am just curious, which library are you using? PROPACK. From the description: "The software package PROPACK contains a set of functions for computing the singular value decomposition of large and sparse or structured matrices. The SVD routines are based on the Lanczos bidiagonalization algorithm with partial reorthogonalization (BPRO). The Lanczos routines can also be used directly, and form the basis of efficient algorithms for solving linear systems of equations and linear least squares problems, in particular for systems with multiple right-hand sides." I was using it for large sparse matrices and it worked pretty well. Now as for compression I have dense matrices perhaps I could switch to another library, but I didn't bother, as I already have glue code to run PROPACK and it works good enough. TKP> LLA currently uses LAPACK's divide-and-conquer algorithm for SVD by TKP> default, which I found to be pretty fast, even for large matrices. TKP> LAPACK's lower-level SVD functions also have other possible speedups, TKP> if you only need some of the singular values/vectors etc. It is TKP> really amazing how fast SVD is these days, nowadays I tend to use it TKP> in places where I would have used other decompositions (QR, Cholesky) TKP> before, it is so stable and robust. Can you please check how fast it will do, say, 256*2930 matrix? I'd consider replacement to a thing I have now. It's not like I think PROPACK is bad, just my glue code use files (!) to communicate with it and writing to files is the most time consuming operation. And I guess it might be easier to swich to another library than to rewrite glue code using FFI myself :).
From: Tamas K Papp on 16 Jul 2010 09:56 On Fri, 16 Jul 2010 16:19:25 +0300, Captain Obvious wrote: > ??>> But I'd say it is actually pretty fast -- library I'm using can do > SVD > ??>> of 17 seconds worth of a sound in 2 seconds on 2 GHz CPU. Faster > than ??>> realtime. > > TKP> I am just curious, which library are you using? > > PROPACK. > From the description: > "The software package PROPACK contains a set of functions for computing > the singular value decomposition of large and sparse or structured > matrices. The SVD routines are based on the Lanczos bidiagonalization > algorithm with partial reorthogonalization (BPRO). The Lanczos routines > can also be used directly, and form the basis of efficient algorithms > for solving linear systems of equations and linear least squares > problems, in particular for systems with multiple right-hand sides." > > I was using it for large sparse matrices and it worked pretty well. > > Now as for compression I have dense matrices perhaps I could switch to > another library, but I didn't bother, as I already have glue code to run > PROPACK and it works good enough. > > TKP> LLA currently uses LAPACK's divide-and-conquer algorithm for SVD > by TKP> default, which I found to be pretty fast, even for large > matrices. TKP> LAPACK's lower-level SVD functions also have other > possible speedups, TKP> if you only need some of the singular > values/vectors etc. It is TKP> really amazing how fast SVD is these > days, nowadays I tend to use it TKP> in places where I would have used > other decompositions (QR, Cholesky) TKP> before, it is so stable and > robust. > > Can you please check how fast it will do, say, 256*2930 matrix? I'd > consider replacement to a thing I have now. > > It's not like I think PROPACK is bad, just my glue code use files (!) to > communicate with it and writing to files is the most time consuming > operation. > And I guess it might be easier to swich to another library than to > rewrite glue code using FFI myself :). Sparse matrices are a different issue, the have special algorithms, which are likely to be a bit faster than general ones. I am not entirely sure what you need from the SVD (do you need both U and V^T?), so I ran a whole lot of different benchmarks (included below). Even if you want to stick with PROPACK for now, LLA has a set of macros which make interfacing to FORTRAN a lot easier (see linear-algebra.lisp for examples). Here is the code I ran: (in-package #:common-lisp-user) (require :lla) (defparameter *m* (lla:make-matrix 256 2930 :double :initial-element (lambda () (random 100d0)))) (defmacro time* (form comment) `(progn (format t "Timing ~A (~A)" ',form ,comment) (time ,form))) (time* (lla:svd *m*) "singular values only") (time* (lla:svd *m* :left :singular) "singular vectors from U") (time* (lla:svd *m* :left :all) "all vectors from U") (time* (lla:svd *m* :right :singular) "singular vectors from V^T") (time* (lla:svd *m* :right :all) "all vectors from V^T") (time* (lla:svd *m* :left :singular :right :singular) "singular vector from both U and V^T") Here are the results (Dell Latitude laptop, Intel(R) Core(TM)2 Duo CPU P8700 @ 2.53GHz, 3 MB cache, single thread): Timing (SVD *M*) (singular values only) Evaluation took: 0.648 seconds of real time 0.640000 seconds of total run time (0.640000 user, 0.000000 system) 98.77% CPU 1,637,341,188 processor cycles 18,037,008 bytes consed Timing (SVD *M* LEFT SINGULAR) (singular vectors from U) Evaluation took: 0.742 seconds of real time 0.740000 seconds of total run time (0.740000 user, 0.000000 system) 99.73% CPU 1,873,691,270 processor cycles 18,566,016 bytes consed Timing (SVD *M* LEFT ALL) (all vectors from U) Evaluation took: 0.747 seconds of real time 0.740000 seconds of total run time (0.730000 user, 0.010000 system) 99.06% CPU 1,889,345,019 processor cycles 18,560,688 bytes consed Timing (SVD *M* RIGHT SINGULAR) (singular vectors from V^T) Evaluation took: 1.595 seconds of real time 1.600000 seconds of total run time (1.600000 user, 0.000000 system) 100.31% CPU 4,030,383,369 processor cycles 24,028,528 bytes consed Timing (SVD *M* RIGHT ALL) (all vectors from V^T) Evaluation took: 8.161 seconds of real time 8.170000 seconds of total run time (8.130000 user, 0.040000 system) 100.11% CPU 20,623,094,542 processor cycles 86,801,424 bytes consed Timing (SVD *M* LEFT SINGULAR RIGHT SINGULAR) (singular vector from both U and V^T) Evaluation took: 1.656 seconds of real time 1.650000 seconds of total run time (1.650000 user, 0.000000 system) 99.64% CPU 4,185,266,126 processor cycles 24,551,440 bytes consed Best, Tamas
From: Captain Obvious on 18 Jul 2010 14:01 TKP> Even if you want to stick with PROPACK for now, LLA has a set TKP> of macros which make interfacing to FORTRAN a lot easier (see TKP> linear-algebra.lisp for examples). Cool, I'll check this if I return to huge sparse matrices case. TKP> I am not entirely sure what you need from the SVD (do you need both U TKP> and V^T?) Yep, in this particular case both of them, and singular values too. TKP> Timing (SVD *M* LEFT SINGULAR RIGHT SINGULAR) (singular vector from TKP> both U and V^T) Evaluation took: TKP> 1.656 seconds of real time TKP> 1.650000 seconds of total run time (1.650000 user, 0.000000 system) TKP> 99.64% CPU TKP> 4,185,266,126 processor cycles TKP> 24,551,440 bytes consed I guess that's it. Approximately same time as with PROPACK. TKP> LAPACK's lower-level SVD functions also have other possible speedups, TKP> if you only need some of the singular values/vectors etc. Do you mean that it is already implemented and it is easy to enable it?
From: Tamas K Papp on 19 Jul 2010 03:26
On Sun, 18 Jul 2010 21:01:29 +0300, Captain Obvious wrote: > TKP> LAPACK's lower-level SVD functions also have other possible > speedups, TKP> if you only need some of the singular values/vectors > etc. > > Do you mean that it is already implemented and it is easy to enable it? I realize I have been thinking about eigenvalue/vector problems, not SVD. The divide-and-conquer algorithm is about as fast as it gets on dense matrices. Tamas |