From: Archie on 3 Mar 2006 10:23 I am looking for Fortran 77/95 source code for a multiple key sort. Input records can contain a variable number of (integer) fields where the number of fields is typically between 4 and 40. For a file of 40 field records, the records need to be sorted into ascending order of field 40 within field 39, withing field 38 .... within field 1. File sizes are usually relatively small with between, say, 1,000 and 20,000 records. It would generally be better to sort within memory, for subsequent processing, rather than writing to an output file. The sort might optionally be repeated based on subsets of the input data file. Most sort code I have been able to find is for a single sort key. Is anyone able to recommend suitable Fortran source code?
From: Herman D. Knoble on 3 Mar 2006 10:47 Syncsort isn't Fortran source code, but you might contact them about this capability as I recall a Fortran interface years ago under VM/CMS. http://www.syncsort.com/products/ss/home.htm Also you might contact Michel Olagnon: http://www.fortran-2000.com/rank/index.html Skip On 3 Mar 2006 07:23:43 -0800, "Archie" <infocorp(a)ozemail.com.au> wrote: -|I am looking for Fortran 77/95 source code for a multiple key sort. -| -|Input records can contain a variable number of (integer) fields where -|the number of fields is typically between 4 and 40. For a file of 40 -|field records, the records need to be sorted into ascending order of -|field 40 within field 39, withing field 38 .... within field 1. -| -|File sizes are usually relatively small with between, say, 1,000 and -|20,000 records. -| -|It would generally be better to sort within memory, for subsequent -|processing, rather than writing to an output file. -| -|The sort might optionally be repeated based on subsets of the input -|data file. -| -|Most sort code I have been able to find is for a single sort key. -| -|Is anyone able to recommend suitable Fortran source code?
From: Gordon Sande on 3 Mar 2006 11:03 On 2006-03-03 11:23:43 -0400, "Archie" <infocorp(a)ozemail.com.au> said: > I am looking for Fortran 77/95 source code for a multiple key sort. > > Input records can contain a variable number of (integer) fields where > the number of fields is typically between 4 and 40. For a file of 40 > field records, the records need to be sorted into ascending order of > field 40 within field 39, withing field 38 .... within field 1. > > File sizes are usually relatively small with between, say, 1,000 and > 20,000 records. > > It would generally be better to sort within memory, for subsequent > processing, rather than writing to an output file. > > The sort might optionally be repeated based on subsets of the input > data file. > > Most sort code I have been able to find is for a single sort key. > > Is anyone able to recommend suitable Fortran source code? Confusing terminology at play. Your single key is constructed by concatenating several fields if I have straight. This is the sort of thing that requires a 30 line sort template into which you either paste the concatenation directly or invoke the function which will do the concatenation. The elaborate commercial sort programs deal with the construction of the concatenation and files that are larger than memory. Multi-key sorting is also a misnaming of nearest neighbor searching which a rather different issue. When you mention resorting of subsets it starts to sound like you may a nearest neighbor search problem without realizing it.
From: Michel OLAGNON on 3 Mar 2006 12:26 Herman D. Knoble wrote: > Syncsort isn't Fortran source code, but you might > contact them about this capability as I recall a Fortran > interface years ago under VM/CMS. > http://www.syncsort.com/products/ss/home.htm > > Also you might contact Michel Olagnon: > http://www.fortran-2000.com/rank/index.html For instance, you might modify routine mrgref to deal with your kind of records (there are only 3 places where records are compared, replace with a function that you write to text successively the various fields), carry out the ranking and store the sorted records into a new memory location. It requires a little work, but should give reasonable performance. > > Skip > > On 3 Mar 2006 07:23:43 -0800, "Archie" <infocorp(a)ozemail.com.au> wrote: > > -|I am looking for Fortran 77/95 source code for a multiple key sort. > -| > -|Input records can contain a variable number of (integer) fields where > -|the number of fields is typically between 4 and 40. For a file of 40 > -|field records, the records need to be sorted into ascending order of > -|field 40 within field 39, withing field 38 .... within field 1. > -| > -|File sizes are usually relatively small with between, say, 1,000 and > -|20,000 records. > -| > -|It would generally be better to sort within memory, for subsequent > -|processing, rather than writing to an output file. > -| > -|The sort might optionally be repeated based on subsets of the input > -|data file. > -| > -|Most sort code I have been able to find is for a single sort key. > -| > -|Is anyone able to recommend suitable Fortran source code? >
From: glen herrmannsfeldt on 3 Mar 2006 12:49
Archie <infocorp(a)ozemail.com.au> wrote: > I am looking for Fortran 77/95 source code for a multiple key sort. > Input records can contain a variable number of (integer) fields where > the number of fields is typically between 4 and 40. For a file of 40 > field records, the records need to be sorted into ascending order of > field 40 within field 39, withing field 38 .... within field 1. The usual unix sort program, and its GNU implementation, will do any number of fields, numeric or alphanumeric, and most likely faster than anything you can write. > File sizes are usually relatively small with between, say, 1,000 and > 20,000 records. That is small. Yesterday I sorted 98 million records in an 11GB file in less than an hour with the GNU sort program on a windows box. > It would generally be better to sort within memory, for subsequent > processing, rather than writing to an output file. For that small a file, I suppose, though it depends on how big each record is. > The sort might optionally be repeated based on subsets of the input > data file. > Most sort code I have been able to find is for a single sort key. The standard C library qsort can sort any number of keys. You supply a function that will compare two records given pointers to those records. With F2003 C interop that should be easy, without it maybe not so easy. > Is anyone able to recommend suitable Fortran source code? qsort isn't very complicated, though you need generic pointers which I understand are not so easy in Fortran. It shouldn't take long to rewrite in Fortran. -- glen |