From: chris on 3 May 2010 07:27 Hi all, Given this data: 0 + AGAAATCTAACACAAAATCATTAACTTAT-TAGTTTCCAA 0 + AAGAGAAACGATATTAGTCCAAAAATGTAAACATA 0 - TCGTTGGTAACAATATCTAC-TTT-CT 3 - TTTTTGTCTTTTTTTTTTTTTTTGTTTAGTTA-GT 0 - GACGATAAAGAAATAAAATCT-ATT-GCTTCTT-GT 1 - TTTTTTTTTTTAAAAATA-ATTTC-TTAATATCTT 1 - CTATATAGTTTGTGGACATTATATTATGTTCTCTCTTGACTAA-ATGT 0 - TTTGTCCAAGTCAACTAAGTGCACTA-AAAAGGATCTTCTAT 2 + ATTATTGGCTTATTATTGCCAAAACAGAAAA-AAA 0 - C-TACGTGTCTGATGCAATAATGGAAATGGAGTTGTGTGT 0 - TGTTGTATGACATCATAATTATGGAATTTTTTTT-GTT 0 + AATAATAAGAAAA-AAAA-AAAAA-AAAAAAA 0 - TGTTGAAAAGCATCTAACTTGA--AGGACGGTCTGAGGCTT 0 - ATTTTTTTGTTTTTTTATCA-C--AAATTA-T-AT 1 + ACTATCGGAAAAAATCAAGACGCACGGATATATAAA 2 + GACATCAAAGATACTTT-CTTGAACAAGACCAGGAATA 0 + AAACAAACCAGAAACTTTCATATCAATAATACATAGAA 0 - TTCTATGTGATATTTTGGTTCGCTGTGTG 0 - TTTTTTTTTTTTT-TTTTTTTCTTTTTACT--T 0 + GTCGACCATAAAAGTTTACATAAAGAATCAAGGTT sort v5.97 (as per Centos5.4) gives this: > $ sort -k2 file 0 + AAACAAACCAGAAACTTTCATATCAATAATACATAGAA 0 + AAGAGAAACGATATTAGTCCAAAAATGTAAACATA 0 + AATAATAAGAAAA-AAAA-AAAAA-AAAAAAA 1 + ACTATCGGAAAAAATCAAGACGCACGGATATATAAA 0 + AGAAATCTAACACAAAATCATTAACTTAT-TAGTTTCCAA 2 + ATTATTGGCTTATTATTGCCAAAACAGAAAA-AAA 0 - ATTTTTTTGTTTTTTTATCA-C--AAATTA-T-AT 0 - C-TACGTGTCTGATGCAATAATGGAAATGGAGTTGTGTGT 1 - CTATATAGTTTGTGGACATTATATTATGTTCTCTCTTGACTAA-ATGT 2 + GACATCAAAGATACTTT-CTTGAACAAGACCAGGAATA 0 - GACGATAAAGAAATAAAATCT-ATT-GCTTCTT-GT 0 + GTCGACCATAAAAGTTTACATAAAGAATCAAGGTT 0 - TCGTTGGTAACAATATCTAC-TTT-CT 0 - TGTTGAAAAGCATCTAACTTGA--AGGACGGTCTGAGGCTT 0 - TGTTGTATGACATCATAATTATGGAATTTTTTTT-GTT 0 - TTCTATGTGATATTTTGGTTCGCTGTGTG 0 - TTTGTCCAAGTCAACTAAGTGCACTA-AAAAGGATCTTCTAT 3 - TTTTTGTCTTTTTTTTTTTTTTTGTTTAGTTA-GT 1 - TTTTTTTTTTTAAAAATA-ATTTC-TTAATATCTT 0 - TTTTTTTTTTTTT-TTTTTTTCTTTTTACT--T i.e. it's sorting on column 3 not 2. sort v5.93 (as per Mac OS 10.5.8) gives: 0 + AAACAAACCAGAAACTTTCATATCAATAATACATAGAA 0 + AAGAGAAACGATATTAGTCCAAAAATGTAAACATA 0 + AATAATAAGAAAA-AAAA-AAAAA-AAAAAAA 1 + ACTATCGGAAAAAATCAAGACGCACGGATATATAAA 0 + AGAAATCTAACACAAAATCATTAACTTAT-TAGTTTCCAA 2 + ATTATTGGCTTATTATTGCCAAAACAGAAAA-AAA 2 + GACATCAAAGATACTTT-CTTGAACAAGACCAGGAATA 0 + GTCGACCATAAAAGTTTACATAAAGAATCAAGGTT 0 - ATTTTTTTGTTTTTTTATCA-C--AAATTA-T-AT 0 - C-TACGTGTCTGATGCAATAATGGAAATGGAGTTGTGTGT 1 - CTATATAGTTTGTGGACATTATATTATGTTCTCTCTTGACTAA-ATGT 0 - GACGATAAAGAAATAAAATCT-ATT-GCTTCTT-GT 0 - TCGTTGGTAACAATATCTAC-TTT-CT 0 - TGTTGAAAAGCATCTAACTTGA--AGGACGGTCTGAGGCTT 0 - TGTTGTATGACATCATAATTATGGAATTTTTTTT-GTT 0 - TTCTATGTGATATTTTGGTTCGCTGTGTG 0 - TTTGTCCAAGTCAACTAAGTGCACTA-AAAAGGATCTTCTAT 3 - TTTTTGTCTTTTTTTTTTTTTTTGTTTAGTTA-GT 1 - TTTTTTTTTTTAAAAATA-ATTTC-TTAATATCTT 0 - TTTTTTTTTTTTT-TTTTTTTCTTTTTACT--T Which looks like it's sorting column 2 then 3. Anyone else seen this and is it a bug? Cheers, Chris
From: Gordon Henderson on 3 May 2010 08:25 In article <op.vb4nfilos4ghqh(a)caterpillar.compbio.dundee.ac.uk>, chris <ithinkiam(a)gmail.com> wrote: >Hi all, > >Which looks like it's sorting column 2 then 3. Anyone else seen this and >is it a bug? It's not a bug, but a feature. See: http://www.gnu.org/software/coreutils/faq/coreutils-faq.html#Sort-does-not-sort-in-normal-order_0021 where it says: # If you use bash or some other Bourne-based shell, export LC_ALL=POSIX # If you use a C-shell, setenv LC_ALL POSIX Gordon
From: chris on 3 May 2010 08:32 On Mon, 03 May 2010 13:25:22 +0100, Gordon Henderson <gordon+usenet(a)drogon.net> wrote: > In article <op.vb4nfilos4ghqh(a)caterpillar.compbio.dundee.ac.uk>, > chris <ithinkiam(a)gmail.com> wrote: >> Hi all, >> >> Which looks like it's sorting column 2 then 3. Anyone else seen this and >> is it a bug? > > It's not a bug, but a feature. > See: > http://www.gnu.org/software/coreutils/faq/coreutils-faq.html#Sort-does-not-sort-in-normal-order_0021 > > where it says: > > # If you use bash or some other Bourne-based shell, > export LC_ALL=POSIX > # If you use a C-shell, > setenv LC_ALL POSIX > > Gordon LOCALES strikes again! Thanks Gordon.
From: Tom Anderson on 3 May 2010 08:56 On Mon, 3 May 2010, Gordon Henderson wrote: > In article <op.vb4nfilos4ghqh(a)caterpillar.compbio.dundee.ac.uk>, > chris <ithinkiam(a)gmail.com> wrote: >> Hi all, >> >> Which looks like it's sorting column 2 then 3. Anyone else seen this and >> is it a bug? > > It's not a bug, but a feature. > > See: http://www.gnu.org/software/coreutils/faq/coreutils-faq.html#Sort-does-not-sort-in-normal-order_0021 I don't get it. I tried that locally, and got tthe same problem as Chris, and the solution there fixed it. But i don't understand why. Why does the locale affect sorting of a column containing only + and -? Is it something to do with how the columns are defined? Is it that the collation sequence for en_GB.UTF-8 sorts '-' and '+' equally, and so sort falls back to comparing the whole line? If the latter, is that not an astonishing bug? tom -- Science is the outcome of being prepared to live without certainty and therefore a mark of maturity. -- AC Grayling
From: chris on 3 May 2010 10:22 On Mon, 03 May 2010 13:32:09 +0100, chris <ithinkiam(a)gmail.com> wrote: > On Mon, 03 May 2010 13:25:22 +0100, Gordon Henderson > <gordon+usenet(a)drogon.net> wrote: > >> In article <op.vb4nfilos4ghqh(a)caterpillar.compbio.dundee.ac.uk>, >> chris <ithinkiam(a)gmail.com> wrote: >>> Hi all, >>> >>> Which looks like it's sorting column 2 then 3. Anyone else seen this >>> and >>> is it a bug? >> >> It's not a bug, but a feature. >> See: >> http://www.gnu.org/software/coreutils/faq/coreutils-faq.html#Sort-does-not-sort-in-normal-order_0021 >> >> where it says: >> >> # If you use bash or some other Bourne-based shell, >> export LC_ALL=POSIX >> # If you use a C-shell, >> setenv LC_ALL POSIX >> >> Gordon > > LOCALES strikes again! Thanks Gordon. Wait a sec! This issue initially cropped up with a multi-column sort and I thought I'd whittled it down to a 'simple' example. However, the original problem is still not solved. Given this file: 2 20140192 + 0 25394313 + 0 17128576 - 1 19332581 - 2 5214084 - 0 9019334 - 2 1232272 - 2 11075440 - 3 242532 + 3 7434705 - 1 19397725 - 1 8621880 + 2 17445849 - 1 6685383 - 4 15377341 + 1 14265470 + 3 796183 + 3 13285233 - 2 5241794 - 0 2370091 + I want to sort on -k1n -k3 -k2n, but it still doesn't work even with LC_ALL=POSIX? I can sort on columns 1 and 3 or 1 and 2, but three gives: > $ sort -k1n -k3 -nk2 file 0 2370091 + 0 9019334 - 0 17128576 - 0 25394313 + 1 6685383 - 1 8621880 + 1 14265470 + 1 19332581 - 1 19397725 - 2 1232272 - 2 5214084 - 2 5241794 - 2 11075440 - 2 17445849 - 2 20140192 + 3 242532 + 3 796183 + 3 7434705 - 3 13285233 - 4 15377341 + It's sorting cols 1 and 2, but not 3. What's wrong here?
|
Next
|
Last
Pages: 1 2 3 Prev: Serial to Ethernet adapters Next: Bogus "Unable to mount" error dialog |