Prev: interaction of mode 'r+', file.write(), and file.tell(): a bug or undefined behavior?
Next: Why am I getting this Error message
From: evilweasel on 28 Jan 2010 10:07 Hi folks, I am a newbie to python, and I would be grateful if someone could point out the mistake in my program. Basically, I have a huge text file similar to the format below: AAAAAGACTCGAGTGCGCGGA 0 AAAAAGATAAGCTAATTAAGCTACTGG 0 AAAAAGATAAGCTAATTAAGCTACTGGGTT 1 AAAAAGGGGGCTCACAGGGGAGGGGTAT 1 AAAAAGGTCGCCTGACGGCTGC 0 The text is nothing but DNA sequences, and there is a number next to it. What I will have to do is, ignore those lines that have 0 in it, and print all other lines (excluding the number) in a new text file (in a particular format called as FASTA format). This is the program I wrote for that: seq1 = [] list1 = [] lister = [] listers = [] listers1 = [] a = [] d = [] i = 0 j = 0 num = 0 file1 = open(sys.argv[1], 'r') for line in file1: if not line.startswith('\n'): seq1 = line.split() if len(seq1) == 0: continue a = seq1[0] list1.append(a) d = seq1[1] lister.append(d) b = len(lister) for j in range(0, b): if lister[j] == 0: listers.append(j) else: listers1.append(j) print listers1 resultsfile = open("sequences1.txt", 'w') for i in listers1: resultsfile.write('\n>seq' + str(i) + '\n' + list1[i] + '\n') But this isn't working. I am not able to find the bug in this. I would be thankful if someone could point it out. Thanks in advance! Cheers!
From: Alf P. Steinbach on 28 Jan 2010 10:17 * evilweasel: > Hi folks, > > I am a newbie to python, and I would be grateful if someone could > point out the mistake in my program. Basically, I have a huge text > file similar to the format below: > > AAAAAGACTCGAGTGCGCGGA 0 > AAAAAGATAAGCTAATTAAGCTACTGG 0 > AAAAAGATAAGCTAATTAAGCTACTGGGTT 1 > AAAAAGGGGGCTCACAGGGGAGGGGTAT 1 > AAAAAGGTCGCCTGACGGCTGC 0 > > The text is nothing but DNA sequences, and there is a number next to > it. What I will have to do is, ignore those lines that have 0 in it, > and print all other lines (excluding the number) in a new text file > (in a particular format called as FASTA format). This is the program I > wrote for that: > > seq1 = [] > list1 = [] > lister = [] > listers = [] > listers1 = [] > a = [] > d = [] > i = 0 > j = 0 > num = 0 > > file1 = open(sys.argv[1], 'r') > for line in file1: > if not line.startswith('\n'): > seq1 = line.split() > if len(seq1) == 0: > continue > > a = seq1[0] > list1.append(a) > > d = seq1[1] > lister.append(d) > > > b = len(lister) > for j in range(0, b): > if lister[j] == 0: > listers.append(j) > else: > listers1.append(j) > > > print listers1 > resultsfile = open("sequences1.txt", 'w') > for i in listers1: > resultsfile.write('\n>seq' + str(i) + '\n' + list1[i] + '\n') > > But this isn't working. What do you mean by "isn't working"? > I am not able to find the bug in this. I would > be thankful if someone could point it out. Thanks in advance! What do you expect as output, and what do you actually get as output? Cheers, - Alf
From: Mark Dickinson on 28 Jan 2010 10:22 On Jan 28, 3:07 pm, evilweasel <karthikramaswam...(a)gmail.com> wrote: > Hi folks, > > I am a newbie to python, and I would be grateful if someone could > point out the mistake in my program. <snip> > for j in range(0, b): > if lister[j] == 0: At a guess, this line should be: if lister[j] == '0': ... -- Mark
From: Krister Svanlund on 28 Jan 2010 10:28 On Thu, Jan 28, 2010 at 4:07 PM, evilweasel <karthikramaswamy88(a)gmail.com> wrote: > Hi folks, > > I am a newbie to python, and I would be grateful if someone could > point out the mistake in my program. Basically, I have a huge text > file similar to the format below: > > AAAAAGACTCGAGTGCGCGGA 0 > AAAAAGATAAGCTAATTAAGCTACTGG 0 > AAAAAGATAAGCTAATTAAGCTACTGGGTT 1 > AAAAAGGGGGCTCACAGGGGAGGGGTAT 1 > AAAAAGGTCGCCTGACGGCTGC 0 > > The text is nothing but DNA sequences, and there is a number next to > it. What I will have to do is, ignore those lines that have 0 in it, > and print all other lines (excluding the number) in a new text file > (in a particular format called as FASTA format). This is the program I > wrote for that: > > seq1 = [] > list1 = [] > lister = [] > listers = [] > listers1 = [] > a = [] > d = [] > i = 0 > j = 0 > num = 0 > > file1 = open(sys.argv[1], 'r') > for line in file1: > if not line.startswith('\n'): > seq1 = line.split() > if len(seq1) == 0: > continue > > a = seq1[0] > list1.append(a) > > d = seq1[1] > lister.append(d) > > > b = len(lister) > for j in range(0, b): > if lister[j] == 0: > listers.append(j) > else: > listers1.append(j) > > > print listers1 > resultsfile = open("sequences1.txt", 'w') > for i in listers1: > resultsfile.write('\n>seq' + str(i) + '\n' + list1[i] + '\n') > > But this isn't working. I am not able to find the bug in this. I would > be thankful if someone could point it out. Thanks in advance! > > Cheers! I'm not totaly sure what you want to do but try this (python2.6+): newlines = [] with open(sys.argv[1], 'r') as f: text = f.read(); for line in text.splitlines(): if not line.strip() and line.strip().endswith('1'): newlines.append('seq'+line) with open(sys.argv[2], 'w') as f: f.write('\n'.join(newlines))
From: Krister Svanlund on 28 Jan 2010 10:31
On Thu, Jan 28, 2010 at 4:28 PM, Krister Svanlund <krister.svanlund(a)gmail.com> wrote: > On Thu, Jan 28, 2010 at 4:07 PM, evilweasel > <karthikramaswamy88(a)gmail.com> wrote: >> Hi folks, >> >> I am a newbie to python, and I would be grateful if someone could >> point out the mistake in my program. Basically, I have a huge text >> file similar to the format below: >> >> AAAAAGACTCGAGTGCGCGGA 0 >> AAAAAGATAAGCTAATTAAGCTACTGG 0 >> AAAAAGATAAGCTAATTAAGCTACTGGGTT 1 >> AAAAAGGGGGCTCACAGGGGAGGGGTAT 1 >> AAAAAGGTCGCCTGACGGCTGC 0 >> >> The text is nothing but DNA sequences, and there is a number next to >> it. What I will have to do is, ignore those lines that have 0 in it, >> and print all other lines (excluding the number) in a new text file >> (in a particular format called as FASTA format). This is the program I >> wrote for that: >> >> seq1 = [] >> list1 = [] >> lister = [] >> listers = [] >> listers1 = [] >> a = [] >> d = [] >> i = 0 >> j = 0 >> num = 0 >> >> file1 = open(sys.argv[1], 'r') >> for line in file1: >> if not line.startswith('\n'): >> seq1 = line.split() >> if len(seq1) == 0: >> continue >> >> a = seq1[0] >> list1.append(a) >> >> d = seq1[1] >> lister.append(d) >> >> >> b = len(lister) >> for j in range(0, b): >> if lister[j] == 0: >> listers.append(j) >> else: >> listers1.append(j) >> >> >> print listers1 >> resultsfile = open("sequences1.txt", 'w') >> for i in listers1: >> resultsfile.write('\n>seq' + str(i) + '\n' + list1[i] + '\n') >> >> But this isn't working. I am not able to find the bug in this. I would >> be thankful if someone could point it out. Thanks in advance! >> >> Cheers! > > I'm not totaly sure what you want to do but try this (python2.6+): > > newlines = [] > > with open(sys.argv[1], 'r') as f: > text = f.read(); > for line in text.splitlines(): > if not line.strip() and line.strip().endswith('1'): newlines.append('seq'+line.strip()[:-1].strip()) > > with open(sys.argv[2], 'w') as f: > f.write('\n'.join(newlines)) > Gah, made some errors |