Need help with a program [Python]

Prev: interaction of mode 'r+', file.write(), and file.tell(): a bug or undefined behavior?
Next: Why am I getting this Error message

From: evilweasel on 28 Jan 2010 10:07

Hi folks,

I am a newbie to python, and I would be grateful if someone could
point out the mistake in my program. Basically, I have a huge text
file similar to the format below:

AAAAAGACTCGAGTGCGCGGA 0
AAAAAGATAAGCTAATTAAGCTACTGG 0
AAAAAGATAAGCTAATTAAGCTACTGGGTT 1
AAAAAGGGGGCTCACAGGGGAGGGGTAT 1
AAAAAGGTCGCCTGACGGCTGC 0

The text is nothing but DNA sequences, and there is a number next to
it. What I will have to do is, ignore those lines that have 0 in it,
and print all other lines (excluding the number) in a new text file
(in a particular format called as FASTA format). This is the program I
wrote for that:

seq1 = []
list1 = []
lister = []
listers = []
listers1 = []
a = []
d = []
i = 0
j = 0
num = 0

file1 = open(sys.argv[1], 'r')
for line in file1:
if not line.startswith('\n'):
seq1 = line.split()
if len(seq1) == 0:
continue

a = seq1[0]
list1.append(a)

d = seq1[1]
lister.append(d)

b = len(lister)
for j in range(0, b):
if lister[j] == 0:
listers.append(j)
else:
listers1.append(j)

print listers1
resultsfile = open("sequences1.txt", 'w')
for i in listers1:
resultsfile.write('\n>seq' + str(i) + '\n' + list1[i] + '\n')

But this isn't working. I am not able to find the bug in this. I would
be thankful if someone could point it out. Thanks in advance!

Cheers!

From: Alf P. Steinbach on 28 Jan 2010 10:17

* evilweasel:
> Hi folks,
>
> I am a newbie to python, and I would be grateful if someone could
> point out the mistake in my program. Basically, I have a huge text
> file similar to the format below:
>
> AAAAAGACTCGAGTGCGCGGA 0
> AAAAAGATAAGCTAATTAAGCTACTGG 0
> AAAAAGATAAGCTAATTAAGCTACTGGGTT 1
> AAAAAGGGGGCTCACAGGGGAGGGGTAT 1
> AAAAAGGTCGCCTGACGGCTGC 0
>
> The text is nothing but DNA sequences, and there is a number next to
> it. What I will have to do is, ignore those lines that have 0 in it,
> and print all other lines (excluding the number) in a new text file
> (in a particular format called as FASTA format). This is the program I
> wrote for that:
>
> seq1 = []
> list1 = []
> lister = []
> listers = []
> listers1 = []
> a = []
> d = []
> i = 0
> j = 0
> num = 0
>
> file1 = open(sys.argv[1], 'r')
> for line in file1:
> if not line.startswith('\n'):
> seq1 = line.split()
> if len(seq1) == 0:
> continue
>
> a = seq1[0]
> list1.append(a)
>
> d = seq1[1]
> lister.append(d)
>
>
> b = len(lister)
> for j in range(0, b):
> if lister[j] == 0:
> listers.append(j)
> else:
> listers1.append(j)
>
>
> print listers1
> resultsfile = open("sequences1.txt", 'w')
> for i in listers1:
> resultsfile.write('\n>seq' + str(i) + '\n' + list1[i] + '\n')
>
> But this isn't working.

What do you mean by "isn't working"?

> I am not able to find the bug in this. I would
> be thankful if someone could point it out. Thanks in advance!

What do you expect as output, and what do you actually get as output?

Cheers,

- Alf

From: Mark Dickinson on 28 Jan 2010 10:22

On Jan 28, 3:07 pm, evilweasel <karthikramaswam...(a)gmail.com> wrote:
> Hi folks,
>
> I am a newbie to python, and I would be grateful if someone could
> point out the mistake in my program.

<snip>

> for j in range(0, b):
> if lister[j] == 0:

At a guess, this line should be:

if lister[j] == '0':
...

--
Mark

From: Krister Svanlund on 28 Jan 2010 10:28

On Thu, Jan 28, 2010 at 4:07 PM, evilweasel
<karthikramaswamy88(a)gmail.com> wrote:
> Hi folks,
>
> I am a newbie to python, and I would be grateful if someone could
> point out the mistake in my program. Basically, I have a huge text
> file similar to the format below:
>
> AAAAAGACTCGAGTGCGCGGA 0
> AAAAAGATAAGCTAATTAAGCTACTGG 0
> AAAAAGATAAGCTAATTAAGCTACTGGGTT 1
> AAAAAGGGGGCTCACAGGGGAGGGGTAT 1
> AAAAAGGTCGCCTGACGGCTGC 0
>
> The text is nothing but DNA sequences, and there is a number next to
> it. What I will have to do is, ignore those lines that have 0 in it,
> and print all other lines (excluding the number) in a new text file
> (in a particular format called as FASTA format). This is the program I
> wrote for that:
>
> seq1 = []
> list1 = []
> lister = []
> listers = []
> listers1 = []
> a = []
> d = []
> i = 0
> j = 0
> num = 0
>
> file1 = open(sys.argv[1], 'r')
> for line in file1:
> if not line.startswith('\n'):
> seq1 = line.split()
> if len(seq1) == 0:
> continue
>
> a = seq1[0]
> list1.append(a)
>
> d = seq1[1]
> lister.append(d)
>
>
> b = len(lister)
> for j in range(0, b):
> if lister[j] == 0:
> listers.append(j)
> else:
> listers1.append(j)
>
>
> print listers1
> resultsfile = open("sequences1.txt", 'w')
> for i in listers1:
> resultsfile.write('\n>seq' + str(i) + '\n' + list1[i] + '\n')
>
> But this isn't working. I am not able to find the bug in this. I would
> be thankful if someone could point it out. Thanks in advance!
>
> Cheers!

I'm not totaly sure what you want to do but try this (python2.6+):

newlines = []

with open(sys.argv[1], 'r') as f:
text = f.read();
for line in text.splitlines():
if not line.strip() and line.strip().endswith('1'):
newlines.append('seq'+line)

with open(sys.argv[2], 'w') as f:
f.write('\n'.join(newlines))

From: Krister Svanlund on 28 Jan 2010 10:31

On Thu, Jan 28, 2010 at 4:28 PM, Krister Svanlund
<krister.svanlund(a)gmail.com> wrote:
> On Thu, Jan 28, 2010 at 4:07 PM, evilweasel
> <karthikramaswamy88(a)gmail.com> wrote:
>> Hi folks,
>>
>> I am a newbie to python, and I would be grateful if someone could
>> point out the mistake in my program. Basically, I have a huge text
>> file similar to the format below:
>>
>> AAAAAGACTCGAGTGCGCGGA 0
>> AAAAAGATAAGCTAATTAAGCTACTGG 0
>> AAAAAGATAAGCTAATTAAGCTACTGGGTT 1
>> AAAAAGGGGGCTCACAGGGGAGGGGTAT 1
>> AAAAAGGTCGCCTGACGGCTGC 0
>>
>> The text is nothing but DNA sequences, and there is a number next to
>> it. What I will have to do is, ignore those lines that have 0 in it,
>> and print all other lines (excluding the number) in a new text file
>> (in a particular format called as FASTA format). This is the program I
>> wrote for that:
>>
>> seq1 = []
>> list1 = []
>> lister = []
>> listers = []
>> listers1 = []
>> a = []
>> d = []
>> i = 0
>> j = 0
>> num = 0
>>
>> file1 = open(sys.argv[1], 'r')
>> for line in file1:
>> if not line.startswith('\n'):
>> seq1 = line.split()
>> if len(seq1) == 0:
>> continue
>>
>> a = seq1[0]
>> list1.append(a)
>>
>> d = seq1[1]
>> lister.append(d)
>>
>>
>> b = len(lister)
>> for j in range(0, b):
>> if lister[j] == 0:
>> listers.append(j)
>> else:
>> listers1.append(j)
>>
>>
>> print listers1
>> resultsfile = open("sequences1.txt", 'w')
>> for i in listers1:
>> resultsfile.write('\n>seq' + str(i) + '\n' + list1[i] + '\n')
>>
>> But this isn't working. I am not able to find the bug in this. I would
>> be thankful if someone could point it out. Thanks in advance!
>>
>> Cheers!
>
> I'm not totaly sure what you want to do but try this (python2.6+):
>
> newlines = []
>
> with open(sys.argv[1], 'r') as f:
> text = f.read();
> for line in text.splitlines():
> if not line.strip() and line.strip().endswith('1'):
newlines.append('seq'+line.strip()[:-1].strip())
>
> with open(sys.argv[2], 'w') as f:
> f.write('\n'.join(newlines))
>

Gah, made some errors

| Next | Last
Pages: 1 2 3 4 5 6
Prev: interaction of mode 'r+', file.write(), and file.tell(): a bug or undefined behavior?
Next: Why am I getting this Error message