simple indexing in Perl? [Perl]

Prev: Need Google AdSense Account
Next: FAQ 4.58 How do I look up a hash element by value?

From: Tad McClellan on 10 Aug 2010 10:26

Jens Thoms Toerring <jt(a)toerring.de> wrote:
> ela <ela(a)yantai.org> wrote:

>> print '($listfile, $accfile, $infofile)'; <STDIN>;
>
> What's that at end of the line good for?

Pausing the program until something is typed on STDIN.

--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.liamg\100cm.j.dat/"
The above message is a Usenet post.
I don't recall having given anyone permission to use it on a Web site.

From: Jens Thoms Toerring on 10 Aug 2010 18:40

Tad McClellan <tadmc(a)seesig.invalid> wrote:
> Jens Thoms Toerring <jt(a)toerring.de> wrote:
> > ela <ela(a)yantai.org> wrote:

> >> print '($listfile, $accfile, $infofile)'; <STDIN>;
> >
> > What's that at end of the line good for?

> Pausing the program until something is typed on STDIN.

Oh, I see. I was a bit confused why to wait for some input
in that situation when one is complaining that the program
is taking so long;-)
Regards, Jens
--
\ Jens Thoms Toerring ___ jt(a)toerring.de
\__________________________ http://toerring.de

From: ela on 10 Aug 2010 22:51

After testing different approaches, Jens Thoms Toerring's works better and
therefore I modified the codes accordingly. Now I just don't know why the
array content cannot be retrieved but only a number "1" is returned. Can
anyone tell me the reason? In fact I can simply pass $line instead of @cells
but what I finally want to achieve is to only print out several cells
instead of all.

my %ahash;
while ( my $line = <$afp> ) {
my @cells = split /\t/, $line;
$ahash{ $cells[ 5 ] } = $cells[ 1 ];
}
close $afp;

open my $ifp, '<', $infofile or die "Can't open $infofile for reading\n";

my %ihash;
while ( my $line = <$ifp> ) {
my @cells = split /\t/, $line;
$ihash{ $cells[ 1 ] } = @cells;
}
close $ifp;

while ( my $line = <$fp> ) {
if ( $line eq "\n" ) {
print $ofp "\n";
next;
}
chomp $line;

if ( $format eq "" ) {
@cells = split /:/, $line;
$tag = $cells[ 0 ];
} else {
@cells = split /\t/, $line;
$tag = $cells[ $acci ];
}

$gid = $ahash{ $tag } if exists $ahash{ $tag };
@gene_info = $ihash{$gid};
print $ofp "$line\t(a)gene_info";
}

close $fp;

From: Xho Jingleheimerschmidt on 10 Aug 2010 22:06

ela wrote:
> I'm new to database programming and just previously learnt to use loops to
> look up and enrich information using the following codes. However, when the
> tables are large,

How large?

> I find this process is very slow. Then, somebody told me I
> can build a database for one of the file real time and so no need to read
> the file from the beginning till the end again and again.

Not sure what you mean by "real time" here.

> However, perl DBI
> has a lot of sophisticated functions there and in fact my tables are only
> large but nothing special, linked by an ID.

Data is data. It doesn't need to "something special" in order to put
into a database. Databases themselves are nothing special, just
specialized tools to do a specialized job.

> Is there any simple way to
> achieve the same purpose? I just wish the ID can be indexed and then
> everytime I access the record through memory and not through I/O...

You can read the data into a hash, depending on just how large it is,
and exactly how it needs to be matched.

> open (OFP, ">$outname");
>
> open(FP, $listfile);

You should check that your open commands succeed.

>
> print OFP "$line\tgene info\n";
>
> $nl = '\n';

This is never used, and I don't see what one would use it for.

>
> while (<FP>) {
....

>
> open(AFP, $accfile);

Again, you should check that the open succeeds.

>
> while (<AFP>) {
> @cells = split (/\t/, $_);
> if ($cells[5] =~ /$tag/) {
> $des = $cells[1];
> last;
> }
> }
> close AFP;

This would actually be quite hard to optimize if the match really needs
to be as written, $cells[5] =~ /$tag/. Are you sure it wouldn't still
be correct (or even be more correct) to test $cells[5] eq $tag, or at
least $cells[5] =~ /^\Q$tag/ ?

>
> if ($found == 0) {
> print OFP "$line\tNo gene info available\n";
> }
> }

In your code, $found never gets set to anything, or changed.

Xho

From: Tad McClellan on 10 Aug 2010 23:17

ela <ela(a)yantai.org> wrote:

> Now I just don't know why the
> array content cannot be retrieved but only a number "1" is returned. Can
> anyone tell me the reason?

> open my $ifp, '<', $infofile or die "Can't open $infofile for reading\n";

You should include the reason (in $!)for open's failure in your diag message:

open my $ifp, '<', $infofile or die "Can't open $infofile for reading: $!\n";
^^
^^

> my %ihash;
> while ( my $line = <$ifp> ) {
> my @cells = split /\t/, $line;
> $ihash{ $cells[ 1 ] } = @cells;

@cells is in scalar context there, so it returns the number of
elements in the array. See the "Context" section in:

perldoc perldata

You cannot store an array in a hash value, you need to instead
store a *reference* to an array.

You should read all of:

perldoc perlreftut

Apply "Make Rule 1" from perlreftut:

$ihash{ $cells[ 1 ] } = \@cells;

> if ( $format eq "" ) {
> @cells = split /:/, $line;
> $tag = $cells[ 0 ];

You can do away with the @cells temporary variable by making
use of a List Assignment, replacing the 2 lines above with this line:

($tag) = split /:/, $line;

> $gid = $ahash{ $tag } if exists $ahash{ $tag };
> @gene_info = $ihash{$gid};

To retrieve the cells stored earlier, apply "Use Rule 1"
from perlreftut to dereference the stored reference:

@gene_info = @{ $ihash{$gid} };

--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.liamg\100cm.j.dat/"
The above message is a Usenet post.
I don't recall having given anyone permission to use it on a Web site.

First | Prev | Next | Last
Pages: 1 2 3
Prev: Need Google AdSense Account
Next: FAQ 4.58 How do I look up a hash element by value?