building a hash path index? [Perl]

Prev: FAQ 4.67 How can I make my hash remember the order I put elements into it?
Next: FAQ 6.17 How do I efficiently match many regular expressions at once?

From: Bo Lindbergh on 9 Jul 2010 16:10

In article <87y6dksr5e.fsf(a)lifelogs.com>, Ted Zlatanov <tzz(a)lifelogs.com>
wrote:

> On Fri, 09 Jul 2010 09:10:20 +0100 bugbear <bugbear(a)trim_papermule.co.uk_trim> wrote:
>
> b> What's annoying is how trivial this is for fixed-length field lists
> b> e.g. 2:
>
> b> sub mk_2_index {
> b> my ($list, $fields) = @_;
> b> my $index;
> b> foreach my $h (@$list) {
> b> push @{$index->{$h->{$fields->[0]}}->{$h->{$fields->[1]}}}, $h;
> b> }
> b> return $index;
> b> }
>
> That's the right direction, but by condensing and using so many
> shortcuts you've robbed yourself of the chance to see the general
> solution.
>
> An alternative would have been to use Hash::Merge; construct each
> entry's tree (e.g. { 10 => { 20 => { a => 10, b => 20 } } } )
> individually and merge them all into one hash. But since that will be
> less efficient (I think) I went with the recursive standalone version
> below. It produces the results you want and will work as long as all
> the entries have the keys required.
>
> #!/usr/bin/perl
>
> use warnings;
> use strict;
> use Data::Dumper;
>
> my $data = [
> {
> a => 10,
> b => 20,
> },
> {
> a => 10,
> b => 25,
> c => "thing",
> },
> {
> a => 10,
> b => 25,
> c => "other thing",
> },
> {
> a => 12,
> b => 25,
> },
> ];
>
> my $index = mk_index($data, [ 'a', 'b'] );
> print Dumper($index);
>
> sub mk_index
> {
> my $data = shift @_;
> my $fields = shift @_;
>
> return $data unless scalar @$fields;
>
> my @fields = @$fields;
> my $field = shift @fields;
>
> my %uniques;
> foreach my $entry (@$data)
> {
> push @{$uniques{$entry->{$field}}}, $entry;
> }
>
> my %h;
>
> foreach my $unique (keys %uniques)
> {
> $h{$unique} = mk_index($uniques{$unique}, \@fields);
> }
>
> return \%h;
> }

No need for recursion.

sub mk_index
{
my($data, $fields) = @_;
my(@firstfields, $lastfield, %result);

@firstfields = @{$fields};
$lastfield = pop(@firstfields);
return $data unless defined($lastfield);
foreach my $entry (@{$data}) {
my($walk);

$walk = \%result;
foreach my $field (@firstfields) {
$walk = $walk->{$entry->{$field}} ||= {};
}
push(@{$walk->{$entry->{$lastfield}}}, $entry);
}
return \%result;
}

Any more contestants, or is it time to use Benchmark qw(cmpthese)? :-)

/Bo Lindbergh

From: sln on 9 Jul 2010 19:05

On Fri, 09 Jul 2010 22:10:29 +0200, Bo Lindbergh <blgl(a)stacken.kth.se> wrote:

>
>No need for recursion.
>
>sub mk_index
>{
> my($data, $fields) = @_;
> my(@firstfields, $lastfield, %result);
>
> @firstfields = @{$fields};
> $lastfield = pop(@firstfields);
> return $data unless defined($lastfield);
> foreach my $entry (@{$data}) {
> my($walk);
>
> $walk = \%result;
> foreach my $field (@firstfields) {
> $walk = $walk->{$entry->{$field}} ||= {};
> }
> push(@{$walk->{$entry->{$lastfield}}}, $entry);
> }
> return \%result;
>}
>
>Any more contestants, or is it time to use Benchmark qw(cmpthese)? :-)
>
>
>/Bo Lindbergh

Fire away!

From: bugbear on 19 Jul 2010 04:28

Bo Lindbergh wrote:
>
> Any more contestants, or is it time to use Benchmark qw(cmpthese)? :-)
>

I was far more interested in elegance and correctness
than performance - as it happens my data structures
are not especially large (500 hashes).

Thanks to all who helped.

BugBear

First | Prev |
Pages: 1 2
Prev: FAQ 4.67 How can I make my hash remember the order I put elements into it?
Next: FAQ 6.17 How do I efficiently match many regular expressions at once?