Prev: FAQ 4.67 How can I make my hash remember the order I put elements into it?
Next: FAQ 6.17 How do I efficiently match many regular expressions at once?
From: bugbear on 7 Jul 2010 04:48 I have a large (array) of hashes, and each hash has several fields. I would like to be able to group the hashes by some of the fields, so I thought of creating a hash so that I find an array of selected hashes via: $index->{field1}->{field2}->{field3} I would like to create a method which could be called like this: my $hash_index = make_index($list_of_hashes, [ 'date', 'name' ]) resulting in $hash_index being a hash such that $index->{"jul-10"}->{paul} was an array of all hashes with the corresponding date and name. This is easy to do with a fixed field list, but I can't see a clear road to parameterising it as per my example call. It's the variable length of the index-name-array that causes me difficulty. BugBear
From: bugbear on 7 Jul 2010 05:05 bugbear wrote: > I have a large (array) of hashes, and each hash > has several fields. > > I would like to be able to group > the hashes by some of the fields, > so I thought of creating a hash so > that I find an array of selected hashes via: > > $index->{field1}->{field2}->{field3} > > I would like to create a method which could > be called like this: > > my $hash_index = make_index($list_of_hashes, [ 'date', 'name' ]) > > resulting in $hash_index being a hash such that > > $index->{"jul-10"}->{paul} was an array of all hashes > with the corresponding date and name. > > This is easy to do with a fixed field list, > but I can't see a clear road to parameterising > it as per my example call. > > It's the variable length of the index-name-array > that causes me difficulty. Here's my inelegant code; I suspect there's a MUCH more elegant solution to be had: sub _mk_index { my ($dst, $hash, $fields) = @_; if(scalar(@$fields) == 0) { if(!defined($dst)) { $dst = []; } push @$dst, $hash; } else { if(!defined($dst)) { $dst = {}; } my $key = $hash->{$fields->[0]}; my @tail = @$fields; shift @tail; $dst->{$key} = _mk_index($dst->{$key}, $hash, \@tail); } return $dst; } sub mk_index { my ($list, $fields) = @_; my $index; foreach my $h (@$list) { $index = _mk_index($index, $h, $fields); } return $index; } BugBear
From: sln on 8 Jul 2010 18:50 On Wed, 07 Jul 2010 10:05:54 +0100, bugbear <bugbear(a)trim_papermule.co.uk_trim> wrote: >bugbear wrote: >> I have a large (array) of hashes, and each hash >> has several fields. >> >> I would like to be able to group >> the hashes by some of the fields, >> so I thought of creating a hash so >> that I find an array of selected hashes via: >> >> $index->{field1}->{field2}->{field3} Is this what you mean? my $index = { 'field1' => { 'field2' => { 'field3' => {}, }, }, }; >> >> I would like to create a method which could >> be called like this: >> >> my $hash_index = make_index($list_of_hashes, [ 'date', 'name' ]) >> >> resulting in $hash_index being a hash such that >> >> $index->{"jul-10"}->{paul} was an array of all hashes >> with the corresponding date and name. >> >> This is easy to do with a fixed field list, >> but I can't see a clear road to parameterising >> it as per my example call. >> >> It's the variable length of the index-name-array >> that causes me difficulty. > >Here's my inelegant code; I suspect there's a MUCH >more elegant solution to be had: > > >sub _mk_index { > my ($dst, $hash, $fields) = @_; > if(scalar(@$fields) == 0) { > if(!defined($dst)) { > $dst = []; > } > push @$dst, $hash; > } else { > if(!defined($dst)) { > $dst = {}; > } > my $key = $hash->{$fields->[0]}; > my @tail = @$fields; > shift @tail; > $dst->{$key} = _mk_index($dst->{$key}, $hash, \@tail); > } > return $dst; >} > >sub mk_index { > my ($list, $fields) = @_; > my $index; > foreach my $h (@$list) { > $index = _mk_index($index, $h, $fields); > } > return $index; >} > > BugBear It would be better if you post a working example. I imagine you call mk_index() from the main code? -sln
From: bugbear on 9 Jul 2010 04:10 sln(a)netherlands.com wrote: > > It would be better if you post a working example. > I imagine you call mk_index() from the main code? Of course - it's a library-style utility method. Here's my "test" my $data = [ { a => 10, b => 20, }, { a => 10, b => 25, c => "thing", }, { a => 10, b => 25, c => "other thing", }, { a => 12, b => 25, }, ]; my $index = mk_index($data, [ 'a', 'b'] ); print Dumper($index); And the desired result: $VAR1 = { '10' => { '25' => [ { 'c' => 'thing', 'a' => 10, 'b' => 25 }, { 'c' => 'other thing', 'a' => 10, 'b' => 25 } ], '20' => [ { 'a' => 10, 'b' => 20 } ] }, '12' => { '25' => [ { 'a' => 12, 'b' => 25 } ] } }; What's annoying is how trivial this is for fixed-length field lists e.g. 2: sub mk_2_index { my ($list, $fields) = @_; my $index; foreach my $h (@$list) { push @{$index->{$h->{$fields->[0]}}->{$h->{$fields->[1]}}}, $h; } return $index; } BugBear
From: Ted Zlatanov on 9 Jul 2010 09:47
On Fri, 09 Jul 2010 09:10:20 +0100 bugbear <bugbear(a)trim_papermule.co.uk_trim> wrote: b> What's annoying is how trivial this is for fixed-length field lists b> e.g. 2: b> sub mk_2_index { b> my ($list, $fields) = @_; b> my $index; b> foreach my $h (@$list) { b> push @{$index->{$h->{$fields->[0]}}->{$h->{$fields->[1]}}}, $h; b> } b> return $index; b> } That's the right direction, but by condensing and using so many shortcuts you've robbed yourself of the chance to see the general solution. An alternative would have been to use Hash::Merge; construct each entry's tree (e.g. { 10 => { 20 => { a => 10, b => 20 } } } ) individually and merge them all into one hash. But since that will be less efficient (I think) I went with the recursive standalone version below. It produces the results you want and will work as long as all the entries have the keys required. #!/usr/bin/perl use warnings; use strict; use Data::Dumper; my $data = [ { a => 10, b => 20, }, { a => 10, b => 25, c => "thing", }, { a => 10, b => 25, c => "other thing", }, { a => 12, b => 25, }, ]; my $index = mk_index($data, [ 'a', 'b'] ); print Dumper($index); sub mk_index { my $data = shift @_; my $fields = shift @_; return $data unless scalar @$fields; my @fields = @$fields; my $field = shift @fields; my %uniques; foreach my $entry (@$data) { push @{$uniques{$entry->{$field}}}, $entry; } my %h; foreach my $unique (keys %uniques) { $h{$unique} = mk_index($uniques{$unique}, \@fields); } return \%h; } |