From: nobody on 13 Nov 2009 19:12 I'm trying to process flat files with many thousands of records. In these files several rows comprise the information for a single customer. In the example __DATA__ below, I'm trying to fill the variables with the customer information while the customer number is 06020004293, then for customer number 07020000279, and finally customer number 09020000251. I believe my problem is looping while the customer number remains the same, then move on to the next customer numbers. I've been pulling my hair out with nested while and do loops. I've included the desired output below. Here's what I'm working with so far: #!/usr/bin/perl use strict; use warnings; my ( $Name, $City, $Street ); while (<DATA>) { chomp; if (substr($_, 12, 1) eq 'A') { $Name = substr($_, 14, 17); } if (substr($_, 12, 1) eq 'B') { $City = substr($_, 14, 17); } if (substr($_, 12, 1) eq 'C') { $Street = substr($_, 33, 19); } } print "Name: $Name\n"; print "City: $City\n"; print "Street: $Street\n"; # Desired output: #Name: Fred Flintstone #City: Bedrock #Street: 123 Bedrock Road #Name: George Washington #City: Washington D.C. #Street: #Name: Joe Smith #City: Smallville #Street: __DATA__ 06020004293 A Fred Flintstone 123 Bedrock Road 06020004293 B Bedrock Gravel Pit 06020004293 C Loney Toons 123 Bedrock Road 07020000279 A George Washington 234 Washington Ave. 07020000279 B Washington D.C. 234 Washington Ave. 09020000251 A Joe Smith 54 Abbey Road 09020000251 B Smallville 54 Abbey Road
From: Ben Morrow on 13 Nov 2009 20:56 Quoth nobody <nobody(a)nowhere.com>: > I'm trying to process flat files with many thousands of records. In > these files several rows comprise the information for a single customer. > In the example __DATA__ below, I'm trying to fill the variables with the > customer information while the customer number is 06020004293, then for > customer number 07020000279, and finally customer number 09020000251. I > believe my problem is looping while the customer number remains the same, > then move on to the next customer numbers. I've been pulling my hair out > with nested while and do loops. I've included the desired output below. > Here's what I'm working with so far: Since you want to print out the information you have every time you see a new customer number, you need to extract and remember the number from each line. I'll make minimal additions to your code to acheive this, then talk about general style later. > #!/usr/bin/perl > > use strict; > use warnings; > > my ( > $Name, > $City, > $Street $Customer, $Last_Customer, Note that Perl explicitly allows you to include a trailing comma in lists, so that you can add and remove lines without worrying about whether this was the last entry or not. > ); > > while (<DATA>) { > > chomp; > # Get the customer number for the new line $Customer = substr($_, 0, 10); if ( # ..we've seen at least one line already, and... defined $Last_Customer and # ...the new line is for a different customer from the last... $Customer ne $Last_Customer ) { # ...print out the data for the old customer before we proceed # to extract the data for the new one. print "Name: $Name\n"; print "City: $City\n"; print "Street: $Street\n"; print "\n"; } # Remember which customer we were on for next time round the loop. $Last_Customer = $Customer; > if (substr($_, 12, 1) eq 'A') { > $Name = substr($_, 14, 17); > } > > if (substr($_, 12, 1) eq 'B') { > $City = substr($_, 14, 17); > } > > if (substr($_, 12, 1) eq 'C') { > $Street = substr($_, 33, 19); > } > > } We need to keep this final section in this version of the program, since otherwise the very last customer will never get their information printed. *However*, that fact should immediately make you say to yourself 'I've just written the same thing twice. How could I have avoided that?'. > print "Name: $Name\n"; > print "City: $City\n"; > print "Street: $Street\n"; The first comment to make about style is, IMHO, that multiple 'print' statements are always a bad idea. Perl has a special form of multi-line quoting called 'here documents' which allow you to avoid that: print <<OUTPUT; Name: $Name City: $City Street: $Street OUTPUT See the section "<<EOF" in perldoc perlop for more details. The second is that it would be much easier to split the line into fields first, rather than picking out pieces as you need them. For this I would use a regex, which will additionally let you check that the line looks as you expect. So, I might write something like my @record = /^(\d{10}) ([ABC]) (.{17}) (.{19})$/ or die "Invalid record: [$_]"; which does rather a lot of things in one statement. First the /.../ expression matches $_ against the given pattern, and returns a list of substrings. Start with perldoc perlretut to understand the syntax used for the patterns. Next, the 'my @record =' takes that list of substrings, and puts it in a newly-declared array. Finally, if the pattern match failed, the whole expression is 'false', so the 'or die "..."' will fire to alert you of the error. (The reason for putting the offending line in [] in the error message is so you can easily see if there is extra whitespace at either end.) Using this array is then straightforward: the customer number is in $record[0], the line code in $record[1], and the two data fields in $record[2] and $record[3]. (The next step would be to turn the printing into a subroutine, so you don't have to duplicate the code, and to build up a hash for each customer rather than using global variables; but this post is already quite long enough... :).) Ben
From: Tad McClellan on 13 Nov 2009 21:37 nobody <nobody(a)nowhere.com> wrote: > I'm trying to process flat files with many thousands of records. In > these files several rows comprise the information for a single customer. > In the example __DATA__ below, I'm trying to fill the variables with the > customer information while the customer number is 06020004293, then for > customer number 07020000279, and finally customer number 09020000251. Another way of saying that is: fill the variables with the customer information until the start of the next customer record (marked by an 'A' row). > I > believe my problem is looping while the customer number remains the same, or looping until an 'A' line is found... > then move on to the next customer numbers. [snip] > # Desired output: > > #Name: Fred Flintstone > #City: Bedrock > #Street: 123 Bedrock Road > > #Name: George Washington > #City: Washington D.C. > #Street: > > #Name: Joe Smith > #City: Smallville > #Street: -------------------------------- #!/usr/bin/perl use warnings; use strict; my %buffer; while ( <DATA> ) { chomp; my $code = substr $_, 12, 1; if ( $code eq 'A' ) { if ( keys %buffer) { output(%buffer); %buffer = (); } $buffer{Name} = substr $_, 14, 17; } elsif ( $code eq 'B' ) { $buffer{City} = substr $_, 14, 17; } elsif ( $code eq 'C' ) { $buffer{Street} = substr $_, 34, 18; } else { warn "code '$code' is invalid\n"; } } output(%buffer); sub output { my %h = @_; foreach my $key qw/Name City Street/ { print "#$key: "; print $h{$key} if defined $h{$key}; print "\n"; } print "\n"; } __DATA__ 06020004293 A Fred Flintstone 123 Bedrock Road 06020004293 B Bedrock Gravel Pit 06020004293 C Loney Toons 123 Bedrock Road 07020000279 A George Washington 234 Washington Ave. 07020000279 B Washington D.C. 234 Washington Ave. 09020000251 A Joe Smith 54 Abbey Road 09020000251 B Smallville 54 Abbey Road -------------------------------- -- Tad McClellan email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
From: nobody on 14 Nov 2009 10:01 On Fri, 13 Nov 2009 20:37:20 -0600, Tad McClellan wrote: Thanks for your answer, it does exactly as I asked. However, the data files I'm dealing with are more complicated. In the __DATA__ below, as part of the 06020004293 records, Fred has two daughters, both of which comprise a 'B' record. Fred's 06020004293 data outputs Sue Flintstone twice like: Name: Fred Flintstone Daughter: Sue Flintstone Daughter2: Sue Flintstone Company: Gravel Pit OldCompany: Loney Toons Street: 123 Bedrock Road The first should be Jane Flintstone, so I'm trying to do something in the code below where it says "NEED Daughter2". Any help would be greatly appreciated again! #!/usr/bin/perl use warnings; use strict; my $flag = 0; my %buffer; while ( <DATA> ) { chomp; my $code = substr $_, 12, 1; if ( $code eq 'A' ) { if ( keys %buffer) { output(%buffer); %buffer = (); } $buffer{Name} = substr $_, 14, 17; $buffer{Street} = substr $_, 32, 17; } elsif ( $code eq 'B' ) { $buffer{Daughter} = substr $_, 14, 17; $flag = 1; ####### NEED Daughter2 if ($buffer{Daughter}) { $buffer{Daughter2} = substr $_, 14, 17; } } elsif ( $code eq 'C' ) { $buffer{Company} = substr $_, 32, 18; } elsif ( $code eq 'D' ) { $buffer{OldCompany} = substr $_, 14, 18; } else { warn "code '$code' is invalid\n"; } } output(%buffer); sub output { my %h = @_; foreach my $key qw/Name Daughter Daughter2 Company OldCompany Street/ { print "$key: "; print $h{$key} if defined $h{$key}; print "\n"; } print "\n"; } __DATA__ 06020004293 A Fred Flintstone 123 Bedrock Road 06020004293 B Jane Flintstone 123 Bedrock Road 06020004293 B Sue Flintstone 123 Bedrock Road 06020004293 C Bedrock Gravel Pit 06020004293 D Loney Toons 123 Bedrock Road 07020000279 A George Washington 234 Washington Ave. 07020000279 C Washington D.C. 234 Washington Ave. 09020000251 A Joe Smith 54 Abbey Road 09020000251 C Smallville 54 Abbey Road
From: nobody on 14 Nov 2009 13:10
On Sat, 14 Nov 2009 08:37:33 -0800, sln wrote: > On Sat, 14 Nov 2009 00:12:20 GMT, nobody <nobody(a)nowhere.com> wrote: > >>I'm trying to process flat files with many thousands of records. In >>these files several rows comprise the information for a single customer. >>In the example __DATA__ below, I'm trying to fill the variables with the >>customer information while the customer number is 06020004293, then for >>customer number 07020000279, and finally customer number 09020000251. I >>believe my problem is looping while the customer number remains the >>same, then move on to the next customer numbers. I've been pulling my >>hair out with nested while and do loops. I've included the desired >>output below. Here's what I'm working with so far: >> >> > Hey dude, I see your on your 5th or 7th incarnation of this so called > problem. Several threads later this flat file is in a fixed width form, > but at least its in the same flintstone janra. You should work for WB's > or try upgrading your cable subscription to something other than toon > tv. > Hey dude, you're confused. I'm working with various data files in various formats. Some are flat files, some are delimited. You should learn some manners, give up your lame attempts at humor, And learn how to spell genera. |