From: J�rgen Exner on 8 Jun 2010 21:08 ccc31807 <cartercc(a)gmail.com> wrote: >I get some of the data in CSV format. One of my sources switched from >an Access database to an Excel file. Turns out that Excel strips out >the leading zeros if it thinks that the datum is an integer. Which I would argue is the correct behaviour for a numerical data field. If you don't want a canonical numerical form, then declare the data field to be text. Problem solved. jue
From: Uri Guttman on 8 Jun 2010 21:26 >>>>> "TM" == Tad McClellan <tadmc(a)seesig.invalid> writes: >> my ($order, $first, $last, @years) = split /\|/; >> __DATA__ >> 1|George|Washington|1788 1792 >> 2|John|Adams|1796 >> 3|Thomas|Jefferson|1800 1804 >> 4|James|Madison|1808 1812 >> 32|Franklin|Roosevelt|1932 1936 1940 1944 TM> @years always contains exactly one element, it is a non-arrayish array. TM> $years would work as well, and would avoid looking like it wouldn't TM> work... gack, i didn't see that! no wonder it 'worked'. i was so caught up in the wrong use of an array there i didn't notice it was only getting one value which had the whole string with numbers. he never split that field into a list of numbers. do'h!! uri -- Uri Guttman ------ uri(a)stemsystems.com -------- http://www.sysarch.com -- ----- Perl Code Review , Architecture, Development, Training, Support ------ --------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
From: bugbear on 9 Jun 2010 03:57 Uri Guttman wrote: >>>>>> "TM" == Tad McClellan <tadmc(a)seesig.invalid> writes: > > >> my ($order, $first, $last, @years) = split /\|/; > >> __DATA__ > >> 1|George|Washington|1788 1792 > >> 2|John|Adams|1796 > >> 3|Thomas|Jefferson|1800 1804 > >> 4|James|Madison|1808 1812 > >> 32|Franklin|Roosevelt|1932 1936 1940 1944 > > > TM> @years always contains exactly one element, it is a non-arrayish array. > > TM> $years would work as well, and would avoid looking like it wouldn't > TM> work... > > gack, i didn't see that! no wonder it 'worked'. i was so caught up in > the wrong use of an array there i didn't notice it was only getting one > value which had the whole string with numbers. he never split that field > into a list of numbers. do'h!! Chuckle. *MY* experience tells me that bugs are never where you're looking ;-) BugBear
From: ccc31807 on 9 Jun 2010 10:39 On Jun 8, 9:08 pm, J rgen Exner <jurge...(a)hotmail.com> wrote: > ccc31807 <carte...(a)gmail.com> wrote: > >I get some of the data in CSV format. One of my sources switched from > >an Access database to an Excel file. Turns out that Excel strips out > >the leading zeros if it thinks that the datum is an integer. > > Which I would argue is the correct behaviour for a numerical data field. > If you don't want a canonical numerical form, then declare the data > field to be text. Problem solved. I get these kinds of files as user input. My supposition is that prior to this experience, the user was using Access, and configured the ID field as text (even though it consists entirely of digits), so that when exported as CSV it kept all seven digits, which it would have done as a text field, a string. Users don't normally bother to set the data type of Excel columns unless they are currency, dates, or specific numeric fields, so Excel treats a column with numeric characters as numeric, which is entirely reasonable. When you save the Excel file as CSV, it only saves the significant digits, not leading zeros. Again, this is entirely reasonable. My problem was that I wasn't aware of the switch (might have been told but wasn't really aware of it) and ASSUMED that the numeric IDs were all present including the leading zeros. When I figured out that the errors were associated with the records that had ID consisting of leading zeros, I ASSUMED that it was a software problem, a bug I had introduced, a programming error. As bugbear notes, it was indeed a programming error, but related to validation of data, not conversion of data types. When I converted the numeric fields to strings, I got the same error, which ultimately lead me to examine the data file. As to use of the @courses variable, I'll change that to $courses. I've already explained why that happened, and I honestly don't feel too bad about that, as that's the kind of error we all make when we write in different languages at the same time. CC
From: Peter J. Holzer on 9 Jun 2010 16:00
On 2010-06-08 21:12, ccc31807 <cartercc(a)gmail.com> wrote: > #! perl > # array.plx > use strict; > use warnings; > my %presidents; > while (<DATA>) > { > chomp; > my ($order, $first, $last, @years) = split /\|/; > $presidents{$order} = { > first => $first, > last => $last, > years => @years, > }; > } > > foreach my $k (sort keys %presidents) > { > print "$k => $presidents{$k}\n"; > foreach my $k2 (sort keys %{$presidents{$k}}) > { > print " $k2 => $presidents{$k}{$k2}\n"; > } > } > exit(0); This script never pads $order to two digits. > __DATA__ > 1|George|Washington|1788 1792 ^ here $order has only one digit. > 2|John|Adams|1796 > 3|Thomas|Jefferson|1800 1804 > 4|James|Madison|1808 1812 > 32|Franklin|Roosevelt|1932 1936 1940 1944 > > ----------OUTPUT---------------- > D:\PerlLearn>perl array.plx > 01 => HASH(0x248e5c) ^^ Thus I do not believe that this output is from the script above. hp |