multiple files into one file based on unique entry in one of the files [TCL]

Prev: Need help to uninstall TCL 8.5.8
Next: multiple files into one file based on unique entry in one of the files

From: Cesear on 17 Mar 2010 09:55

On Mar 17, 9:16 am, Alexandre Ferrieux <alexandre.ferri...(a)gmail.com>
wrote:
> On Mar 17, 1:47 pm, Cesear <ces...(a)gmail.com> wrote:
>
> > I have 4 flat files where each field is separated by a pipe |. In
> > each file the second field has a unique value that is in all the
> > files. Each line is terminated by a newline. I what to combine each
> > file and create one file. In this one file, there should be one line
> > for each "unique" entry in that was found in the second file. I was
> > able to do this in MS access by creating each file as table and
> > linking the "unique" field in the 2nd file to the other files. I want
> > do this in Tcl, so I can automate the process. Come some please point
> > me in a starting direction? I know enough Tcl to get by, but not much
> > in the I/O region. Any ideas or would be great!
>
> Your description is a bit unclear, please provide an example.
>
> -Alex

Here is an example: In File 2, second field is unique, to the other 3
files. In File 2, field 2 could occur in multiple lines of the file.
I want to combine all 4 files into ONE file FOREACH instance of field
2 in File 2. Each instance should occur as ONE line in the combined
one File. Also, I ONLY need to have field 1 and field 2 occur once in
the combine file. So the grand outlook for "D0000012345678" (From
File 2) the one line would be this-->

D0001234|D0000012345678|CHUCK|BROWNSTOWN|123 WOODSON ROAD||ANYTOWN|USA|
111111|(111)111-1111||02/15/1970|F|111|||51|41.9|||28.4|JOESPH|M|UNIT|
03/13/2010||I|AAAA|BBBBBB|CCCCCCC|DDDDDDD|BLAH|B|C|DD"newline here"
.......
and so on field 2.....

Does that make sense??

File 1-->
D0001234|D00000123456||51|41.9|||28.4|JOE||DEPARTMENT|03/13/2010||I
D00012345|D0000012345678||51|41.9|||28.4|JOESPH|M|UNIT|03/13/2010||I
D0001234|D00000123456||51|41.9|||28.4||FRANK|UNIT|03/13/2010||I

File 2-->
D0001234|D00000123456|CHARLIE|BROWN|123 WOODS ROAD|APT 2110|ANYTOWN|
USA|111111|(111)111-1111||02/15/1952|M|111113
D0001234|D0000012345678|CHUCK|BROWNSTOWN|123 WOODSON ROAD||ANYTOWN|USA|
111111|(111)111-1111||02/15/1970|F|111

File 3-->
D00012345|D0000012345678|AAAA|BBBBBB|CCCCCCC|DDDDDDD
D0001234|D00000123456|KKKKK|YYYYYY|CC|DDD

File 4-->
D00012345|D0000012345678|BLAH|B|C|DD
D0001234|D00000123456|K|YY|CC|DDD
D0001234|D00000123456|KKKK|YYYY|CC|DDD

From: Andrew Mangogna on 17 Mar 2010 10:18

Arjen Markus wrote:

> On 17 mrt, 13:47, Cesear <ces...(a)gmail.com> wrote:
>> I have 4 flat files where each field is separated by a pipe |. In
>> each file the second field has a unique value that is in all the
>> files. Each line is terminated by a newline. I what to combine each
>> file and create one file. In this one file, there should be one line
>> for each "unique" entry in that was found in the second file. I was
>> able to do this in MS access by creating each file as table and
>> linking the "unique" field in the 2nd file to the other files. I want
>> do this in Tcl, so I can automate the process. Come some please point
>> me in a starting direction? I know enough Tcl to get by, but not much
>> in the I/O region. Any ideas or would be great!
>>
>> Thx!!
>
> What you could do is:
>
> while {[gets $infile line] } {
> set fields [split $line |]
> set uniqueId [lindex $fields 1]
> set data1($uniqueId) [lreplace $fields 1 1]
> }
>
> (Same for the other files)
>
> Now you have four arrays, data1, ... data4, that hold the
> non-unique information for each unique ID.
>
> Joining them into one file:
>
> foreach id [array names data1] {
> puts $outfile [join [concat $id $data1($id) $data2($id)
> $data3($id) $data4($id)] |]
> }
>
> Or code along those lines - this is mostly a sketch.
>
> Regards,
>
> Arjen

If all you want to do is to manipulate file contents, then certainly Arjen's
example (with the small correction) is a straight forward way to accomplish
that. But the original post referred to MS access, and so if you want to
actually reason on the data, you could consider casting the file data into
relational terms where the reasoning operations are much easier to formulate.
Either SQLite or TclRAL could be brought to task for that.

--
Andrew Mangogna

From: Cesear on 17 Mar 2010 10:41

On Mar 17, 10:18 am, Andrew Mangogna <amango...(a)mindspring.com> wrote:
> Arjen Markus wrote:
> > On 17 mrt, 13:47, Cesear <ces...(a)gmail.com> wrote:
> >> I have 4 flat files where each field is separated by a pipe |. In
> >> each file the second field has a unique value that is in all the
> >> files. Each line is terminated by a newline. I what to combine each
> >> file and create one file. In this one file, there should be one line
> >> for each "unique" entry in that was found in the second file. I was
> >> able to do this in MS access by creating each file as table and
> >> linking the "unique" field in the 2nd file to the other files. I want
> >> do this in Tcl, so I can automate the process. Come some please point
> >> me in a starting direction? I know enough Tcl to get by, but not much
> >> in the I/O region. Any ideas or would be great!
>
> >> Thx!!
>
> > What you could do is:
>
> > while {[gets $infile line] } {
> > set fields [split $line |]
> > set uniqueId [lindex $fields 1]
> > set data1($uniqueId) [lreplace $fields 1 1]
> > }
>
> > (Same for the other files)
>
> > Now you have four arrays, data1, ... data4, that hold the
> > non-unique information for each unique ID.
>
> > Joining them into one file:
>
> > foreach id [array names data1] {
> > puts $outfile [join [concat $id $data1($id) $data2($id)
> > $data3($id) $data4($id)] |]
> > }
>
> > Or code along those lines - this is mostly a sketch.
>
> > Regards,
>
> > Arjen
>
> If all you want to do is to manipulate file contents, then certainly Arjen's
> example (with the small correction) is a straight forward way to accomplish
> that. But the original post referred to MS access, and so if you want to
> actually reason on the data, you could consider casting the file data into
> relational terms where the reasoning operations are much easier to formulate.
> Either SQLite or TclRAL could be brought to task for that.
>
> --
> Andrew Mangogna

What small correction from Arjen post are you referring too?

From: Cesear on 17 Mar 2010 10:42

From: Arjen Markus on 17 Mar 2010 10:49

On 17 mrt, 15:42, Cesear <ces...(a)gmail.com> wrote:
> On Mar 17, 10:18 am, Andrew Mangogna <amango...(a)mindspring.com> wrote:
>
>
>
>
>
> > Arjen Markus wrote:
> > > On 17 mrt, 13:47, Cesear <ces...(a)gmail.com> wrote:
> > >> I have 4 flat files where each field is separated by a pipe |. In
> > >> each file the second field has a unique value that is in all the
> > >> files. Each line is terminated by a newline. I what to combine each
> > >> file and create one file. In this one file, there should be one line
> > >> for each "unique" entry in that was found in the second file. I was
> > >> able to do this in MS access by creating each file as table and
> > >> linking the "unique" field in the 2nd file to the other files. I want
> > >> do this in Tcl, so I can automate the process. Come some please point
> > >> me in a starting direction? I know enough Tcl to get by, but not much
> > >> in the I/O region. Any ideas or would be great!
>
> > >> Thx!!
>
> > > What you could do is:
>
> > > while {[gets $infile line] } {
> > > set fields [split $line |]
> > > set uniqueId [lindex $fields 1]
> > > set data1($uniqueId) [lreplace $fields 1 1]
> > > }
>
> > > (Same for the other files)
>
> > > Now you have four arrays, data1, ... data4, that hold the
> > > non-unique information for each unique ID.
>
> > > Joining them into one file:
>
> > > foreach id [array names data1] {
> > > puts $outfile [join [concat $id $data1($id) $data2($id)
> > > $data3($id) $data4($id)] |]
> > > }
>
> > > Or code along those lines - this is mostly a sketch.
>
> > > Regards,
>
> > > Arjen
>
> > If all you want to do is to manipulate file contents, then certainly Arjen's
> > example (with the small correction) is a straight forward way to accomplish
> > that. But the original post referred to MS access, and so if you want to
> > actually reason on the data, you could consider casting the file data into
> > relational terms where the reasoning operations are much easier to formulate.
> > Either SQLite or TclRAL could be brought to task for that.
>
> > --
> > Andrew Mangogna
>
> What small correction from Arjen post are you referring too?- Tekst uit oorspronkelijk bericht niet weergeven -
>
> - Tekst uit oorspronkelijk bericht weergeven -

The condition for terminating the loop:

while { [gets $infile line] >= 0 } { ... }

(Indeed, Andrew has a much more sophisticated and flexible solution
for you)

Regards,

Arjen

First | Prev | Next | Last
Pages: 1 2 3 4 5
Prev: Need help to uninstall TCL 8.5.8
Next: multiple files into one file based on unique entry in one of the files