From: Ed Morton on 2 Mar 2010 08:45 On 3/2/2010 7:34 AM, Ed Morton wrote: > On 3/2/2010 3:09 AM, pk wrote: >> Janis Papanagnou wrote: >> >>>> works, but I have the impression that I'm overcomplicating it. >>>> However, I >>>> cannot find a simpler way. Any suggestion? >>> >>> awk '{ print | "command" } >>> /^END$/ { close("command") }' >> >> Yes, thanks (and to Bill). I was thinking of something more shell-ish >> rather >> than calling external commands in awk, but that'll do. >> >> Thank you! >> > > How about something like (untested, but I know you know awk...): > > awk -v RS="END" -v ORS="\n" -F FS="\n" -v OFS="^L" '{$1=$1}1' file | > while IFS= read -r block; do > echo "$block" | tr '^L' '\n' | command > done > > where the ^L is control-L or some other control character that's not in > your input. > > Regards, > > Ed. Actually, you could use "END" instead of control-L as the OFS since you know there won't be any "END"s in the current records since the RS="END" is taking care of that. Ed. Ed.
From: pk on 2 Mar 2010 09:26 Ed Morton wrote: >> How about something like (untested, but I know you know awk...): >> >> awk -v RS="END" -v ORS="\n" -F FS="\n" -v OFS="^L" '{$1=$1}1' file | >> while IFS= read -r block; do >> echo "$block" | tr '^L' '\n' | command >> done >> >> where the ^L is control-L or some other control character that's not in >> your input. >> >> Regards, >> >> Ed. > > Actually, you could use "END" instead of control-L as the OFS since you > know there won't be any "END"s in the current records since the RS="END" > is taking care of that. Yes, that's a clever solution. I prefer the ^L as separator however (or any other single character), as it's easier to turn into a "\n" with tr. Thanks!
From: Janis on 2 Mar 2010 09:57 On 2 Mrz., 15:26, pk <p...(a)pk.invalid> wrote: > Ed Morton wrote: > >> How about something like (untested, but I know you know awk...): > > >> awk -v RS="END" -v ORS="\n" -F FS="\n" -v OFS="^L" '{$1=$1}1' file | > >> while IFS= read -r block; do > >> echo "$block" | tr '^L' '\n' | command > >> done > > >> where the ^L is control-L or some other control character that's not in > >> your input. > > >> Regards, > > >> Ed. > > > Actually, you could use "END" instead of control-L as the OFS since you > > know there won't be any "END"s in the current records since the RS="END" > > is taking care of that. > > Yes, that's a clever solution. I prefer the ^L as separator however (or any > other single character), as it's easier to turn into a "\n" with tr. In such situations I sometimes just take SUBSEP for convenience; being predefined and a control character. (If you do shell post-processing you would of course have to know what SUBSEP actually is.) > > Thanks BTW, I wonder why you said upthread >>> "I was thinking of something more shell-ish [...]" and prefer shell loops and in this case quite bulky shell code. Janis
From: pk on 2 Mar 2010 10:10 Janis wrote: > BTW, I wonder why you said upthread > >>>> "I was thinking of something more shell-ish [...]" > > and prefer shell loops and in this case quite bulky shell code. Don't get me wrong: awk is perfectly fine (no, I don't want to start the debate "shell loops vs. dedicated tools" again). But at the point I described in my first post I was just feeling like I was overlooking some more "natural" shell way (ie, involving pipelines, file descriptors, IFS or other trickery) to complete the task. Efficiency is not a concern here as the input is just a few hundred lines in the worst case, and it's semi-throwaway code anyway - ie will be used for a limited time only as part of a bigger data migration task. But it turns out it was just a wrong feeling. Thanks again.
From: Ed Morton on 2 Mar 2010 10:10 On Mar 2, 8:57 am, Janis <janis_papanag...(a)hotmail.com> wrote: > On 2 Mrz., 15:26, pk <p...(a)pk.invalid> wrote: > > > > > > > Ed Morton wrote: > > >> How about something like (untested, but I know you know awk...): > > > >> awk -v RS="END" -v ORS="\n" -F FS="\n" -v OFS="^L" '{$1=$1}1' file | > > >> while IFS= read -r block; do > > >> echo "$block" | tr '^L' '\n' | command > > >> done > > > >> where the ^L is control-L or some other control character that's not in > > >> your input. > > > >> Regards, > > > >> Ed. > > > > Actually, you could use "END" instead of control-L as the OFS since you > > > know there won't be any "END"s in the current records since the RS="END" > > > is taking care of that. > > > Yes, that's a clever solution. I prefer the ^L as separator however (or any > > other single character), as it's easier to turn into a "\n" with tr. > > In such situations I sometimes just take SUBSEP for convenience; being > predefined and a control character. > (If you do shell post-processing you would of course have to know what > SUBSEP actually is.) Yeah, I thought about that, but I don't like to use SUBSEP outside of awk for the reason you stated so I'd only use that if the solution was going to be: awk -v RS="END" -v ORS="\n" -F FS="\n" 'BEGIN{OFS=SUBSEP}{$1=$1}1' file | while IFS= read -r block; do awk -v block="$block" 'BEGIN{gsub(SUBSEP,"\n",block); print block; exit}' | command done which just seemed unnecessarily complicated compared to the tr solution and you're as likely to be able to use any other absent control character as you are to be able to use the SUBSEP one, and you might have to know what it is anyway to be sure it can't appear in your input. Ed.
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 Prev: Escaping regexp meta characters Next: wget, forms, password, cookies |