Prev: subtraction files
Next: copy files
From: John DuBois on 25 May 2010 14:41 In article <htgnf1$s5d$1(a)news.m-online.net>, Janis Papanagnou <janis_papanagnou(a)hotmail.com> wrote: >It seems that ANSI sequences can terminate in a digit. How could one >distinguish in a sequence like, say, \x1b[0A whether the A is part of >the ANSI sequence or part of the subsequent data. No, I don't think they can. The patterns I've used in the past for excising ANSI sequences: gsub(/\033\[[^a-zA-Z]*./, "") gsub(/\033./, "") Apparently the terminating character can actually be characters 64 through 95, not just letters, though I haven't seen that. And of course you may also encounter the single-character CSI, character 155, in place of \033[. John -- John DuBois spcecdt(a)armory.com KC6QKZ/AE http://www.armory.com/~spcecdt/
From: Ben Bacarisse on 25 May 2010 17:15 pk <pk(a)pk.invalid> writes: <snip> > For reference, here are some tables with most ANSI escape sequences: > > http://isthe.com/chongo/tech/comp/ansi_escapes.html > http://ascii-table.com/ansi-escape-sequences.php Yes, I found both of those but they seem less that comprehensive (my test being if they tell you about \e[J and \e[1J as well as \e2J). ECMA-48 seems to be the most definitive reference I can find online. It gives a more restrictive pattern: (\x1b\[|\x9b)[\x30-\x3f]*[\x40-\x7e] In fact, trailing bytes in the range \x70-\7e ('p' to '~' in ASCII) are reserved for private or experimental use so this could be made even more restricted. -- Ben.
From: Janis Papanagnou on 25 May 2010 18:25 Ben Bacarisse wrote: > pk <pk(a)pk.invalid> writes: > <snip> >> For reference, here are some tables with most ANSI escape sequences: >> >> http://isthe.com/chongo/tech/comp/ansi_escapes.html >> http://ascii-table.com/ansi-escape-sequences.php > > Yes, I found both of those but they seem less that comprehensive (my > test being if they tell you about \e[J and \e[1J as well as \e2J). > > ECMA-48 seems to be the most definitive reference I can find online. It > gives a more restrictive pattern: > > (\x1b\[|\x9b)[\x30-\x3f]*[\x40-\x7e] I wonder, though, why, e.g., ESC ( B ESC = ESC > (which, incidentally, are all in the data that I parse) are not covered by the pattern that you've found in the ECMA-48 reference. > In fact, trailing bytes in the range \x70-\7e ('p' to '~' in ASCII) are > reserved for private or experimental use so this could be made even more > restricted. > BTW, in one of the references there are also escape sequences that seems to be terminated by a digit; ESC 7 and ESC 8, for example. Janis
From: Ben Bacarisse on 25 May 2010 19:06 Janis Papanagnou <janis_papanagnou(a)hotmail.com> writes: > Ben Bacarisse wrote: >> pk <pk(a)pk.invalid> writes: >> <snip> >>> For reference, here are some tables with most ANSI escape sequences: >>> >>> http://isthe.com/chongo/tech/comp/ansi_escapes.html >>> http://ascii-table.com/ansi-escape-sequences.php >> >> Yes, I found both of those but they seem less that comprehensive (my >> test being if they tell you about \e[J and \e[1J as well as \e2J). >> >> ECMA-48 seems to be the most definitive reference I can find online. It >> gives a more restrictive pattern: >> >> (\x1b\[|\x9b)[\x30-\x3f]*[\x40-\x7e] > > I wonder, though, why, e.g., > > ESC ( B > ESC = > ESC > > > (which, incidentally, are all in the data that I parse) are not covered > by the pattern that you've found in the ECMA-48 reference. What I quoted was a pattern for what ECMA-48 calls control sequences. There are four other categories (the C0 set, the C1 set, independent control functions and control strings) and I have not gone through and worked them all out. I think there is a lot of history being codified here. >> In fact, trailing bytes in the range \x70-\7e ('p' to '~' in ASCII) are >> reserved for private or experimental use so this could be made even more >> restricted. >> > > BTW, in one of the references there are also escape sequences that seems > to be terminated by a digit; ESC 7 and ESC 8, for example. That may well be possible. I was only describing "control sequences" -- those that start with CSI (the Control Sequence Introducer) \e[. There aught to be an ANSI document, of course, but they are not always easily available. It might be easier to read though than ECMA-48 which is rather hard going. -- Ben.
From: stan on 26 May 2010 17:23
Janis Papanagnou wrote: > Ben Bacarisse wrote: >> pk <pk(a)pk.invalid> writes: >> <snip> >>> For reference, here are some tables with most ANSI escape sequences: >>> >>> http://isthe.com/chongo/tech/comp/ansi_escapes.html >>> http://ascii-table.com/ansi-escape-sequences.php >> >> Yes, I found both of those but they seem less that comprehensive (my >> test being if they tell you about \e[J and \e[1J as well as \e2J). >> >> ECMA-48 seems to be the most definitive reference I can find online. It >> gives a more restrictive pattern: >> >> (\x1b\[|\x9b)[\x30-\x3f]*[\x40-\x7e] > > I wonder, though, why, e.g., > > ESC ( B > ESC = > ESC > I don't know of a handy online reference but I have an old copy of an actual VT100 user guide with a pretty good description that seems comprehensive. For example ESC ( B is shown as ANSI SCS control which switches from G0 to G1 char set. ESC = is shown as DECKPAM Keypad App Mode (DEC private) ESC > is shown as DECKPNM Keypad Numeric Mode (DEC private) > (which, incidentally, are all in the data that I parse) are not covered > by the pattern that you've found in the ECMA-48 reference. > >> In fact, trailing bytes in the range \x70-\7e ('p' to '~' in ASCII) are >> reserved for private or experimental use so this could be made even more >> restricted. >> > BTW, in one of the references there are also escape sequences that seems > to be terminated by a digit; ESC 7 and ESC 8, for example. Ok, I'm back and it seems there is a copy at: www.piesoftwareinc.co.uk/textonly/VT100_User_Guide.pdf I don't know if it helps but it has a lot of pages :) |