From: J on 7 Apr 2010 17:40 Can someone make me un-crazy? I have a bit of code that right now, looks like this: status = getoutput('smartctl -l selftest /dev/sda').splitlines()[6] status = re.sub(' (?= )(?=([^"]*"[^"]*")*[^"]*$)', ":",status) print status Basically, it pulls the first actual line of data from the return you get when you use smartctl to look at a hard disk's selftest log. The raw data looks like this: # 1 Short offline Completed without error 00% 679 - Unfortunately, all that whitespace is arbitrary single space characters. And I am interested in the string that appears in the third column, which changes as the test runs and then completes. So in the example, "Completed without error" The regex I have up there doesn't quite work, as it seems to be subbing EVERY space (or at least in instances of more than one space) to a ':' like this: # 1: Short offline:::::: Completed without error:::::: 00%:::::: 679:::::::: - Ultimately, what I'm trying to do is either replace any space that is > one space wiht a delimiter, then split the result into a list and get the third item. OR, if there's a smarter, shorter, or better way of doing it, I'd love to know. The end result should pull the whole string in the middle of that output line, and then I can use that to compare to a list of possible output strings to determine if the test is still running, has completed successfully, or failed. Unfortunately, my google-fu fails right now, and my Regex powers were always rather weak anyway... So any ideas on what the best way to proceed with this would be?
From: Grant Edwards on 7 Apr 2010 17:47 On 2010-04-07, J <dreadpiratejeff(a)gmail.com> wrote: > Can someone make me un-crazy? Definitely. Regex is driving you crazy, so don't use a regex. inputString = "# 1 Short offline Completed without error 00% 679 -" print ' '.join(inputString.split()[4:-3]) > So any ideas on what the best way to proceed with this would be? Anytime you have a problem with a regex, the first thing you should ask yourself: "do I really, _really_ need a regex? Hint: the answer is usually "no". -- Grant Edwards grant.b.edwards Yow! I'm continually AMAZED at at th'breathtaking effects gmail.com of WIND EROSION!!
From: Patrick Maupin on 7 Apr 2010 20:49 On Apr 7, 4:40 pm, J <dreadpiratej...(a)gmail.com> wrote: > Can someone make me un-crazy? > > I have a bit of code that right now, looks like this: > > status = getoutput('smartctl -l selftest /dev/sda').splitlines()[6] > status = re.sub(' (?= )(?=([^"]*"[^"]*")*[^"]*$)', ":",status) > print status > > Basically, it pulls the first actual line of data from the return you > get when you use smartctl to look at a hard disk's selftest log. > > The raw data looks like this: > > # 1 Short offline Completed without error 00% 679 - > > Unfortunately, all that whitespace is arbitrary single space > characters. And I am interested in the string that appears in the > third column, which changes as the test runs and then completes. So > in the example, "Completed without error" > > The regex I have up there doesn't quite work, as it seems to be > subbing EVERY space (or at least in instances of more than one space) > to a ':' like this: > > # 1: Short offline:::::: Completed without error:::::: 00%:::::: 679:::::::: - > > Ultimately, what I'm trying to do is either replace any space that is> one space wiht a delimiter, then split the result into a list and > > get the third item. > > OR, if there's a smarter, shorter, or better way of doing it, I'd love to know. > > The end result should pull the whole string in the middle of that > output line, and then I can use that to compare to a list of possible > output strings to determine if the test is still running, has > completed successfully, or failed. > > Unfortunately, my google-fu fails right now, and my Regex powers were > always rather weak anyway... > > So any ideas on what the best way to proceed with this would be? You mean like this? >>> import re >>> re.split(' {2,}', '# 1 Short offline Completed without error 00%') ['# 1', 'Short offline', 'Completed without error', '00%'] >>> Regards, Pat
From: Patrick Maupin on 7 Apr 2010 20:50 On Apr 7, 4:47 pm, Grant Edwards <inva...(a)invalid.invalid> wrote: > On 2010-04-07, J <dreadpiratej...(a)gmail.com> wrote: > > > Can someone make me un-crazy? > > Definitely. Regex is driving you crazy, so don't use a regex. > > inputString = "# 1 Short offline Completed without error 00% 679 -" > > print ' '.join(inputString.split()[4:-3]) > > > So any ideas on what the best way to proceed with this would be? > > Anytime you have a problem with a regex, the first thing you should > ask yourself: "do I really, _really_ need a regex? > > Hint: the answer is usually "no". > > -- > Grant Edwards grant.b.edwards Yow! I'm continually AMAZED > at at th'breathtaking effects > gmail.com of WIND EROSION!! OK, fine. Post a better solution to this problem than: >>> import re >>> re.split(' {2,}', '# 1 Short offline Completed without error 00%') ['# 1', 'Short offline', 'Completed without error', '00%'] >>> Regards, Pat
From: Patrick Maupin on 7 Apr 2010 21:03
On Apr 7, 7:49 pm, Patrick Maupin <pmau...(a)gmail.com> wrote: > On Apr 7, 4:40 pm, J <dreadpiratej...(a)gmail.com> wrote: > > > > > Can someone make me un-crazy? > > > I have a bit of code that right now, looks like this: > > > status = getoutput('smartctl -l selftest /dev/sda').splitlines()[6] > > status = re.sub(' (?= )(?=([^"]*"[^"]*")*[^"]*$)', ":",status) > > print status > > > Basically, it pulls the first actual line of data from the return you > > get when you use smartctl to look at a hard disk's selftest log. > > > The raw data looks like this: > > > # 1 Short offline Completed without error 00% 679 - > > > Unfortunately, all that whitespace is arbitrary single space > > characters. And I am interested in the string that appears in the > > third column, which changes as the test runs and then completes. So > > in the example, "Completed without error" > > > The regex I have up there doesn't quite work, as it seems to be > > subbing EVERY space (or at least in instances of more than one space) > > to a ':' like this: > > > # 1: Short offline:::::: Completed without error:::::: 00%:::::: 679:::::::: - > > > Ultimately, what I'm trying to do is either replace any space that is> one space wiht a delimiter, then split the result into a list and > > > get the third item. > > > OR, if there's a smarter, shorter, or better way of doing it, I'd love to know. > > > The end result should pull the whole string in the middle of that > > output line, and then I can use that to compare to a list of possible > > output strings to determine if the test is still running, has > > completed successfully, or failed. > > > Unfortunately, my google-fu fails right now, and my Regex powers were > > always rather weak anyway... > > > So any ideas on what the best way to proceed with this would be? > > You mean like this? > > >>> import re > >>> re.split(' {2,}', '# 1 Short offline Completed without error 00%') > > ['# 1', 'Short offline', 'Completed without error', '00%'] > > > > Regards, > Pat BTW, although I find it annoying when people say "don't do that" when "that" is a perfectly good thing to do, and although I also find it annoying when people tell you what not to do without telling you what *to* do, and although I find the regex solution to this problem to be quite clean, the equivalent non-regex solution is not terrible, so I will present it as well, for your viewing pleasure: >>> [x for x in '# 1 Short offline Completed without error 00%'.split(' ') if x.strip()] ['# 1', 'Short offline', ' Completed without error', ' 00%'] Regards, Pat |