Prev: ANN: Leo 4.7 rc1 released
Next: Configuring apache to execute python scripts using mod_pythonhandler
From: Jean-Michel Pichavant on 15 Feb 2010 09:03 Martin wrote: > Hi, > > I am trying to come up with a more generic scheme to match and replace > a series of regex, which look something like this... > > 19.01,16.38,0.79,1.26,1.00 ! canht_ft(1:npft) > 5.0, 4.0, 2.0, 4.0, 1.0 ! lai(1:npft) > > Ideally match the pattern to the right of the "!" sign (e.g. lai), I > would then like to be able to replace one or all of the corresponding > numbers on the line. So far I have a rather unsatisfactory solution, > any suggestions would be appreciated... > > The file read in is an ascii file. > > f = open(fname, 'r') > s = f.read() > > if CANHT: > s = re.sub(r"\d+.\d+,\d+.\d+,\d+.\d+,\d+.\d+,\d+.\d+ ! > canht_ft", CANHT, s) > > where CANHT might be > > CANHT = '115.01,16.38,0.79,1.26,1.00 ! canht_ft' > > But this involves me passing the entire string. > > Thanks. > > Martin > I remove all lines containing things like 9*0.0 in your file, cause I don't know what they mean and how to handle them. These are not numbers. import re replace = { 'snow_grnd' : (1, '99.99,'), # replace the 1st number by 99.99 't_soil' : (2, '88.8,'), # replace the 2nd number by 88.88 } testBuffer = """ 0.749, 0.743, 0.754, 0.759 ! stheta(1:sm_levels)(top to bottom) 0.46 ! snow_grnd 276.78,277.46,278.99,282.48 ! t_soil(1:sm_levels)(top to bottom) 19.01,16.38,0.79,1.26,1.00 ! canht_ft(1:npft) 200.0, 4.0, 2.0, 4.0, 1.0 ! lai(1:npft) """ outputBuffer = '' for line in testBuffer.split('\n'): for key, (index, repl) in replace.items(): if key in line: parameters = { 'n' : '[\d\.]+', # given you example you have to change this one, I don't know what means 9*0.0 in your file 'index' : index - 1, } # the following pattern will silently match any digit before the <index>th digit is found, and use a capturing parenthesis for the last pattern = '(\s*(?:(?:%(n)s)[,\s]+){0,%(index)s})(?:(%(n)s)[,\s]+)(.*!.*)' % parameters # regexp are sometimes a nightmare to read line = re.sub(pattern, r'\1 '+repl+r'\3' , line) break outputBuffer += line +'\n' print outputBuffer
From: Martin on 15 Feb 2010 09:13 On Feb 15, 2:03 pm, Jean-Michel Pichavant <jeanmic...(a)sequans.com> wrote: > Martin wrote: > > Hi, > > > I am trying to come up with a more generic scheme to match and replace > > a series of regex, which look something like this... > > > 19.01,16.38,0.79,1.26,1.00 ! canht_ft(1:npft) > > 5.0, 4.0, 2.0, 4.0, 1.0 ! lai(1:npft) > > > Ideally match the pattern to the right of the "!" sign (e.g. lai), I > > would then like to be able to replace one or all of the corresponding > > numbers on the line. So far I have a rather unsatisfactory solution, > > any suggestions would be appreciated... > > > The file read in is an ascii file. > > > f = open(fname, 'r') > > s = f.read() > > > if CANHT: > > s = re.sub(r"\d+.\d+,\d+.\d+,\d+.\d+,\d+.\d+,\d+.\d+ ! > > canht_ft", CANHT, s) > > > where CANHT might be > > > CANHT = '115.01,16.38,0.79,1.26,1.00 ! canht_ft' > > > But this involves me passing the entire string. > > > Thanks. > > > Martin > > I remove all lines containing things like 9*0.0 in your file, cause I > don't know what they mean and how to handle them. These are not numbers. > > import re > > replace = { > 'snow_grnd' : (1, '99.99,'), # replace the 1st number by 99.99 > 't_soil' : (2, '88.8,'), # replace the 2nd number by 88.88 > } > > testBuffer = """ > 0.749, 0.743, 0.754, 0.759 ! stheta(1:sm_levels)(top to bottom) > 0.46 ! snow_grnd > 276.78,277.46,278.99,282.48 ! t_soil(1:sm_levels)(top to bottom) > 19.01,16.38,0.79,1.26,1.00 ! canht_ft(1:npft) > 200.0, 4.0, 2.0, 4.0, 1.0 ! lai(1:npft) > """ > > outputBuffer = '' > for line in testBuffer.split('\n'): > for key, (index, repl) in replace.items(): > if key in line: > parameters = { > 'n' : '[\d\.]+', # given you example you have to change > this one, I don't know what means 9*0.0 in your file > 'index' : index - 1, > } > # the following pattern will silently match any digit before > the <index>th digit is found, and use a capturing parenthesis for the last > pattern = > '(\s*(?:(?:%(n)s)[,\s]+){0,%(index)s})(?:(%(n)s)[,\s]+)(.*!.*)' % > parameters # regexp are sometimes a nightmare to read > line = re.sub(pattern, r'\1 '+repl+r'\3' , line) > break > outputBuffer += line +'\n' > > print outputBuffer Thanks I will take a look. I think perhaps I was having a very slow day when I posted and realised I could solve the original problem more efficiently and the problem wasn't perhaps as I first perceived. It is enough to match the tag to the right of the "!" sign and use this to adjust what lies on the left of the "!" sign. Currently I have this...if anyone thinks there is a neater solution I am happy to hear it. Many thanks. variable_tag = 'lai' variable = [200.0, 60.030, 0.060, 0.030, 0.030] # generate adjustment string variable = ",".join(["%s" % i for i in variable]) + ' ! ' + variable_tag # call func to adjust input file adjustStandardPftParams(variable, variable_tag, in_param_fname, out_param_fname) and the inside of this func looks like this def adjustStandardPftParams(self, variable, variable_tag, in_fname, out_fname): f = open(in_fname, 'r') of = open(out_fname, 'w') pattern_found = False while True: line = f.readline() if not line: break pattern = re.findall(r"!\s+"+variable_tag, line) if pattern: print 'yes' print >> of, "%s" % variable pattern_found = True if pattern_found: pattern_found = False else: of.write(line) f.close() of.close() return
From: Jean-Michel Pichavant on 15 Feb 2010 09:27 Martin wrote: > On Feb 15, 2:03 pm, Jean-Michel Pichavant <jeanmic...(a)sequans.com> > wrote: > >> Martin wrote: >> >>> Hi, >>> >>> I am trying to come up with a more generic scheme to match and replace >>> a series of regex, which look something like this... >>> >>> 19.01,16.38,0.79,1.26,1.00 ! canht_ft(1:npft) >>> 5.0, 4.0, 2.0, 4.0, 1.0 ! lai(1:npft) >>> >>> Ideally match the pattern to the right of the "!" sign (e.g. lai), I >>> would then like to be able to replace one or all of the corresponding >>> numbers on the line. So far I have a rather unsatisfactory solution, >>> any suggestions would be appreciated... >>> >>> The file read in is an ascii file. >>> >>> f = open(fname, 'r') >>> s = f.read() >>> >>> if CANHT: >>> s = re.sub(r"\d+.\d+,\d+.\d+,\d+.\d+,\d+.\d+,\d+.\d+ ! >>> canht_ft", CANHT, s) >>> >>> where CANHT might be >>> >>> CANHT = '115.01,16.38,0.79,1.26,1.00 ! canht_ft' >>> >>> But this involves me passing the entire string. >>> >>> Thanks. >>> >>> Martin >>> >> I remove all lines containing things like 9*0.0 in your file, cause I >> don't know what they mean and how to handle them. These are not numbers. >> >> import re >> >> replace = { >> 'snow_grnd' : (1, '99.99,'), # replace the 1st number by 99.99 >> 't_soil' : (2, '88.8,'), # replace the 2nd number by 88.88 >> } >> >> testBuffer = """ >> 0.749, 0.743, 0.754, 0.759 ! stheta(1:sm_levels)(top to bottom) >> 0.46 ! snow_grnd >> 276.78,277.46,278.99,282.48 ! t_soil(1:sm_levels)(top to bottom) >> 19.01,16.38,0.79,1.26,1.00 ! canht_ft(1:npft) >> 200.0, 4.0, 2.0, 4.0, 1.0 ! lai(1:npft) >> """ >> >> outputBuffer = '' >> for line in testBuffer.split('\n'): >> for key, (index, repl) in replace.items(): >> if key in line: >> parameters = { >> 'n' : '[\d\.]+', # given you example you have to change >> this one, I don't know what means 9*0.0 in your file >> 'index' : index - 1, >> } >> # the following pattern will silently match any digit before >> the <index>th digit is found, and use a capturing parenthesis for the last >> pattern = >> '(\s*(?:(?:%(n)s)[,\s]+){0,%(index)s})(?:(%(n)s)[,\s]+)(.*!.*)' % >> parameters # regexp are sometimes a nightmare to read >> line = re.sub(pattern, r'\1 '+repl+r'\3' , line) >> break >> outputBuffer += line +'\n' >> >> print outputBuffer >> > > Thanks I will take a look. I think perhaps I was having a very slow > day when I posted and realised I could solve the original problem more > efficiently and the problem wasn't perhaps as I first perceived. It is > enough to match the tag to the right of the "!" sign and use this to > adjust what lies on the left of the "!" sign. Currently I have > this...if anyone thinks there is a neater solution I am happy to hear > it. Many thanks. > > variable_tag = 'lai' > variable = [200.0, 60.030, 0.060, 0.030, 0.030] > > # generate adjustment string > variable = ",".join(["%s" % i for i in variable]) + ' ! ' + > variable_tag > > # call func to adjust input file > adjustStandardPftParams(variable, variable_tag, in_param_fname, > out_param_fname) > > and the inside of this func looks like this > > def adjustStandardPftParams(self, variable, variable_tag, in_fname, > out_fname): > > f = open(in_fname, 'r') > of = open(out_fname, 'w') > pattern_found = False > > while True: > line = f.readline() > if not line: > break > pattern = re.findall(r"!\s+"+variable_tag, line) > if pattern: > print 'yes' > print >> of, "%s" % variable > pattern_found = True > > if pattern_found: > pattern_found = False > else: > of.write(line) > > f.close() > of.close() > > return > Are you sure a simple if variable_tag in line: # do some stuff is not enough ? People will usually prefer to write for line in open(in_fname, 'r') : instead of your ugly while loop ;-) JM
From: Martin on 15 Feb 2010 17:26 On Feb 15, 2:27 pm, Jean-Michel Pichavant <jeanmic...(a)sequans.com> wrote: > Martin wrote: > > On Feb 15, 2:03 pm, Jean-Michel Pichavant <jeanmic...(a)sequans.com> > > wrote: > > >> Martin wrote: > > >>> Hi, > > >>> I am trying to come up with a more generic scheme to match and replace > >>> a series ofregex, which look something like this... > > >>> 19.01,16.38,0.79,1.26,1.00 ! canht_ft(1:npft) > >>> 5.0, 4.0, 2.0, 4.0, 1.0 ! lai(1:npft) > > >>> Ideally match the pattern to the right of the "!" sign (e.g. lai), I > >>> would then like to be able to replace one or all of the corresponding > >>> numbers on the line. So far I have a rather unsatisfactory solution, > >>> any suggestions would be appreciated... > > >>> The file read in is an ascii file. > > >>> f = open(fname, 'r') > >>> s = f.read() > > >>> if CANHT: > >>> s = re.sub(r"\d+.\d+,\d+.\d+,\d+.\d+,\d+.\d+,\d+.\d+ ! > >>> canht_ft", CANHT, s) > > >>> where CANHT might be > > >>> CANHT = '115.01,16.38,0.79,1.26,1.00 ! canht_ft' > > >>> But this involves me passing the entire string. > > >>> Thanks. > > >>> Martin > > >> I remove all lines containing things like 9*0.0 in your file, cause I > >> don't know what they mean and how to handle them. These are not numbers. > > >> import re > > >> replace = { > >> 'snow_grnd' : (1, '99.99,'), # replace the 1st number by 99.99 > >> 't_soil' : (2, '88.8,'), # replace the 2nd number by 88.88 > >> } > > >> testBuffer = """ > >> 0.749, 0.743, 0.754, 0.759 ! stheta(1:sm_levels)(top to bottom) > >> 0.46 ! snow_grnd > >> 276.78,277.46,278.99,282.48 ! t_soil(1:sm_levels)(top to bottom) > >> 19.01,16.38,0.79,1.26,1.00 ! canht_ft(1:npft) > >> 200.0, 4.0, 2.0, 4.0, 1.0 ! lai(1:npft) > >> """ > > >> outputBuffer = '' > >> for line in testBuffer.split('\n'): > >> for key, (index, repl) in replace.items(): > >> if key in line: > >> parameters = { > >> 'n' : '[\d\.]+', # given you example you have to change > >> this one, I don't know what means 9*0.0 in your file > >> 'index' : index - 1, > >> } > >> # the following pattern will silently match any digit before > >> the <index>th digit is found, and use a capturing parenthesis for the last > >> pattern = > >> '(\s*(?:(?:%(n)s)[,\s]+){0,%(index)s})(?:(%(n)s)[,\s]+)(.*!.*)' % > >> parameters # regexp are sometimes a nightmare to read > >> line = re.sub(pattern, r'\1 '+repl+r'\3' , line) > >> break > >> outputBuffer += line +'\n' > > >> print outputBuffer > > > Thanks I will take a look. I think perhaps I was having a very slow > > day when I posted and realised I could solve the original problem more > > efficiently and the problem wasn't perhaps as I first perceived. It is > > enough to match the tag to the right of the "!" sign and use this to > > adjust what lies on the left of the "!" sign. Currently I have > > this...if anyone thinks there is a neater solution I am happy to hear > > it. Many thanks. > > > variable_tag = 'lai' > > variable = [200.0, 60.030, 0.060, 0.030, 0.030] > > > # generate adjustment string > > variable = ",".join(["%s" % i for i in variable]) + ' ! ' + > > variable_tag > > > # call func to adjust input file > > adjustStandardPftParams(variable, variable_tag, in_param_fname, > > out_param_fname) > > > and the inside of this func looks like this > > > def adjustStandardPftParams(self, variable, variable_tag, in_fname, > > out_fname): > > > f = open(in_fname, 'r') > > of = open(out_fname, 'w') > > pattern_found = False > > > while True: > > line = f.readline() > > if not line: > > break > > pattern = re.findall(r"!\s+"+variable_tag, line) > > if pattern: > > print 'yes' > > print >> of, "%s" % variable > > pattern_found = True > > > if pattern_found: > > pattern_found = False > > else: > > of.write(line) > > > f.close() > > of.close() > > > return > > Are you sure a simple > if variable_tag in line: > # do some stuff > > is not enough ? > > People will usually prefer to write > > for line in open(in_fname, 'r') : > > instead of your ugly while loop ;-) > > JM My while loop is suitably offended. I have changed it as you suggested...though if I do: if pattern (variable_tag) in line as you suggested i would in my example correctly pick the tag lai, but also one called dcatch_lai, which I wouldn't want. No doubt there is an obvious solution I am again missing! of = open(out_fname, 'w') pattern_found = False for line in open(in_fname, 'r'): pattern = re.findall(r"!\s+"+variable_tag, line) if pattern: print >> of, "%s" % variable pattern_found = True if pattern_found: pattern_found = False else: of.write(line) of.close() Many Thanks.
First
|
Prev
|
Pages: 1 2 Prev: ANN: Leo 4.7 rc1 released Next: Configuring apache to execute python scripts using mod_pythonhandler |