From: deltaquattro on 9 Jun 2010 12:31 Hi, this is really more of a "numerical computing" question, so I cross- post to sci.math.num.analysis too. I decided to post on comp.lang.fortran, anyway, because here is full of computational scientists and anyway there are some sides of the issue specifically related to Fortran language. The problem is this: I am modifying a legacy code, and I need to compute some REAL values which I then store in large arrays. Sometimes it's impossible to compute these values: for example, think of interpolating a table to a given abscissa, it may happen that the abscissa falls outside the curve boundaries. I have code which checks for this possibility, and if this happens the interpolation is not performed. However, now I must "store" somewhere the information that interpolation was not possible for that array element, and inform the user of it. Since the values can be either positive or negative, I cannot use tricks like initializing the array element to a negative values. I'm sure this has happened to you before: which solution did you use? Basically, I can think of three ways: 1. For each REAL array, I declare a LOGICAL array of the same shape, which contains 0 for correct values and 1 for missing values. I guess that's the cleanest way, but I have a lot of arrays and I'd rather not declare an extra array for each of them. I know it's not a memory issues (obviously LOGICAL arrays don't occupy a lot of space, even if they do are big in my case!), but to me it seems like I'm adding redundant code. It would be better to declare arrays of a derived type, each element containing a REAL and a LOGICAL, but this would force me to modify the code in all the places where the arrays are used, and it's quite a big code. 2. I initialize a missing value to an extremely large positive or negative value, like 9e99. I think that's how the problem is usually solved in practice, isn't it? I'm a bit worried that this is not entirely "clean", since such values could in theory also result from the interpolation. However, since reasonable values of all the interpolated quantities are usually in the range -100/100, when this happens usually it is related to errors in the interpolation table data. So most likely it indicates an error which must be signaled to the user. 3. One could initialize the "missing" values to NaN. However, I then have to test for the array element being a NaN, when I produce my output for the user. From what I remember about Fortran and NaN, there's (or there was) no portable way to do this...am I wrong? I would really appreciate your help on this issue, since I really don't know which way to choose and currently I'm stuck! Thanks in advance, Best Regards Sergio Rossi
From: Ian Bush on 9 Jun 2010 13:58 On 9 June, 17:31, deltaquattro <deltaquat...(a)gmail.com> wrote: > > 3. One could initialize the "missing" values to NaN. However, I then > have to test for the array element being a NaN, when I produce my > output for the user. From what I remember about Fortran and NaN, > there's (or there was) no portable way to do this...am I wrong? > Well in f2003 there is the ieee_is_nan function. You'll need the ieee_arithmetic module (I think) to use it. I must admit, however, I haven't used this and have no feel for how widely implemented this is yet, Ian
From: glen herrmannsfeldt on 9 Jun 2010 14:23 In comp.lang.fortran deltaquattro <deltaquattro(a)gmail.com> wrote: (snip on applicability) > The problem is this: I am modifying a legacy code, and I need to > compute some REAL values which I then store in large arrays. Sometimes > it's impossible to compute these values: for example, think of > interpolating a table to a given abscissa, it may happen that the > abscissa falls outside the curve boundaries. I have code which checks > for this possibility, and if this happens the interpolation is not > performed. However, now I must "store" somewhere the information that > interpolation was not possible for that array element, and inform the > user of it. Since the values can be either positive or negative, I > cannot use tricks like initializing the array element to a negative > values. > I'm sure this has happened to you before: which solution did you use? > Basically, I can think of three ways: > > 1. For each REAL array, I declare a LOGICAL array of the same shape, > which contains 0 for correct values and 1 for missing values. I guess > that's the cleanest way, but I have a lot of arrays and I'd rather not > declare an extra array for each of them. I know it's not a memory > issues (obviously LOGICAL arrays don't occupy a lot of space, even if > they do are big in my case!), In Fortran, the default LOGICAL is the same size as default REAL. You may have a smaller size available, though. > but to me it seems like I'm adding > redundant code. It would be better to declare arrays of a derived > type, each element containing a REAL and a LOGICAL, but this would > force me to modify the code in all the places where the arrays are > used, and it's quite a big code. It is hard to say. There are some cache issues, as well as readability. > 2. I initialize a missing value to an extremely large positive or > negative value, like 9e99. I think that's how the problem is usually > solved in practice, isn't it? I'm a bit worried that this is not > entirely "clean", since such values could in theory also result from > the interpolation. Well, you could check for the (unlikely) accidental occurance and substitute a different (nearby) value. Note that 9e99 is too big for the single precision REAL on most systems. I believe that before IEEE, this was the usual solution. Likely still even with IEEE, as long as non-IEEE machines are around. > However, since reasonable values of all the > interpolated quantities are usually in the range -100/100, when this > happens usually it is related to errors in the interpolation table > data. So most likely it indicates an error which must be signaled to > the user. In general testing for a specific floating point value isn't a good idea, but it is likely done in this case. > 3. One could initialize the "missing" values to NaN. However, I then > have to test for the array element being a NaN, when I produce my > output for the user. From what I remember about Fortran and NaN, > there's (or there was) no portable way to do this...am I wrong? You have to test in all cases, so I don't see the difference. Many will print out a nice value, such as NaN, in the case of NaN, such that you don't have to test. Portable NaN testing is fairly new to Fortran. This is probably the best solution going forward. > I would really appreciate your help on this issue, since I really > don't know which way to choose and currently I'm stuck! Thanks in > advance, I don't recommend the LOGICAL variable method, unless it is necessary to have all REAL values legal. If you need portability to older compilers, you could do conditional compilation on a test for NaN or 9.9999e30 (fits in single precision on most machines), or 9.9999e99 (in double precision on many machines). -- glen
From: Thomas Koenig on 9 Jun 2010 15:22 On 2010-06-09, deltaquattro <deltaquattro(a)gmail.com> wrote: > 1. For each REAL array, I declare a LOGICAL array of the same shape, Default logical has the same size as default real. You could check if your compiler has a smaller kind. gfortran, for example, supports logical(kind=1), which occupies one byte. > which contains 0 for correct values and 1 for missing values. A nit: logical arrays only contain .TRUE. and .FALSE. How they are implemented internally is processor dependent. > 2. I initialize a missing value to an extremely large positive or > negative value, like 9e99. I don't in-line signalling much. It is likely to bite you in the one place in your program where you didn't think to check for it. Murphy dictates that there will be at least one such place ;-) > 3. One could initialize the "missing" values to NaN. Probably the best way. You could isolate the check in a module, in a single, system-dependent function. Chances are most compilers will have either isnan() or the F2003 IEEE feature. There is a fourth method, which depends on the way that you process your data. If you walk through it linearly and you have few invalid points, you could keep a list of invalid points, and check for the presence of that point on the list.
From: Harald Anlauf on 9 Jun 2010 16:02
On Jun 9, 6:31 pm, deltaquattro <deltaquat...(a)gmail.com> wrote: > 2. I initialize a missing value to an extremely large positive or > negative value, like 9e99. I think that's how the problem is usually > solved in practice, isn't it? I'm a bit worried that this is not > entirely "clean", since such values could in theory also result from > the interpolation. However, since reasonable values of all the > interpolated quantities are usually in the range -100/100, when this > happens usually it is related to errors in the interpolation table > data. So most likely it indicates an error which must be signaled to > the user. Some external data formats like NetCDF have the concept of a "missing value" that you define for a variable or an arbitrary-rank array, which can be inquired when reading in the data. It need not be the same fixed value for all variables in a file. I would therefore recommend to use a variable which you compare to instead of a certain "magic" number. Cheers, Harald |