How to deal with "missing points" in arrays [Fortran]

Prev: VAX VMS Fortran Source
Next: New Intel Visual Fortran user

From: deltaquattro on 10 Jun 2010 07:26

On 9 Giu, 19:58, Ian Bush <ianbush.throwaway.acco...(a)googlemail.com>
wrote:
> On 9 June, 17:31, deltaquattro <deltaquat...(a)gmail.com> wrote:
>
>
>
> > 3. One could initialize the "missing" values to NaN. However, I then
> > have to test for the array element being a NaN, when I produce my
> > output for the user. From what I remember about Fortran and NaN,
> > there's (or there was) no portable way to do this...am I wrong?
>
> Well in f2003 there is the ieee_is_nan function. You'll need the
> ieee_arithmetic module (I think) to use it.
>
> I must admit, however, I haven't used this and have no feel for how
> widely
> implemented this is yet,
>
> Ian

Hi Ian,

thanks for the piece of information: you're right, I know of that
function because (upon suggestion in this great ng) I bought the
excellent reference MR&C. Unfortunately, I should have specified that
I cannot write fortran 2003-only code...I must stick to F95 with
TRs...

Best Regards

Sergio Rossi

From: deltaquattro on 10 Jun 2010 07:45

On 9 Giu, 20:23, glen herrmannsfeldt <g...(a)ugcs.caltech.edu> wrote:
[..]
> > 1. For each REAL array, I declare a LOGICAL array of the same shape,
> > which contains 0 for correct values and 1 for missing values. I guess
> > that's the cleanest way, but I have a lot of arrays and I'd rather not
> > declare an extra array for each of them. I know it's not a memory
> > issues (obviously LOGICAL arrays don't occupy a lot of space, even if
> > they do are big in my case!),
>
> In Fortran, the default LOGICAL is the same size as default REAL.
> You may have a smaller size available, though.

Wow! Really? I didn't know....it does seems a bit strange to me,
though. Wouldn't a single bit be sufficient for LOGICAL? Oh, btw,
sorry for not having forgot to mention, but when I write REAL I really
mean REAL(kind=r8) where

integer, parameter :: r8= SELECTED_REAL_KIND(9,99)

I apologize for the mistake.

> > but to me it seems like I'm adding
> > redundant code. It would be better to declare arrays of a derived
> > type, each element containing a REAL and a LOGICAL, but this would
> > force me to modify the code in all the places where the arrays are
> > used, and it's quite a big code.
>
> It is hard to say. There are some cache issues, as well as
> readability.
>

Yes, I was afraid about readability too, that's why I was inclined to
discarding this solution.

[..]
>
> Well, you could check for the (unlikely) accidental occurance and
> substitute a different (nearby) value.

Hmmm, not sure I got your point, could you show me a short example?

> Note that 9e99 is too big
> for the single precision REAL on most systems. I believe that
> before IEEE, this was the usual solution. Likely still even with
> IEEE, as long as non-IEEE machines are around.

No problem, as said before I made a mistake, it's really a DOUBLE
PRECISION variable.

[..]
>
> > 3. One could initialize the "missing" values to NaN. However, I then
> > have to test for the array element being a NaN, when I produce my
> > output for the user. From what I remember about Fortran and NaN,
> > there's (or there was) no portable way to do this...am I wrong?
>
> You have to test in all cases, so I don't see the difference.

Sure, I have to test in all case, but the difference is that the other
tests are portable, while, as far as I remember, testing for NaN (,
under F95 with TRs) is not.
For example, I recall that the classic test

IF (x/=x) THEN
! It's a NaN, do something here
END IF

could fail on some compilers. Of course, I may be wrong.

> Many will print out a nice value, such as NaN, in the case of NaN,
> such that you don't have to test. Portable NaN testing is fairly
> new to Fortran. This is probably the best solution going forward.
>
> > I would really appreciate your help on this issue, since I really
> > don't know which way to choose and currently I'm stuck! Thanks in
> > advance,
>
> I don't recommend the LOGICAL variable method, unless it is
> necessary to have all REAL values legal. If you need portability
> to older compilers, you could do conditional compilation on a
> test for NaN or 9.9999e30 (fits in single precision on most
> machines), or 9.9999e99 (in double precision on many machines).
>
> -- glen

Thanks! I think I'll go for 9.9999e99, but just in case I decide to
switch to NaN, could you please suggest a good way to fill the cell
with a NaN? Something like

A=0
arr(i,j)=A/0

should work if I'm not compiling with debugging options on, I think.
Thank you very much,

Best Regards,

Sergio Rossi

From: deltaquattro on 10 Jun 2010 07:52

On 9 Giu, 21:22, Thomas Koenig <tkoe...(a)netcologne.de> wrote:
> On 2010-06-09, deltaquattro <deltaquat...(a)gmail.com> wrote:
>
> > 1. For each REAL array, I declare a LOGICAL array of the same shape,
>
> Default logical has the same size as default real. You could check
> if your compiler has a smaller kind. gfortran, for example, supports
> logical(kind=1), which occupies one byte.

You're right, glenn pointed that out too. I was really surprised to
learn that LOGICAL and REAL occupy the same space.

>
> > which contains 0 for correct values and 1 for missing values.
>
> A nit: logical arrays only contain .TRUE. and .FALSE. How they are
> implemented internally is processor dependent.

Ok.

>
> > 2. I initialize a missing value to an extremely large positive or
> > negative value, like 9e99.
>
> I don't in-line signalling much. It is likely to bite you in the
> one place in your program where you didn't think to check for it.
> Murphy dictates that there will be at least one such place ;-)
>
> > 3. One could initialize the "missing" values to NaN.
>
> Probably the best way. You could isolate the check in a module, in
> a single, system-dependent function. Chances are most compilers will
> have either isnan() or the F2003 IEEE feature.
>
> There is a fourth method, which depends on the way that you process your
> data. If you walk through it linearly and you have few invalid points, you
> could keep a list of invalid points, and check for the presence of that
> point on the list.

Hmmmm, I have multiple arrays, so I should either keep multiple lists,
or a big list of integers, where some ordering of the different arrays
is chosen...I don't like this solution too much. The point is that
each array describes one of four physical properties of an "object",
and I have many of these objects, stored in a doubly linked list. I'd
rather not be forced to do the checks always in the same order, since
I'm not sure I'll always access the "objects" in the same order. Also,
some objects may be deleted, so I should update the list of invalid
points accordingly.

Thanks,

Best Regards

Sergio Rossi

From: deltaquattro on 10 Jun 2010 07:56

On 9 Giu, 22:02, Harald Anlauf <anlauf.2...(a)arcor.de> wrote:
> On Jun 9, 6:31 pm, deltaquattro <deltaquat...(a)gmail.com> wrote:
>
> > 2. I initialize a missing value to an extremely large positive or
> > negative value, like 9e99. I think that's how the problem is usually
> > solved in practice, isn't it? I'm a bit worried that this is not
> > entirely "clean", since such values could in theory also result from
> > the interpolation. However, since reasonable values of all the
> > interpolated quantities are usually in the range -100/100, when this
> > happens usually it is related to errors in the interpolation table
> > data. So most likely it indicates an error which must be signaled to
> > the user.
>
> Some external data formats like NetCDF have the concept of a "missing
> value"
> that you define for a variable or an arbitrary-rank array, which can
> be
> inquired when reading in the data. It need not be the same fixed
> value
> for all variables in a file. I would therefore recommend to use a
> variable which you compare to instead of a certain "magic" number.
>
> Cheers,
> Harald

Ok, so you're for the solution using LOGICAL arrays. Do you think it
would be better to implement a "parallel" logical array for each real
array, or to define arrays of derived type like:

type safearray
logical :: isvalid
real(r8):: cell
end type

type(safearray), dimension(imax,jmax) :: arr1, arr2, ....

Thanks

Sergio rossi

From: deltaquattro on 10 Jun 2010 08:47

Hi all,

The discussion is getting very interesting, but after reading all the
answer I am getting a little confused about which would be the best
option. Let's try to recap:

1) William Clodius suggests an enhanced version of solution 1.: I
address this in my reply to him.

2) Most people are against option 2. because it's not safe: in your
experience in-line signaling is too error-prone, so I should refrain
from doing this. Ok, point taken.

3) Richard Maine and others advocate testing for NaN with a separately
compiled function, because that will be not "optimized out" by the
compiler. So that should be portable enough even under F95 with TRs.
Let's say I choose this solution: what I missing here is, how do I
fill an array elements with a NaN? Will this be portable?

a=0
arr(i,j)=a/0

Thanks to all,

Sergio Rossi

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10
Prev: VAX VMS Fortran Source
Next: New Intel Visual Fortran user