Prev: Which ode solver is the best for this problem? (Duffing oscillator)
Next: IOSTAT for input from sequential formatted files with ?"incomplete" final record
From: Arjan on 4 Feb 2010 14:23 > I also put in heavy duty checks, where it is possible, that are under control > of an DEBUG parameter in an IF statement. Leave them in as they are very handy In a similar way, most of my functions/subroutines have a line: INTEGER, PARAMETER :: DebugLevel = 0 The "heaviness"/verbosity of the checks activated by IF-statements can be increased by increasing the value of DebugLevel. After some years I really started to appreciate myself for doing this... A.
From: Ron Shepard on 4 Feb 2010 16:26 In article <de9d0fd6-33f8-44e4-8e03-20428eaf7451(a)21g2000yqj.googlegroups.com>, deltaquattro <deltaquattro(a)gmail.com> wrote: > when writing code which performs a numerical task such as for example > interpolation, would you add the checks on the input data inside the > sub itself, or would you demand it to other code to be called before > the interpolation subroutine? For example, in 1D linear interpolation > you have to lookup an ordered table for the position of a value x, > before performing interpolation. Checking for x falling inside or > outside the interval spanned by the table should be done inside or > outside the interpolation sub? I'd say outside, but then when I reuse > the code I risk forgetting the call to the checking sub... There isn't a universal answer to this. If the checks are relatively inexpensive (time, memory, cache, etc.), then you should add them in the low level routine so they are always active. If they are expensive, or in the gray area between, then you have options. You might write two versions of the routine, one with and one without the checks. You would call the fast one when speed is critical (e.g. inside innermost do-loops) and where it makes sense to test the arguments outside the low-level routine (e.g. outside of the innermost do-loops), and you would call the safe one when speed is not critical or where nothing is saved by testing outside of the call. Or, you might have a single version of the routine, but do the internal tests conditionally based on the value of an argument (or the existence of an optional argument). This is one of the common uses for optional arguments. The same issue applies also within the routine regarding what to do if it detects an error. Should it return an error code, or should it abort internally? You can write a single routine with an optional return argument that handles both situations. If the argument is present, then it can be set with an appropriate error code and return to the caller, otherwise the routine can abort on the spot. $.02 -Ron Shepard
From: glen herrmannsfeldt on 4 Feb 2010 17:05 Richard Maine <nospam(a)see.signature> wrote: > deltaquattro <deltaquattro(a)gmail.com> wrote: >> ps one of the reasons why I'm asking this is because I've seen a lot >> of interp sub in libraries which don't do any checks on input data, >> while mine are filled to the brim with checks, so I started doubting >> my coding practices :) > Yes, one sees lots of code that fails to do basic sanity checks. One > also sees lots of examples of resulting failures. At times it seems like > about half of the software security flaws out there boil down to failing > to check such things. Buffer overrun, anyone? That's more often in C > code, than in Fortran, but the principles apply. As a first approximation, you should test well user supplied values, and not so well data supplied by other parts of your program. Say, for example, a cubic-spline routine where the first computed the spline coefficients and another routine evaluates the splines at a specified point. The second should not be expected to do extensive tests on the supplied coefficients, but should test that the user supplied interpolation point is within range. (Unless you allow for extrapolation, but don't do that.) You should only worry about the time taken for the tests if they are expected to be inside deeply nested (or otherwise executed billions of times) loops. Integer and logical tests are extremely fast on modern, and even not so modern, processors. Floating point is a little slower in many cases. Scalar tests are faster than array test. Again considering the cubic spline routine which might be executed many times with one set of computer coefficients, extensive retesting of the coefficient arrays could be very slow. There are still some tests that could be done, though. If you test the first and last sets of coefficients, that protects against some of the more obvious mistakes that someone might make. You could also add a value near the beginning of the data structure that should match another value near the end, and test that at evaluation time. That protects against many cases of the user passing the wrong data structure to the evaluation routine. Put the length into the data structure instead of depending on the user to resupply that each time. This reminds me of the computed GOTO statement. In Fortran 66, it was the programmers responsibility to be sure that the selection value was in range. Some compilers tested it and branched to the next statement for out of range values, but the standard didn't require that. I believe that was changed to require the test in Fortran 77. It seems that sometime between 1966 and 1977 the level of testing expected changed... -- glen
From: deltaquattro on 5 Feb 2010 07:52 On 4 Feb, 23:05, glen herrmannsfeldt <g...(a)ugcs.caltech.edu> wrote: > Richard Maine <nos...(a)see.signature> wrote: > > deltaquattro <deltaquat...(a)gmail.com> wrote: > >> ps one of the reasons why I'm asking this is because I've seen a lot > >> of interp sub in libraries which don't do any checks on input data, > >> while mine are filled to the brim with checks, so I started doubting > >> my coding practices :) > > Yes, one sees lots of code that fails to do basic sanity checks. One > > also sees lots of examples of resulting failures. At times it seems like > > about half of the software security flaws out there boil down to failing > > to check such things. Buffer overrun, anyone? That's more often in C > > code, than in Fortran, but the principles apply. > > As a first approximation, you should test well user supplied values, > and not so well data supplied by other parts of your program. > > Say, for example, a cubic-spline routine where the first computed > the spline coefficients and another routine evaluates the splines > at a specified point. The second should not be expected to do > extensive tests on the supplied coefficients, but should test that > the user supplied interpolation point is within range. (Unless > you allow for extrapolation, but don't do that.) > > You should only worry about the time taken for the tests if they > are expected to be inside deeply nested (or otherwise executed > billions of times) loops. Integer and logical tests are extremely > fast on modern, and even not so modern, processors. Floating point > is a little slower in many cases. Scalar tests are faster than > array test. Again considering the cubic spline routine which might > be executed many times with one set of computer coefficients, > extensive retesting of the coefficient arrays could be very slow. > > There are still some tests that could be done, though. If you > test the first and last sets of coefficients, that protects against > some of the more obvious mistakes that someone might make. You > could also add a value near the beginning of the data structure > that should match another value near the end, and test that at > evaluation time. That protects against many cases of the user > passing the wrong data structure to the evaluation routine. > Put the length into the data structure instead of depending on > the user to resupply that each time. > > This reminds me of the computed GOTO statement. In Fortran 66, > it was the programmers responsibility to be sure that the selection > value was in range. Some compilers tested it and branched to the > next statement for out of range values, but the standard didn't > require that. I believe that was changed to require the test in > Fortran 77. It seems that sometime between 1966 and 1977 the level > of testing expected changed... > > -- glen Hi guys, thank you all for the replies. This group is a treasure trove of Fortran wisdom, as always :) I hadn't thought of using a Debug parameter and/or optional input/output arguments for this issue - very handy suggestions. Dividing the checks between checks on user supplied data and data generated by code is also interesting, and my case would fall under the second case. So in the end the check stays where it is, in LinearInterp. Thnx again, Best Regards deltaquattro
From: George White on 6 Feb 2010 10:02
On Thu, 4 Feb 2010, Ron Shepard wrote: > In article > <de9d0fd6-33f8-44e4-8e03-20428eaf7451(a)21g2000yqj.googlegroups.com>, > deltaquattro <deltaquattro(a)gmail.com> wrote: > >> when writing code which performs a numerical task such as for example >> interpolation, would you add the checks on the input data inside the >> sub itself, or would you demand it to other code to be called before >> the interpolation subroutine? For example, in 1D linear interpolation >> you have to lookup an ordered table for the position of a value x, >> before performing interpolation. Checking for x falling inside or >> outside the interval spanned by the table should be done inside or >> outside the interpolation sub? I'd say outside, but then when I reuse >> the code I risk forgetting the call to the checking sub... > > There isn't a universal answer to this. If the checks are relatively > inexpensive (time, memory, cache, etc.), then you should add them in the > low level routine so they are always active. If they are expensive, or > in the gray area between, then you have options. You might write two > versions of the routine, one with and one without the checks. You would > call the fast one when speed is critical (e.g. inside innermost > do-loops) and where it makes sense to test the arguments outside the > low-level routine (e.g. outside of the innermost do-loops), and you > would call the safe one when speed is not critical or where nothing is > saved by testing outside of the call. Or, you might have a single > version of the routine, but do the internal tests conditionally based on > the value of an argument (or the existence of an optional argument). > This is one of the common uses for optional arguments. If the chances of bad values are small, sometimes it is better to just go ahead and compute garbage, then catch the bad value outside the routine. Many functions are designed return a special value in case of errors, e.g., a function whose value should be positive can return negative values to signal various types of errors. When using lookup tables it is often simple to add two extra bins for the out-of-range values and adjust the code to map the out-of-range values to those bins. This is generally much easier to handle in parallel processing than putting error handling in low-level code -- you end up with 99 values that were computed quickly but the programs stalls waiting on the one value that triggered an error handler. This has been a problem with some implementations of basic math libraries where exp(-bignum) triggers a slow handler to decide whether to underflow and return 0 or trigger an exception. > The same issue applies also within the routine regarding what to do if > it detects an error. Should it return an error code, or should it abort > internally? You can write a single routine with an optional return > argument that handles both situations. If the argument is present, then > it can be set with an appropriate error code and return to the caller, > otherwise the routine can abort on the spot. In some situations it is useful to keep a count of the errors so you can provide a table. In my work (remote sensing) you have a data set with >>10^6 records, many with missing/invalid data. You want to compute what makes sense for each record, and keep statistics for the various data problems so you can generate a summary table. The Slatec xerror package provides this capability. -- George White <aa056(a)chebucto.ns.ca> <gnw3(a)acm.org> 189 Parklea Dr., Head of St. Margarets Bay, Nova Scotia B3Z 2G6 |