Conversion of data & associated logic from ISAM to RDB [Cobol]

Prev: How would you do this?
Next: COBOL dynamic allocation (putenv) of output-file won't release extra space when closed

From: DaveM on 3 Apr 2007 14:55

We are using Micro Focus Net Express 4.0 and Microsoft SQL Server
2000. The concepts/examples we are seeking however can be more
generic, i.e., not necessarily shown within COBOL source code per se.

Right now we have two realized problems, both of which appear to stem
from our mutual and still-thriving ignorance......

The first problem is as follows: We are having surprising difficulty
in our attempts to find working examples of SQL-related code sequences
for handling low-volume user-entered updates vs large-volume batch
updates. The working examples we need can be represented as pseudo-
code and/or actual code - we don't really care which - we just need
something representative to work from. (See fictitious example
enclosed)

The second problem involves how to handle record locking issues among
multiple users. The lead analyst wants us to code logic that requires
maintenance of a date-time field in every record - this date-time
stamp would then be used for determining the availability of a given
record such that unilaterally-applied changes are not given an
opportunity to sneak in while another user has said record in a state
of flux. I'll spare you the further gory details of this terrifying
scheme for now, but suffice it to say we do not like it because it
seems that we would end up re-inventing the wheel, given that the rdb
is supposed to have various locking detection/tools already built into
it.

Here is a simple/fictitious representation of the type of code
sequence samples that we are looking for...

Typical LOW-VOLUME USER UPDATE module:
1. Open rdb
2. EXEC SQL WHENEVER SQLERROR DO sql_error;
3. Accept record key from user
4. Read matching record w/ shared lock (presume REC-FOUND for this
example)
5. Display fields on screen
6. Accept field updates from user
7. Edit field updates (presume EDIT-PASSED for this example)
8. BEGIN TRANSACTION
9. Read record from table with exclusive lock
10. Move new field values to table
11. Rewrite table record
12. COMMIT
13. END TRANSACTION
14. Close rdb

Typical HIGH-VOLUME BATCH UPDATE module:
1. Open rdb
2. ~?~?~
3. ~?~?~
4. ~?~?~

Perhaps there is a site somewhere that includes sql-related coding
examples? I appreciate in advance any input that anyone may have
about how we should be approaching this data conversion effort.

Sincerely,
Dave Miner

From: Anonymous on 3 Apr 2007 21:13

In article <1175626528.510733.311750(a)n59g2000hsh.googlegroups.com>,
DaveM <renfrew76(a)xemaps.com> wrote:
>We are using Micro Focus Net Express 4.0 and Microsoft SQL Server
>2000. The concepts/examples we are seeking however can be more
>generic, i.e., not necessarily shown within COBOL source code per se.
>
>Right now we have two realized problems, both of which appear to stem
>from our mutual and still-thriving ignorance......
>
>The first problem is as follows: We are having surprising difficulty
>in our attempts to find working examples of SQL-related code sequences
>for handling low-volume user-entered updates vs large-volume batch
>updates.

That's interesting... where have you looked for these, so that others here
might not duplicate your efforts?

>The working examples we need can be represented as pseudo-
>code and/or actual code - we don't really care which - we just need
>something representative to work from. (See fictitious example
>enclosed)

I'll do that.

>
>The second problem involves how to handle record locking issues among
>multiple users. The lead analyst wants us to code logic that requires
>maintenance of a date-time field in every record - this date-time
>stamp would then be used for determining the availability of a given
>record such that unilaterally-applied changes are not given an
>opportunity to sneak in while another user has said record in a state
>of flux.

How interesting... this is a time-honored technique that I first heard
described by someone who worked on one of the original airline reservation
systems.

Things have changed a bit since then... your lead analyst seems to want to
apply techniques for indexed files to a database; this has, in my
experience, usually resulted in disappointment for the system's users and
those who maintain the code.

>I'll spare you the further gory details of this terrifying
>scheme for now, but suffice it to say we do not like it because it
>seems that we would end up re-inventing the wheel, given that the rdb
>is supposed to have various locking detection/tools already built into
>it.

I am not sure about Microsoft SQL Server 2000 but I know that Oracle has
some pretty good internals to avoid deadlocks; I suggest that someone dig
into the appropriate manual and present the necessary pages to the lead
analyst.

>
>Here is a simple/fictitious representation of the type of code
>sequence samples that we are looking for...
>
>Typical LOW-VOLUME USER UPDATE module:
>1. Open rdb
>2. EXEC SQL WHENEVER SQLERROR DO sql_error;
>3. Accept record key from user
>4. Read matching record w/ shared lock (presume REC-FOUND for this
>example)
>5. Display fields on screen
>6. Accept field updates from user
>7. Edit field updates (presume EDIT-PASSED for this example)
>8. BEGIN TRANSACTION
>9. Read record from table with exclusive lock
>10. Move new field values to table
>11. Rewrite table record
>12. COMMIT
>13. END TRANSACTION
>14. Close rdb

Hmmmmm... is there anyone on this particular job who knows the difference
between a 'record' and a 'row'?

>
>
>Typical HIGH-VOLUME BATCH UPDATE module:
>1. Open rdb
>2. ~?~?~
>3. ~?~?~
>4. ~?~?~

2. Do a bunch of stuff.
3. Close rdb
4. Get promoted before this comes back to fasten its teeth in one's
gluteals.

>
>
>Perhaps there is a site somewhere that includes sql-related coding
>examples? I appreciate in advance any input that anyone may have
>about how we should be approaching this data conversion effort.

My suggestion is that you find people who know what they are doing and pay
them a lot of money to do it while you train the on-site staff to deal
with the new technology.

DD

From: Pete Dashwood on 4 Apr 2007 00:13

"DaveM" <renfrew76(a)xemaps.com> wrote in message
news:1175626528.510733.311750(a)n59g2000hsh.googlegroups.com...
> We are using Micro Focus Net Express 4.0 and Microsoft SQL Server
> 2000. The concepts/examples we are seeking however can be more
> generic, i.e., not necessarily shown within COBOL source code per se.
>
> Right now we have two realized problems, both of which appear to stem
> from our mutual and still-thriving ignorance......
>
> The first problem is as follows: We are having surprising difficulty
> in our attempts to find working examples of SQL-related code sequences
> for handling low-volume user-entered updates vs large-volume batch
> updates. The working examples we need can be represented as pseudo-
> code and/or actual code - we don't really care which - we just need
> something representative to work from. (See fictitious example
> enclosed)
>
Fair enough.

> The second problem involves how to handle record locking issues among
> multiple users. The lead analyst wants us to code logic that requires
> maintenance of a date-time field in every record - this date-time
> stamp would then be used for determining the availability of a given
> record such that unilaterally-applied changes are not given an
> opportunity to sneak in while another user has said record in a state
> of flux. I'll spare you the further gory details of this terrifying
> scheme for now, but suffice it to say we do not like it because it
> seems that we would end up re-inventing the wheel, given that the rdb
> is supposed to have various locking detection/tools already built into
> it.

Yes, this was fairly normally practice once upon a time. I remember using it
in 1978 on the first IBM 3790 deployment in the UK. This was to be a
"distributed application" that was to be "Networked" (Cutting edge stuff at
the time... We were given a very thick Assembler manual and told to learn
it. A week later we were writing applications, which had to be Assembled and
run on a mainframe because there was no hardware available yet :-)). There's
nothing wrong with date/timestamping rows on an RDB, even today, but not for
the reasons your Lead Analyst wants to do so. Nowadays it serves as an audit
trail, rather than a locking or rollback/recovery device.

You are absolutely correct in that your DBMS (SQL Server) can manage its own
transaction isolation, rollback and recovery. Perhaps the Lead Analyst needs
to do a quick course on modern Database Management?

>
> Here is a simple/fictitious representation of the type of code
> sequence samples that we are looking for...
>
> Typical LOW-VOLUME USER UPDATE module:
> 1. Open rdb

That would be a CONNECT...

> 2. EXEC SQL WHENEVER SQLERROR DO sql_error;
> 3. Accept record key from user
> 4. Read matching record w/ shared lock (presume REC-FOUND for this
> example)

Despite the ISAM terminology, you are simply requesting a row. Don't worry
too much about the locking; the advantage of a DBMS is that it takes much of
this concern off you, and, at this point it doesn't matter anyway.

> 5. Display fields on screen
> 6. Accept field updates from user
> 7. Edit field updates (presume EDIT-PASSED for this example)
> 8. BEGIN TRANSACTION (ONLY if it is a distributed transaction and will use
> MS Transact-SQL to access distributed servers)
> 9. Read record from table with exclusive lock

Get a row, with update intent.

> 10. Move new field values to table
> 11. Rewrite table record

That would be an UPDATE... :-)

> 12. COMMIT
> 13. END TRANSACTION (ONLY if it is a distributed transaction and will use
> MS Transact-SQL to access distributed servers)
> 14. Close rdb

No, we don't close the database; other people are using it... :-) Instead we
might DISCONNECT from it.
>
>
> Typical HIGH-VOLUME BATCH UPDATE module:
> 1. Open rdb
> 2. ~?~?~
> 3. ~?~?~
> 4. ~?~?~
>

In fact, you could use almost exactly the same algorithm you did above for
the transaction bit, but without the connection and disconnection, so....

1. establish a connection.
2. start a transaction. (ONLY if it is a distributed transaction and will
use MS Transact-SQL to access distributed servers)
3. read your batch input and get a key and data for the update.
These must be loaded into Host Variables in your WORKING-STORAGE SECTION
(See DECLARE in your SQL manual)
Set a count somewhere of the records you have read. You will use this to
issue a COMMIT after say, 500 updates.

4. issue an UPDATE something like this:

(Sample of COBOL with embedded SQL...)

PROCEDURE DIVISION.
....
EXEC SQL
UPDATE ourTable
SET dataColumn1 = :field-1 (from the batch record. Note that Host
Variables start with a special character which can
vary across
environments. Try using a colon...)
dataColumn2 = :field-2
dataColumn3 = :field-3
dataColumnN = :field-N
...
WHERE ourTableKey = :Batch-record-key

END-EXEC

if function REM (input-rec-count 500) = zero
EXEC SQL
COMMIT WORK (OR COMMIT TRANSACTION if you are running
distributed servers using Transact-SQL)
END-EXEC
(Start a new transaction at this point if you are accessing
distributed servers. If you are not, SQL Server automatically
assigns a transaction where one is needed and you don't need to worry about
it.)
end-if

....

5. When you have hit EOF on your batch file, do the final COMMIT and then
DISCONNECT.

>
> Perhaps there is a site somewhere that includes sql-related coding
> examples?

Most people are a bit cagey about publicising their code :-)

Try the following:

http://developer.mimer.com/interfaces/interface_5.htm
http://www.pdc.kth.se/doc/SP/manuals/db2-5.0/html/db2a0/db2a002.htm

> I appreciate in advance any input that anyone may have
> about how we should be approaching this data conversion effort.

Now THAT's an entirely different matter... If you are looking for free
strategic advice, what you get is probably worth the price...:-)

Having done a number of successful conversions from COBOL file system to
RDBMS, built tools to automatically analyse COBOL File Definitions and
generate RDB equivalents in third normal form with all repeating groups,
foreign keys, constraints and indexes carried over, and having worked with
RDB since the second IBM course on them in Reading, England, in 1983, I have
a nodding acquaintance with the problems you are facing.

(However, I have never loaded 60 million records to a database so I must
state that caveat up front...)

If you were to ask for my advice (and pay for it) I can tell you that I
could not, in all conscience, recommend what you are doing.

Given the stated environment (SQL Server) you are utilising a tiny fraction
of what is available to you. It's a bit like buying a Ferrari and never
getting out of first gear...

Embedded SQL is not the way to go.

Have a look at ADO.Net (this is not the same as ADO; it is light years
ahead). You would do much better embedding ADO.Net calls against your SQL
Server DB. This allows data and table adapters, automatic binding to
datasources, processing result sets with a single command, and manipulating
SETS of data rather than a row at a time.It is also makes MUCH less
connection demand on the server, so overall throughput is improved.)

Here's an overview:
http://www.developer.com/net/vb/article.php/10926_1540311_1

(Ironically, because it can employ Reader objects, this approach MAY use
OPEN, BEGIN... and CLOSE... :-))

I have not used it from COBOL (I use C#), but I see no reason why it
couldn't be called from COBOL as a normal COM server. I must have a go at
this when I get some time...:-)

However, given the realities of your current situation, embedded SQL is
probably what will happen.

If you need help with it, post here.

Good luck.

Pete.

From: DaveM on 4 Apr 2007 14:45

Pete,

Your response was VERY helpful, and I am grateful to you for taking
the time and patience to put it together. We will check out the sites
you recommend, and I will pass along all of your suggestions and
examples to my team as well as management.

Also, please forgive me if I misspoke in my final para, as I would
never presume the right to ask anyone for the favor of writing me up
an entire detailed game plan on how we should carry out the whole
conversion project. The kind of information that I meant to ask for
is what you have indeed since given me, and I thank you!

Dave Miner

From: DaveM on 4 Apr 2007 15:00

On Apr 3, 9:13 pm, docdw...(a)panix.com () wrote:
> In article <1175626528.510733.311...(a)n59g2000hsh.googlegroups.com>,
>
> DaveM <renfre...(a)xemaps.com> wrote:
> >We are using Micro Focus Net Express 4.0 and Microsoft SQL Server
> >2000. The concepts/examples we are seeking however can be more
> >generic, i.e., not necessarily shown within COBOL source code per
> >se.
>
> >Right now we have two realized problems, both of which appear to
> >stem from our mutual and still-thriving ignorance......
>
> >The first problem is as follows: We are having surprising
> >difficulty in our attempts to find working examples of SQL-
> >related code sequences for handling low-volume user-entered
> >updates vs large-volume batch updates.
>
> That's interesting... where have you looked for these, so that
> others here might not duplicate your efforts?
>

If I'd kept a log of every single place I've been to while seeking
this information and then posted it here, as you seem to be kindly
requesting now, I am afraid that my inquiry would have become entirely
too enormous for most folks to bother with. Lets just say that I have
looked everywhere, with the obvious exception of those places where
the answers I am looking for are actually being kept.

The gist of what I am currently seeking should exist mainly within the
minds of many of those who frequent this newsgroup, hence my inquiry.
Forgive me for not making it clearer, but I am not asking anyone to go
out on a google tour on my behalf; I am only asking for information
relevant to people's own real-world experiences such that we might
gain a better perspective about the options we are faced with.

> >The working examples we need can be represented as pseudo-
> >code and/or actual code - we don't really care which - we
> >just need something representative to work from. (See
> >fictitious example enclosed)
>
> I'll do that.
>
>
>
> >The second problem involves how to handle record locking issues
> >among multiple users. The lead analyst wants us to code logic
> >that requires maintenance of a date-time field in every record
> > - this date-time stamp would then be used for determining the
> >availability of a given record such that unilaterally-applied
> >changes are not given an opportunity to sneak in while another
> >user has said record in a state of flux.
>
> How interesting... this is a time-honored technique that I first
> heard described by someone who worked on one of the original
> airline reservation systems.
>
> Things have changed a bit since then... your lead analyst seems
> to want to apply techniques for indexed files to a database;
> this has, in my experience, usually resulted in disappointment
> for the system's users and those who maintain the code.

The handling of record (or I guess I should now say ROW) locking
conflicts, esp among multiple users, are supposed to be handled
primarily by functions that are internal to the rdb itself.
Regardless of how time-honored a given technique may be, it seems to
me that by going to the trouble of coding our own locking handler we
will only end up sidestepping what the rdb is designed to take care of
for us, and all in exchange for a manual (and inferior) version of
that capability.

This is rather like harnessing a team of mules up to a tractor to plow
the field. Sure, its possible, but we'd be foolishly wasting the very
reasons that we'd paid extra money to buy the damn tractor in the
first place.

> >I'll spare you the further gory details of this terrifying
> >scheme for now, but suffice it to say we do not like it because
> >it seems that we would end up re-inventing the wheel, given that
> >the rdb is supposed to have various locking detection/tools
> >already built into it.
>
> I am not sure about Microsoft SQL Server 2000 but I know that
> Oracle has some pretty good internals to avoid deadlocks; I
> suggest that someone dig into the appropriate manual and
> present the necessary pages to the lead analyst.
>

Deadlocks per se are not of particular concern to us, as these are
supposed to be automatically handled by the rdb. What we are trying
to learn about are the protocols of dealing with wait-locks, time-
outs, and prevention of unilateral changes etc.

>
>
>
>
> >Here is a simple/fictitious representation of the type of code
> >sequence samples that we are looking for...
>
> >Typical LOW-VOLUME USER UPDATE module:
> >1. Open rdb
> >2. EXEC SQL WHENEVER SQLERROR DO sql_error;
> >3. Accept record key from user
> >4. Read matching record w/ shared lock (presume REC-FOUND
> > for this example)
> >5. Display fields on screen
> >6. Accept field updates from user
> >7. Edit field updates (presume EDIT-PASSED for this example)
> >8. BEGIN TRANSACTION
> >9. Read record from table with exclusive lock
> >10. Move new field values to table
> >11. Rewrite table record
> >12. COMMIT
> >13. END TRANSACTION
> >14. Close rdb
>
> Hmmmmm... is there anyone on this particular job who knows
> the difference between a 'record' and a 'row'?

Yes, we do. Pseudo-code, at least within our organization, is written
for the purpose of communicating ideas and logic flow. Given that you
figured out that my usage of 'record' should have instead been
expressed as 'row', then this pseudo-code has apparently accomplished
its purpose. :)

In any case, point taken.

>
> >Typical HIGH-VOLUME BATCH UPDATE module:
> >1. Open rdb
> >2. ~?~?~
> >3. ~?~?~
> >4. ~?~?~
>
> 2. Do a bunch of stuff.
> 3. Close rdb
> 4. Get promoted before this comes back to fasten its teeth
> in one's gluteals.
>
>
>
> >Perhaps there is a site somewhere that includes sql-related
> > coding examples? I appreciate in advance any input that
> > anyone may have about how we should be approaching this
> > data conversion effort.
>
> My suggestion is that you find people who know what they are
> doing and pay them a lot of money to do it while you train the
> on-site staff to deal with the new technology.
>
> DD

Your final suggestion is excellent but I am afraid it is not
practicable because I do not control any of the purse strings. We
have no choice but to work with this analyst, along with a rather
tight-fisted management team who is unwilling to spring for the cost
of formal training. Complaints I have aplenty, of course, but that
won't resolve anything. The only realistic and proactive plan we have
at this point is to continue doing what we are doing, namely, to
research usenet groups and knowledge bases, to ask questions, read
manuals, ask more questions, setup & run tests, analyze results,
borrow or buy additional manuals, and humbly beg somebody in the real
world to lead us to some working examples.

Thank you for your help.

Dave Miner

| Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11
Prev: How would you do this?
Next: COBOL dynamic allocation (putenv) of output-file won't release extra space when closed