From: Alistair on 3 Feb 2010 12:54 On Feb 3, 4:16 pm, SomeGuy <jimgr...(a)nc.rr.com> wrote: > On Feb 3, 7:01 am, Alistair <alist...(a)ld50macca.demon.co.uk> wrote: > > > > > > > On Feb 2, 9:11 pm, SomeGuy <jimgr...(a)nc.rr.com> wrote: > > > > On Feb 2, 7:18 am, Alistair <alist...(a)ld50macca.demon.co.uk> wrote: > > > > > On Feb 1, 9:43 pm, Richard <rip...(a)Azonic.co.nz> wrote: > > > > > > On Feb 2, 6:11 am, SomeGuy <jimgr...(a)nc.rr.com> wrote: > > > > > > > Need to identify some database files used by a PC COBOL program > > > > > > written in the mid-90's. The extensions are .DB and .IDX. Given the > > > > > > date, language and OS, are there any candidates you can think of? I > > > > > > can send a sample of the files if that would help. > > > > > > > Thanks, > > > > > > Jim > > > > > > The .DB is probably a user choice. The .IDX is most likely an index > > > > > file for the .DB. If the first two bytes of the .IDX is 0xFE53 then it > > > > > is probable that these are MicroFocus LevelII/CISAM format indexed > > > > > files. > > > > > > The first block of the .IDX should have further information giving > > > > > record length and key information (size and start position). > > > > > > If the files are LevelII/CISAM then the data records in the .DB will > > > > > be fixed length with CR/LF record terminators. Other formats may have > > > > > variable length records with record headers and/or may have compressed > > > > > data. > > > > > > Without an FD entry you are unlikely to be able to know what the data > > > > > fields are or even where they start/end within the record. > > > > > Am I the only person who remembers dBase files which (IIRC) were > > > > suffixed .DB? > > > > Since the .DB files are unlikely to contain cobol specific data items > > > > then importing the flat files in to MS Access would be an option. It > > > > would require some understanding of data formats and intelligent > > > > guessing of layouts. Not too difficult, even I have done that in the > > > > past. > > > > How would one go about guessing the layout of a COBOL-generated file > > > for which you know next to nothing about the layout? Note that, > > > unlike DBase files, I can discern no field descriptors (name, type, > > > start, length, etc...) in the file. > > > > Thanks.- Hide quoted text - > > > > - Show quoted text - > > > The same way that I went about guessing the contents of a VDF (Visual > > Data Flex) file: look at the reports or screens produced/used and tie > > their contents to the data in the file. It takes a bit of brain > > processing and is not guaranteed to be 100% foolproof especially if > > you don't know the data formats available to the database. > > > If you don't have the file layouts and probably you don't have report > > or screen shots then you probably won't be able to resolve the issue. > > We do have screen shots and output, but I can't imagine that approach > would be economical for this project. > It worked for me on a project with about 8 database files. It was a right pain as the compression of the data resulted in rubbish characters in the data stream. I was faced with a lot of manual editing.
From: James J. Gavan on 3 Feb 2010 17:37 >>SomeGuy wrote: >> >> >>pic 9(04)v9(02) comp-3 (contains 012345) shows as '0123.45' in the >>display dialog. BUT it doesn't give you a column-header description, nor >>when you look at the size does it indicate whether or not the field was >>specified as :- >> >>- pic 9(04).9(02), pic 9(04)v9(02) or pic 9(04)v9(02) comp-3, and >>depending upon the compiler, other variations on the 'comp-3', such as >>comp-1, comp-5. >> >>The only thing it specifically does, in the case of ISAM, is define the >>positioning of the PrimeKey and any Alternate Keys, such as :- >> >>- PrimeKey (1:20) Alt-Key-1 (30:40) Alt-Key-2 (78:20) >> >>I drafted something, but I don't think I sent it. Either the end-user >>has got to show you what they have, from reports, (which will still >>entail you doing a lot of messing about to extract the data), or bite >>the bullet, and for an acceptable fee get the file formats from the >>original developers - *IF* they will sell them to you ! >> >>Did you google on COBOL data conversions, or look at the COBOL FAQ for >>help ? >> > > To be honest, never having used COBOL, I didn't really follow > everything you posted. I looked at the Net Express website but > couldn't find information about the DFE. I did try the Siber Systems > Data Viewer. It manages to report a lot of columns, many of them with > legitimate-looking values. Doesn't surprise me; no doubt you could clearly see text fields such as :- 01 CustomerRecord. 05 CustomerKey pic x(05). 05 Customer Name pic x(40). ---> 'Encana Construction Ltd.................." or Usage Display values :- 05 GSTorVatNumber pic 9(10). ----> '8282233344' Your gibberish will be (which accounts for "GUESSED-535-1" ):- 05 SalesThisWeek pic s9(04)v9(02) comp-3. 05 Sales YTD pic s9(10)v9(02) comp-3. But the rest have basically gibberish and > all have only generated names (like "GUESSED-535-1"). > > Another thought: is there a tool that will scan COBOL source and > produce a report (copybook? FDD?) with the layout? If so, perhaps I > can give the tool to the client to run for me (assuming they have > source). > No there just aren't such tools; as previously indicated, a limited amount in the File Header record, primarily to do with sizing and accessing but the real answer is in the record formats. Even assuming you are very proficient at 'bit-fiddling', and given you had the record layouts, you've still got to locate the fields by size and translate them into numeric values that you want. (Seeing as Bob has just recently done a low key sales pitch :-), check out Flexus.com. They have a document from Michael Mattias, a contributor here, explaining COBOL binary fields). I can assure you GIVEN a COBOL programmer had BOTH the record layouts, and the appropriate compiler, (We're still into assuming it's Micro Focus), it wouldn't take too much time to knock out a conversion per file. (Which was what DD was alluding to). The steps are :- 1 - create a COBOL source that contains the copyfiles for the FD and Record layouts, which Alistair and I mentioned. 2 - above includes the record layout additionally for a CSV file 3 - Open Your file as input and the CSV file as Output 4 - read records sequentially and just move the input fields to text and non-binary fields as appropriate. 01 CSV-Record. 05 CustomerKey pic x(06). 05 pic x value ",". 05 CustomerName pic x(40). 05 pic x value ",". 05 GSTorVatNumber pic 9(10). 05 pic x value ",". 05 SalesThisWeek pic -9999.99. *> shows +/- 05 pic x value ",". 05 Sales YTD pic -9999999999.99. *> shows +/- 05 pic x value ",". 05 pic x value ",". 05 etc.... It is not a challenging exercise; I've done it moving RM/COBOL data files to Micro Focus; but in my case I additionally had the advantage of bit routines from Micro Focus, to convert RM binaries to M/F format. Why my Two Step approach ? Without going into details, I took the opportunity to enhance the application, so wrote to the output CSVs, which meant incoming Record-A might finish up as output Records B and C, particularly when I got into (R)DBMS and SQL. Bear in mind I had the compiler for RM/COBOL as well, and MOST IMPORTANTLY, having programmed the application in RM, I also had the RM RECORD FORMATS ! Jimmy, Calgary AB
From: Michael Wojcik on 4 Feb 2010 11:02 Alistair wrote: > > The same way that I went about guessing the contents of a VDF (Visual > Data Flex) file: look at the reports or screens produced/used and tie > their contents to the data in the file. It takes a bit of brain > processing and is not guaranteed to be 100% foolproof especially if > you don't know the data formats available to the database. > > If you don't have the file layouts and probably you don't have report > or screen shots then you probably won't be able to resolve the issue. In other words, this is a forensic exercise. It's impossible to reconstruct the data format with guaranteed complete accuracy in the general case, and difficult in many specific cases. You'd need to perform a cost/benefit analysis to determine how much effort is reasonable to expend on it. -- Michael Wojcik Micro Focus Rhetoric & Writing, Michigan State University
From: Michael Wojcik on 4 Feb 2010 10:59 SomeGuy wrote: > On Feb 2, 8:23 am, Fred Mobach <f...(a)mobach.nl> wrote: >> Did you already try to use the file command ? See :http://www.darwinsys.com/file/ > > Never heard of it before, but just tried online at http://swoag.webhop.org/ > (which per Wikipedia uses it internally). Reports both the DB and IDX > as "data". Thanks. The standard Unix file command does not contain information about file types. Instead, it uses a side file, /etc/magic, which describes heuristic identifiers for various kinds of files. Originally, /etc/magic was just a list of "magic cookie" values that appeared as the first few bytes of a handful of specific file types on various Unix implementations - executables, shell scripts, archive libraries, etc. Later implementations of the file command and /etc/magic are more sophisticated, and entries in /etc/magic can be fairly complicated rules (along the lines of search for this regular expression, then get the value of this byte at this offset from the match, and so on). So just using some random implementation of the file command doesn't guarantee that you're using one with a very comprehensive /etc/magic file. And many ISVs add entries to /etc/magic as part of product installation, to recognize the particular file types their code generates. Since http://swoag.webhop.org/ provides no information (that I could find) about what implementation it's using, who knows if it's any good? Of course, this won't help you identify the files in question, unless you want to go around trying different file implementations (which you probably don't). Incidentally, file implementations are available for Windows, for example as part of Cygwin. (For the record, the Cygwin /etc/magic doesn't recognize MF ISAM files or their index files as anything but "data". I'm not sure there *is* anything in MF ISAM files that can be used to distinguish them.) While they may not help in this particular case, they can be useful in others. -- Michael Wojcik Micro Focus Rhetoric & Writing, Michigan State University
From: Alistair on 5 Feb 2010 09:45
On Feb 4, 4:02 pm, Michael Wojcik <mwoj...(a)newsguy.com> wrote: > Alistair wrote: > > > The same way that I went about guessing the contents of a VDF (Visual > > Data Flex) file: look at the reports or screens produced/used and tie > > their contents to the data in the file. It takes a bit of brain > > processing and is not guaranteed to be 100% foolproof especially if > > you don't know the data formats available to the database. > > > If you don't have the file layouts and probably you don't have report > > or screen shots then you probably won't be able to resolve the issue. > > In other words, this is a forensic exercise. It's impossible to > reconstruct the data format with guaranteed complete accuracy in the > general case, and difficult in many specific cases. You'd need to > perform a cost/benefit analysis to determine how much effort is > reasonable to expend on it. > The application of a cost benefit analysis is quite a good idea as I found the effort excessive (but I had very little choice in the matter). I think SOMEGUY is banging his head against a brick wall (ce taper la tete contre le mur as they say in Germany) without the copylibs. |