From: David W. Fenton on 25 Feb 2010 13:40 "Allen Browne" <AllenBrowne(a)SeeSig.Invalid> wrote in news:uSKvaydtKHA.4492(a)TK2MSFTNGP05.phx.gbl: > While there's some truth in that simple view, the reality is much > more complex than that. Access uses pointers to store the data. If > you define a Text field as 255 characters but enter 2 characters > in it, it does not use up 510 bytes (unicode) of disk space to > store those 2 characters. Internally, the database keeps track of > where each field starts in each record. Some fields (memo, OLE, > attachments, MVFs) are not actually stored in-line with the other > data in the record either. So it's not a matter of a single record > being faster to retrieve. There are many other factors at work > here, including whether the database has been compacted recently, > whether you are using page- or record-locking, whether the disk is > badly fragmented, how intelligent are the various layers of > caching (physical on the drive, or the controller, or the > operating system, or in Access), the Rushmore algorithms in JET, > and so on. Then we may have to start another mindset to handle SSD > storage issues as well. And none of this considers the issue of disk caching, such that there's not that much difference between data loaded into RAM and data that is being read from/written to disk, since the latter usually takes place through the caching mechanism, and is not going to be limited by the speed of the actual storage medium, but by RAM. This has been the case in Windows since at least c. 1991-2, when Windows 3.1 was released with disk caching turned on by default. It was essential for decent performance in Windows, but it also meant that your databases were going to be speeded up because of the disk cache, too (although back then it was largely a read-only improvement, as lazy writes and such had not been implemented in the DOS disk cache; any modern version of Windows, though, caches both reads and writes). -- David W. Fenton http://www.dfenton.com/ usenet at dfenton dot com http://www.dfenton.com/DFA/
From: David W. Fenton on 25 Feb 2010 13:45 =?Utf-8?B?RGVubmlz?= <Dennis(a)discussions.microsoft.com> wrote in news:E7D7C241-A566-4E28-8857-E6744C3C82F2(a)microsoft.com: > My background is with large commercial insurance companies. All > of the companies have had relational databases designed to the 3rd > normal form. The determination of whether or not those tables were actually normalized depends on the chosen definition of the entity being modelled in the table. I would consder the 200-field table you mention later to be unquestionably denormalized, even though I know nothing at all about the content/function of those fields. That sounds like a table that has a bunch of fields that are used only for a single record type, so that an auto insurance policy has one set of fields, but a renter's insurance policy has a different set of fields. Any time you're using some fields for some records and not for others, it's an indication to me that the entity has been misdefined, and should probably be broken into at least two tables, with a narrow header table and a long child table, where each row stores what was formerly a field in the wide table. All that said, my conclusion could be wrong for any particular application. But "fields are expensive, rows are cheap" is a generalized rule of thumb, not a hard-and-fast law of nature. It allows for exceptions for certain purposes, but is a starting point for evaluating a schema design. -- David W. Fenton http://www.dfenton.com/ usenet at dfenton dot com http://www.dfenton.com/DFA/
From: David W. Fenton on 25 Feb 2010 13:49 =?Utf-8?B?RGVubmlz?= <Dennis(a)discussions.microsoft.com> wrote in news:A72A1230-A337-4483-AB36-2E8892D7E19B(a)microsoft.com: > I agree with comment about a field being expensive in an > improperly normalized table structure resulting in all sorts of > workarounds. 10 of the 30 new fields had to do with funeral > information for a deceased member. Since all of the information > is specific to the individual member, I included on the member > table. You might be able to argue that it should go in it own > table, but for data entry and reporting simplicity I included it > my member table. That sounds like the type of data I'd put in a separate 1:1 table, as it only applies once a particular threshold has been crossed. A record in that table also means the person is deceased (though you may not have the information and might still need to store a date of death in the main record). I wouldn't call that denormalized, but I have always found a certain utility in using the 1:1 table for things that apply only after the entity has reached a certain milestone. However, I would likely avoid having multiple 1:1 records, though, as it then becomes complicated to get a single editable row, unless the 1:1 records are related in a logical chain that is modelled in the database in that way. -- David W. Fenton http://www.dfenton.com/ usenet at dfenton dot com http://www.dfenton.com/DFA/
From: Dennis on 25 Feb 2010 23:23 David, I can see where disk caching would help in a sequential process, but does disk caching really help in a randomly accessed database during data entry? The first data entry might access the first record, the next the 1,000th record, then next on the 5,000th record, and so on and so on. So, unless the entire table is cached, does it really help? Dennis
From: Dennis on 25 Feb 2010 23:27 David, Hmmm, I see your point and kind of agree with it. My back ground in on large and midrnage computers where it is nothing to have a 200 field, 30K record. However, I realize that Access is a different beast and I'm having to learn to adjust for it restrictions. Thanks for the insight. Just more to think about. But then I learn something new also. Thanks, Dennis
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 Prev: header full line but page 2 rows Next: Restated: "Fields are expensive, records are cheap" |