From: m on 22 Mar 2010 20:11 If the transaction volume is low (i.e. less then ten per employee per year for a small company), then this solution will be okay, but once the transaction volume increases, then you will have no choice except to benchmark account transactions to form a new datum. It may not be much to compute the balance for a single employee, but in an organization with half a million employees, some of whom may have 30+ years of transactions, it becomes much more onerous to re-compute the totals on demand. And in applications where there may be millions of transactions per day, then it absolutely makes sense to realize and store the computed information - which, paralleling the way as a Jacobin can transform n-space complex relations to m-space functions, said benchmarked computed data can form the basis of an 1nf schema - usually in different systems however. "--CELKO--" <jcelko212(a)earthlink.net> wrote in message news:64b96480-5514-4af8-8e1c-d14a3ff3ee9e(a)t32g2000prg.googlegroups.com... > CREATE VIEW EducationAccountSummary (P.emp_id, claim_tot, > claim_balance, most_recent_claim_date) > AS > SELECT P.emp_id, SUM(E.claim_amt) AS claim_tot, > 600.00 � SUM(E.claim_amt) AS claim_balance, > MAX(E.claim_date) AS most_recent_claim_date > FROM Personnel AS P > LEFT OUTER JOIN > EducationAccounts AS E > ON P.emp_id = E.emp_id > GROUP BY P.emp_id; >
From: Tony Rogerson on 23 Mar 2010 01:29 Absolutely right m; I do a few days a week for a debt management company over the last 3 years I've been steadily migrating them from flat files into SQL Server. We have accounts, accounts may have 100's of transactions; we need rolling balances - to do it celko's way just does not scale; I use a trigger to populate rolling totals as the transactions come in (there is additional logic to calculate other things like retention which is a complex calculation). Some would argue put that complex calculation in the middle tier, I'd argue back - have you actually tried that in a real world? Use a trigger and have summary tables or additional columns on row. Tony. "m" <m(a)b.c> wrote in message news:ulJgt1hyKHA.3264(a)TK2MSFTNGP06.phx.gbl... > If the transaction volume is low (i.e. less then ten per employee per year > for a small company), then this solution will be okay, but once the > transaction volume increases, then you will have no choice except to > benchmark account transactions to form a new datum. It may not be much to > compute the balance for a single employee, but in an organization with > half a million employees, some of whom may have 30+ years of transactions, > it becomes much more onerous to re-compute the totals on demand. And in > applications where there may be millions of transactions per day, then it > absolutely makes sense to realize and store the computed information - > which, paralleling the way as a Jacobin can transform n-space complex > relations to m-space functions, said benchmarked computed data can form > the basis of an 1nf schema - usually in different systems however. > > > "--CELKO--" <jcelko212(a)earthlink.net> wrote in message > news:64b96480-5514-4af8-8e1c-d14a3ff3ee9e(a)t32g2000prg.googlegroups.com... >> CREATE VIEW EducationAccountSummary (P.emp_id, claim_tot, >> claim_balance, most_recent_claim_date) >> AS >> SELECT P.emp_id, SUM(E.claim_amt) AS claim_tot, >> 600.00 � SUM(E.claim_amt) AS claim_balance, >> MAX(E.claim_date) AS most_recent_claim_date >> FROM Personnel AS P >> LEFT OUTER JOIN >> EducationAccounts AS E >> ON P.emp_id = E.emp_id >> GROUP BY P.emp_id; >>
From: --CELKO-- on 23 Mar 2010 16:47 >> It may not be much to compute the balance for a single employee, but in an organization with half a million employees, some of whom may have 30+ years of transactions, it becomes much more onerous to re-compute the totals on demand. << Commercial applications do not have decades of data on line; they have to close out their books at the end of a fiscal reporting period (year, quarter or whatever) and archive it. The real work is done on data in the current fiscal reporting period. Even data warehouses are not that deep -- Wal-Mart keeps 18 months of data and they are one of the largest DW on Earth. The only scenario I can imagine is a scientific DB with a long history of measurements. But the math you do on them is not a simple running total; it would feed a statistical package. It woudl be a very different DB than what we run on SQL Server. During the current reporting period, there will be corrections that appear out of temporal sequence. The most common one that comes to mind is that deposits are applied before debits on the same day, no matter when they were timestamped. When I worked at a bank, this was a huge percentage of the retail banking volume (payroll deposits were actually made after withdrawals when we loaded company payroll tapes after the close of business). Keeping a running total in the accounts in real time would have been a disaster on paydays. I don't know what the percentage is today, but a computer can do the math faster than it can read a disk. If you want a more theoretical reason, you don't store computed columns in an RDBMS. Different levels of aggregation, mixing data and metadata, blah, blah, blah.
From: m on 23 Mar 2010 22:52 Not wanting to provoke much further debate, but feeling compelled to respond: Retail sales (i.e. Wal-Mart) is certainly a high transaction volume business, but the transactions are nearly all independent. In retail wealth management however, you would be very upset if your broker didn't have records of your last transactions just because they were last year; and even more upset if your banker didn't keep complete records of your mortgage payments, spanning perhaps the last couple of decades - unless of course your performance on that loan wasn't quite what it ought to have been ;) Similarly, escrow receipt and inventory reconciliation business functions often span decades, and are fundamentally concerned with what is rather then how it has changed. The point is not that 1nf etc. is bad - absolutely it is not - but rather that 1nf for a particular application can depend on one's point of view, and what the system is designed to achieve. Transaction processing systems must work with transactions; inventory systems must work with holdings; and appointment calendars must work with days and meetings - and in my case, work out how I am to attend the three overlapping ones where I am listed as a required attendee! For the OP's case, he has low transactional volume per employee, but the time over which an employee's records must be kept is long (years). And since it is unlikely that every employee will use the complete allowance each year, and carry forward is allowed, the simplest design for a large number of employees is a monthly or annual benchmark. This design requires two systems in one (transaction processing + inventory) but results in a storage efficient and easily queried schema, but the drawback is the necessity for batch processing outside of the RDBMS and some latency from transaction posting until inventory reconciliation. BTW: the problems that you are likely discussing regarding payroll are likely the effect of check clearing and the cash forward service offered by many retail banks. The problem relates mostly to the inter-institution delivery of physical currency, and the way that many banking systems book transactions, but has nothing to do with the client liability or accounting except in the minds of those forget that the bank is _paid_ to take that risk. "--CELKO--" <jcelko212(a)earthlink.net> wrote in message news:a9441de8-4961-47d6-8135-50089f39dca3(a)a16g2000pre.googlegroups.com... >>> It may not be much to compute the balance for a single employee, but in >>> an organization with half a million employees, some of whom may have 30+ >>> years of transactions, it becomes much more onerous to re-compute the >>> totals on demand. << > > Commercial applications do not have decades of data on line; they have > to close out their books at the end of a fiscal reporting period > (year, quarter or whatever) and archive it. The real work is done on > data in the current fiscal reporting period. Even data warehouses are > not that deep -- Wal-Mart keeps 18 months of data and they are one of > the largest DW on Earth. > > The only scenario I can imagine is a scientific DB with a long history > of measurements. But the math you do on them is not a simple running > total; it would feed a statistical package. It woudl be a very > different DB than what we run on SQL Server. > > During the current reporting period, there will be corrections that > appear out of temporal sequence. The most common one that comes to > mind is that deposits are applied before debits on the same day, no > matter when they were timestamped. When I worked at a bank, this was a > huge percentage of the retail banking volume (payroll deposits were > actually made after withdrawals when we loaded company payroll tapes > after the close of business). Keeping a running total in the accounts > in real time would have been a disaster on paydays. > > I don't know what the percentage is today, but a computer can do the > math faster than it can read a disk. > > If you want a more theoretical reason, you don't store computed > columns in an RDBMS. Different levels of aggregation, mixing data and > metadata, blah, blah, blah.
From: Tony Rogerson on 24 Mar 2010 01:34
> Commercial applications do not have decades of data on line; they have > to close out their books at the end of a fiscal reporting period Obviously never worked in the financial sector. Obviously not much experience in business "period". Do you actually try any of this theory on large data sets with multiple concurrent users? Or, like I suspect you either tell your client something entirely different when you consultant? --ROGGIE-- "--CELKO--" <jcelko212(a)earthlink.net> wrote in message news:a9441de8-4961-47d6-8135-50089f39dca3(a)a16g2000pre.googlegroups.com... >>> It may not be much to compute the balance for a single employee, but in >>> an organization with half a million employees, some of whom may have 30+ >>> years of transactions, it becomes much more onerous to re-compute the >>> totals on demand. << > > Commercial applications do not have decades of data on line; they have > to close out their books at the end of a fiscal reporting period > (year, quarter or whatever) and archive it. The real work is done on > data in the current fiscal reporting period. Even data warehouses are > not that deep -- Wal-Mart keeps 18 months of data and they are one of > the largest DW on Earth. > > The only scenario I can imagine is a scientific DB with a long history > of measurements. But the math you do on them is not a simple running > total; it would feed a statistical package. It woudl be a very > different DB than what we run on SQL Server. > > During the current reporting period, there will be corrections that > appear out of temporal sequence. The most common one that comes to > mind is that deposits are applied before debits on the same day, no > matter when they were timestamped. When I worked at a bank, this was a > huge percentage of the retail banking volume (payroll deposits were > actually made after withdrawals when we loaded company payroll tapes > after the close of business). Keeping a running total in the accounts > in real time would have been a disaster on paydays. > > I don't know what the percentage is today, but a computer can do the > math faster than it can read a disk. > > If you want a more theoretical reason, you don't store computed > columns in an RDBMS. Different levels of aggregation, mixing data and > metadata, blah, blah, blah. |