From: topmind on 21 Jan 2006 14:06 > >>>>The point is that there are alternative /implementations/ for > >>>>persistence to RDBs in the computing space. SQL has already made that > >>>>implementation choice. > >>> > >>> > >>>SQL is not an implementation. What is the difference between locking > >>>yourself to SQL instead of locking yourself to Java? If you want > >>>open-source, then go with PostgreSQL. What is the diff? Java ain't no > >>>universal language either. > >> > >>Of course it's an implementation! It implements access to physical > >>storage. > > > > > > Just as Java implements access to physical RAM etc. > > Exactly. Java is a specific implementation of a 3GL. 3GL is the > abstraction, Java is an implementation. Persistence access is the > abstraction, SQL is an implementation. Why do you keep saying "persistence"? I don't think you get the idea of RDBMS and query languages. Like I said, think of a RDBMS as an "attribute management system". Forget about disk drives for now. Saying it is only about "persistence" is simply misleading. > > >> More important to the context here, that implementation is > >>quite specific to one single paradigm for stored data. > > > > > > Any language or API is pretty much going to target a specific paradigm > > or two. I don't see any magic way around this, at least not that you > > offer. UML is no different. > > 4GLs get around it because they are independent of /all/ computing space > implementations. I am not sure UML qualifies as 4th Gen. Just because it can be translated into multiple languages does not mean anything beyond Turing Equivalency. C can be translated into Java and visa verse. > > However, that's not the point. SQL is a 3GL but comparing it to Java is > specious because Java is a general purpose 3GL. Again, this gets into the definiton of "general purpose". I agree that query languages are not meant to do the *entire* application, but that does not mean it is not general purpose. File systems are "general purpose", but that does not mean that one writes an entire application in *only* a file system. It is a general purpose *tool*, NOT intended to be the whole enchilata. A hammer is a general purpose tool, but that does not mean one is supposed to ONLY use a hammer. You need to clarify your working definition of "general purpose", and then show it the consensus definition for 4GL. > SQL represents a > solution to persistence access that is designed around a particular > model of persistence itself. So one can't even use it for general > purpose access to persistence, much less general computing. Please clarify. Something can still be within a paradigm and be general purpose. Further GP does not necessarily mean "all purpose", for nothing is practially all purpose. > > >>Requirements -> 4GL -> 3GL -> Assembly -> machine code executable > >> > >>Everything on the left is a specification for what is immediately to its > >>right. Similarly, everything to the right is a solution implementation > >>for the specification on its immediate left. > > > > > > Well that is a bit outdated. For one, the distinction between 4GL and > > 3GL is fuzzy, and many compilers/interpreters don't use assembler. > > My 4GL definition isn't ambiguous, which is why I like it. Reviewers of > OOA models have no difficulty recognizing implementation pollution. Argument by authority. > > All compilers generate object code (relocatable Assembly). Most modern > interpreters can produce storable bytecodes that are equivalent to > Assembly from the VM's viewpoint. At run time one can view an > interpreter as simply combining link and load functions that transform > the bytecode to a machine instruction. But at some level the > interpreter still has to understand that MUL,R1,R2 maps into bits. > > But you reverting to ploys again by deflecting. The context is > specification vs. implementation, not how machine instructions are encoded. You have not finished your analogy on the 3G and 4G side. Besides, analogies often make poor evidence, being better for teaching or illuminating. > > >>Go look at an SA/D Data Flow Diagram or a UML Activity Diagram. They > >>express data store access at a high level of abstraction that is > >>independent of the actual storage mechanism. SQL, ISAM, CODASYL, gets, > >>or any other access mechanism, is then an implementation of that generic > >>access specification. > > > > > > SQL is independent of the "actual storage mechanism". It is an > > interface. You may not like the interface, but that is another matter. > > Repeat after me: "SQL is an interface, SQL is an interface, SQL is an > > interface".... > > Try using SQL vs. flat files if you think it is independent of the > actual storage mechanism. (Actually, you probably could if the flat > files happened to be normalized to the RDM, but the SQL engine would be > a doozy and would have to be tailored locally to the files.) SQL > implements the RDB view of persistence and only the RDB view. How is that different than ANY other interface? You are claiming magic powers of UML that it simply does not have. And as somebody pointed out, one can use SQL on flat files too. ODBC drivers can be created to hook SQL to spreadsheets, flat files, etc. > >>Java is certainly a general purpose 3GL. Like most 3GLs there are > >>situations where there are better choices (e.g., lack of BCD arithmetic > >>support makes it a poor choice for a General Ledger), but one could > >>still use it in those situations. SQL, in contrast, is a niche language > >>that just doesn't work for many situations outside its niche. > > > > > > You could be right, but I have yet to see a good case outside of > > split-second timing issues where there is a limit to the max allowed > > response time. (This does not mean that rdbms are "slow", just less > > predictable WRT response time.) > > > > If you can give an example outside of timing, please do. (I don't doubt > > they exist, but I bet they are rarer than you imply. Some scientic > > applications that use imaginary numbers and lots of calculus may also > > fall outside.) > > Compute a logarithm. You can't hedge by dismissing "scientific" > computations. I didn't. Nothing is ideal for everything under the sun. Nothing. See above about general-purpose tools. > Try doing forecasting in an inventory control system w/o > "scientific" computations. I am not sure what you are implying here. I did not claim that scientific computation was not necessary. > Or try encoding the pattern recognition that > the user of a CRUD/USER application applies to the presented data. The > reality is that IT is now solving a bunch of problems that are > computationally intensive. As usual, "it depends". Problems where there is a lot of "chomping" on a small set of data are probably not something DB's are good at (at this time). An example might be the Travaling Salesman puzzle. However, problems where the input is large and from multiple entities are more up the DB's alley. (It may be possible to use a DB to solve Salesmen quickly, but few bother to research that area.) > > >>BTW, remember that I am a translationist. When I do a UML model, I > >>don't care what language the transformation engine targets in the model > >>implementation. (In fact, transformation engines for R-T/E typically > >>target straight C from the OOA models for performance reasons.) Thus > >>every 3GL (or Assembly) represents a viable alternative implementation > >>of the notion of '3GL'. > > > > > > > > Well, UML *is* language. It is a visual language just like LabView is. > > Exactly. But solutions at the OOA level are 4GLs because they can be > unambiguously implemented without change on any platform with any > computing technologies. So can any Turing Complete language. > > >>>>>>UML with a compliant AAL is an example of a 4GL. If I build an OOA > >>>>>>model for, say, a POS Order Entry System, that model can be > >>>>>>unambiguously implemented without change either manually as a print mail > >>>>>>order catalogue or as software for a browser-based 'net application. > >>>>>>The fundamental processing logic of catalogue organization and order > >>>>>>entry is expressed the same way regardless of the implementation context. > >>>>> > >>>>> > >>>>>And if other people/vendors made their own flavor of this tool with > >>>>>differences between the implimentation, then it would be in the same > >>>>>boat. Why should implementation A1 and A2 demote the "generation" > >>>>>ranking of A? > >>>> > >>>>It is not the same thing at all. The 4GL solution does not care if > >>>>persistence is /implemented/ with RDBs, OODBs, flat files, paper files, > >>>>or clay tablets. > >>> > >>> > >>>For the zillionth time, RDBMS are far more than just "persistence". > >> > >>It is only if one refuses to manage complexity by separating logical > >>concerns. > > > > > > > > "Separation" is generally irrelavent in cyber-land. It is a phsycial > > concept, not a logical one. Perhaps you mean "isolatable", which can be > > made to be dynamic, based on needs. "Isolatable" means that there is > > enough info to produce a seperated *view* if and when needed. This is > > the nice thing about DB's: you don't have to have One-and-only-one > > separation/taxonomy up front. OO tends to want one-taxonomy-fits-all > > and tries to find the One True Taxonomy, which is the fast train the > > Messland. Use the virtual power of computers to compute as-need > > groupings based on metadata. > > You know very well what I mean by 'separation of concerns' in a software > context, so don't waste our time recasting it. Modularity has been a > Good Practice since the late '50s. If there is only one concern set where each concern is mutually exclusive, then we have no disagreement. In practice there are usually multiple "partioning" candidates, and that is where the disagreements usually arise. File and text systems don't make it easy to have partitioning in all dimensions, so compromises must be made. It is "my factor is more important than your factor, neener neener". If there is only one way to slice the pizza, then there is no problem. But if there are multiple ways, then a fight breaks out. This is one reason why DB's are useful: the more info you put into the DB instead of code, the more ad-hoc, situational partitionings you can view. You are not forced to pick the One Right Taxonomy of partitioning. Categorizational philosphers came to the consensus that there is no One Right Taxonomy for most real-world things. > > >>Render unto the Disk generic static storage and render unto > >>the Application context-dependent dynamics. > >> > >> * 1 > >>[Context] ----------------- [Data] > >> > >> > >> 1 * > >>[Problem Solution] -------- [Data] > >> > >>The first view if the basis of the RDB paradigm -- generic storage of > >>the same data for access by many different contexts. The second view is > >>the one that is relevant for solving large problems -- access of data > >>that is carefully tailored to the problem in hand. Storing and > >>accessing data for many different contexts is a quite different problem > >>than formatting and manipulating data to solve a specific problem. > > > > > > > > Again, DB's are not JUST for "storage". There are RAM-only RDBMS's. > > I agree they are used that for more, but it is not my problem if > developers are determined to shoot themselves in the foot by bleeding > cohesion all over the place. It is plain bad software practice to > ignore logical modularity. Again, in practice there are multiple incompatable modularity candidates in non-trivial software. Life if multi-dimensional, and the more complex the software the more factors there are. Change impact analysis often does not help either because I found out that people perceive change and change probabilities different. It is hard to plan for change when people don't perceive the future the same. > > As far as RAM RDBs go, for any large non-CRUD/USER problem I can > formulate a solution (which doesn't have to be OO) that will beat your > RAM RDB for performance, and often by integer factors. Claims claims claims. Yaawwwn. > The RDB paradigm > is not designed for context-dependent problem solving; it is designed > for generic static data storage and access in a context-independent manner. I think what you view as context-dependent is not really context dependent after all. It is just your pet way of viewing the world because of all the OOP anti-DB hype. > > Before you argue that the RAM RDB saves developer effort because it is > largely reusable and that may be worth more than performance, I agree. > But IME for /large/ non-CRUD/USER problems the computer is usually too > small and performance is critical. Please clarify. Ideally the RDBMS would determine what goes into RAM and what to disk such that the app developer doesn't have to give a rat's rear. Cache management generally does this, but a both-way system is probably not as fast as a dedicated RAM DB. Even if the two-way ideal is not fully reached, one will soon have the *option* to switch some or all of an app to a full-RAM DB as needed without rewriting the app. The query language abstracts/encapsulates/hides that detail way. > > [I could also argue that an OO solution will provide one with optimum > performance "for free" because it falls out of basic OOA/D for the > solution logic. IOW, one doesn't need that sort of reuse. But I won't > argue that because that would be going down the rabbit hole. B-)] No, it often hard-wires in the early usage paths such that future usage paths that go against those early paths turn into a mess. OO tends to be really lousy at many-to-many relationships, for example. > > >>For a non-CRUD/USER application, SQL and the DBMS provide the first > >>relationship while a persistence access subsystem provides the > >>reformatting for the second relationship. > > > > > > Reformatting? Please clarify. > > The solution needs a different view of the data that is tailored to the > problem in hand. So the RDB view needs to be converted to the solution > view (and vice versa). IOW, one needs to reformat the RDB data > structures to the solution data structures. This is called a "result set" or "view". Most queries customize the data to a particular task. Thus, it *is* a solution view. > > >>>>I am talking about the abstracting the domain where the original problem > >>>>exists rather than the computing domain where a software solution will > >>>>be executed. SQL only abstracts a very narrow part of the computing domain. > >>> > >>> > >>>I disagree. A large part of *most* apps I have seen involves > >>>database-oriented stuff. P. May mentioned security. Security can be > >>>viewed as a dealing with large ACL tables. Most algorithms can be > >>>reduced to mostly DB-oriented operations. I had to build a 3D graphics > >>>system in college, and most of it could be reduced to DB-operations: > >>>having "parts" reference each other in many-to-many tables, > >>>transformation steps tracking, looking up polygons, cross-referencing > >>>those polygons with their "parent part", storing scan-lines for later > >>>inspection, etc. I will agree that DB's are not (currently) fast at > >>>such, but still from a logical perspective the operations were > >>>essentially DB-oriented. (Because I couldn't use a DB, I ended up > >>>reinventing a lot of DB idioms and it was not very fun.) > >> > >>When the only tool you have is a Hammer, then everything looks like a > >>Nail. > > > > > > > > No, out of necessity I started my career without DB usage, and I never > > want to return there. > > That's because you are in a CRUD/USER environment where P/R works quite > well. Try a problem like allocating a fixed marketing budget to various > national, state, and local media outlets in an optimal fashion for a > Fortunate 500. Again, I never said that DB's are good for every problem. I don't know enough about that particular scenario to propose a DB-centric solution and to know whether it is an exception or not. Unless you provide some specific use-case or detailed sceneria, it is anecdote against anecdote here. RDBMS are a common tool. The sales of Oracle, DB2, and Sybase are gigantic. > >> >><moved>What I implied was that CRUD/USER applications tend to be not very > >> >>complex. Report generation was never very taxing even back in the COBOL > >> >>days, long before SQL, RDBs, or even the RDM. Substituting a GUI or > >> >>browser UI for a printed report doesn't change the fundamental nature of > >> >>the processing. > >> > > >> > > >> > Please clarify. If a process was "not taxing", then you are simply > >> > given more duties and projects to work on. Management loads you up > >> > based on your productivity and work-load. > >> > >>Back in the '60s and early '70s writing COBOL code to extract data and > >>format reports was a task given to the entry level programmers. That's > >>where the USER acronym (Update, Sort, Extract, Report) came from. The > >>stars went on to coding Payroll and Inventory Control where one had to > >>encode business rules and policies to solve specific problems. > > > > > > > > Fine, show how OO better solves business rule management. (Many if not > > most biz rules can be encoded as data, BTW, if you know how.) > > Why? That has nothing to do with whether a DBMS should execute dynamic > business rules and policies. This isn't an OO vs. P/R discussion, much > as you would like to make it so. Are you saying it is a UML-versus-RDB debate? > > >>It is only when the > >>problem solution gets drawn into the software that one leaves the realm > >>of CRUD/USER processing and thing start to get tricky. > >> > >> > >>>>Unfortunately, I agree with May that the rest of the paragraph makes no > >>>>sense; it just seems to be your personal jargon and mantras. > >>> > >>> > > >>They are part of > >>the predictable collection of forensic ploys you use when debating OO > >>people. It's all designed to have an emotional effect to put the > >>opponent on tilt. > >> > >>You seem to get your amusement out of having OO people go bonkers. > > > > > > > > I will admit there is a certain satisfaction of using other people's > > own logic against themselves, especially if they have insulted me > > prior. > > That doesn't answer why you went to the trouble of creating an > inflammatory website and have been here for years. A simple dislike of > OO? I don't think so. How many converts have you made to justify your > crusade? I get enough "amen brother's" to provide all the social satisfaction I need from it. > It just wouldn't be worth the effort of beating your head > against the wall all these years. So you have to have some other > reason. The only plausible reasons I see are Quixotic masochism or you > enjoy pulling people's chains. Perhaps an Asperger's Syndrome: obsession with a specific narrow topic. Whatever, if you want to sit around and speculate on my motivation, be my guest. Frankly, I am not that important to waste time on. > > As far as insulting you is concerned, what do you expect? You throw out > inflammatory statements, especially misconceptions about what OO > development is about, that are designed to drive anyone who understands > OO up a tree. If I used my knowledge of OO and tried to design a > website that would drive OO people to outrage, it would be your > geocities website. It pushes all the buttons in admirable fashion. > (That you can push all the right buttons is what makes me believe you > actually understand a lot more about OO than you pretend; it would be > difficult to be so inflammatory without that knowledge.) So I have to > conclude it is intentional. When you jump up and down on the bellows > long enough, you will get burned. Whatever. If OO was truely great you could demonstrate it with a coded business example that many if not most OO proponents would agree is good OO. You can't. BilliOOns of dOOllars spent based on anecdotes, bragging, and brochure-talk. [....] > >>>You are so cute when you paint me as bad, manipulative, and evil. > >> > >>Not bad or evil, but definitely manipulative. You just find it amusing > >>to pull people's chains and the OO community is providing plenty of soft > >>targets. As I've said before, I think you are actually know a lot more > >>about OO development than you let on and you are pretty clever about the > >>way you tweak the OO people who engage with you. > > > > > > > > You are spreading falsehoods about RDBMS. They are NOT low-level. You > > only treat them as low level. > > Where did I say that? I said that once one is out of the realm of > CRUD/USER processing, /talking/ to persistence is a low level service > _within the application_. How persistence is implemented outside the > application is a whole other story. > We'll, we both agree that DB's are not for everything. However, we disagree widely on where the limits lay. RDBMS tend to not be the right tool where performance, hardware packaging, or timing is more critical than change management. If something changes often, then a RDBMS is a more general-purpose solution. This is not to say that RDBMS are slow, they just will not be competitive with a critial system designed for a very specific, slow-changing purpose. But for the budget-minded who don't want to build low-level tools from scratch and want flexibility, DB's are the way to go. I believe most cases where DB's are not appropriate for the application will fall into the above category. > > ************* > There is nothing wrong with me that could > not be cured by a capful of Drano. > > H. S. Lahman -T-
From: frebe on 22 Jan 2006 05:33 > SQL is not limited to persistence. Finally we can agree about something. Does this mean that you will stop making this claim? > However, that is probably where 99.99% of the usage lies. I suppose that you are talking about your usage of SQL. In an average enterprise application, non-persitence features like queries, transactions, referential integrity, caching, etc, are heavily used. > SQL is specifically designed for the RDB implementation paradigm of the > RDM. Because you have created a new definition of the term RDM, that is different from Codd's definition, the distinction between RDB and RDM is your own invention. > You could develop a SQL driver to use file names as table identity and > read lines via an implied line number as a key, but good luck on > correctly dealing with line insertions and deletions without an embedded > key. Why would line number be the key? > The RDM is basic set theory. Are you saying that the RDM is based on basic set theory or that the RDM is nothing more than basic set theory? > Codd was explicitly dealing with > persistence in a computing environment so he expressed the rules in > terms of embedded identity attributes (keys). However the set theory > only requires that each tuple have unique identity. In what way does emedded identity attributes limits Codd's RDM to persistence? The option appear to be pointers, which was used in network databases. Emedded keys vs pointers is orthogonal to persistent vs in-memory. > Thus you will see a discussion of normalization of > Class Models in most standard OOA/D books I was asking for a definition of the second definition of the RDM, broader than Codd's definition. I was not asking for discussions about class model normalization. Fredrik Bertilsson http://butler.sourceforge.net
From: frebe on 22 Jan 2006 08:09 > There > is a reason why the CRUD work is typically given to new hires and > junior developers. The reason why CRUD work is given to junior developers is the fact that using OO design, CRUD applications are very bloated. If RAD tools were used instead, the same work would be done in a few minutes, saving a lot of money instead of hiring an army of (junior) developers. Fredrik Bertilsson http://butler.sourceforge.net
From: H. S. Lahman on 22 Jan 2006 12:13 Responding to Frebe... >>SQL is not limited to persistence. > > Finally we can agree about something. Does this mean that you will stop > making this claim? I never made that claim. >>However, that is probably where 99.99% of the usage lies. > > I suppose that you are talking about your usage of SQL. In an average > enterprise application, non-persitence features like queries, > transactions, referential integrity, caching, etc, are heavily used. If they are using SQL for that in a non-CRUD/USER application for anythign other than persistence access, then they are misusing SQL. Even in a CRUD/USER application it doesn't make much sense from a performance viewpoint if the data is in memory. >>SQL is specifically designed for the RDB implementation paradigm of the >>RDM. > > Because you have created a new definition of the term RDM, that is > different from Codd's definition, the distinction between RDB and RDM > is your own invention. Codd's definition /is/ the RDB view; it is a specialized application of more general set theory... >>You could develop a SQL driver to use file names as table identity and >>read lines via an implied line number as a key, but good luck on >>correctly dealing with line insertions and deletions without an embedded >>key. > > Why would line number be the key? How else would you uniquely identify each line for individual access in a text file? >>The RDM is basic set theory. > > Are you saying that the RDM is based on basic set theory or that the > RDM is nothing more than basic set theory? The RDM is a combination of basic set theory and predicate logic that deals with relational calculus using terminology like relation, tuple, and attribute. Codd's data model is an application of the RDM that deals with relational algebra using terminology like table, row, and field (see his original 1970 paper, "A Relational Model of Data in Large Shared Data Banks", ACM Communications, 13, pgs. 377-387 where he introduced the notion of representing data in tables). While Codd was the first to provide a formal and consistent view of the RDM, the RDM itself has been greatly expanded over the years beyond the RDB view. Today it can be applied to such disparate arenas as OO development and OODBs... >>Codd was explicitly dealing with >>persistence in a computing environment so he expressed the rules in >>terms of embedded identity attributes (keys). However the set theory >>only requires that each tuple have unique identity. > > > In what way does emedded identity attributes limits Codd's RDM to > persistence? It doesn't. But Codd's goal was to describe persisted data and he developed the initial view of the RDM around the notion of RDB storage. Just look at the titles of Codd's early books and papers and try to convince me that his research wasn't /focused/ on RDBs and persistence: A Relational Model of Data for Large Shared Data Banks, 1970 Normalized Data Base Structure: A Tutorial, 1971 A Data Base Sublanguage Founded on the Relational Calculus, 1971 Further Normalization of the Data Base Relational Model, 1972 Relational Completeness of Data Base Languages, 1972 The Gamma-0 n-ary Relational Data Base Interface Specifications of Objects and Operations, 1973 Recent Investigation in Relational Data Base Systems, 1974 Implementation of Relational Data Base Management Systems, 1975 He was a researcher in IBM's hard disk division, for Pete's sake! >>Thus you will see a discussion of normalization of >>Class Models in most standard OOA/D books > > I was asking for a definition of the second definition of the RDM, > broader than Codd's definition. I was not asking for discussions about > class model normalization. For starters, try set theory. Though I am not a fan, you might also look at Chris Date's work for descriptions of the RDM well beyond the RDB view. ************* There is nothing wrong with me that could not be cured by a capful of Drano. H. S. Lahman hsl(a)pathfindermda.com Pathfinder Solutions -- Put MDA to Work http://www.pathfindermda.com blog: http://pathfinderpeople.blogs.com/hslahman (888)OOA-PATH
From: H. S. Lahman on 22 Jan 2006 14:57
Responding to Jacobs... >>>>>SQL is not an implementation. What is the difference between locking >>>>>yourself to SQL instead of locking yourself to Java? If you want >>>>>open-source, then go with PostgreSQL. What is the diff? Java ain't no >>>>>universal language either. >>>> >>>>Of course it's an implementation! It implements access to physical >>>>storage. >>> >>> >>>Just as Java implements access to physical RAM etc. >> >>Exactly. Java is a specific implementation of a 3GL. 3GL is the >>abstraction, Java is an implementation. Persistence access is the >>abstraction, SQL is an implementation. > > > > Why do you keep saying "persistence"? I don't think you get the idea of > RDBMS and query languages. Like I said, think of a RDBMS as an > "attribute management system". Forget about disk drives for now. Saying > it is only about "persistence" is simply misleading. Persistent data is data that is stored externally between executions of an application. RDBs are a response to that need combined with a requirement that access be generic (i.e., the data can be accessed by many different applications, each with unique usage contexts). That's what DBMSes do -- they manage persistent data storage and provide generic, context-independent access to that data storage. My point in this subthread is that such responsibilities are complicated enough in practice that one does not want the DBMS to also manage and execute dynamic business rules and policies. IOW, the DBMS should just mind its own store. [This thread has been a veritable hotbed of puns. I've probably made more in this thread than I've done in the last decade. B-)] >>>>More important to the context here, that implementation is >>>>quite specific to one single paradigm for stored data. >>> >>> >>>Any language or API is pretty much going to target a specific paradigm >>>or two. I don't see any magic way around this, at least not that you >>>offer. UML is no different. >> >>4GLs get around it because they are independent of /all/ computing space >>implementations. > > > I am not sure UML qualifies as 4th Gen. Just because it can be > translated into multiple languages does not mean anything beyond Turing > Equivalency. C can be translated into Java and visa verse. A UML OOA model can be implemented unambiguously and without change in a manual system. In fact, that is a test reviewers use to detect implementation pollution. The OOA model for, say, a catalogue-driven order entry system will look exactly the same whether it is implemented as a 19th century mail-in Sears catalogue or a modern broswer-based web application. That is not true for any 3GL. >>However, that's not the point. SQL is a 3GL but comparing it to Java is >>specious because Java is a general purpose 3GL. > > > Again, this gets into the definiton of "general purpose". I agree that > query languages are not meant to do the *entire* application, but that > does not mean it is not general purpose. File systems are "general > purpose", but that does not mean that one writes an entire application > in *only* a file system. It is a general purpose *tool*, NOT intended > to be the whole enchilata. Huh?!? If you can't write the entire application in it, then it isn't general purpose by definition. > A hammer is a general purpose tool, but that does not mean one is > supposed to ONLY use a hammer. You need to clarify your working > definition of "general purpose", and then show it the consensus > definition for 4GL. huh**2?!? A hammer is not a general purpose tool by any stretch of the imagination. >>SQL represents a >>solution to persistence access that is designed around a particular >>model of persistence itself. So one can't even use it for general >>purpose access to persistence, much less general computing. > > > Please clarify. Something can still be within a paradigm and be general > purpose. Further GP does not necessarily mean "all purpose", for > nothing is practially all purpose. SQL is designed around the RDB paradigm for persistence. It can't be used for, say, accessing lines in a text flat file because the text file is not does organize the data the way SQL expects. So SQL is not a general purpose interface to stored data. Apropos of your point, though, SQL is quite general purpose for accessing /any/ data in a uniform way from a data store _organized like an RDB_. >>>>Requirements -> 4GL -> 3GL -> Assembly -> machine code executable >>>> >>>>Everything on the left is a specification for what is immediately to its >>>>right. Similarly, everything to the right is a solution implementation >>>>for the specification on its immediate left. >>> >>> >>>Well that is a bit outdated. For one, the distinction between 4GL and >>>3GL is fuzzy, and many compilers/interpreters don't use assembler. >> >>My 4GL definition isn't ambiguous, which is why I like it. Reviewers of >>OOA models have no difficulty recognizing implementation pollution. > > > Argument by authority. I prefer to think of it as argument by rational practicality. >>>>Go look at an SA/D Data Flow Diagram or a UML Activity Diagram. They >>>>express data store access at a high level of abstraction that is >>>>independent of the actual storage mechanism. SQL, ISAM, CODASYL, gets, >>>>or any other access mechanism, is then an implementation of that generic >>>>access specification. >>> >>> >>>SQL is independent of the "actual storage mechanism". It is an >>>interface. You may not like the interface, but that is another matter. >>>Repeat after me: "SQL is an interface, SQL is an interface, SQL is an >>>interface".... >> >>Try using SQL vs. flat files if you think it is independent of the >>actual storage mechanism. (Actually, you probably could if the flat >>files happened to be normalized to the RDM, but the SQL engine would be >>a doozy and would have to be tailored locally to the files.) SQL >>implements the RDB view of persistence and only the RDB view. > > > > How is that different than ANY other interface? You are claiming magic > powers of UML that it simply does not have. There is a distinction between describing an interface and designing its semantics. UML is quite capable of describing the semantics of any interface. Deciding what the semantics should be is quite another thing that the developer owns. When I have a subsystem in my application to access persistent data, that subsystem has an interface that the rest of the application talks to. That interface is designed around the rest of the application's data needs, not the persistence mechanisms. It is the job of the persistence access subsystem to convert the problem solution's data needs into the access mechanisms de jour. If the persistence is an RDB, then the subsystem implementation will <probably> use SQL. If the persistence is flat text files, it will use the OS file manager and streaming facilities. If it is clay tablets, it will use an OCR and stylus device driver API. That allows me to plug & play the persistence mechanisms without touching the application solution because it still talks to the same interface regardless of the implementation of the subsystem. IOW, the semantics of the interface to the subsystem is /designed/ at a different level of abstraction than that of the subsystem implementation. UML doesn't care about the design process; it just represents the results. > And as somebody pointed out, one can use SQL on flat files too. ODBC > drivers can be created to hook SQL to spreadsheets, flat files, etc. Only if the data is organized around embedded identity and normalized. Even then such drivers carry substantial overhead and tend to be highly tailored to specific applications. IOW, you need a different driver for every context (e.g., a spreadsheet) and then it won't be as efficient as an access paradigm designed specifically for the storage paradigm. >>>>Java is certainly a general purpose 3GL. Like most 3GLs there are >>>>situations where there are better choices (e.g., lack of BCD arithmetic >>>>support makes it a poor choice for a General Ledger), but one could >>>>still use it in those situations. SQL, in contrast, is a niche language >>>>that just doesn't work for many situations outside its niche. >>> >>> >>>You could be right, but I have yet to see a good case outside of >>>split-second timing issues where there is a limit to the max allowed >>>response time. (This does not mean that rdbms are "slow", just less >>>predictable WRT response time.) >>> >>>If you can give an example outside of timing, please do. (I don't doubt >>>they exist, but I bet they are rarer than you imply. Some scientic >>>applications that use imaginary numbers and lots of calculus may also >>>fall outside.) >> >>Compute a logarithm. You can't hedge by dismissing "scientific" >>computations. > > > > I didn't. Nothing is ideal for everything under the sun. Nothing. See > above about general-purpose tools. > > > >>Try doing forecasting in an inventory control system w/o >>"scientific" computations. > > > I am not sure what you are implying here. I did not claim that > scientific computation was not necessary. I was just anticipating your deflection; you've been using the give-me-an-example ploy for years. B-) When the example is provided you deflect by attacking it on grounds unrelated to the original point. That's usually easy to do because examples are deliberately kept simple to focus on the point in hand. That allows you to bring in unstated requirements, programming practices designed for other contexts, and whatnot to attack the example on grounds unrelated to the original point. In this case, though, you screwed up by setting up a basis for the deflection ahead of time. You asked for an example outside of "timing". The main reason SQL isn't a general purpose 3GL is that it can't handle dynamics (algorithmic processing) very well. So the obvious examples are going to tend to be algorithmic, such as computing a logarithm. But your parenthetical hedge set up a basis for dismissing any obvious example as "scientific" when you subsequently deflect. Then later you can argue the point was never demonstrated. >>Or try encoding the pattern recognition that >>the user of a CRUD/USER application applies to the presented data. The >>reality is that IT is now solving a bunch of problems that are >>computationally intensive. > > > As usual, "it depends". Problems where there is a lot of "chomping" on > a small set of data are probably not something DB's are good at (at > this time). An example might be the Travaling Salesman puzzle. However, > problems where the input is large and from multiple entities are more > up the DB's alley. The Traveling Salesman problem can be arbitrarily large and the RDB model will still probably not be useful because... <aside> FYI, most of the Operations Research algorithms are actually pretty simple when written out in equations and the core processing doesn't require a lot of code. Typically most of the code is involved in getting the data into the application, setting up data structures, and reporting the results. In addition, the interesting problems are huge and involve vast amounts of data. For example, the logistics for the '44 D-Day invasion of Normandy held the record as the largest linear programming problem ever solved well into the '70s. The equations for the Simplex solution were written in a few lines but the pile of data processed was humongous and the actual execution took months. (It had to be split up into many chunks because of the MTTF of the computer hardware and a lot of preprocessing was done by acres of clerks with hand-cranked calculators.) </aside> > (It may be possible to use a DB to solve Salesmen quickly, but few > bother to research that area.) Unlikely. It's an np-Complete problem so the worst case always involves an exhaustive search of all possible combinations (i.e., O(N*N)). The exotic algorithms just provide /average/ performance that approaches O(NlogN). But those algorithms require data structures that are highly tailored to the solution. And because of the crunching one wants identity in the form of array indices, not embedded in tables or the problem doesn't get solved in a lifetime. >>>>BTW, remember that I am a translationist. When I do a UML model, I >>>>don't care what language the transformation engine targets in the model >>>>implementation. (In fact, transformation engines for R-T/E typically >>>>target straight C from the OOA models for performance reasons.) Thus >>>>every 3GL (or Assembly) represents a viable alternative implementation >>>>of the notion of '3GL'. >>> >>> >>> >>>Well, UML *is* language. It is a visual language just like LabView is. >> >>Exactly. But solutions at the OOA level are 4GLs because they can be >>unambiguously implemented without change on any platform with any >>computing technologies. > > > So can any Turing Complete language. And your point is...? On separation of concerns of problem solving dynamics vs. data persistence and access: >>>"Separation" is generally irrelavent in cyber-land. It is a phsycial >>>concept, not a logical one. Perhaps you mean "isolatable", which can be >>>made to be dynamic, based on needs. "Isolatable" means that there is >>>enough info to produce a seperated *view* if and when needed. This is >>>the nice thing about DB's: you don't have to have One-and-only-one >>>separation/taxonomy up front. OO tends to want one-taxonomy-fits-all >>>and tries to find the One True Taxonomy, which is the fast train the >>>Messland. Use the virtual power of computers to compute as-need >>>groupings based on metadata. >> >>You know very well what I mean by 'separation of concerns' in a software >>context, so don't waste our time recasting it. Modularity has been a >>Good Practice since the late '50s. > > > > If there is only one concern set where each concern is mutually > exclusive, then we have no disagreement. In practice there are usually > multiple "partioning" candidates, and that is where the disagreements > usually arise. File and text systems don't make it easy to have > partitioning in all dimensions, so compromises must be made. It is "my > factor is more important than your factor, neener neener". If there is > only one way to slice the pizza, then there is no problem. But if there > are multiple ways, then a fight breaks out. > > This is one reason why DB's are useful: the more info you put into the > DB instead of code, the more ad-hoc, situational partitionings you can > view. You are not forced to pick the One Right Taxonomy of > partitioning. Categorizational philosphers came to the consensus that > there is no One Right Taxonomy for most real-world things. There are three accepted criteria for application partitioning (i.e., separating concerns at the scale of subsystems): Subject matter, level of abstraction, and requirements allocation via client/service relationships. (BTW, this has nothing to do with OO; it is basic Systems Engineering.) Subject matter: Clearly static data storage and providing generic access to it is a different subject matter than solving Problem X. Level of abstraction: Outside CRUD/USER processing the detailed manipulation of data storage (e.g. ,two-phased commit) is clearly at a much lower level of abstraction than the algorithmic processing the solves a particular problem. IOW, the application solution is completely indifferent to where and how data is stored. One should be able to solve the problem the same way regardless of what the persistence mechanisms are. That substitutability means that the problem solution is at a higher level of abstraction than the persistence mechanisms. Requirements Allocation: Clearly the requirements for persistence implementation and access are quite different than the requirements on the specific solution of Problem X. So under all three of these criteria it makes sense to separate the concerns of persistence from individual problem solutions. That's exactly what DBMSes do. The problems only come into play when one violates that separation of concerns and starts bleeding specific problem solutions into the DBMS itself. >>The RDB paradigm >>is not designed for context-dependent problem solving; it is designed >>for generic static data storage and access in a context-independent manner. > > > I think what you view as context-dependent is not really context > dependent after all. It is just your pet way of viewing the world > because of all the OOP anti-DB hype. My view of context-dependence is the solution to a /particular/ problem. Each application solves a unique problem. IOW, the problem is the context. RDBs provide persistence that allows all the applications to access the data in a uniform way regardless of what specific problem they are solving. Whether one can solve the problem in a reasonable fashion with the data structures mapped to the RDB structure depends on the nature of the problem. For CRUD/USER processing one can. For problems outside that realm one can't so one needs to convert data into structures tailored to the problem in hand. [Note that this is relevant to the point above about providing SQL drivers for different storage paradigms. That makes sense for CRUD/USER environments because one is already employing SQL as the norm. So long as the exceptions requiring a special driver are fairly rare, one can justify the single access paradigm. However, it makes no sense at all for non-CRUD/USER environments because one has to reformat the data to the problem solution anyway. So rather than reformatting twice, one should just reformat once from a driver that optimizes for the storage paradigm.] >>Before you argue that the RAM RDB saves developer effort because it is >>largely reusable and that may be worth more than performance, I agree. >>But IME for /large/ non-CRUD/USER problems the computer is usually too >>small and performance is critical. > > > Please clarify. Ideally the RDBMS would determine what goes into RAM > and what to disk such that the app developer doesn't have to give a > rat's rear. Cache management generally does this, but a both-way system > is probably not as fast as a dedicated RAM DB. Even if the two-way > ideal is not fully reached, one will soon have the *option* to switch > some or all of an app to a full-RAM DB as needed without rewriting the > app. The query language abstracts/encapsulates/hides that detail way. This is another non sequitur deflection. Caching and whatnot is not relevant to the point I was making. There is business a trade-off between run-time performance and developer development time that every shop must make. Sometimes greater developer productivity can justify reusing the RDB paradigm when more efficient specific solutions are available. However, my point was that those situations tend to map to CRUD/USER processing. Once problems become more complex than format conversions in UI/DB pipeline applications, performance becomes the dominant consideration. I spent years solving large np-Complete problems on machines like PDP11s and there was no contest on that issue; customers simply would not spring for Crays in their systems but they would spring for a marginal extra developer cost prorated across all systems. >>>>For a non-CRUD/USER application, SQL and the DBMS provide the first >>>>relationship while a persistence access subsystem provides the >>>>reformatting for the second relationship. >>> >>> >>>Reformatting? Please clarify. >> >>The solution needs a different view of the data that is tailored to the >>problem in hand. So the RDB view needs to be converted to the solution >>view (and vice versa). IOW, one needs to reformat the RDB data >>structures to the solution data structures. > > > This is called a "result set" or "view". Most queries customize the > data to a particular task. Thus, it *is* a solution view. That formatting is cosmetic. The most sophisticated reformatting is combing data from multiple tables in a join into a single table dataset. I am talking about data structures whose semantics are different, whose access paradigms are different, whose relationships are different, and/or whose structure is different. IOW, there isn't a 1:1 mapping to the RDB. For example, if my solution requires the data to be organized hierarchically SQL queries can't do that. >>>>When the only tool you have is a Hammer, then everything looks like a >>>>Nail. >>> >>> >>> >>>No, out of necessity I started my career without DB usage, and I never >>>want to return there. >> >>That's because you are in a CRUD/USER environment where P/R works quite >>well. Try a problem like allocating a fixed marketing budget to various >>national, state, and local media outlets in an optimal fashion for a >>Fortunate 500. > > > Again, I never said that DB's are good for every problem. I don't know > enough about that particular scenario to propose a DB-centric solution > and to know whether it is an exception or not. > > Unless you provide some specific use-case or detailed sceneria, it is > anecdote against anecdote here. > > RDBMS are a common tool. The sales of Oracle, DB2, and Sybase are > gigantic. Of course they are. They provide a generic, context-independent access to stored data that any application can use. That's why they exist. But that is beside the point. The issue here is where individual business problems should get solved. My assertion is that is an application responsibility. For CRUD/USER processing one can use the same data structures in the solution as in the RDB so P/R as a software development paradigm works well. Generally, though, one can't use the same data structures once one is out of the CRUD/USER realm so P/R doesn't work very well. >>>>>><moved>What I implied was that CRUD/USER applications tend to be not very >>>>>>complex. Report generation was never very taxing even back in the COBOL >>>>>>days, long before SQL, RDBs, or even the RDM. Substituting a GUI or >>>>>>browser UI for a printed report doesn't change the fundamental nature of >>>>>>the processing. >>>>> >>>>> >>>>>Please clarify. If a process was "not taxing", then you are simply >>>>>given more duties and projects to work on. Management loads you up >>>>>based on your productivity and work-load. >>>> >>>>Back in the '60s and early '70s writing COBOL code to extract data and >>>>format reports was a task given to the entry level programmers. That's >>>>where the USER acronym (Update, Sort, Extract, Report) came from. The >>>>stars went on to coding Payroll and Inventory Control where one had to >>>>encode business rules and policies to solve specific problems. >>> >>> >>> >>>Fine, show how OO better solves business rule management. (Many if not >>>most biz rules can be encoded as data, BTW, if you know how.) >> >>Why? That has nothing to do with whether a DBMS should execute dynamic >>business rules and policies. This isn't an OO vs. P/R discussion, much >>as you would like to make it so. > > > Are you saying it is a UML-versus-RDB debate? Another deflection. How do you get from how complex report generation software is to UML vs. RDB? The topic here has nothing to do with OO, P/R, or UML. It is about the complexity of processing for CRUD/USER applications vs. other applications. ************* There is nothing wrong with me that could not be cured by a capful of Drano. H. S. Lahman hsl(a)pathfindermda.com Pathfinder Solutions -- Put MDA to Work http://www.pathfindermda.com blog: http://pathfinderpeople.blogs.com/hslahman (888)OOA-PATH |