From: Lew on 27 Jul 2010 08:00 Andreas Leitgeb wrote: > On some deeper level, a relational DB seems to actually use the "separate > arrays" approach, too. Otherwise I cannot explain the relatively low cost > of adding another column to a table of 100 million entries already in it. There's a big difference between a database with tens of thousands, maybe far more manhours of theory, development, testing, user feedback, optimization efforts, commercial competition and evolution behind it, and an ad-hoc use of in-memory arrays by a solo programmer. A database system is far, far more than a simple "separate arrays" approach. There are B[+]-trees, caches, indexes, search algorithms, stored procedures, etc., etc., etc. Your comment is like saying that "on some deeper level" a steel-and-glass skyscraper is like the treehouse you built for your kid in the back yard. -- Lew
From: Tom McGlynn on 27 Jul 2010 09:53 On Jul 26, 1:33 pm, Tom Anderson <t...(a)urchin.earth.li> wrote: > On Mon, 26 Jul 2010, Tom McGlynn wrote: > > On Jul 26, 8:01 am, Tom Anderson <t...(a)urchin.earth.li> wrote: > > >> But Joshua was talking about using instances of Color, where those > >> instances are singletons (well, flyweights is probably the right term > >> when there are several of them), exposed in static final fields on > >> Color, and > > > While I agree with Tom's main point here,I am dubious about his > > suggestion for what the static instances should be called in a class > > that only creates a unique closed set of such instances. > > > I don't think flyweights is the right word. For me flyweights are > > classes where part of the state is externalized for some purpose. This > > is orthogonal to the concept of singletons. E.g., suppose I were running > > a simulation of galaxy mergers of two 100-million-star galaxies. Stars > > differ only in position, velocity and mass. Rather than creating 200 > > million Star objects I might create a combination flyweight/singleton > > Star where each method call includes an index that is used to find the > > mutable state in a few external arrays. > > I am 90% sure that is absolutely not how 'flyweight' is defined in the > Gang of Four book, from which its use in the programming vernacular > derives. If you want to use a different definition, then that's fine, but > you are of course wrong. > > > So is there a better word than flyweights for the extension of a > > singleton to a set with cardinality > 1? > > Multipleton. > > More seriously, enumeration. > Thanks Tom for responding to my real question about the vocabularly rather than the example. That was just intended to clarify a distinction between a singleton and my idea of a flyweight. Here's a bit of what the GOF has to say about flyweights. (Page 196 in my version).... "A flyweight is a shared object that can be used in multiple contexts simultaneously. The flyweight acts as an independent object in each context--it's indistinguishable from an instance of the object that's not shared.... The key concept here is the distinction between intrinsic and extrinsic state. Intrinsic state is stored in the flyweight. It consists of information that's independent of the flyweight's context, thereby making it shareable. Extrinsic state depends on and varies with the flyweights context and therefore can't be shared. Client objects are responsible for passing extrinsic state to the flyweight when it needs it." That's reasonably close to what I had in mind. In my simple example stars may share some common state (e.g., age), but the information about the position and velocity is extrinsic and supplied when the object is used. In a more realistic example I might have multiple flyweights for different types of stars. Getting back to my original concern, I don't think enumeration is a good word for the concept either. Enumerations are often used for an implementation of the basis set -- favored in Java by special syntax. However the word enumeration strongly suggests a list. In general the set of special values may have a non-list relationship (e.g., they could form a hierarchy). I like the phrase 'basis set' I used above but that suggests that other elements can be generated by combining the elements of the basis so it's not really appropriate either. Regards, Tom McGlynn
From: Alan Gutierrez on 27 Jul 2010 13:34 Lew wrote: > Andreas Leitgeb wrote: >> On some deeper level, a relational DB seems to actually use the "separate >> arrays" approach, too. Otherwise I cannot explain the relatively low >> cost >> of adding another column to a table of 100 million entries already in it. > > There's a big difference between a database with tens of thousands, > maybe far more manhours of theory, development, testing, user feedback, > optimization efforts, commercial competition and evolution behind it, > and an ad-hoc use of in-memory arrays by a solo programmer. > > A database system is far, far more than a simple "separate arrays" > approach. There are B[+]-trees, caches, indexes, search algorithms, > stored procedures, etc., etc., etc. Your comment is like saying that > "on some deeper level" a steel-and-glass skyscraper is like the > treehouse you built for your kid in the back yard. In other words, nobody ever got fired for buying IBM. -- Alan Gutierrez - alan(a)blogometer.com - http://twitter.com/bigeasy
From: Alan Gutierrez on 27 Jul 2010 14:11 Andreas Leitgeb wrote: > Lew <lew(a)lewscanon.com> wrote: >> Lew wrote: >>>> Except that that "mutable state in a few external arrays" still has to >>>> maintain state for order 10^8 objects, coordinating position, velocity >>>> (relative to ...?) and mass. So you really aren't reducing very much, >>>> except simplicity in the programming model and protection against >>>> error. >> Andreas Leitgeb wrote: >>> About 2.4GB are not much? >>> (200 mill * (8bytes plain Object + 4bytes for some ref to each)) >> No, it isn't. Around USD 75 worth of RAM. > > Except, if the machine is already RAM-stuffed to its limits... > > Even if the machine wasn't yet fully RAM'ed, then buying more RAM > *and* using the arrays-kludge(yes, that's it, afterall) would allow > even larger galaxies to be simulated. The RAM is cheaper than programmer time argument is useful to salt the tail of the newbie that seeks to dive down every micro-optimization rabbit hole that he come across on the path to the problems that truly deserve such intense consideration. You have to admire the moxie of the newbie that wants to catenate last name first as fast as possible, but you explain to them that their are plenty of dragons to slay further down the road. It is not a good argument for someone who brings a problem that is truly limited by available memory. Memory management is an appropriate consideration for the problem. Memory management is the problem. Memory procurement is the non-programmer solution. Throw money at it. Scale up rather than scaling out, because we can scale up with cash, but scaling out requires programmers who understand algorithms. You're right that scaling up hits a foreseeable limit. I like to have the limitations of my program be unforeseeable. That is, if I'm going to read something into memory, say, every person in the world who would loan money to me personally without asking questions, I'd like to know that hitting the limits of the finite resource employed on a contemporary computer system correlates to situation in reality that is unimaginable. Moore's Law does not excuse brute force. Which is why I am similarly taken aback to hear RAM prices quoted for something that has obvious solutions in plain old Java. >>> I for myself might choose the perhaps less cpu-efficient (due to all >>> the repeated indexing and for the defied locality) and also less simple >>> programming model, if it allowed me to solve larger problems with the >>> available RAM. >> With that much data to manage, I'd go with the straightforward object >> model and a database or the $75 worth of RAM chips. Or both. > > On some deeper level, a relational DB seems to actually use the "separate > arrays" approach, too. Otherwise I cannot explain the relatively low cost > of adding another column to a table of 100 million entries already in it. On some deeper level, a relational database through an object relational mapping layer will be paging information in and out of memory, on and off of disk, as you need it. That is the feature you need to address your memory problem. Lately, I've been mucking about with `MappedByteBuffer`, so I imagine for your (hypothetical) problem of modeling the Galaxy, you would model it by keeping the primitives you describe in the `MappedByteBuffer` and creating objects from them as needed. This is not `Flyweight` to my mind, where you keep objects that map to finite set of values, these values are assembled into a larger structure in an infinite number of permutations. These atomic components exist within the larger structure, but they are reused. Interned `String` is a flyweight to my mind. I'm not sure what the pattern is for the short term objectification of a record, but that is a lot of what Hibernate is about. Making objecty that which is stringy, just long enough for you do your GRUD in the security of your type-safe world. >> And what about when the model changes, and you want to track star age, >> brightness, color, classification, planets, name, temperature, >> galactic quadrant, ...? With parallel arrays the complexity and risk >> of bugs just goes up and up. With an object model, the overhead of >> maintaining that model becomes less and less significant, but the >> complexity holds roughly steady. > > 100% agree to these points. You create an `Star` object that can read the information from a `MappedByteBuffer` at a particular index, and you can simply change the `read` and `write` method of the star. You've reached down to the deeper level of the ORM+RDBMS stack and extracted the only design pattern you need to address the problem of reading the Universe into memory. -- Alan Gutierrez - alan(a)blogometer.com - http://twitter.com/bigeasy
From: Martin Gregorie on 27 Jul 2010 14:58
On Tue, 27 Jul 2010 12:34:27 -0500, Alan Gutierrez wrote: > > In other words, nobody ever got fired for buying IBM. > Regardless of what you might think of their business methods, and in the past they didn't exactly smell of roses, their software quality control and their hardware build quality are both hard to beat. I've used S/38 and AS/400 quite a bit and never found bugs in their system software or lost work time due to hardware problems. For elegant systems design ICL had them beat hands down, but although ICL quality was OK by IT standards the IBM kit was more reliable. IME anyway. -- martin@ | Martin Gregorie gregorie. | Essex, UK org | |