From: Tom McGlynn on 29 Jul 2010 08:51 On Jul 29, 7:29 am, Tom Anderson <t...(a)urchin.earth.li> wrote: > On Wed, 28 Jul 2010, Tom McGlynn wrote: > > I'm a little intrigued by the discussion of the appropriate choices for > > the architecture for an n-body calculation by a group which likely has > > little experience if any in the field. > > > Direct n-body calculations need to compute the distance between pairs of > > objects. The distances between nearby objects need to be calculated > > orders of magnitude more frequently than between distant objects. If > > the data can be organized such that nearby in [simulated] space stars > > tend to be nearby in memory, then cache misses may be substantially > > reduced. This can improve performance by an order of magnitude or more. > > In Java the only structure you have available that allows for managing > > this (since nearby pairs change with time) is a primitive array. Java > > gives no way, as far as I know, to manage the memory locations of > > distinct objects. > > True, although it doesn't prevent cache locality - whereas the parallel > arrays approach immediately rules out locality of the coordinates of a > single star, because they'll be in different arrays. If you want locality, > you'd have to pack the values of all three coordinates into one big array, > which of course is possible. > > The dual of the fact that java doesn't let you control locality is that > JVMs are free to control it. There is research going back at least ten > years now into allocation and GC strategies that improve locality. Indeed, > for some popular kinds of collectors, locality is a standard side-effect - > any moving collector where objects are moved in a depth-first traversal of > (some subgraph of) the object graph will tend to put objects shortly after > some other object that refers to them, and thus also close to objects that > are also referred to by that object. It may not help enormously for > mesh-structured object graphs, but it works pretty well for the trees that > are common in real life. If these Stars are held in an octree, for > example, we might expect decent locality. > Putting everything in a single array is certainly fine. It doesn't change the basic flyweight idea here. Since in most n-body codes stars are never destroyed, garbage collection as such may not come into play much. I'd be curious how much storage reallocation is done in current JVMs where there is very little creation or destruction of objects. Of course if one built a changeable star hierarchy then one would be continually creating and destroying branch nodes of the hierarchy (as stars move) even though the leaf nodes would be immortal. Perhaps the churn of branch nodes would be enough for the gc to move the leafs appropriately. It is certainly possible that a clever enough JVM could address this automatically and efficiently. The n-body code may have the advantage in that it can do things predictively rather than reactively, but a system approach would likely be able to adapt to changes in the local environment better. But I'm not trying to persuade people that using flyweights in the sense that I suggested in my example is necessarily the right thing to do in all circumstances, merely to note that it may be a rational or even desirable approach in some. That example was given to give a concrete realization of an object that simultaneously implemented flyweight and singleton, not for its intrinsic merit but I was a little bemused by reaction to it. Regards, Tom McGlynn
From: Lew on 29 Jul 2010 09:00 Tom McGlynn wrote: > But I'm not trying to persuade people that using flyweights in the > sense that I suggested in my example is necessarily the right thing to > do in all circumstances, merely to note that it may be a rational or > even desirable approach in some. That example was given to give a > concrete realization of an object that simultaneously implemented > flyweight and singleton, not for its intrinsic merit but I was a > little bemused by reaction to it. When dealing with 200 M stars, one might be tempted to use a multi-threaded approach. Sharing a singleton among threads introduces the complexity and overhead of synchronization. The straightforward object approach, combined with appropriate caching (the buffer approach or DBMS approach) simplifies concurrent implementations. -- Lew
From: Tom McGlynn on 29 Jul 2010 09:24 On Jul 29, 7:20 am, Tom Anderson <t...(a)urchin.earth.li> wrote: > On Tue, 27 Jul 2010, Tom McGlynn wrote: > > On Jul 26, 1:33 pm, Tom Anderson <t...(a)urchin.earth.li> wrote: > >> On Mon, 26 Jul 2010, Tom McGlynn wrote: > >>> On Jul 26, 8:01 am, Tom Anderson <t...(a)urchin.earth.li> wrote: > > >>>> But Joshua was talking about using instances of Color, where those > >>>> instances are singletons (well, flyweights is probably the right term > >>>> when there are several of them) > > >>> I don't think flyweights is the right word. For me flyweights are > >>> classes where part of the state is externalized for some purpose. This > >>> is orthogonal to the concept of singletons. E.g., suppose I were > >>> running a simulation of galaxy mergers of two 100-million-star > >>> galaxies. Stars differ only in position, velocity and mass. Rather > >>> than creating 200 million Star objects I might create a combination > >>> flyweight/singleton Star where each method call includes an index that > >>> is used to find the mutable state in a few external arrays. > > >> I am 90% sure that is absolutely not how 'flyweight' is defined in the > >> Gang of Four book > > > Here's a bit of what the GOF has to say about flyweights. (Page 196 > > in my version).... > > > "A flyweight is a shared object that can be used in multiple contexts > > simultaneously. The flyweight acts as an independent object in each > > context--it's indistinguishable from an instance of the object that's > > not shared.... The key concept here is the distinction between intrinsic > > and extrinsic state. Intrinsic state is stored in the flyweight. It > > consists of information that's independent of the flyweight's context, > > thereby making it shareable. Extrinsic state depends on and varies with > > the flyweights context and therefore can't be shared. Client objects > > are responsible for passing extrinsic state to the flyweight when it > > needs it." > > > That's reasonably close to what I had in mind. > > Yes, point taken. I'm still not happy with your usage, though. > > IIRC, the example in GoF is of a Character class in a word processor. So, > a block of text is a sequence of Character objects. Each has properties > like width, height, vowelness, etc and methods like paintOnScreen. But > because every lowercase q behaves much the same as every other lowercase > q, rather than having a separate instance for every letter in the text, we > have one for every distinct letter. > > The extrinsic state in this example is the position in the text, the > typeface, the style applied to the paragraph, etc. Certainly, things that > are not stored in the Character. But also not things that intrinsically > belong in the Character anyway; rather, things inherited from enclosing > objects. > > Whereas in your case, the array offset *is* something intrinsic to the > Star. If it had been something else, say the coordinates of the centre of > mass of the local cluster, then i'd agree that that was Flyweightish. But > i'm not so sure about the array index. > Hmmm.... The GOF has the location of the letter as extrinsic state, while I'm suggesting the location of the star. Seems pretty comparable. My use of the array index is simply the way I supply the extrinsic information, it's not intrinsic to a given Star [and in fact the index of the same 'Star' might change during the simulation, since the array is likely to be resorted continually in the process]. I wonder if the sticking point is the lack of any internal state. But the GOF notes that the FlyWeight is particularly applicable when "Most object state can be made extrinsic". This doesn't mean that Star isn't a real class. We can have lots of methods in the Star class, e.g., distanceFromNeighbor(i), move(), accelerate(), force(), ... .....back to what word to use... > > I can't think of a good word for this. Do we need one? What are some > examples of this pattern in the wild? > I was inspired to start this subthread by the sense that something was missing given your use of flyweight for this concept. As for examples: Enumerations are one of course. Another might be the states in finite state machines. I think the concept of "An a priori known and unalterable set of instances" comes up fairly commonly in code and it might be useful to be able to convey that quickly. If thats true perhaps the first step is to agree about what the concept is and then worry about the word. Regards, Tom McGlynn
From: Tom Anderson on 29 Jul 2010 09:25 On Thu, 29 Jul 2010, Lew wrote: > Tom McGlynn wrote: > >> But I'm not trying to persuade people that using flyweights in the >> sense that I suggested in my example is necessarily the right thing to >> do in all circumstances, merely to note that it may be a rational or >> even desirable approach in some. That example was given to give a >> concrete realization of an object that simultaneously implemented >> flyweight and singleton, not for its intrinsic merit but I was a little >> bemused by reaction to it. > > When dealing with 200 M stars, one might be tempted to use a > multi-threaded approach. Sharing a singleton among threads introduces > the complexity and overhead of synchronization. McGlynn's flyweights are immutable, so sharing them between threads should be fine. The mutable state lives in the parallel arrays, and access to those would need to be controlled. *That* could be tricky, because there's no natural object to lock on if you want to access a given row. You certainly can't do it with a flyweight, because the same flyweight instance will be in use by other threads to represent entirely different stars! > The straightforward object approach, combined with appropriate caching > (the buffer approach or DBMS approach) simplifies concurrent > implementations. Hmm. I don't think Star-level locking would be a good idea, regardless of how Stars work. You'd want to partition larger units between threads. Given that you'll have to build a mechanism for controlling concurrency at that level anyway, whether you have real stars or flyweights might not matter very much. tom -- The art of medicine consists in amusing the patient while nature cures the disease. -- Voltaire
From: Alan Gutierrez on 29 Jul 2010 10:39
Tom McGlynn wrote: > On Jul 29, 7:20 am, Tom Anderson <t...(a)urchin.earth.li> wrote: >> On Tue, 27 Jul 2010, Tom McGlynn wrote: >>> On Jul 26, 1:33 pm, Tom Anderson <t...(a)urchin.earth.li> wrote: >>>> On Mon, 26 Jul 2010, Tom McGlynn wrote: >>>>> On Jul 26, 8:01 am, Tom Anderson <t...(a)urchin.earth.li> wrote: >>>>>> But Joshua was talking about using instances of Color, where those >>>>>> instances are singletons (well, flyweights is probably the right term >>>>>> when there are several of them) >>>>> I don't think flyweights is the right word. For me flyweights are >>>>> classes where part of the state is externalized for some purpose. This >>>>> is orthogonal to the concept of singletons. E.g., suppose I were >>>>> running a simulation of galaxy mergers of two 100-million-star >>>>> galaxies. Stars differ only in position, velocity and mass. Rather >>>>> than creating 200 million Star objects I might create a combination >>>>> flyweight/singleton Star where each method call includes an index that >>>>> is used to find the mutable state in a few external arrays. >>>> I am 90% sure that is absolutely not how 'flyweight' is defined in the >>>> Gang of Four book >>> Here's a bit of what the GOF has to say about flyweights. (Page 196 >>> in my version).... >>> "A flyweight is a shared object that can be used in multiple contexts >>> simultaneously. The flyweight acts as an independent object in each >>> context--it's indistinguishable from an instance of the object that's >>> not shared.... The key concept here is the distinction between intrinsic >>> and extrinsic state. Intrinsic state is stored in the flyweight. It >>> consists of information that's independent of the flyweight's context, >>> thereby making it shareable. Extrinsic state depends on and varies with >>> the flyweights context and therefore can't be shared. Client objects >>> are responsible for passing extrinsic state to the flyweight when it >>> needs it." >>> That's reasonably close to what I had in mind. >> Yes, point taken. I'm still not happy with your usage, though. >> >> IIRC, the example in GoF is of a Character class in a word processor. So, >> a block of text is a sequence of Character objects. Each has properties >> like width, height, vowelness, etc and methods like paintOnScreen. But >> because every lowercase q behaves much the same as every other lowercase >> q, rather than having a separate instance for every letter in the text, we >> have one for every distinct letter. >> >> The extrinsic state in this example is the position in the text, the >> typeface, the style applied to the paragraph, etc. Certainly, things that >> are not stored in the Character. But also not things that intrinsically >> belong in the Character anyway; rather, things inherited from enclosing >> objects. >> >> Whereas in your case, the array offset *is* something intrinsic to the >> Star. If it had been something else, say the coordinates of the centre of >> mass of the local cluster, then i'd agree that that was Flyweightish. But >> i'm not so sure about the array index. >> > > Hmmm.... The GOF has the location of the letter as extrinsic state, > while I'm suggesting the location of the star. Seems pretty > comparable. My use of the > array index is simply the way I supply the extrinsic information, it's > not intrinsic to a given Star [and in fact the index of the same > 'Star' might change during the simulation, since the array is likely > to be resorted continually in the process]. I wonder if the sticking > point is the lack of any internal state. But the GOF notes that the > FlyWeight is particularly applicable when "Most object state can be > made extrinsic". This doesn't mean that Star isn't a real class. We > can have lots of methods in the Star class, e.g., > distanceFromNeighbor(i), move(), accelerate(), force(), ... I don't think the state of the `Star` can be extract from the `Star`. The idea behind the Word Processor example is that you can cache the font size in an object along with the character code itself, and have an object that can be reused and reset into the document at any location, and then participate in a `Composite` pattern, where the characters participate as tiny graphical objects. But this implies that 11pt Helvetica 'C' is one object that is reused. I'm assuming that each of these `Star` objects will have different vales entirely, therefore `Flyweight` does not apply much at all. I've called this a tiny `Adaptor` because you're going to take a `MappedByteBuffer` or parallel arrays of primitives, or something structure that stores the state of the object, and when you need an object, create a temporary wrapper around the state of a `Star` stored at a particular index. -- Alan Gutierrez - alan(a)blogometer.com - http://twitter.com/bigeasy |