Prev: 128316 Computer Knowledge, Free and alwqays Up to Date 59
Next: Fwd: Different stacks for return addresses and data?
From: Robert Myers on 2 Mar 2010 14:48 On Mar 2, 2:12 pm, Stephen Fuld <SF...(a)alumni.cmu.edu.invalid> wrote: > On 3/2/2010 9:56 AM, Robert Myers wrote: > > > > Well, some of the details of Mitch's proposal aren't clearly specified. > I can't tell if he intends the off chip DRAM to be part of the > processor's address space or not. If it is, then the on-chip DRAM is > essentially a level 4 cache. But it could be that it isn't, in which > case, it is more like a fast paging device with some extra features. > Help me out here. Isn't the page file part of the processor's address space? When something is paged out, you don't have to worry about coherence because you can't touch it. Maybe there is some genius-level subtlety I'm missing. Robert.
From: Stephen Fuld on 2 Mar 2010 16:30 On 3/2/2010 11:48 AM, Robert Myers wrote: > On Mar 2, 2:12 pm, Stephen Fuld<SF...(a)alumni.cmu.edu.invalid> wrote: >> On 3/2/2010 9:56 AM, Robert Myers wrote: >> >> >> >> Well, some of the details of Mitch's proposal aren't clearly specified. >> I can't tell if he intends the off chip DRAM to be part of the >> processor's address space or not. If it is, then the on-chip DRAM is >> essentially a level 4 cache. But it could be that it isn't, in which >> case, it is more like a fast paging device with some extra features. >> > Help me out here. Isn't the page file part of the processor's address > space? No. In most systems, the process of bringing in a page to main memory from the page file, causes the page to be mapped into the processor address space (I know, Del, not true of AS/400) That is, the page tables are updated. If this were not the case, you couldn't have multiple programs all running whose aggregate size totals more than the processor address space. So on a 32 bit processor, you couldn't have more than 4 GB of programs. But, of course you can do this, as the pages that aren't active are in the page file and don't take up processor address space. > When something is paged out, you don't have to worry about > coherence because you can't touch it. That's true, but not the point. > Maybe there is some genius-level subtlety I'm missing. That's too easy, so I won't respond. :-) -- - Stephen Fuld (e-mail address disguised to prevent spam)
From: Robert Myers on 2 Mar 2010 16:50 On Mar 2, 4:30 pm, Stephen Fuld <SF...(a)alumni.cmu.edu.invalid> wrote: > On 3/2/2010 11:48 AM, Robert Myers wrote: > > When something is paged out, you don't have to worry about > > coherence because you can't touch it. > > That's true, but not the point. > Sorry, but it does seem like the critical point. If what used to be the main memory is now nothing but a page file, then only one processor can have that page in Level 4 cache, or whatever you choose to call it, just as there can only be one copy of something in main memory. Processors would have shared access through some kind of NUMA architecture. If, on the other hand, the Level 4 cache operates like a cache, then processors share access through cache snooping. If there is some other detail that Mitch left out, I'm still missing it. I will admit that I confused the issue by using the term "Level 4 cache," although I might point out that IBM keeps Level 4 cache in main memory in some architectures. Robert.
From: Stephen Fuld on 2 Mar 2010 17:16 On 3/2/2010 1:50 PM, Robert Myers wrote: > On Mar 2, 4:30 pm, Stephen Fuld<SF...(a)alumni.cmu.edu.invalid> wrote: >> On 3/2/2010 11:48 AM, Robert Myers wrote: > >>> When something is paged out, you don't have to worry about >>> coherence because you can't touch it. >> >> That's true, but not the point. >> > Sorry, but it does seem like the critical point. If what used to be > the main memory is now nothing but a page file, then only one > processor can have that page in Level 4 cache, or whatever you choose > to call it, just as there can only be one copy of something in main > memory. Yes. > Processors would have shared access through some kind of NUMA > architecture. Mitch's original proposal was multiple cores on a single chip. When you say multiple processors, are you talking about the multiple cores on the chip or multiple chips? > If, on the other hand, the Level 4 cache operates like a cache, then > processors share access through cache snooping. If it is a single L4 cache on a chip, then there is no coherence issue for it on the chip. Multiple chips have the same coherence issues that current Intel and AMD chips have now with their on chip caches. Let me try a different explanation. Consider a system with DRAM main memory, but with the page file is resident on a solid state disk (SSD) attached via say a SATA port. Except for the "details" of interconnect speed and I/O, this is entirely analogous, yet no one would call the DRAM memory of this system a level 4 cache. Please remember that I am not saying that what Mitch proposed actually operates this way, but that given the information he has provided, it is possible. -- - Stephen Fuld (e-mail address disguised to prevent spam)
From: Robert Myers on 2 Mar 2010 17:57
On Mar 2, 5:16 pm, Stephen Fuld <SF...(a)alumni.cmu.edu.invalid> wrote: > On 3/2/2010 1:50 PM, Robert Myers wrote: > > > On Mar 2, 4:30 pm, Stephen Fuld<SF...(a)alumni.cmu.edu.invalid> wrote: > >> On 3/2/2010 11:48 AM, Robert Myers wrote: > > >>> When something is paged out, you don't have to worry about > >>> coherence because you can't touch it. > > >> That's true, but not the point. > > > Sorry, but it does seem like the critical point. If what used to be > > the main memory is now nothing but a page file, then only one > > processor can have that page in Level 4 cache, or whatever you choose > > to call it, just as there can only be one copy of something in main > > memory. > > Yes. > > > Processors would have shared access through some kind of NUMA > > architecture. > > Mitch's original proposal was multiple cores on a single chip. When you > say multiple processors, are you talking about the multiple cores on the > chip or multiple chips? > > > If, on the other hand, the Level 4 cache operates like a cache, then > > processors share access through cache snooping. > > If it is a single L4 cache on a chip, then there is no coherence issue > for it on the chip. Multiple chips have the same coherence issues that > current Intel and AMD chips have now with their on chip caches. > > Let me try a different explanation. Consider a system with DRAM main > memory, but with the page file is resident on a solid state disk (SSD) > attached via say a SATA port. Except for the "details" of interconnect > speed and I/O, this is entirely analogous, yet no one would call the > DRAM memory of this system a level 4 cache. > > Please remember that I am not saying that what Mitch proposed actually > operates this way, but that given the information he has provided, it is > possible. > I have only myself to blame for my cavalier use of language. It seemed natural to call memory resident on the die "cache." Nothing would have tempted me to call the main memory in your SSD proposal cache, although, as I pointed out, IBM has already blurred the lines. I try to avoid arguments over terminology whenever possible. Never again will I refer to something that doesn't in every way conform to your notion of cache as "cache." Assuming that the on-chip memory is acting like main memory, then all cores on a die would have equal access to the main memory resident on the die (or chip, which could conceivably have multiple dies). If a core on another chip with it's own distinct memory needed that data it could only store a copy of that data in Level 3 cache, and you would have to deal with cache coherence. The situation seems completely analogous to what you would have with a multiple Nehalem system, except that the main memory has migrated to the chip and your page file would reside in motherboard memory (in all likelihood). Such an arrangement *still* leaves an interconnect bandwidth problem if there are multiple sockets in the system, as surely there would be where super-high bandwidth is a requirement. Maybe if you can jam enough into a single socket, you can justify an optical interconnect between sockets. Robert. |