From: Paul A. Clayton on 1 Feb 2010 17:34 Given that L2 (and farther) reads are usually transmitted in about four transfers (e.g., 64B blocks-->four 16B transfers), would it be profitable to place the predicted critical block in a nearby (lower latency) area? (Even a predictor as simple as first read on previous access might provide some benefit--at the cost of two tag bits and some miss handling complexity.) An extension of this might be used in an L2 shared by two cores: critical blocks could be placed near the appropriate core. (Obviously, such would involve more complex allocation and placement issues.) While I have not read many NUCA papers, I have not yet seen any that use predictability of access (prefetchability) to bias placement. (It looks like the POWER7 L3 cache might be something like a Reactive NUCA [Hardavellas et al., 2009] cache!) Paul A. Clayton just a technophile
|
Pages: 1 Prev: Ignorant TSV questions Next: Intel documentation mistake for single floating point precision ! |