Tue, 24 Mar 2009, 00:19
A student writes. The first comment is clearly beyond what I expect you to deal with in 360N, so feel free to skip over it if you wish. The second comment is more relevant to immediate goals of 360N. Dr. Patt, I have a question about cache lines from today's class. A hypothetical situation: Suppose we have a direct mapped cache such as the example that you proposed in class with an 8 byte cache line. Let's also assume that the cache line represented by physical address 01001XYZ is in the cache. Suppose this is a very dense line. I don't really know what to call it, but by "dense" I mean that something like 7 of the 8 bytes in the line are frequently used by the executing program. Now, suppose that the program needs the line represented by physical address 10001XYZ. However, this line is very "sparsely" populated, so that only one or two of the bytes on that line are ever used by the program. Is there a way to filter out the necessary data, and do some sort of partial replacement? Or perhaps does memory get reorganized somehow? Filter is the correct word to use for this. It just seems to me that it is a waste of processor time to keep switching out cache lines when you might need part of one line frequently and another part of another line frequently. Or perhaps I am simply thinking wishfully. Sounds like this undergraduate is a computer architecture PhD student in the making. Excellent insights. Actually, if you are interested, we have a couple of papers dealing with exactly that issue, although there is plenty of research still to do. "Line Distillation: Increasing Cache Capacity by Filtering Unused Words in Cache Lines," Moinuddin Qureshi, Muhammed Aater Suleman, and Yale N. Patt, Proceedings of the 13th Annual IEEE High Performance Computer Architecture Symposium (HPCA-13), Phoenix, February, 2007. "The V-Way Cache: Demand Based Associativity via Global Replacement," Moinuddin Qureshi, David Thompson, and Yale N. Patt, Proceedings, 32nd Int'l Symposium on Computer Architecture, Madison, Wisconsin, June, 2005 Moin is now Dr. Qureshi and is at IBM Research in Yorktown Heights, Aater is almost finished with his PhD at UT, and Dave is working at TI in Dallas. Also, you mentioned that the cache addresses were all physical addresses. Are we assuming that they have been already translated from virtual addresses? Yes. Or, it seems to me that caches are part of the microarchitecture and not the ISA since they simply one implementation on a chip. In that case, would one not need a virtual address for the cache since the executing code would simply see its own picture of the memory, regardless of whether or not the chip has a cache? Indeed, the executing code only sees its own picture of the memory whether or not the chip has a cache. However, since the code needs to read what is in this "picture," this picture better correspond to physical locations. It is true that we could forgo the translation and only cache the virtual addresses of this information. The problem is that two processes may have the same virtual address corresponding to two different physical locations. For example, both have a virtual page 0. What would happen if I bring a location into the cache based on process A's virtual address and then get a hit when process B later accesses this virtual address. Note they are the same virtual address, but correspond to two very different physical locations. What to do? Actually, there are two usual ways to deal with this: Either store a process ID in every tag store entry, or flush the cache on every context switch. Both ways have serious negatives. What are they? Thanks for asking. I was going to talk about this on Wednesday, so now I can dismiss the class on Wednesday at 6:23 instead of 6:25. Hope this helps. Yale Patt Thanks. <<name withheld to protect the future PhD student>>