Tue, 24 Mar 2009, 00:19

A student writes.  The first comment is clearly beyond what I expect you
to deal with in 360N, so feel free to skip over it if you wish.  The second
comment is more relevant to immediate goals of 360N.

	Dr. Patt,

	I have a question about cache lines from today's class.

	A hypothetical situation:

	Suppose we have a direct mapped cache such as the example that you 
	proposed in class with an 8 byte cache line. Let's also assume that 
	the cache line represented by physical address 01001XYZ is in the 
	cache. Suppose this is a very dense line. I don't really know what 
	to call it, but by "dense" I mean that something like 7 of the 8 bytes 
	in the line are frequently used by the executing program. Now, suppose 
	that the program needs the line represented by physical address 
	10001XYZ. However, this line is very "sparsely" populated, so that 
	only one or two of the bytes on that line are ever used by the program.
	Is there a way to filter out the necessary data, and do some sort of 
	partial replacement? Or perhaps does memory get reorganized somehow?

Filter is the correct word to use for this.

	It just seems to me that it is a waste of processor time to keep 
	switching out cache lines when you might need part of one line 
	frequently and another part of another line frequently. Or perhaps 
	I am simply thinking wishfully.

Sounds like this undergraduate is a computer architecture PhD student in the
making.  Excellent insights.  Actually, if you are interested, we have a couple
of papers dealing with exactly that issue, although there is plenty of 
research still to do.

"Line Distillation: Increasing Cache Capacity by Filtering Unused Words in
Cache Lines," Moinuddin Qureshi, Muhammed Aater Suleman, and Yale N. Patt,
Proceedings of the 13th Annual IEEE High Performance Computer Architecture 
Symposium (HPCA-13), Phoenix, February, 2007.

"The V-Way Cache: Demand Based Associativity via Global Replacement,"
Moinuddin Qureshi, David Thompson, and Yale N. Patt, Proceedings, 32nd Int'l 
Symposium on Computer Architecture, Madison, Wisconsin, June, 2005 

Moin is now Dr. Qureshi and is at IBM Research in Yorktown Heights, Aater
is almost finished with his PhD at UT, and Dave is working at TI in Dallas.

	Also, you mentioned that the cache addresses were all physical 
	addresses.  Are we assuming that they have been already translated 
	from virtual addresses? 


	Or, it seems to me that caches are part of the microarchitecture
	and not the ISA since they simply one implementation on a chip. In that
	case, would one not need a virtual address for the cache since the 
	executing code would simply see its own picture of the memory, 
	regardless of whether or not the chip has a cache?

Indeed, the executing code only sees its own picture of the memory whether
or not the chip has a cache.  However, since the code needs to read what is
in this "picture," this picture better correspond to physical locations.  It is
true that we could forgo the translation and only cache the virtual addresses
of this information.  The problem is that two processes may have the same
virtual address corresponding to two different physical locations.  For example,
both have a virtual page 0.  What would happen if I bring a location into the
cache based on process A's virtual address and then get a hit when process B 
later accesses this virtual address.  Note they are the same virtual address,
but correspond to two very different physical locations.  What to do?

Actually, there are two usual ways to deal with this: Either store a process ID
in every tag store entry, or flush the cache on every context switch.  Both
ways have serious negatives.  What are they? 

Thanks for asking.  I was going to talk about this on Wednesday, so now I can
dismiss the class on Wednesday at 6:23 instead of 6:25.

Hope this helps.

Yale Patt  

	<<name withheld to protect the future PhD student>>