Wed, 3 Dec 2014, 22:19

A student writes:

> Dear Dr. Patt,
> I am a graduate student and I am taking your class
> EE460N/EE382N.1 this semester. The class today is a "open-to-all" session
> but I missed the chances to ask you technical questions.

I am sorry to hear that.  Did you raise your hand?

> Can I ask you
> about those questions in email?

Of course, and I will share your answers with the class.

> I have the following three questions, which
> have puzzled me quite a while. Sorry that I may ask some silly questions,
> since I don't have architecture background in my undergraduates. Correct me
> if I am wrong.
> 1. On chip memory:
> In the modern architecture, the memory is off-chip. But I heard that there
> is a trend of “on-chip memory processor”. I think the idea is pretty cool,
> since the data transfer rate between CPU and memory is really slow
> comparing to the amount of memory. How do you think of this trend? Do you
> believe it is possible? But as far as I know, I didn’t find any practical
> architecture with “on-chip memory processor”, why? What is the trade-off
> for memory on-chip versus off-chip?

Actually this idea has been around for a long time.  Years ago it was more
problematic to implement since DRAM cells store the bit as charge on a
capacitor (large capacitance is good) while the processor wants to minimize
the RC time constant.  Notwithstanding hard to implement, there were at least
three major projects on this idea: PIM (processor in memory), MOP (memory on
the processor chip), and IRAM (intelligent RAM, as opposed to Dumb RAM!).
The idea of course was that for chasing pointers in a linked list, for
example, one would save an enormous amount of latency if the processing and
memory accesses could both be done on chip.  I personally think you are going
to see this revisited a lot in the near future.

> 2. Cache coherency:
> Is cache coherency the main issue for the scalability of multi-core
> processor? What are other reasons that we cannot have too many cores on a
> single chip?

No, I think most people would agree that memory bandwidth and energy
consumption are the biggest problems.  ...which I thought I talked about
the first day of class.  One more reason to skip most of that early
introduction and save it for the last week of class.

> Is there any architecture which doesn’t necessarily guarantee cache
> coherency? I searched on the Internet, and it looks like GPU/Intel Xeon
> Phi/Intel SCC are such architectures, and the cache coherency is left to
> software side instead of hardware.

There are many ways to skin a cat.  One of my students Sanjay Patel, a
Professor at UIUC, designed Rigel a few years ago.  Rigel had coherency among
a small number of first level caches, but not across all first level caches
on the chip.  Software would then need to carry the day.

> 3. Virtual Memory:
> Currently the PTE is mapping one virtual page to one frame. Is it possible
> that we can also have one PTE mapping several contiguous virtual pages to
> several contiguous frames? Notice some scientific computing applications
> (such as dgemm, which I am working on) always target at large data, and the
> data occupy contiguous pages and should always reside in the memory to get
> better performance. For such applications, if we can map several contiguous
> virtual pages to several contiguous frames with only one PTE, we may reduce
> the page table size, and maybe we can reduce TLB size, so we can accelerate
> the virtual to physical memory lookup time.

Very good insight!  Many ISAs have multiple page sizes.  I think the first one
I remember was the DEC Alpha (in 1992).  This does what you suggest, where
what I call here one large page is what you call many contiguous small pages.
The PTE has two bits to identify the size of the page.  The sizes generally
form a geometric progression.  And the result is the positive property you
mention, one PTE instead of many for the same mapping.

> Thank you for the semester!
> <<name withheld to protect the student who did not get to ask his questions>>

Thank you for your questions.  Good luck on the final exam and the 6th lab.

Yale Patt