Thurs, 16 Oct 2014, 01:28

A student writes:

> Hi Dr. Patt,
>  Stephen and I had a question regarding the efficiency of the Cray 1 at
> different stride lengths, in particular stride lengths which were multiples
> of 16 that seem to cripple the optimization introduced by its banked
> memory. I was wondering if the system did anything to improve this type of
> access.
> Regards,
> << name withheld to protect the student who sees a problem with strides>>

Excellent question!  And, indeed, you correctly note that if the stride is
a multiple of 16, you may as well not have interleaving at all.  In fact,
what if the stride is 8?  or 4?  or 2?  The purpose of interleaving
(consecutive addresses go to different banks) is to allow an access to a
bank to complete before the next time you wish to access that bank.  To make
that happen, it is important that the values of "stride" and "interleaving"
be relatively prime.  So a stride of 9 for example will work fine with
interleaving of 16.  Do you see why?

As far as I know the Cray 1 did nothing to help, but relied on the
programmer to choose dimensions that were relatively prime to the degree
of interleaving.  Too much to expect of your average programmer?  Perhaps.
But, as they have told me many times, Cray 1 programmers were not average

Thanks for the question.
Yale Patt