Wed, 29 Apr 2009, 01:54

Guess it must be the night before the midterm!  A student writes:

	Dr. Patt, 

	This is a question relating to branch prediction. I understand the
	prediction happens during the decode stage of the branch instruction. If
	that is the case, there will always be a stall ( or a bubble )  in the
	pipeline since the correct instruction cannot be fetched until 2 cycles
	after the fetch of the branch. Is this correct ? 

I actually do not know what to do with this question.  The simple answer
is 360N stuff.  The more detailed answer is 382N stuff, and you have a
360N exam tomorrow.  In fact, I think I outlined the 382N answer in the
review session last night.

So, the 360N answer is yes, it produces a bubble in the pipe, since you can
not get the branch target until you have decoded the instruction which you
need to access the I cache.  In fact the DEC Alpha 21064 did exactly that.
On a taken branch, a one stage bubble between successive fetches.

But (the 382N answer), we learned a long time ago to include in the fetch
stage a structure called the BTB (branch target buffer) which stores all
this information including br prediction information at an address corresponding
to the PC of the branch instruction.

So, while the PC is accessing the I cache, it is also accessing the BTB.  If
this is not the first time the branch instruction has been encountered and
therefore, information about the branch is stored in the BTB, that information 
is accessed, and the predicted target address loaded into the PC at the same 
time the instruction from the I cache is loaded into the instruction buffer
for the decode stage.  In that case, no pipeline bubble.  The Intel Pentium
chip, released around the same time as the 21064 had this structure. 

	Is it that the branch is fetched and predicted in the FETCH 
	stage itself ? Is so , isn't that timing path impossible ? Also , 
	in that case, what is the need for a Branch instruction to go 
	through a decode stage - it could simply zoom from Fetch to Execute.

You still have to check to be sure the information in the BTB is correct.
The address of the BTB is a hash function of the PC, not the PC.

	Also, is a "bubble" the same as a stall in the pipeline? Or does a 
	"bubble" refer specifically to a NOP passing through the pipeline? 
	Will a NOP have a Execute and a WB stage? 

Yes.  An instruction stalling means the instruction is stuck in some stage
of the pipeline.  Thus as the cycles pass, NOPs are effectively present in
the subsequent stages until the stalled instruction gets unstalled.  Yes,
the NOPs wil go through Execute and WB but they will not write results.  A 
single bit inhibits all destination writes.  In fact, you are going to have
fun dealing with that in the final programming lab, lab 6.

Hope this helps.  Good luck.
Yale Patt

	Thanks for your patience in answering so many questions!

	<<name withheld to protect...>>