Wed, 29 Apr 2009, 01:35

A student writes:

	Dr. Patt, 

	I have two questions: 

	1)      PREDICATED BRANCHING :  We did an example of a single 
	instruction and then a merged execution flow.  In case there are 
	more than a few instructions within the branches, isn't it the case 
	that predicated branching could prove more expensive than taking a 
	penalty , since the instructions on one side of the branch will be 
	executed but never committed?

Yup!  ...and I thought we said that in class.  The compiler has to decide
whether the merge point (where the true and false paths come together) is
sufficiently far away that it is better to just simply treat the branch the
old fashioned way, and take a misprediction penalty if necessary.

	Also , is it true that predicated will have much greater benefit 
	with an out of order execution machine?

I think the simplest answer is: it depends. 

	2)       Out of order Execution : The examples shows we can have 
	multiple write backs in the same cycle. Is this possible even in 
	the example case of a non-OOO machine,  that has one adder 
	( 5 cycles ) , a separate mul ( 6 cycles ) and the add follows the 
	mul instruction.  In this case both the add and mul would finish 
	at the same time  , and hence be ready to write. 

	Would this capability make it an out of order execution machine?   

No, it would not make it an o-o-o execution machine, although if I change
the numbers a little, it could make it an o-o-o "completion" machine.
In that case, it is the hardware's job to make sure the instructions are
retired in order, even though they started execution in-order.

We normally think of an o-o-o execution machine as one in which the tags
allow instructions to get out of the pipeline if they are waiting for
something to continue and let younger instructions move ahead.  We can do
this because of the tags, register alias table, and mechanism for updating
the results.  The scenario you describe still presents a problem, but the
problem (as described above) can be handled with much less hardware.  In your
case, the older instruction stays in the pipeline, keeping the younger 
instruction from starting execution ahead of it.  Instructions start 
execution in program order.  But, as you correctly point out, if they 
complete o-o-o, then the hardware has to be sure that they retire in order.

	Thanks and Regards,

	<<name withheld to protect the student fine tuning his understanding>>

Hope this helps.
Yale Patt