Assembly Optimizations
Hand coding optimizations
- Use instructions in parallel
|| sub .L2 B1,B2,B1 ; parallel instruction
- Fill NOP delay slots with useful instructions
- Manual loop unrolling
- Pack two 16-bit numbers in a 32-bit register: replace two LDH instructions with LDW instruction
Assembler optimizations
- Assigns functional units when not specified
- Pack and parallelize linear assembly language code
- Software pipelining