2-D FIR Implementation #3 on C6x
; registers: A5=&a(0,0) B5=&x(n1,n2) A2=M B7=M2 B8=N2
fir2d3 ZERO .D1 A4 ; initialize accumulator #1|| SUB .D2 B8,B7,B9 ; index offset between rows
|| ZERO .L2 B2 ; offset into image data|| MVKH .S1 0xFF,A8 ; mask to get lowest 8 bits
|| SHR .S2 B7,1,B7 ; divide by 2: 16bit address ZERO .D2 B4 ; initialize accumulator #2|| ZERO .L1 A6 ; current coefficient value|| ZERO .L2 B6 ; current image value
|| SHR .S1 A2,1,A2 ; divide by 2: 16bit address|| SHR .S2 B9,1,B9 ; divide by 2: 16bit address