Department of Electrical and Computer Engineering

The University of Texas at Austin

EE 460N, Fall 2014
Problem Set 4 Solution
Yale N. Patt, Instructor
Stephen Pruett, Emily Bragg, Siavash Zangeneh, TAs
      • Byte on bus Addr[1:0]
      • Interleave bits Addr[4:2]
      • Chip address Addr[7:5]
      • Rank bits Addr[11:8]
    1. 577 Cycles. The first 8 memory accesses, A[0][0] to A[0][7], must occur sequentially with no overlap since they are all acesses to the same bank. Thus, it would take 80 cycles for the 1st 8 memory accesses, with the 8th access starting in cycle 70. Since the 8th and 9th memory accesses, A[0][7] and A[1][0], respectively, are to different banks, the accesses can overlap, and the 9th access can start in cycle 71 (70 cycles for the 1st 7 accesses plus 1 additional cycle of the 8th access). Continuing with this logic, the access to A[2][0] could start in cycle 142 (71x2). Finally, the access to A[7][0] could start in cycle 497 (71x7). Now all that remains are 8 more memory accesses, all to the same bank (A[7][0] to A[7][7]). This takes another 80 cycles, bringing the total to 577 cycles (497 + 80).

      If the memory were not interleaved, all 64 memory accesses must happen sequentially with no overlap, so it would take a total of 640 cycles (64*10). Therefore, we do gain some benefit from this interleaving scheme, but not that much.

    2. Yes, a change can be made. The new bits are:

      • Byte on bus Addr[1:0]
      • Interleave bits Addr[4:2]
      • Chip address Addr[11:9]
      • Rank bits Addr[8:5]

      73 Cycles. With the new interleaving scheme, consecutive memory accesses are to either to different banks of the same rank, or to different ranks all together. In both cases, the consecutive accesses can start immediately after each other. Therefore the latency of all memory accesses would be hidden except the first access. Total number of cycles = 10 + 63 = 73

    3. Only one line of code needs to be changed:

      sum = sum + A[i][j];
      to
      sum = sum + A[j][i];

      Alternatively, you could keep that line the same, but swap the variable (i/j) of the inner and outer loops as shown below.

      Original code:

           for(i = 0; i < 8; ++i){
             for(j = 0; j < 8; ++j){
               sum = sum + A[i][j];
             }
           }

      New code:

           for(j = 0; j < 8; ++j){
             for(i = 0; i < 8; ++i){
               sum = sum + A[i][j];
             }
           }

      87 Cycles. Now consecutive memory accesses are to different banks, so the accesses can overlap. The 1st access, A[0][0], would begin at cycle 0, the 2nd, A[1][0], at cycle 1, and so on. The 8th access, A[7][0], would start at cycle 7. However, the 9th access, A[0][1], cannot start at cycle 8. It would have to wait 2 more cycles for the 1st access to finish since it is on the same bank as the 1st access; therefore, it would start at cycle 10. Continuing this logic, the access to A[0][2] would start at cycle 20, and finally, the access to A[0][7] would start at cycle 70. Now, all that is left are 8 accesses, but they are all to different banks so they can start 1 cycle after each other. The access to A[1][7] would begin at cycle 71, A[2][7] at 72, and finally A[7][7], the last memory access, would begin at cycle 77 and, therefore, end at cycle 87

  1. The following table summarizes the differences between exceptions and interrupts.

    ExceptionsInterrupts
    Cause Internal to the running process External to the running process
    When are they handled? When detected (mostly) When convenient (mostly)
    Are they maskable? Almost never Almost always
    Priority Same as the process that caused the exception. Depends (on the priority of the interrupting device)
    Context Process Handling is done withing the system context.
    The process to take an interrupt or exception is explained in lab 4 descriptor.
    1. The probability of a single bit flipping is pf = 10-7. Therefore, the probability of a bit remaining correct is pc = (1 - 10-7). The probability that a transmitted nine bit message will have zero flipped bits is pc9 = (1 - 10-7)9. Thus, the probability of at least one of the bits being flipped is 1 - pc9 = 1 - (1 - 10-7)9 ≈ 9 × 10-7.
    2. Parity check logic can detect an odd number of errors: 1, 3, 5, 7, and 9.
    3. Parity check logic cannot detect an even number of errors: 2, 4, 6, 8.
      • There are choose(9, 1) = 9 possible combinations that result in a single flipped bit. Thus, the probability of one bit being flipped is p1 = 9 × pf × pc8 ≈ 9 × 10-7.
      • There are choose(9, 2) = 36 possible combinations that result in two flipped bits. Thus, the probability of two bits being flipped is p2 = 36 × pf2 × pc7 ≈ 3.6 × 10-13.
      • There are choose(9, 3) = 84 possible combinations that result in three flipped bits. Thus, the probability of three bit being flipped is p3 = 84 × pf3 × pc6 ≈ 8.4 × 10-20.
    4. Ignoring the probability of three or more bit errors, the probability of a detected error is just the probability of a single bit error (calculated above). Thus, the rate of detected errors is p1 × 109 ÷ 9 ≈ 100 errors per second.
    5. Similarly, the probability of an undetected error is approximately the probability of a double error. Thus, the rate of undetected errors is p2 × 109 ÷ 9 ≈ 4 × 10-5 corrupt messages per second (or twice as many undetected bit errors, since we assume each undetected corrupt message contains two flipped bits), which is equivalent to one undetected corrupt message about every 7 hours.
  2. postponed
  3. Size of a page is 512 bytes.

    Number of bits of address required to calculate the offset within a page is 9.

    Number of frames in physical memory is (8K bytes) ÷ (512 bytes) = 213 ÷ 29 = 24.

    Size of PFN is 4 bits.

    Size of PTE equals 1 (Valid) + 1 (Modified) + 2 (access control) + 4 (PFN) = 8 bits = 1 byte.

    Number of virtual pages in System Space is (3 × 212) ÷ (29) = 24 pages.

    Size of System Page Table is 24 × 1 byte = 24 bytes = 24 × 8 bits = 192 bits.

  4. We can determine from the given information that:

    The breakdown for a Virtual Address (VA) must be:

    The breakdown for a Physical Address (PA) must be:

    1. Answer: x825C

      Given the VAX 2-level translation scheme, we know that this VA must be in system space. Therefore, the top 2 bits of the VA (VA[15:14]) must be 10 (in binary). We also know that the offset of the VA is the same as the offset of the PA, so the bottom 8 bits of the VA (VA[7:0]) must be x5C (01011100 in binary). Now, all we have to figure out is the 6 bit Virtual Page Number (VPN). It was given that the 1st access to physical memory (let's call this PA1) was at location x108. Once again, given the VAX 2-level translation scheme, we know that PA1 = SBR + (size of PTE in bytes) × VPN. Solving this equation for VPN we get VPN = (PA1 - SBR) ÷ (size of PTE in bytes).

      It was given that PA1 is x108, SBR is x100, and the size of a PTE is 4 bytes. Therefore, the VPN is x2 (000010 in binary). The complete VA is therefore 10 000010 01011100 (in binary) which is x825C.

    2. Answer: x80000009

      The contents of physical address x45C (the 2nd access to physical memory) is a PTE. We know that a PTE is a 32 bit value that consists of 1 valid bit (PTE[31]), and a 4-bit PFN (PTE[3:0]). All other bits of the PTE (PTE[30:4]) are 0. The contents of this PTE are used to form the address of the 3rd access to physical memory (x942). Therefore, the PFN bits of the PTE must be x9, and the valid bit must be 1. This implies that the PTE is x80000009.

    3. Answer x0342

      First, we must determine if this VA is in P0 space or P1 space. To determine this, we have to figure out if P0BR or P1BR was used to compute the virtual address x825C (the answer to part a). It was given that P0BR is x8250, and that P1BR is x8350. Since x8350 is greater than x825C, we know that we could not have used P1BR to compute x825C, and therefore we must have used P0BR which means the VA is in P0 space. Therefore, the top 2 bits of the virtual address (VA[15:14]) must be 00 (in binary). We also know that the offset of the VA is the same as the offset of the PA, so the bottom 8 bits of the VA (VA[7:0]) must be x42 (01000010 in binary). Now, all we have to figure out is the 6 bit Virtual Page Number (VPN). Once again, given the VAX 2-level translation scheme, we know that x825C = P0BR + ((size of PTE in bytes) × VPN ).

      Solving this equation for VPN we get VPN = (x825C - P0BR) ÷ (size of PTE in bytes).

      It was given that P0BR is x8250, and the size of a PTE is 4 bytes. Therefore, the VPN is x3 (000011 in binary). The complete VA is therefore 00 000011 01000010 (in binary) which is x0342.

    1. # virtual pages = virtual address space ÷ size of page = 212 Bytes ÷ 25 Bytes/page = 27 pages
    2. # physical frames = physical address space ÷ size of frame = 29Bytes ÷ 25 Bytes/frame = 24 frames. Therefore, 4 bits are needed to specify the PFN.

      Size of PTE = Valid bit + Modified bit + access control bits + PFN bits = 1 + 1 + 2 + 4 = 8 bits (1 Byte)

    3. User space = (3/4) × Virtual address space = (3/4) × 27 pages = 3 × 25 pages. Each page of user space will have a PTE in the user space page table.

      Size of user page table is # of entries × size of PTE = (3 × 25 entries) × 1 Byte/entry = 96 Bytes.

      # of pages = 96 Bytes ÷ 25 Bytes/page = 3 pages.

    4. We'll use the prefix “b” to indicate a binary number in this solution. Also, for clarity, we will call the virtual address x7AC the virtual address of X (VA_X).

      Virtual address VA_X = x7AC
      The three parts of this virtual address are:
      VA_X[11:10]: b01 (indicates that this is an address in user space)
      VA_X[11:5] (7 bits): Virtual Page Number = b0111101
      VA_X[4:0] Offset within page: b01100
      • X is on page x03D of user space
      • VA of the PTE of the page containing x is VA_PTE_X = PTBR + (x03D × 1) = x380 + x03D = x3BD.
      • Virtual page of System Space of PTE is VA_PTE_X[11:5] = x01D.
      • PA of the PTE of this page of System Space is PA_PTE_PTE = SBR + VA_PTE_X[11:5] × 1 = x1E0 + x01D = x1FD.
      • The PTE of this page of System Space is:
        PTE_PTE_X = Memory[x1FD] = xB8
        PFN_PTE_X = PTE_PTE_X[3:0] = x8
        PA of the PTE of the page containing X:
        PA_PTE_X = PFN_PTE_X concatenated with VA_PTE_X[4:0] = x11D
        PTE_X = Memory[x11D] = x83
        PFN_X = PTE_X[3:0] = x3
        PA_X = PFN_X concantenated with VA_X[4:0] = x06C
  5. We'll use the prefix “b” to indicate a binary number in this solution.

    Virtual address VA_X = x3456789A
    The three parts of this virtual address are:
    VA_X[31:30]: b00 (indicates that this is an address in P0 space)
    VA_X[29:9] (21 bits): Virtual Page Number = b 11 0100 0101 0110 0111 100 (x1A2B3C)
    VA_X[8:0] Offset within page: b010011010
  6. Including instruction fetch, every instruction can generate a page fault. Ignoring instruction fetch, LDB, LDW, STB, STW, TRAP, RTI can generate a page fault (If the trap vector table or system stack is always in physical memory, then the TRAP or RTI won't generate a page fault).

    Including the instruction fetch, RTI can generate the maximum number of page faults (3) and LDB, LDW, STB, STW, TRAP can generate the next most number of page faults (2). (Points will not be deducted for RTI)