According to the 21066 HRM, the processor loads its initial I-stream from the
 SROM.  All Icache bits are loaded from the SROM, including the cache block
 metadata.  The blocks are loaded in sequential order starting with block 0
 and ending with block 255.  For the 20166, the Icache is loaded LSB first
 filling from left to right (i.e. bit 0 of LW0 will be the first bit loaded).
 This is the resulting order of each cache block:
 BHT LW7 LW5 LW3 LW1 V ASM ASN TAG LW6 LW4 LW2 LW0
 
No wonder I am not making sense of the image!
 I thought I had some code do unmultiplex each bit stream from an SROM image
 and then reconstruct the resulting memory image, but I can't find it or I
 just thought about doing that. 
If it turns up and it works for a 21064, I'd be very interested in a copy.
It looks like I may have to program a new SROM image in order to disable the
cache at startup to see if that works around the issues my machines are
having (assuming I can figure out how to disable the cache).
Regards,
Peter Coghlan.