Email: Password: Remember Me | Create Account (Free)

Back to Subject List

Old thread has been locked -- no new posts accepted in this thread
???
12/15/11 07:35
Modified:
  12/15/11 07:47

Read: times


 
#185093 - jump cache miss penalty
Responding to: ???'s previous message
Erik Malund said:
so, if you have a jump to some different place it takes 4 cycles to get to the first byte.

if that is the first byte of the 4 loaded all is well.

if iis the 2nd or 3rd - figure it out by the below

if it is the last byte of the 4 there will be a 3 clcok delay to fetch the next word to the cache.

Thanks, Erik.

This is roughly how I would understand it, too; except that I would expect exactly this behaviour only if all of the instructions at the jump target would be single-world single-cycle (NOP-like). As most of the instructions (even the single-word) take more than that, I would expect slightly less penalty in average.

So, in effect, that all means, that a jump to a non-cached position may result in delay of 4-7 cycles (the 1-3 extra cycles would "happen" at a different instruction and under the circumstances outlined above, but I think it is an adequate description for the purpose of a worst-case description for the table). A jump into a cached position may result in delay of 1-3 cycles, if the next word is not cached yet. Correct?


It would be interesting also to investigate the corner case of a multi-byte instruction, which is the target of the jump and lies across the boundary between two non-cached four-byte words. Would the core stop until all the bytes of the instruction are fetched (i.e. 4 + 4 cycles), or would it execute the instruction partially during the fetch of the rest of the bytes which lie in the next word? If not, it would mean the "jump penalty" be 1-8 cycles rather than 1-7.

I am also curious whether this behaviour can be suppressed if the 100MHz-er runs at lower speed, and also how does this mechanism work with the 50MHz-ers. [*]


Thanks again,

Jan

PS. [*] I found it. I read only the "branch cache" chapter and it is hidden elsewhere: the exact number of cycles for FLASH word reading is given by the FLRT bits in FLSCL register.


List of 19 messages in thread
TopicAuthorDate
'51 derivatives cycle comparison table updated            01/01/70 00:00      
   above about 40 Mhz devices may need extra cycles            01/01/70 00:00      
      silabs with cache            01/01/70 00:00      
         Ok, a SILabs cache lesson            01/01/70 00:00      
            Bytes            01/01/70 00:00      
               ecc?            01/01/70 00:00      
               not the cookies            01/01/70 00:00      
                  Washed?            01/01/70 00:00      
                     am I as has happened before ...            01/01/70 00:00      
                        Is that how it's spelled?            01/01/70 00:00      
                     re: Washed?            01/01/70 00:00      
            jump cache miss penalty            01/01/70 00:00      
               clarifications            01/01/70 00:00      
               no cache for 50MHz            01/01/70 00:00      
                  surely not all            01/01/70 00:00      
                     you missed a word            01/01/70 00:00      
   more update            01/01/70 00:00      
      Table suggestions            01/01/70 00:00      
      Updated MC51 supports Cycle Define            01/01/70 00:00      

Back to Subject List