??? 12/15/11 07:35 Modified: 12/15/11 07:47 Read: times |
#185093 - jump cache miss penalty Responding to: ???'s previous message |
Erik Malund said:
so, if you have a jump to some different place it takes 4 cycles to get to the first byte.
if that is the first byte of the 4 loaded all is well. if iis the 2nd or 3rd - figure it out by the below if it is the last byte of the 4 there will be a 3 clcok delay to fetch the next word to the cache. Thanks, Erik. This is roughly how I would understand it, too; except that I would expect exactly this behaviour only if all of the instructions at the jump target would be single-world single-cycle (NOP-like). As most of the instructions (even the single-word) take more than that, I would expect slightly less penalty in average. So, in effect, that all means, that a jump to a non-cached position may result in delay of 4-7 cycles (the 1-3 extra cycles would "happen" at a different instruction and under the circumstances outlined above, but I think it is an adequate description for the purpose of a worst-case description for the table). A jump into a cached position may result in delay of 1-3 cycles, if the next word is not cached yet. Correct? It would be interesting also to investigate the corner case of a multi-byte instruction, which is the target of the jump and lies across the boundary between two non-cached four-byte words. Would the core stop until all the bytes of the instruction are fetched (i.e. 4 + 4 cycles), or would it execute the instruction partially during the fetch of the rest of the bytes which lie in the next word? If not, it would mean the "jump penalty" be 1-8 cycles rather than 1-7. I am also curious whether this behaviour can be suppressed if the 100MHz-er runs at lower speed, and also how does this mechanism work with the 50MHz-ers. [*] Thanks again, Jan PS. [*] I found it. I read only the "branch cache" chapter and it is hidden elsewhere: the exact number of cycles for FLASH word reading is given by the FLRT bits in FLSCL register. |
Topic | Author | Date |
'51 derivatives cycle comparison table updated | 01/01/70 00:00 | |
above about 40 Mhz devices may need extra cycles | 01/01/70 00:00 | |
silabs with cache | 01/01/70 00:00 | |
Ok, a SILabs cache lesson | 01/01/70 00:00 | |
Bytes | 01/01/70 00:00 | |
ecc? | 01/01/70 00:00 | |
not the cookies | 01/01/70 00:00 | |
Washed? | 01/01/70 00:00 | |
am I as has happened before ... | 01/01/70 00:00 | |
Is that how it's spelled? | 01/01/70 00:00 | |
re: Washed? | 01/01/70 00:00 | |
jump cache miss penalty | 01/01/70 00:00 | |
clarifications | 01/01/70 00:00 | |
no cache for 50MHz | 01/01/70 00:00 | |
surely not all | 01/01/70 00:00 | |
you missed a word | 01/01/70 00:00 | |
more update | 01/01/70 00:00 | |
Table suggestions | 01/01/70 00:00 | |
Updated MC51 supports Cycle Define | 01/01/70 00:00 |