Email: Password: Remember Me | Create Account (Free)

Back to Subject List

Old thread has been locked -- no new posts accepted in this thread
???
05/15/09 05:46
Read: times


 
#165360 - It's not that difficult ...
Responding to: ???'s previous message
Per Westermark said:
The only problem with a pipeline is the startup time after an unexpected jump.

You don't have to worry about any reset conditions. Normal speculative execution just means that you can throw away results at the last step of the pipeline. Form the programs view, an instruction hasn't happened unless the last step in the pipeline gets accepted.

With a short pipeline, you need to be able to see the startup time either. After a reset, it may take a couple of cycles extra until the first results gets produced. In the same way, an interrupt will require these cycles until activated, but these cycles would normally be so much shorter (the pipeline allows a higher clock frequency, since you limit the sequential state changes required for a step in the pipeline) that the total number of nanoseconds for the interrupt response will not be affected. The ALU etc of the 8051 are so tiny that it is very easy for the pipeline to compute both sides of a conditional branch, and throw away the wrong alternative.

In the end, a pipeline need not affect any predictability. You could still count your cycles for the individual instructions. It isn't until you start with instruction reordering or concurrent (accepted, instead of speculative) instructions that you will lose the ability to compute exact timing.

It's still predictable, since computer programs can and will "know" exactly what the state of the pipeline is, in the event that the core is pipelined.

I disagree with the notion that the 805x ALU is small, as compared with the remainder of the logic. I also disagree with the notion that there has to be a lot of propagation delay when reaching into code space, internal data space, SFR space, external data space, etc. The ALU, from where I sit, is pretty large, wide, not in data path but in function, and has a substantial pair of data selectors (muxes) at its input, and a substantial data distributor at its output. A mux does the shifts and rotates, a 16-bit adder-subtractor does the address and data arithmetic, though it's 16-bits wide only to support the DPTR and PC/Address-Bus operations.

Viewing many bytes of the instruction stream without a pipeline gives the processor information it can't make any use of. The ALU, address busses, etc. will not be any faster just because you have knowledge about following instructions.

Yes, you're right ... the logic depth, which can be fairly well equalized, using short, wide paths rather than narrow long ones, will provide the rate-determining step. However, if a 3-byte instruction, e.g. MOV A,#HHHH takes just as long as a single-byte instruction, MOV B,A, or a two-byte instruction, MOV A,VNAME, things will go quite a bit faster even though the individual cycles are longer.

Think about discrete logic. How much can you manage to do in your discrete logic with just one clock transition? Each gate will have a delay, and your information may in some situations have to ripple through the logic gates and flip-flops. Using a two-phase clock, you would still have quite interesting times to get data from the code space, decode, retrieve input data, compute and store back the result within one low-to-high and one high-to-low clock transition.


I've not built an 805x core ... yet ... though I've done considerable preliminary work on it. I've built other cores, and have found that one can build nearly any MCU core with a simple two-phase clock, e.g. the sort which was used on 6801 or 6502, etc, doing data arithmetic on one phase and address arithmetic on the other. The two internal data spaces, SFR space, code space, and external data space, are all segments of an otherwise contiguous memory space. Since address arithmetic doesn't affect the data, and since data arithmetic doesn't affect the addresses, the address arithmetic cycle can be used to access the composite memory space. Consequently, the data arithmetic result can be transferred to memory during the address arithmetic cycle, and the address arithmetic result can be transferred to the appropriate resource (not data or code memory, but possibly to stack or registers) during the data arithmetic cycle. Because the ALU is wide but shallow, it can easily be used for both sets of arithmetic, thereby eliminating the need for long clearable and presettable up/down counters, which require quite large concatenated gates. I could go on, but I imagine people's eyes are already glazing over.

RE







List of 74 messages in thread
TopicAuthorDate
max clk freq            01/01/70 00:00      
   Which            01/01/70 00:00      
   300MHz            01/01/70 00:00      
      .            01/01/70 00:00      
         Does that make it effectively 600MHz, then...?            01/01/70 00:00      
            That are the links I found...            01/01/70 00:00      
               Interesting item, but did you notice ... ?            01/01/70 00:00      
                  300Mips, equivalent to 3.6GHz!            01/01/70 00:00      
                     That's slightly misleading ...            01/01/70 00:00      
                        You sure about your math?            01/01/70 00:00      
                           It's confusing ... typical marketing drivel            01/01/70 00:00      
                              Based on the claims you posted            01/01/70 00:00      
                                 Those aren't my claims!            01/01/70 00:00      
                                    Read comments _before_ (not) answering them            01/01/70 00:00      
                                       Architecture speed            01/01/70 00:00      
                                          That was my take too            01/01/70 00:00      
                                             Of course, it does not depend on CLK frequency!            01/01/70 00:00      
                              I cannot see a confusion            01/01/70 00:00      
                                 Not all one-clocker mfg's make the same claims            01/01/70 00:00      
                                    But...            01/01/70 00:00      
                                 comparison of 12- and less-clockers            01/01/70 00:00      
                                    Very nice!            01/01/70 00:00      
                                    Cool!            01/01/70 00:00      
                                    Good overview            01/01/70 00:00      
               Another link            01/01/70 00:00      
                  Dhrystone?            01/01/70 00:00      
                     Yes ... one could argue that the core is hobbled            01/01/70 00:00      
                        to sell IS useful... ;-)            01/01/70 00:00      
                     Dhrystone            01/01/70 00:00      
                        give data            01/01/70 00:00      
                  I find it useful...            01/01/70 00:00      
                     Nonsense            01/01/70 00:00      
                        Nice attitude...            01/01/70 00:00      
                        One thing that would be useful for FPGA            01/01/70 00:00      
                           Still waiting            01/01/70 00:00      
                              Here it is ... It's simple arithmetic            01/01/70 00:00      
                                 Not at all!            01/01/70 00:00      
                                 You missed the "at the same frequency" part            01/01/70 00:00      
                                    You're right, in a sense ...            01/01/70 00:00      
                                       Still thinking of the DT8051 as 12-clocker            01/01/70 00:00      
                                          Gee ... I can see where I went off the track!            01/01/70 00:00      
                                             You deserve respect for that...            01/01/70 00:00      
                                             Very easy to miss things            01/01/70 00:00      
                                                It is a shame the documentation is so superficial            01/01/70 00:00      
                                             Marketing demagogy            01/01/70 00:00      
                                                baloney            01/01/70 00:00      
                                                   Insignificant?            01/01/70 00:00      
                                                      the "classical" timing            01/01/70 00:00      
                                                         Fair claim            01/01/70 00:00      
                                                Not so fast, there, Pilgrim...            01/01/70 00:00      
                                                   Any alternative?            01/01/70 00:00      
                                                      Possibly ... ???            01/01/70 00:00      
                                                         Still pipelining            01/01/70 00:00      
                                                            It doesn't have to pipeline            01/01/70 00:00      
                                                               What use?            01/01/70 00:00      
                                                                  if critical, lock - if you can            01/01/70 00:00      
                                                                     What question?            01/01/70 00:00      
                                                                        Whatever happened to Amit Mittal ?            01/01/70 00:00      
                                                                           maximum speed of a car            01/01/70 00:00      
                                                                           Pigeon Poster?            01/01/70 00:00      
                                                                        no question, uncernity            01/01/70 00:00      
                                                                  It's not that difficult ...            01/01/70 00:00      
                                                                     Are we talking about the same thing?            01/01/70 00:00      
                                                                        It is a matter of how you choose to view things            01/01/70 00:00      
                                                                           Q still open: any 8051 with only two clock transitions?            01/01/70 00:00      
                                                                              I do not believe bigger is better ...            01/01/70 00:00      
                                                                                 You argue quite much for not caring            01/01/70 00:00      
                                                                                    Without going into too much detail ...            01/01/70 00:00      
                                                                                       Pipeline for concurrency            01/01/70 00:00      
                                                                                          One step at a time            01/01/70 00:00      
                                                                                             Many steps at the same time            01/01/70 00:00      
                                                                              1-clocker without pipelining            01/01/70 00:00      
                                                                                 Interesting link - I just wish it was a bit meatier            01/01/70 00:00      
   what the datasheet for the particular device states            01/01/70 00:00      

Back to Subject List