It's not that difficult ...

Back to Subject List

Old thread has been locked -- no new posts accepted in this thread

???
05/15/09 05:46
Read: times

#165360 - It's not that difficult ...
Responding to: ???'s previous message

Per Westermark said:

The only problem with a pipeline is the startup time after an unexpected jump.

You don't have to worry about any reset conditions. Normal speculative execution just means that you can throw away results at the last step of the pipeline. Form the programs view, an instruction hasn't happened unless the last step in the pipeline gets accepted.

With a short pipeline, you need to be able to see the startup time either. After a reset, it may take a couple of cycles extra until the first results gets produced. In the same way, an interrupt will require these cycles until activated, but these cycles would normally be so much shorter (the pipeline allows a higher clock frequency, since you limit the sequential state changes required for a step in the pipeline) that the total number of nanoseconds for the interrupt response will not be affected. The ALU etc of the 8051 are so tiny that it is very easy for the pipeline to compute both sides of a conditional branch, and throw away the wrong alternative.

In the end, a pipeline need not affect any predictability. You could still count your cycles for the individual instructions. It isn't until you start with instruction reordering or concurrent (accepted, instead of speculative) instructions that you will lose the ability to compute exact timing.

It's still predictable, since computer programs can and will "know" exactly what the state of the pipeline is, in the event that the core is pipelined.

I disagree with the notion that the 805x ALU is small, as compared with the remainder of the logic. I also disagree with the notion that there has to be a lot of propagation delay when reaching into code space, internal data space, SFR space, external data space, etc. The ALU, from where I sit, is pretty large, wide, not in data path but in function, and has a substantial pair of data selectors (muxes) at its input, and a substantial data distributor at its output. A mux does the shifts and rotates, a 16-bit adder-subtractor does the address and data arithmetic, though it's 16-bits wide only to support the DPTR and PC/Address-Bus operations.

Viewing many bytes of the instruction stream without a pipeline gives the processor information it can't make any use of. The ALU, address busses, etc. will not be any faster just because you have knowledge about following instructions.

Yes, you're right ... the logic depth, which can be fairly well equalized, using short, wide paths rather than narrow long ones, will provide the rate-determining step. However, if a 3-byte instruction, e.g. MOV A,#HHHH takes just as long as a single-byte instruction, MOV B,A, or a two-byte instruction, MOV A,VNAME, things will go quite a bit faster even though the individual cycles are longer.

Think about discrete logic. How much can you manage to do in your discrete logic with just one clock transition? Each gate will have a delay, and your information may in some situations have to ripple through the logic gates and flip-flops. Using a two-phase clock, you would still have quite interesting times to get data from the code space, decode, retrieve input data, compute and store back the result within one low-to-high and one high-to-low clock transition.

I've not built an 805x core ... yet ... though I've done considerable preliminary work on it. I've built other cores, and have found that one can build nearly any MCU core with a simple two-phase clock, e.g. the sort which was used on 6801 or 6502, etc, doing data arithmetic on one phase and address arithmetic on the other. The two internal data spaces, SFR space, code space, and external data space, are all segments of an otherwise contiguous memory space. Since address arithmetic doesn't affect the data, and since data arithmetic doesn't affect the addresses, the address arithmetic cycle can be used to access the composite memory space. Consequently, the data arithmetic result can be transferred to memory during the address arithmetic cycle, and the address arithmetic result can be transferred to the appropriate resource (not data or code memory, but possibly to stack or registers) during the data arithmetic cycle. Because the ALU is wide but shallow, it can easily be used for both sets of arithmetic, thereby eliminating the need for long clearable and presettable up/down counters, which require quite large concatenated gates. I could go on, but I imagine people's eyes are already glazing over.

RE

List of 74 messages in thread

Topic Author Date
max clk freq              01/01/70 00:00
   Which              01/01/70 00:00
   300MHz              01/01/70 00:00
      .              01/01/70 00:00
         Does that make it effectively 600MHz, then...?              01/01/70 00:00
            That are the links I found...              01/01/70 00:00
               Interesting item, but did you notice ... ?              01/01/70 00:00
                  300Mips, equivalent to 3.6GHz!              01/01/70 00:00
                     That's slightly misleading ...              01/01/70 00:00
                        You sure about your math?              01/01/70 00:00
                           It's confusing ... typical marketing drivel              01/01/70 00:00
                              Based on the claims you posted              01/01/70 00:00
                                 Those aren't my claims!              01/01/70 00:00
                                    Read comments _before_ (not) answering them              01/01/70 00:00
                                       Architecture speed              01/01/70 00:00
                                          That was my take too              01/01/70 00:00
                                             Of course, it does not depend on CLK frequency!              01/01/70 00:00
                              I cannot see a confusion              01/01/70 00:00
                                 Not all one-clocker mfg's make the same claims              01/01/70 00:00
                                    But...              01/01/70 00:00
                                 comparison of 12- and less-clockers              01/01/70 00:00
                                    Very nice!              01/01/70 00:00
                                    Cool!              01/01/70 00:00
                                    Good overview              01/01/70 00:00
               Another link              01/01/70 00:00
                  Dhrystone?              01/01/70 00:00
                     Yes ... one could argue that the core is hobbled              01/01/70 00:00
                        to sell IS useful... ;-)              01/01/70 00:00
                     Dhrystone              01/01/70 00:00
                        give data              01/01/70 00:00
                  I find it useful...              01/01/70 00:00
                     Nonsense              01/01/70 00:00
                        Nice attitude...              01/01/70 00:00
                        One thing that would be useful for FPGA              01/01/70 00:00
                           Still waiting              01/01/70 00:00
                              Here it is ... It's simple arithmetic              01/01/70 00:00
                                 Not at all!              01/01/70 00:00
                                 You missed the "at the same frequency" part              01/01/70 00:00
                                    You're right, in a sense ...              01/01/70 00:00
                                       Still thinking of the DT8051 as 12-clocker              01/01/70 00:00
                                          Gee ... I can see where I went off the track!              01/01/70 00:00
                                             You deserve respect for that...              01/01/70 00:00
                                             Very easy to miss things              01/01/70 00:00
                                                It is a shame the documentation is so superficial              01/01/70 00:00
                                             Marketing demagogy              01/01/70 00:00
                                                baloney              01/01/70 00:00
                                                   Insignificant?              01/01/70 00:00
                                                      the "classical" timing              01/01/70 00:00
                                                         Fair claim              01/01/70 00:00
                                                Not so fast, there, Pilgrim...              01/01/70 00:00
                                                   Any alternative?              01/01/70 00:00
                                                      Possibly ... ???              01/01/70 00:00
                                                         Still pipelining              01/01/70 00:00
                                                            It doesn't have to pipeline              01/01/70 00:00
                                                               What use?              01/01/70 00:00
                                                                  if critical, lock - if you can              01/01/70 00:00
                                                                     What question?              01/01/70 00:00
                                                                        Whatever happened to Amit Mittal ?              01/01/70 00:00
                                                                           maximum speed of a car              01/01/70 00:00
                                                                           Pigeon Poster?              01/01/70 00:00
                                                                        no question, uncernity              01/01/70 00:00
                                                                  It's not that difficult ...              01/01/70 00:00
                                                                     Are we talking about the same thing?              01/01/70 00:00
                                                                        It is a matter of how you choose to view things              01/01/70 00:00
                                                                           Q still open: any 8051 with only two clock transitions?              01/01/70 00:00
                                                                              I do not believe bigger is better ...              01/01/70 00:00
                                                                                 You argue quite much for not caring              01/01/70 00:00
                                                                                    Without going into too much detail ...              01/01/70 00:00
                                                                                       Pipeline for concurrency              01/01/70 00:00
                                                                                          One step at a time              01/01/70 00:00
                                                                                             Many steps at the same time              01/01/70 00:00
                                                                              1-clocker without pipelining              01/01/70 00:00
                                                                                 Interesting link - I just wish it was a bit meatier              01/01/70 00:00
   what the datasheet for the particular device states              01/01/70 00:00

Back to Subject List

Topic	Author	Date
max clk freq		01/01/70 00:00
Which		01/01/70 00:00
300MHz		01/01/70 00:00
.		01/01/70 00:00
Does that make it effectively 600MHz, then...?		01/01/70 00:00
That are the links I found...		01/01/70 00:00
Interesting item, but did you notice ... ?		01/01/70 00:00
300Mips, equivalent to 3.6GHz!		01/01/70 00:00
That's slightly misleading ...		01/01/70 00:00
You sure about your math?		01/01/70 00:00
It's confusing ... typical marketing drivel		01/01/70 00:00
Based on the claims you posted		01/01/70 00:00
Those aren't my claims!		01/01/70 00:00
Read comments _before_ (not) answering them		01/01/70 00:00
Architecture speed		01/01/70 00:00
That was my take too		01/01/70 00:00
Of course, it does not depend on CLK frequency!		01/01/70 00:00
I cannot see a confusion		01/01/70 00:00
Not all one-clocker mfg's make the same claims		01/01/70 00:00
But...		01/01/70 00:00
comparison of 12- and less-clockers		01/01/70 00:00
Very nice!		01/01/70 00:00
Cool!		01/01/70 00:00
Good overview		01/01/70 00:00
Another link		01/01/70 00:00
Dhrystone?		01/01/70 00:00
Yes ... one could argue that the core is hobbled		01/01/70 00:00
to sell IS useful... ;-)		01/01/70 00:00
Dhrystone		01/01/70 00:00
give data		01/01/70 00:00
I find it useful...		01/01/70 00:00
Nonsense		01/01/70 00:00
Nice attitude...		01/01/70 00:00
One thing that would be useful for FPGA		01/01/70 00:00
Still waiting		01/01/70 00:00
Here it is ... It's simple arithmetic		01/01/70 00:00
Not at all!		01/01/70 00:00
You missed the "at the same frequency" part		01/01/70 00:00
You're right, in a sense ...		01/01/70 00:00
Still thinking of the DT8051 as 12-clocker		01/01/70 00:00
Gee ... I can see where I went off the track!		01/01/70 00:00
You deserve respect for that...		01/01/70 00:00
Very easy to miss things		01/01/70 00:00
It is a shame the documentation is so superficial		01/01/70 00:00
Marketing demagogy		01/01/70 00:00
baloney		01/01/70 00:00
Insignificant?		01/01/70 00:00
the "classical" timing		01/01/70 00:00
Fair claim		01/01/70 00:00
Not so fast, there, Pilgrim...		01/01/70 00:00
Any alternative?		01/01/70 00:00
Possibly ... ???		01/01/70 00:00
Still pipelining		01/01/70 00:00
It doesn't have to pipeline		01/01/70 00:00
What use?		01/01/70 00:00
if critical, lock - if you can		01/01/70 00:00
What question?		01/01/70 00:00
Whatever happened to Amit Mittal ?		01/01/70 00:00
maximum speed of a car		01/01/70 00:00
Pigeon Poster?		01/01/70 00:00
no question, uncernity		01/01/70 00:00
*It's not that difficult ...*		01/01/70 00:00
Are we talking about the same thing?		01/01/70 00:00
It is a matter of how you choose to view things		01/01/70 00:00
Q still open: any 8051 with only two clock transitions?		01/01/70 00:00
I do not believe bigger is better ...		01/01/70 00:00
You argue quite much for not caring		01/01/70 00:00
Without going into too much detail ...		01/01/70 00:00
Pipeline for concurrency		01/01/70 00:00
One step at a time		01/01/70 00:00
Many steps at the same time		01/01/70 00:00
1-clocker without pipelining		01/01/70 00:00
Interesting link - I just wish it was a bit meatier		01/01/70 00:00
what the datasheet for the particular device states		01/01/70 00:00