I made NO optimisation

Back to Subject List

Old thread has been locked -- no new posts accepted in this thread

???
03/07/09 21:07
Read: times

#163218 - I made NO optimisation
Responding to: ???'s previous message

Per Westermark said:

Comparing the end pointer involved two LOADs.

What optimization setting do you use? In my test, the RealView compiler did place the end pointer in a register so it could directly compare the source pointer with the end pointer before the conditional jump.

I made NO optimisation settings. As I said, I am not an experienced user.

After your advice, I chose -O3 (+ for speed) and achieved 36.98us / 39.06us
for the end_pointer / count method. It looked longer, but the cycle count is shorter.

Modern compilers are definitely very clever. I am not surprised that you have to give the odd hint. i.e. the compiler could analyse that you are doing # iterations, perhaps it could choose an integer loop count or a pointer compare.

You always have to drop hints to an 8 bit compiler. One day the 32 bit compiler will know better than you do.

I am learning things about an ARM7 ( or the LPC2103 anyway ). It does not have an external memory bus, so could not do Richard's trick.

However, it may internally read ahead on the instruction stream, but it does not execute instructions from a cache. Or can it ?

I have no personal problem with an improved execution speed due to pipelining. Neither do I complain if a Compiler has removed some redundant instructions. Richard and many others are horrified by Compiler optimisation. But as far as I can see the ARM7 core has a consistent execution cycle count. So a Simulator or pencil and paper can always calculate exact times from the ASM instructions.

There was a recent AVR thread on a ultoa() function. A clever trick with a 32 x 16 multiply and subsequent shifts reduced a worst case from 2700 cycles to 87. However good an instruction set you have, a good algorithm can be dramatic.

Modern Compilers translate accurately. It is reassuring that they are pretty efficient too.

David.

List of 76 messages in thread

Topic Author Date
NXP P89LPC936 Auxilary RAM              01/01/70 00:00
   Compiler Cannot Save the Day At Runtime              01/01/70 00:00
      I found it.....              01/01/70 00:00
         Incorrect!              01/01/70 00:00
            To be fair,...              01/01/70 00:00
               Its good info to know...              01/01/70 00:00
   how is it done in C?              01/01/70 00:00
   Maybe you should try doing it ASM!              01/01/70 00:00
      You should start a new thread on that!              01/01/70 00:00
      Not in this case!              01/01/70 00:00
         A compiler should support ALL MCU features ...              01/01/70 00:00
            A compiler should translate the language              01/01/70 00:00
               Still C - just not strict.              01/01/70 00:00
            I didn't say that!              01/01/70 00:00
               It is not that clean and clear              01/01/70 00:00
                  Deviants are deviant              01/01/70 00:00
                     Keil and SDCC need no macros ...              01/01/70 00:00
                        Are you sure?              01/01/70 00:00
                     That's probably correct, but ...              01/01/70 00:00
                        A C compiler can map the hardware quite well              01/01/70 00:00
                           compiler vendors are looking at the new processors              01/01/70 00:00
                              I once mentioned that ...              01/01/70 00:00
                           It's not what I'd choose, but it is a matter of perefernce              01/01/70 00:00
                              As flash gets bigger, the code steps do too.              01/01/70 00:00
                              Still not needed for other architectures              01/01/70 00:00
                                 We will have to agree to disagree ... I guess              01/01/70 00:00
                                    Which single-clocker is cheaper than an ARM?              01/01/70 00:00
                                       RE: Which single-clocker is cheaper than an ARM?              01/01/70 00:00
                                       differs considerably from the classic microcontroller?              01/01/70 00:00
                                       Horses for courses              01/01/70 00:00
                                          Always start each project by scanning the market              01/01/70 00:00
                                             On that we can agree              01/01/70 00:00
                                          Maybe, but what are they comparing?              01/01/70 00:00
                                             Did you actually look?              01/01/70 00:00
                                                Yes, I did.              01/01/70 00:00
                                       it's a tradeoff              01/01/70 00:00
                                          Is it 2006 already?              01/01/70 00:00
                                             Really?              01/01/70 00:00
                                                that's outright silly              01/01/70 00:00
                                                   Maybe ... which is why it is not yet the case              01/01/70 00:00
                                                      the eyes of the beholder              01/01/70 00:00
                                                         Look at it from another viewpoint for a moment              01/01/70 00:00
                                          RE: I'm not the one to ask about IC prices              01/01/70 00:00
                                             doesn't mean I'm totally in the dark              01/01/70 00:00
                                                Richard Erlacher has left the planet              01/01/70 00:00
                                                   Maybe Andy's the one who's lost              01/01/70 00:00
                                                      I cannot believe you even looked at ARM              01/01/70 00:00
                                                         It's all relative              01/01/70 00:00
                                                            Price doesn't directly follow processor size              01/01/70 00:00
                                                            What about performance?              01/01/70 00:00
                                                         Be specific please.              01/01/70 00:00
                                                            You can use your own supplier              01/01/70 00:00
                                                               Your test simulates as 41.04us              01/01/70 00:00
                                                                  RE: ARM compiles do not like byte-addressing              01/01/70 00:00
                                                                     A typo on my part              01/01/70 00:00
                                                                        Case of full unroll              01/01/70 00:00
                                                                           ... and what does THAT ARM MCU cost?              01/01/70 00:00
                                                                              WHO CARES              01/01/70 00:00
                                                                                 Absolutely true!              01/01/70 00:00
                                                                              $5 or lower              01/01/70 00:00
                                                                                 My Simulation times were wrong.              01/01/70 00:00
                                                                                    67% of your loop was your loop              01/01/70 00:00
                                                                                    re-written the loop in C              01/01/70 00:00
                                                                                       Try without loop counter              01/01/70 00:00
                                                                                          You're right!              01/01/70 00:00
                                                                                             Similar trick with ARM7 would require 66.67MHz              01/01/70 00:00
                                                                                             60ns with no store instruction.              01/01/70 00:00
                                                                                                25% speedup              01/01/70 00:00
                                                                                                   Actually, I found the for loop faster.              01/01/70 00:00
                                                                                                      Compiler setting?              01/01/70 00:00
                                                                                                         I made NO optimisation              01/01/70 00:00
                                                                                                            Avoid variable aliasing if you like high optimization levels              01/01/70 00:00
                                                                                                               The RealView compiler is very competent              01/01/70 00:00
                                                                                                                  Yes, caching              01/01/70 00:00
                                                                                                                  Yes, caching              01/01/70 00:00
                                                                                                It's much easier with the 8-bit single-clocker              01/01/70 00:00

Back to Subject List

Topic	Author	Date
NXP P89LPC936 Auxilary RAM		01/01/70 00:00
Compiler Cannot Save the Day At Runtime		01/01/70 00:00
I found it.....		01/01/70 00:00
Incorrect!		01/01/70 00:00
To be fair,...		01/01/70 00:00
Its good info to know...		01/01/70 00:00
how is it done in C?		01/01/70 00:00
Maybe you should try doing it ASM!		01/01/70 00:00
You should start a new thread on that!		01/01/70 00:00
Not in this case!		01/01/70 00:00
A compiler should support ALL MCU features ...		01/01/70 00:00
A compiler should translate the language		01/01/70 00:00
Still C - just not strict.		01/01/70 00:00
I didn't say that!		01/01/70 00:00
It is not that clean and clear		01/01/70 00:00
Deviants are deviant		01/01/70 00:00
Keil and SDCC need no macros ...		01/01/70 00:00
Are you sure?		01/01/70 00:00
That's probably correct, but ...		01/01/70 00:00
A C compiler can map the hardware quite well		01/01/70 00:00
compiler vendors are looking at the new processors		01/01/70 00:00
I once mentioned that ...		01/01/70 00:00
It's not what I'd choose, but it is a matter of perefernce		01/01/70 00:00
As flash gets bigger, the code steps do too.		01/01/70 00:00
Still not needed for other architectures		01/01/70 00:00
We will have to agree to disagree ... I guess		01/01/70 00:00
Which single-clocker is cheaper than an ARM?		01/01/70 00:00
RE: Which single-clocker is cheaper than an ARM?		01/01/70 00:00
differs considerably from the classic microcontroller?		01/01/70 00:00
Horses for courses		01/01/70 00:00
Always start each project by scanning the market		01/01/70 00:00
On that we can agree		01/01/70 00:00
Maybe, but what are they comparing?		01/01/70 00:00
Did you actually look?		01/01/70 00:00
Yes, I did.		01/01/70 00:00
it's a tradeoff		01/01/70 00:00
Is it 2006 already?		01/01/70 00:00
Really?		01/01/70 00:00
that's outright silly		01/01/70 00:00
Maybe ... which is why it is not yet the case		01/01/70 00:00
the eyes of the beholder		01/01/70 00:00
Look at it from another viewpoint for a moment		01/01/70 00:00
RE: I'm not the one to ask about IC prices		01/01/70 00:00
doesn't mean I'm totally in the dark		01/01/70 00:00
Richard Erlacher has left the planet		01/01/70 00:00
Maybe Andy's the one who's lost		01/01/70 00:00
I cannot believe you even looked at ARM		01/01/70 00:00
It's all relative		01/01/70 00:00
Price doesn't directly follow processor size		01/01/70 00:00
What about performance?		01/01/70 00:00
Be specific please.		01/01/70 00:00
You can use your own supplier		01/01/70 00:00
Your test simulates as 41.04us		01/01/70 00:00
RE: ARM compiles do not like byte-addressing		01/01/70 00:00
A typo on my part		01/01/70 00:00
Case of full unroll		01/01/70 00:00
... and what does THAT ARM MCU cost?		01/01/70 00:00
WHO CARES		01/01/70 00:00
Absolutely true!		01/01/70 00:00
$5 or lower		01/01/70 00:00
My Simulation times were wrong.		01/01/70 00:00
67% of your loop was your loop		01/01/70 00:00
re-written the loop in C		01/01/70 00:00
Try without loop counter		01/01/70 00:00
You're right!		01/01/70 00:00
Similar trick with ARM7 would require 66.67MHz		01/01/70 00:00
60ns with no store instruction.		01/01/70 00:00
25% speedup		01/01/70 00:00
Actually, I found the for loop faster.		01/01/70 00:00
Compiler setting?		01/01/70 00:00
*I made NO optimisation*		01/01/70 00:00
Avoid variable aliasing if you like high optimization levels		01/01/70 00:00
The RealView compiler is very competent		01/01/70 00:00
Yes, caching		01/01/70 00:00
Yes, caching		01/01/70 00:00
It's much easier with the 8-bit single-clocker		01/01/70 00:00