??? 08/20/05 14:54 Read: times |
#99736 - tradeoff Responding to: ???'s previous message |
There is usually a tradeoff between code size and speed.
For example, algorithm-based CRC-16 will take up around 20 asm lines (cca 30 bytes); the table driven version takes up 512 bytes on the tables alone... Here, the increase in code size is less pronounced, but the "tricks" certainly take up some code space. My "fast" solution (I called it "behemoth") is huge compared to "standard code", as it is a quite enormous tree with a different branch almost for each input. It started with dividing the shift into bigger parts according to '51 capabilities, performed one after the other: first the byte-shifts, then nibble-shifts and finally the bit-shifts (#1). Then I made up the "tree", so that the bytes which are completely zero don't get shifted by nibbles and bits (#2). The final version contains some tune-ups, e.g. replacing two clr c by one anl Rx,#3Fh (and two other "tricks" - see description in the header) (#3). Here is the evolution of times and size: -------- cycles ------|- bytes - worst best average size #1 81 20 50.6 77 #2 69 18 34.8 208 #3 64 18 32.5 204----- But nowadays we see increasing amount of code memory size in microcontrollers, so maybe it's time to change strategy. Btw. isn't it possible make the choice of "strategy" (speed vs. size) an option in the compilers? One more word of caution - the solutions are "speed-optimal" for standard 12-clocker '51s and the 2- and 6-clockers, which have the same instruction-cycle structure as the standard. The 4- and 1-clockers have different munber of instruction cycles per various instruction groups. As my "behemoth" uses quite a lot of jumps - and jumps execute longer on singleclockers than other instructions - and also might spoil the jump-cache on the >=40MHz variants (SiLabs, uPSD34xx). Also mutiply and divide tend to execute significantly longer on singleclockers compared to other instructions, so it might turn out, that the "conventional" solution is comparable in terms of execution time to these "tuned-up"'s. Jan Waclawek PS. IMHO you should ask Craig as per usage of the results in SDCC - although IANAL. PS2. Craig, isn't it possible to see the other solutions, too? |
Topic | Author | Date |
First challenge done, new challenge up | 01/01/70 00:00 | |
Seems about right to me... | 01/01/70 00:00 | |
2 weeks? | 01/01/70 00:00 | |
re:challenge | 01/01/70 00:00 | |
"move the data intelligently" | 01/01/70 00:00 | |
re: | 01/01/70 00:00 | |
Overlapping data is part of the challeng | 01/01/70 00:00 | |
Yep! | 01/01/70 00:00 | |
Should work | 01/01/70 00:00 | |
And the Winner is... | 01/01/70 00:00 | |
A worthy winner | 01/01/70 00:00 | |
Yes | 01/01/70 00:00 | |
one + one | 01/01/70 00:00 | |
Note taken | 01/01/70 00:00 | |
tradeoff | 01/01/70 00:00 | |
exec time and size | 01/01/70 00:00 | |
exec time and size II | 01/01/70 00:00 | |
Public domain.... | 01/01/70 00:00 | |
Open source? | 01/01/70 00:00 |