??? 11/26/07 18:31 Read: times |
#147428 - re: nevertheless Responding to: ???'s previous message |
Richard Erlacher said:
Andy Peters said:
Before condemning muxes built from simple 4-LUTs as being too slow, it might be worth your while to code your buffer or mux or whatever, then synthesize and implement it, choosing a Spartan 3E as the target. Betcha it'll be "fast enough." It won't be fast enough unless it's at least as fast, with respect to claimed RAM speed, as the tristate option was in earlier generations. ... but as I pointed out, internal tristates were slower (certainly in the XC4000 parts) than muxes. You're right, certainly, about the amount of RAM currently available as block RAM, even in BIG devices. However, even if it's only 8 bits wide, if it's a deep mux, it's still slowed by the delay of a LUT for each one that's needed to address the RAM. If the RAM is comprised of 4Kb blocks, then there have to be 8 of them to generate a 32k word depth. Moreover, there have to be 36 of them to make a 36-bit width. You've not addressed the question, which was how much delay is added by each LUT in the chain. Wait a minute. Your muxes are still only 8:1, replicated 36 times. There might be a really tiny additional delay due to increased fan-out on the select lines (which ARE registered, right?). And look further: Spartan3 had new wide dedicated muxes. An 8:1 mux using these resources is one level of logic. You can do a 16:1 mux in one CLB and a 32:1 mux in two CLBs. The mux resources have dedicated local routing for performance improvements. But if you can't fit what you want in the FPGA, I suppose it doesn't matter. From TFDS: Spartan 3E, -5 (faster) speed prop delay through a CLB is 0.66 ns. Routing will add to that delay. Make it 1.66 ns. The delay would be the same with a narrow or a wide word, but the width does multiply the amount of logic linearly with width. OK, you see that. What is hidden in the fine print is the delay inherent in using the highly touted block ram imposes. From TFDS: Spartan 3E clock to out time for block RAMs is 2.45 ns. Spartan 2E BRAM clock-to-out is slower, 3.1 ns. Even more obscure, at least in the Spartan-II family with which I'm somewhat familiar, and which DOES have tristate resources, is the penalty for concatenating "distributed" RAM (unused LUT's) via muxes rather than tristate buses. OK, back to the S2E data sheet. Yes, the tristates are fast: 0 ns prop delay and 0.1 ns enable/disable times. And yes, 6-input functions in the S2E CLB are slower (0.9 ns). S3E has no tristates and clock-to-out is 2.05 ns. Back to my point: It's probably worth retargeting a newer family to see what sort of performance you get. I still say that you'll do better in a newer family. Since the fastest FPGA families are also the most costly, I'd think it important to preserve the high speed for which I'm paying. If I can't do that, thanks to the lack of a fast way to utilize all those otherwise unused gates (LUT's) as RAM when I need it, I lose interest in paying the high price. The availability of a hard power-pc core doesn't impress me when I don't want to use it. Likewise, the fact that a single LUT is capable of high speed in the large, costly FPGA doesn't help me if I just need 200K "gates" (as defined by the marketing guys), which really only amounts to 40k "real" gates. The ultra-fast multipliers don't help much, either, if I'm not using 'em. So don't choose a Virtex-4 or -5. Try a Spartan 3A. Really. No PowerPC core, simplified power supply requirements (snip irrelevant cell-phone rant) As far as I'm concerned, FPGA technology is what it is, but, to me, it's a big disappointment, as I know what it could have been. What could it have been? Stop using S2 parts and use something newer (and cheaper). What sort of speed requirements are you trying to meet, anyways? -a |
Topic | Author | Date |
Tri-state busses in FPGAs | 01/01/70 00:00 | |
Tristate Buffers (TBUFs) have been phased out | 01/01/70 00:00 | |
Thank you | 01/01/70 00:00 | |
Closing the loop | 01/01/70 00:00 | |
siumulate? | 01/01/70 00:00 | |
I didn't simulate it (yet) | 01/01/70 00:00 | |
hmmm | 01/01/70 00:00 | |
So ... what about a BIG multi-party bus? | 01/01/70 00:00 | |
delay | 01/01/70 00:00 | |
nevertheless ... | 01/01/70 00:00 | |
re: nevertheless | 01/01/70 00:00 | |
What disappoints me is the advertising vs reality | 01/01/70 00:00 | |
advertising | 01/01/70 00:00 | |
advertising, badvertising ... lies! | 01/01/70 00:00 | |
oy | 01/01/70 00:00 | |
If only one could rely on them ... | 01/01/70 00:00 | |
largely, it's because it's not an option | 01/01/70 00:00 | |
Zackly | 01/01/70 00:00 | |
If you have internal tristate resources ... | 01/01/70 00:00 | |
I have new worries now | 01/01/70 00:00 | |
tristates in FPGAs | 01/01/70 00:00 |