??? 03/24/10 09:18 Read: times |
#174463 - Incomplete byte order. But pad is still dangerous Responding to: ???'s previous message |
No. The example code does not totally takes care of enianness.
Why? It mentions 1234 and 4321. But what about 3412 or 3412 for example? Yes, there exists machines with different endiannes of 16-bit numbers and the order of the 16-bit numbers in 32-bit numbers. This can happen when a processor is using a 16-bit native format for 16-bit numbers, and then is either using the software decisions in the compiler for 32-bit numbers or the hardware actually do write two 16-bit numbers in reversed order from how they write the bytes in each 16-bit number. But byte order is still a quite easy thing to handle. And in the majority of cases, the speed difference isn't important, in which case the code can always perform the split/merge into 1-byte accesses. The reason is that byte order only matters when you are handing over data between two different programs. Within the same program, you will always have the same byte order, unless you are using a processor like the PowerPC, where you can switch byte order on-the-fly to allow faster emulation speeds. Note that the if statement to switch between a raw read/write of a 32-bit value or an unpack/unpack of four bytes in runtime may - for a lot of faster processors - take as much time as the actual code to always perform the pack/unpack. Yet another thing is that many compilers will recognize the pack/unpack code and can - themselves - if they know about the alignment of the data - generate code for 16-bit or 32-bit accesses. So the extra code to optimize may not really work. The next thing when switching between packed bytes and word accesses is that you have to think about alignment. The x86 processors are very nice in that they support unaligned access to the memory, so you only loses speed. Many other processor architectures will kill your program for an unaligned access. Doing runtime-decisions of align would then mean that you both compares byte order with four different combinations, and also checks the last bits of the pointer for align. And looking at bits in the pointer is yet one more thing to think about if talking portability. One little thing with your article. The author writes "Some people classify a register as a big-endian, because it stores its most significant byte at the lowest memory address." Seems like he have decided that the address of a register is a specific end of the register. That gives an indication that the author isn't really 100% comfortable with the problem. Most processors can't take the address of a register. And for processors with memory-mapped registers, the architecture will decide the byte order. Next thing: "and the PowerPC® families are all big-endian". For some processors, it's important to talk about the OS or framework, since the processor may have multiple behaviours. But you have still ignored a large part of the issue when playing with structures and unions. And that is alignment. As soon as a struct has members of different size and isn't packed, funny things will happen. What do you think the #ifdef blocks will look like when you start to take care of all the possible combinations of padding? What we know is that multiple objects in an union will have the same start address. But as soon as there are multiple fields in a struct, there are no more general rules about distance between these fields. That is a big reason why unions should be used as little as possible for type conversions. The standard added them for a very different reason - to store one of multiple different values in the same memory space. The standard did not consider storing a value of one type, and retrieving the data from a different union member. |