Yeah, it exists and working, through i do not test exactly that loader on linux ppc, but there is lot of loaders of all kin of 3d formats, and they mostly working (as they works on amigaos4).
Through i still think the reasson why that loader keepts as it , because probabaly linux ppc kernel emulate in softwate those unalignment access instead of crashes, so they just didn't noticed issues with that loader before
Seeing the various code snippets, isn't the one with the lwz asm faster or is your/corto's code optimized to lwbrx?
Dunno about speed, that need somehow to meassure it.. But probabaly differences not that big (expectually with CPUs which are faster than 100 mhz :) ).
@Corto
Did you maybe have any idea, of how to fix it all more cleaner : what i mean, is to avoid creating the same structures with the same name as it feels kind of bad. Those structs are generally only used for loading and discarded afterwards. So as long as we document the "real" type we wouldn't need a copy of struct .. Maybe something like this :
Ok, you're right about the management of this kind of exception by Linux.
About speed: no difference, that's a matter of few instructions at load time.
What I proposer could certainly be cleaner but how you can avoid a copy in a structure that add an abstraction level, as the initial structure is badly aligned and playing with fields and padding won't give the expected result.
I will have to look at your example as I'm not so familiar with C++ and so I'm not sure to understand what you want to do there.
What I proposer could certainly be cleaner but how you can avoid a copy in a structure that add an abstraction level, as the initial structure is badly aligned and playing with fields and padding won't give the expected result.
I will have to look at your example as I'm not so familiar with C++ and so I'm not sure to understand what you want to do there.
The idea behind the TPack1 template was to let the compiler create the copy and don't do it yourself. So we only have the structs with padding, no more PACK_STRUCT and irrPack includes. And for loading we create a new variable of that struct with TPack1 which creates a copy of the same struct without padding . After loading we copy that one back to the padded struct. On big endian just with = copy and on little endian with the swapping stuff and get_unaligned_le_float. And floats in the structs are replaced by the new union.
We probably anyway can forget about that TPack1 template thing, maybe we not need it ? Just replace all float's with UnalignedFloat32() for example:
union UnalignedFloat32
{
u32 uval;
u8 cval[4];
f32 fval;
};
After we did that we get errors in any place that uses those floats. And in that place we can fix it. The byteswap can be done with the uval then. And for the assignment to the real structures we can use some function which either creates a copy or copies the bytes to the float.
When the float is used directly like with swapping the sign (+-) then we need to use a copy of UnalignedFloat32 in that place. Then copy it back with the uval. Or alternatively move the sign-swapping code completely to the place where it's assigned to the real mesh-vertices later.
pPtr += sizeof(u16);
HybridFloat tmpfloat2;
char *pVPtr2 = (char *)pPtr; // address of the first packed vertex in memory
MS3DTriangle *triangles = new MS3DTriangle[numTriangles];
pPtr += sizeof(MS3DTrianglePacked) * numTriangles;
if (pPtr > buffer+fileSize)
{
delete [] buffer;
os::Printer::log("Loading failed. Corrupted data found.", file->getFileName(), ELL_ERROR);
return false;
}
#ifdef __BIG_ENDIAN__
for (u16 tmp=0; tmp<numTriangles; ++tmp)
{
// because it u16 , do nothing new
triangles[tmp].Flags = os::Byteswap::byteswap(triangles[tmp].Flags);
// because it also u16, do nothing new, just repat 3 times as it was previosly in "for" loop
triangles[tmp].VertexIndices[0] = os::Byteswap::byteswap(triangles[tmp].VertexIndices[0]);
triangles[tmp].VertexIndices[1] = os::Byteswap::byteswap(triangles[tmp].VertexIndices[1]);
triangles[tmp].VertexIndices[2] = os::Byteswap::byteswap(triangles[tmp].VertexIndices[2]);
//our floats start, align them.
// Read per byte with swapping and fill 9 VertexNormals ([3][3]), 3 S and 3 T , in the final structure
// pVPtr[0] and pVPtr[1] is ignored, as it contains u16 Flags;
// pVPtr[2] and pVPtr[3] , pVPtr[4] and pVPtr[5], pVPtr[6] and pVPtr[7] is ignored, as it contains u16 VertexIndices[3].
// So we start from pVPtr[8]
But there i crashes. I am sure something wrong there..
I mostly got the logic behind it, but not sure if i undersand things right how to skip first bytes (so see i count everything since pvrt[8], because structure have 1 u16 (2 bytes) and 3 u 16 (6 bytes), at top before floats starts).
Not sure if i not mess with those [x][x] things there, but probabaly everything is placed in memory one after another, so should be about right as i do it all ?
EDIT: thinking a bit more , probably that part is wrong in my rewrite:
// because fist filed in structure "u16 Flags", do nothing new there. No "for j<3" loop there was.
triangles[tmp].Flags = os::Byteswap::byteswap(triangles[tmp].Flags);
// because second field in structure "u16 VertexIndices[3];", do also nothing new (just repeat it 3 times, as we get rid of "for (u16 j=0; j<3; ++j)" loop:
triangles[tmp].VertexIndices[0] = os::Byteswap::byteswap(triangles[tmp].VertexIndices[0]);
triangles[tmp].VertexIndices[1] = os::Byteswap::byteswap(triangles[tmp].VertexIndices[1]);
triangles[tmp].VertexIndices[2] = os::Byteswap::byteswap(triangles[tmp].VertexIndices[2]);
I.e. i didn't align them as they not floats but u16, but then from another side, they now doesn't contain anything yet at that point (as we no longer start out with a pointer to the memory we loaded). So it contains random values probabaly .. ?
Edited by kas1e on 2019/10/28 18:52:25 Edited by kas1e on 2019/10/28 18:53:59 Edited by kas1e on 2019/10/28 18:59:35 Edited by kas1e on 2019/10/28 19:02:12 Edited by kas1e on 2019/10/28 19:09:35 Edited by kas1e on 2019/10/28 19:36:58 Edited by kas1e on 2019/10/28 20:00:23 Edited by kas1e on 2019/10/28 20:01:55 Edited by kas1e on 2019/10/28 20:17:40 Edited by kas1e on 2019/10/29 11:42:30
of course it will need a triangle structure for read and another for writing the reordered floats but anyway the code will become more readable/maintainable
Feel free to show on latest example from 2 posts above how to fix code with your getfloat().
We specially read byte per byte, for purpose.. The snipet you quote didnt fits here in code i want to fix now.
But of course, feel free plz to post code how you think it should ve in case of that MS3DTriangle structure and later byteswapping.
Currently i have no problems how code reads, i just need that boring and non-interesting shit works on our cpu. I have lots of ideas from everywhere, but at moment can only follow corto's way as it seems works (and i not about first problem which Salas00 help to fix, now its another issue, with different returns and in different case, so code you quote didnt fits)
I will be busy in the coming days but on the week-end, I will attend the Alchimie show here in France so I will be 2 days programming on Amiga with my friends (hey thellier!)
That could be nice to try to make it working there.
Could you provide a ms3d model? All files I find are corrupted archives or they failed to be extracted ...
But it need to be fixed exactly that file which we discuss, because we can then ask mantainer to put it to irrlicht's repo, and because i need exactly that loader working in whole engine.
Seems that Theiller's way also may work, just strange why it crashes on me on 3st structure when time to load materials (so material structure with 6 floats).
Another question is : "register" got deprecated recently in C++. And it never was more than a hint to the compiler as one of devs told me. So is that really necesary here for us ? newer compilers should start ignoring it.
Also, don't lines like tri->VertexNormals[j][0] still write to a float which doesn't care about alignment? (don't know where this specific float sits now in memory, but basically - not sure that if it failed before why it can works now then ..)
@All Are floating point exceptions only on reading the float, but we are allow to write ?
@kas1e There isn't much benefit to using the register keyword nowadays. The optimizers in modern compilers are smart enough to put variables into registers whenever it's possible.
This is just like television, only you can see much further.
@BSZili Yeah, register can be out , but i still have issues with understanding "don't lines like tri->VertexNormals[j][0] still write to a float which doesn't care about alignment? " ? Because as far as i understand for now alignemnt issues with floats mean and load and writing , right ?
When asked to load an unaligned floating-point number from memory, modern PowerPC processors will throw an exception and have the operating system perform the alignment chores in software. Performing alignment in software is much slower than performing it in hardware.
Ok "moder kernel" its not about amigaos4 one, but then they refer only "when asked to load an unaligned floating-point number from memory".
So did it mean only _reading_ are issues for alignment, not writing , so that "tri->VertexNormals[j][0] = getfloat(Balbal)" are fine ?
Also why that damn thing didn't crashes in first 2 strucutre parsings, but still crashes with alignemtn issues when i use it in "load materials" block. Yeah it different, as in first 2 stages where we load vertices and triangles we use struct[x].field->aaaa[x] , but then when we load materials its struct->field[x] , so maybe those GetFloats messed with pointers somehow dunno.. As well as "for" loop there a little bit different, but still about the same imho.
Damn why kernel just can't handle it all, let it me slower , its for sure should't be _that_ slower that anyone can notice visually imho , but we all will forget about possilbe issues with aligments ..
@Thellier And don't we are trying to correct things in the wrong place with your example at all, i mean don't we simply byte reversing data in the same place, while that place still can be unaligned ?
Thank you for the links on models. I've sent you a modified loader by email, that could fix alignment problems in an easier way, padding the beginning of structures (still packed) in order to get the float fields aligned on 4 bytes. I dynamically allocate structures and copy the data to the address of the field Flags (just after the 3-byte padding).
And stack trace point out firstly on that line "vertices[tmp].Vertex[2] = -os::Byteswap::byteswap(vertices[tmp].Vertex[2]);" (i.e. last one), and then on memcpy line.