Try setting the unused arrays to UINT8 as well. That should work for now, until I add a proper option to disable endianness conversion.
I did and unfortunately this doesn't work at all (upload taking seconds, app becoming blocked, not crashing though, can be closed gracefully). Again, even tried with size/stride 0. Is there a chance that this gets fixed soon? Best would be a proper implementation of VBOSetLArray(W3DNEF_NONE) though, abusing UINT8 for "disable" seems like a bad idea anyway.
I did and unfortunately this doesn't work at all (upload taking seconds, app becoming blocked, not crashing though). Again, even tried with size/stride 0.
That makes no sense whatsoever. The code in question literally compares the element sizes of all arrays in the VBO. If they're all the same, then it enables the global conversion mode.
Sorry, I cannot commit to fixing anything any time soon.
Yes, there's lots inside Nova which doesn't make sense
Quote:
That makes no sense whatsoever. The code in question literally compares the element sizes of all arrays in the VBO. If they're all the same, then it enables the global conversion mode.
Global conversion mode?! It should not do any conversion at all! That's the idea of having plain bytes
So now you are saying that what's important is that the element sizes of all arrays are identical? That makes no sense once more. Why should the size of the elements be of any interest when it comes to deciding whether to copy stuff with your no-endian-double-copier?! The only thing that could matter is if they are tightly packed (which is the case for size/stride 0 btw.) and if they are all of endian-uncritical types (which is the case for NONE or UINT8 regardless of their element size).
Also, some minutes ago you were sure that W3DNEF_NONE would do the job, which clearly isn't the case. So the question is if your code actually does what you think it should do (it's called a "bug" if it doesn't ).
But let's see, let's make it easy, please just answer the question I had some posts ago: is any special parameter combination required for VBOSetArray with W3DNEF_NONE (or W3DNEF_UINT8) to make it work as promised? Let's asume I create a VBO with 2 arrays, VBO size is 1024 bytes, I want array #0 to be my uint8 raw data, so it's uint8, elem size 1, elem count 1024, and I want to disable array #1 temporarily. What's the correct VBOSetLayout command for that, so that the result is fast upload just as if the VBO had been created with just 1 array?
Quote:
Sorry, I cannot commit to fixing anything any time soon.
That's bad. Especially because more and more bug reports pile up for almost a year now Not to mention the famous -O0 story.
Edited by Daytona675x on 2019/9/18 11:41:18 Edited by Daytona675x on 2019/9/18 11:45:33
Sorry, I cannot commit to fixing anything any time soon.
Pretty sad to hear that. I was in hope you start to work on it again, at least to deal with -o0 issue and those issues with shaders.
Why you dont want to spend even few hourse to at least help Daniel to be able to disable conversion at all (even if it requere to fix/add something) ?
Is reasson about money ? If so, i can pay per 100$ to each bugreport i report. Will you take that offer ?
But what is more sad, is that not only you, but we all spend time on all this, and now you say "sorry no bug fixing any time soon", which mean never.
Why we all loose our time to go to the level when all almost works, just need some more work, and you say sorry guys, no fixes anytime soon. And what should we do ? Drop it all, put to trash and forget about crap ?
Sure you have priorities, family and stuff (as all of us), but why didnt take even a bit of time to fix things in nova ? You see that we all trying to improve thing, and that it all important. Why you want it all stops ?:(
@kas1e No worries, I experimented a bit more: The question "do all arrays of a VBO have the same size?" which Nova falsely uses to select fast-copy (it's the wrong question to ask, it makes no sense) works if applied that way:
- create VBO with N arrays - set VBO byte size to a multiple of N, add extra bytes if necessary - if you want to switch to fast endian-conv-free data transfer then do for(uint32 i=0;i<N;++i) VBOSetArray(vbo_handle,i,W3DNEF_UINT8,FALSE,1,1,i*(vbo_size/N),vbo_size/N);
This works Now let's get rid of yet another Nova snail effect
>The only thing that helps is to create a simple 1 array VBO in the first place.
?? Please, explain more. Do you mean a simple array of floats declared as W3DNEF_NONE + uint8 ? Because later on your text you say it dont works but say "we still have a potential up-to-factor-4"
I mean do you really obtained x4 speed up ? or is it theory ?
@thellier You misread / skipped stuff. And maybe overread the latest post.
Quote:
The only thing that helps is to create a simple 1 array VBO in the first place
This one is outdated. See post above your post. If the unused arrays are "disabled" in the way (and only this way) I outlined there then it works with VBOs with arbitrary numbers of arrays.
Quote:
Do you mean a simple array of floats declared as W3DNEF_NONE + uint8 ?
I dont understand what you mean by that.
Quote:
Because later on your text you say it dont works but say "we still have a potential up-to-factor-4
As I stated: 7 fps vs 30 fps (~ factor 4) was what I measured: a) 30 fps if I used a 1-array index VBO with the no-endian-conv-trick. b) 7 fps if I instead made it a 2-array index VBO with the 2nd array disabled via W3DNEF_NONE, which weirdly enough results in Novas slow (standard) copy-conv being triggered.
What did not work at this time was that I could not have (b) to be as fast as (a) because I didn't find a way how to trick Nova into the raw-byte-copy-mode if the VBO had more than 1 array.
Hans W3DNEF_NONE info turned out to simply not work at all. And his W3DNEF_UINT8 hint lacked proper usage info and Nova has a logic bug when it comes to mode selection, which is why I did not try it in a way that would suite Nova.
NOW I found such a way. So now get (b) with the speed of (a), which means that if I apply that wisdom to my internal interleaved vertex data VBOs, then the upload of those should speed up accordingly.
Important: note that this expected speedup is usually not going to result in a 4x higher framerate!! What's being sped up will be the VBO upload only! If the respective app uses its own VBOs then there most likely won't be too many uploads... Also, ogles2 does a lot of tricks to avoid uploads at all costs. So you can expect biggest improvements for situations where a) the ogles2 client uses client-RAM instead of his own VBOs and b) that vertex data is frequently changing.
@Daytona675x Sorry, You are right I only read until page 34 :-/ so I missed your last explanations I am not accoutumed to have a coding subject produce so much answers in so few time
> b) that vertex data is frequently changing So for any minigl to nova (or warp3d to nova) wrapper it will make sense
I am not accoutumed to have a coding subject produce so much answers in so few time
LOL
Quote:
So for any minigl to nova (or warp3d to nova) wrapper it will make sense
In general yes, but... The real effective per frame performance gain highly depends on the respective application / game. The highest gain is to be expected when every existing upload-avoidance-strategy inside ogles2 fails. This is the case for quickly changing, non-repetetive data, e.g. a procedural effect. Or if there are so many different objects per frame that ogles2's caches are constantly overwritten again. Or if there are very large objects which don't fit the internal cache buffers. As long as there are only a handful of not too big uploads per frame, the performance gain will still exist but probably be neglectable. We'll see
@all Hate to bring another bug-report again, but probably still worth to discuss to understand from where issues come.
So, to make it short, there something wrong with BGRA Texture hardware extension. There is 2 issues: one cosmetic one, and another one is hardcore crash when those BGRA Textures used with/in FBO.
First one, just pure cosmetic one, and didn't cause any issues except visuall differences: colors are swapped. Where they should be blue, they are red. Where they should be red, they are blue.
For example, there is one of irrlicht examples (13.rendertotexture) and how it looks like on our side now (@all: don't fear of number "6" in the title about fps, its just not whole string fit in my theme, its 600 fps) :
(press open in new tab for fullsize):
And that, how it looks like once i set via gl4es environments to ignore BGRA Texture hardware extension:
(press open in new tab for fullsize):
On win32 and on pandora colors are blue too, i.e. that the correct look, and red one are wrong.
The same issue happens in supertuxkart in the car choice screen, where cars should be red, they blue, where should be blue, they red. And the same once i disable usage of BGRA in texture, it start to show correct colors.
At first i of course think its gl4es and something about endian formats to be done inside, but then, ptitSeb says that no, there isn't much Irrlicht engine and gl4es do here. Texture with GL_BGRA format are supported on AmigaOS, as there are other games that support it, so it's an issue with this format. It's very well may be that on our side reading are fine, but writing to texture are not.
Second issue, is that i have very hardcore ISI crash, when i do choice a car in the car-selection-screen , and before it all crashes i have that warning:
[error] Irrlicht: FBO has one or serveral incomplete image attachements [error] Irrlicht: FBO error [error] main: Exception caught : std::bad_alloc [error] main: Aborting SuperTuxKart
One time i even got:
[error] Irrlicht: FBO has one or serveral incomplete image attachements [error] Irrlicht: FBO error [error] Irrlicht: Fatal Error: Tried to set a texture not owned by this driver
We at first thinking that it can be related to that issue with swapped colors , but when i set to ignore that BGRA texture extensions, then colors are start to be fine, but crash still there. So issues probabaly not related, but in the same area about texture's and BGRA format. Or maybe related, hard to say now.
Now crash looks like this:
Crash log for task "supertuxkart"
Generated by GrimReaper 53.19
Crash occured in module at address 0x590F9168
Type of crash: ISI (Instruction Storage Interrupt) exception
Alert number: 0x80000003
Having a look at logs, ptitSeb says that all looks fine: the log trace from the "crash" version looks clean. There is 3 FBO created, exactly in the same way, but the 3rd one didn't worked, for some reason I can't explain. And there isn't any more usefull log you can get from gl4es. In the current logs, the issue is when glCheckFramebufferStatus(...) is not 0x8CD5... (this one means FBO Complete).
So, he guess that on our side we need to check, if we seees an issue in bindding a GL_BRGA texture to an FBO.
And that we (me probabaly:) ) need to remind what bidding a texture to an FBO mean: that means that all the renedring that will be done don't go to the screen, but to the binded FBO, so the drawing will be done directly in the texture => so maybe GL_BGRA are fine when you read it, but maybe are wrongly handling for "writting" in an FBO.
And the last thing i do, is to create a glSnoop trace of whole game since start, till it crashes. Maybe that also will give as a clue of what wrong with FBO and that texture binding with:
@kas1e Regarding the blue/red color swap: it's a Nova bug, although, well, to be fair: it's more a question of definition who's responsable for what and what's to be expected from what.
Anyway, Nova ignores any eventual channel swizzle settings of a texture when it's bound to an FBO. Unfortunately this channel-swizzling is the way how I implemented BGRA texture support: it's actually a texture of format W3DNPF_RGBA (which actually only means that it should have 4 color channels) and then I modify its default channel mapping.
I submitted a small bug report against Nova. However, because Hans signaled that he won't fix anything anytime soon and because it's probably not really his fault this time, I just added a temporary workaround to ogles2:
whenever a texture is now used as FBO render target, I reset its swizzling so that it at least can be rendered as expected. This may have other sideeffects but for most usecases this should do.
Edited by Daytona675x on 2019/9/19 10:09:59 Edited by Daytona675x on 2019/9/19 10:40:07
You cant blame Hans to not fix immediately Nova: after all we dont know about his life and perhaps he have more urgent/important personnal things to do in his life than fixing a hobby computer program.
Usually the answer is "will be done in a future release" for subcontractor or for employee "I will take care of it as soon as I will have time" and you put the paper up to a big stack of folders meaning " but I have so much to do..."
@Daniel Yeah, tested new ogles2.library, colors fine now, thanks !
Also have some progress on crash front: but i already wrote it all on facebook, but maybe other ones in interst to read :)
So, crash gone once i set gl4es's LIBGL_RECYCLEFBO environment which make avoid multiple FBO creation / deletions. Through it produce some wrong look and mess instead of actual data :
(press open in new tab for fullsize):
But that can be easy gl4es issue as well , need to check with ptitSeb firstly.
- create VBO with N arrays - define VBOsize as VBO byte size - if you want to switch to fast endian-conv-free data transfer then do for(uint32 i=0;i<VBOsize ;++i) VBOSetArray(vbo_handle,i,W3DNEF_UINT8,FALSE,1,1,i,1);
Edit: VBOsize I mean per line of the array Exemple 3 arrays: xyz uvw rgba
@thellier No, that's not the way to go. The second parameter to VBOSetArray must be the index of an array of the respective VBO, namely a value between 0 and N-1, where N is the value you used when creating your VBO. Also, your 40 is not the VBOsize, it's your vertex-size, your VBO certainly contains more than 1 vertex In your example N is 3, but you try to falsely set array-layouts for non-existing arrays 3 to 39.
Keep in mind that for this trick the real layout of your arrays is not of interest. What's important is that you use W3DNEF_UINT8 (otherwise endian conversion would kick in) and the same size (otherwise the driver acts somewhat dumb and always selects complex-slow-copy, which is why you eventually have to add some extra bytes to your VBO (and I suppose an 8 byte divisible size won't hurt neither)) for all arrays of the VBO. The idea simply is to split the VBO memory temporarily into N sequential raw-ubyte areas of the same size.
Let's asume your VBO should contain 4 of your vertices. Then it looks like this:
which is for(uint32 i=0;i<N;++i) VBOSetArray(vbo_handle,i,W3DNEF_UINT8,FALSE,1,1,i*(vbo_size/N),vbo_size/N); or here for(uint32 i=0;i<3;++i) VBOSetArray(vbo_handle,i,W3DNEF_UINT8,FALSE,1,1,i*54,54);
@All While porting supertuxkart, found another simple shader which fail on our side. Daniel's glslangvalidator_redux compile them fine (so probabaly olges2 also fine there, because code from daniel's last glslangvalidator_reduce are inside of ogles2 if i correctly remember), but then, compiled .spv version from that shader fail on Nova with reference to problems with shared_ptr
8/0.Work:Warp3DNova/my_tests> W3DNShaderInfo motion_blur.vert.spv
W3DNShaderInfo - Get shader information
Shader: motion_blur.vert.spv
Compiling motion_blur.vert.spv failed (12) with error: unknown error
Log:
ERROR: An exception occurred during compilation:
ERROR: Assertion failed: px != 0, in T* boost::shared_ptr<T>::operator->() const [with T = GPUProg::CGRegisterAlloc], defined in /SDK/local/common/include/boost/shared_ptr.hpp in line 253
ERROR: An exception occurred during compilation:
ERROR: Assertion failed: px != 0, in T* boost::shared_ptr<T>::operator->() const [with T = GPUProg::CGRegisterAlloc], defined in /SDK/local/common/include/boost/shared_ptr.hpp in line 253
ERROR: Code generation failed for an unknown reason.
Done.
And there is verbose output from nova:
8/0.Work:Warp3DNova/my_tests> W3DNShaderInfo -v motion_blur.vert.spv
W3DNShaderInfo - Get shader information
I do some google, and found that this variable used pretty offten in all tutorials and even in book “OpenGL Shading language".
Then i find that it all described in section 5.5 of GLSL spec, and in end i find in google some explain , that this “st” part is part of swizzle mask which let you recombine your vector. The texture coordinates are four-component vectors, but st mask selects the first two (you can use “xy”, it would be the same).
So i tried also with "xy", but it also fail.
Any ideas if it expected to work (so its a bug), or some non-implemented feature ?
@kas1e The crash is obviously due to a typical programming bug in Nova, a wrong usage of boost::shared_ptr. Basically it's a classic invalid function-call-attempt on a nullptr. Which somewhat fits into the discussion we had about potential boost misapplication some weeks ago, which has to be fixed asap.
I explained it badly I should have called the variable VertexSize not VBOsize : but we agree [ For Wazp3D57 I have encapsulated all those VBO functions to more simple functions so I no more use them...]
I was thinking that "recasting" on the fly a VBO created with 3 "fields" to a VBO with 40 "fields" may be possible (after all Nova works so strangely so it may not have check the "fields" scount) as long as the global VBOsize stay the same (160). It may have permit to not change the VertexSize to a multiple of 3 but if it dont works it dont works...