@Hans About that lighting bug, maybe it can be seen from another side: there is another differences related to lighting visually, maybe it all have the same roots.
So, if i fully remove lighting code in both shaders, then that how it looks like for
amigaos4:
(open in new tab for fullsize):
win32/opengl , pandora/gl4es:
So, the same at this point without lighting code. That mean that other parts of shader not related to issue i will show now.
Now, there is how looks like shaders "as it", without anything changed, with full lighting code in shaders enabled:
amigaos4:
win32/opengl , pandora/gl4es:
See, on amigaos4 its not only start to be "darker" in whole (and also have that "black" effect as i show on video), but also if you check how looks like "Tanks", you can see they also have issues like "overlighting". Some lines of tanks even not visibly.
To see that issue more better, i set that g_ambdiffspec to vec4(1.0) again (i.e. that how we get rid of black areas before), so its no more "dark" , and you can see how tanks looks like now : they looks like lighting code fully disabled:
Probabaly it's all the same roots , but maybe from this point of view it will bring more ideas about.
As you can see, with fully enabled lighting code, we have bunch of issues :
1). everything black when hit enemy as we discuss before 2). too dark in whole 3). tanks (and other enimies of course, and probabaly some textures too) do have overlighting or something, so some parts (black ones ?) just didn't visibly: probably boscage also a little bit overlighted.
There are two possibilities: 1. A shader compiler bug 2. Corrupt lighting input data
Could you check what happens to the value of g_ActiveLights? This is on a hunch that something blowing up adds an extra light, and that light's data is corrupt. For example, if the explosion's light value happened to be huge and negative, then that would cause the effect we see.
Hans
P.S., Yes, I'm pretty sure that the waves disappearing is caused by the cos() instruction not handling large values. Simple test cases are welcome, though.
There are two possibilities: 1. A shader compiler bug 2. Corrupt lighting input data
If it was corrypt lighting input data, it should then fail (probabaly) and on gl4es/pandora and on win32/opengl.
Quote:
Could you check what happens to the value of g_ActiveLights?
Yeah.
With original shaders (i.e. when it show us those 3 issues on amigaos4 only) when i just start a game or level (so when everything have too-dark bug), then gl_ActiveLights are "1". The same as if i made shader be with g_ambdiffspec to vec4(1.0), then everything is not too-dark, but gl_ActiveLights value still 1 in both cases, so at least that bug 100% not because of corrupted gl_activelights
Then, when i see enimies, then values start vary, from 1 to 5. When i hit the enemty and have all black screen, values didn't changes much, same 1,2,3,4,5. Not bigger, and not negative. And that the same and with original shader, and when we do g_ambdiffspec be vec4(1.0).
Quote:
This is on a hunch that something blowing up adds an extra light, and that light's data is corrupt. For example, if the explosion's light value happened to be huge and negative, then that would cause the effect we see.
If it was it, it can only explain that one bug, but it didn't explain why everything is too dark (always, just from starting the game), and that textures of enemies overlighted or over-something, as i show on screenshots (that always too, not when i hit them).
Anyway, as gl_ActiveLights as values are small ones, and not negative, its in end probabaly compiler shader bug. Question is how to find out where and why so you can fix it..
Have any more ideas ?:)
Quote:
Yes, I'm pretty sure that the waves disappearing is caused by the cos() instruction not handling large values. Simple test cases are welcome, though.
Through as i suck in shader's programming, i can't create simple test case. But i put relevant fragment shader part to the ticket, so you with your knowledge for sure will be able to made simple test case fast, it if there will be needs. But probabaly internally you can check somehow what happens with cos() anyway faster than creating test case.
If it was corrypt lighting input data, it should then fail (probabaly) and on gl4es/pandora and on win32/opengl.
Not if the reason for the corrupt data is Amiga-specific (e.g., GLES2 lib bug). The shaders in question use an old variant of GLSL with predefined light uniforms (the gl_LightSource struct array), which modern GLSL doesn't use any more. Fricking shark is probably one of the few programs to use those so far on AmigaOS, and it's entirely possible that there's an issue with them.
Quote:
Then, when i see enimies, then values start vary, from 1 to 5. When i hit the enemty and have all black screen, values didn't changes much, same 1,2,3,4,5. Not bigger, and not negative. And that the same and with original shader, and when we do g_ambdiffspec be vec4(1.0).
What do you mean by "didn't change much?" Does it change at that instant or not? Being in the range of 1-5 doesn't clarify anything.
Quote:
Have any more ideas ?:)
Yes, find out what changes occur with the lights when something is hit. Which light(s) change? Which values change? Knowing what the game is doing with the lights should help track down where it's failing.
What do you mean by "didn't change much?" Does it change at that instant or not? Being in the range of 1-5 doesn't clarify anything.
Retested again : when you just fly in the level gl_ActiveLights are 1. Once you hit the tank, gl_ActiveLights are 2. Once you hit the paraplane, gl_ActiveLights are 2 sometime 3.
Do you think its better to concetrate on that lighting bug when enemies hits, and not on others 2: "why whole game is too dark with lighting shader" and "why look of textures overlighted" ? At least bug with hit happens only when hit, while other two there always (can be a little bit easer to debug maybe?)
Retested again : when you just fly in the level gl_ActiveLights are 1. Once you hit the tank, gl_ActiveLights are 2. Once you hit the paraplane, gl_ActiveLights are 2 sometime 3.
Interesting. What happens if you lock gl_ActiveLights to 1? Is there a way for you to print out the values that the lights are set to?
Quote:
Do you think its better to concetrate on that lighting bug when enemies hits, and not on others 2: "why whole game is too dark with lighting shader" and "why look of textures overlighted" ? At least bug with hit happens only when hit, while other two there always (can be a little bit easer to debug maybe?)
Yes, that one's likely to be easier to track down, because it happens under specific circumstances. Hopefully the other two are related, but we can deal with that later.
Interesting. What happens if you lock gl_ActiveLights to 1?
If i catch all cases when gl_ActiveLights is more than 1, and then set it to 1 before calling adduniform(), then bug with "all black when hit" disappeared !
Quote:
Is there a way for you to print out the values that the lights are set to?
As far as i can see there is only one uniform in shader about lighting (that gl_ActiveLights), so that one i can printf easy from source code of game, but other ones you mean probabaly shaders values, and they probabaly very hard to print. I remember reading in google that its all hard, and ppls only somehow print it like "drawing on screen", etc.
EDIT: oh, and that i notice now, its probabaly another issue with lighting. Originally, when shaders works as expected, when you hit enemy, everything around have some lighting effect, which we didnt' have.
On win32/opengl and on pandora/gl4es, does not matter if you set gl_ActiveLighting to 1 or not for all the time : that effect still here, and visually not much changes. If it 1 for all time, or varies.
If i catch all cases when gl_ActiveLights is more than 1, and then set it to 1 before calling adduniform(), then bug with "all black when hit" disappeared !
Okay, so there's definitely a problem with the additional lights. It could still be either corrupt uniform data passed to the GPU, or a shader compiler issue.
Quote:
As far as i can see there is only one uniform in shader about lighting (that gl_ActiveLights), so that one i can printf easy from source code of game, but other ones you mean probabaly shaders values, and they probabaly very hard to print. I remember reading in google that its all hard, and ppls only somehow print it like "drawing on screen", etc.
The lighting parameters that end up in the gl_LightSource array are set via glLightfv(). This is a legacy OpenGL function which GL4ES must somehow process to work with GLES2. For that matter, the raw shaders must be modified before being passed to the GLES2 library to add the legacy gl_LightSource uniform.
@Hans Ok, will go then that way: will remove from original shaders everything which is not light: fog, shadows, water-ripple, etc. Then will strip light functions till small, but which still give on os4 black-when-hit, but on win32/oopengl and pandora/gl4es still not. Then we can continue , but since then will post shaders after gl4es conversion: that indeed will be better than checking original shaders which later convering by gl4es and add/remove/regroup things.
@Hans So was able to strip the shaders so "black-when-hit" bug still there, and that how looks like shaders which are converted by gl4es and sended to ogles2:
As i see, vertex one still have arrays for uniform, and that is exactly that uniform _gl4es_LightSourceParameters _gl4es_LightSource[_gl4es_MaxLights];
So maybe problem still because of uniforms arrays ! It looks just the same as it was before : when all textures was black. And once we get rid of arrays usage there, textures appears to be visibly. Now, there is another uniform arrays, which also give us "black-when-hit", so maybe it's still the same issues ?
I ask ptitSeb, if it for sake of tests possible to get rid of uniform's array usage there , and he say that:
Quote:
Don't forget all those variable are normaly "built-in" in OpenGL, and gl4es already need to track them. Tranforming the array to non-array basically double the work (tracking the array and non-array form). Also, more importantly, on the shader side, you can see the lights array are accessed by variable. It's inside a function called from inside a loop on the number of active lights (look at the PointLight function in the vertex shader). Unrolling that will result in a mess of a shader: that really is difficult, and will not result in the same shader anyway.
Lightning most often needs arrays, because you most of the time have more than 1 light.
What mean, that even if the roots of problem probabaly the same (seems very possible) because of that non-working uniform's arrays, then we can't fast-check-test it, and only can test it when uniform arrays issue will be fixed.
So cross the fingers that Daniel can find the roots , then we can test fistly if all works with arrays, and then we can see how lighting start to works (or not), and how other lighting issues will looks like.
Oh, btw , need to mention, that it's not just an array of uniform, but an array of uniform structure (!) there, so can be probably some other bug with the same "uniform arrays" roots.
Edited by kas1e on 2019/8/13 20:33:53 Edited by kas1e on 2019/8/13 20:47:00 Edited by kas1e on 2019/8/13 20:48:08
Nice job, we're getting closer to discovering the root cause. It definitely looks like a uniform array of structures problem.
I just checked, and I have a test for uniform arrays of structures. It's a relatively simple one (a single uniform variable with a struct containing 3 floats), though, so there could still be a bug in there.
I agree with ptitSeb that reworking the code to remove the arrays would create a mess. It's not worth the effort.
For now, I can't think of anything else to try to narrow it down further.
Good news about water-ripple bug effect, while its probabaly the easest bug from all, still ..
So, i was able to create simple test case with _VERY_ simplefied version of ripple-water effect. I grab the Hans tutorial 3 (drawing texture), reduce gl4es shaders very much + adapt them a little as well as adapt main.cpp so it set CurrentTime uniform in a loop, and violla, bug is 100% reproducable on amigaos4 once CurrentValue hit _exactly_ 400. Dunno why 400, but it is.
I firstly create it for win32/opengles(angle) (yeah, that really handy now to test bugs when you have ability to run 1:1 the same code on both win32 and aos4), and there is no such bug of course. Currenttime value increased as much as i wish without problems (i tested till 2000, all fine). While on amigaos4, once 400 is hit: bum, nothing happens, effect stops.
win32(x64 with all .dlls inside) and amigaos4 versions, source code and binaries included. Just unpack, run Effect_bug.exe, and just hold any button when window with texture active: In the shell you will see printouts of currentvlaue (a = xxx , this is that fCurrentTime) , and once it hit 400 effect disappear. On win32 all fine, i test even till 2000.
@Hans And it's not only cos() , it also sin() behave exactly the same. I.e all fine on win32/opengl(angle) and on pandora/gl4es, but give same issue on aos4. And it also stops on the same currentime value as cos().
I also shorten fragment shader even more, now effect looks wrong of course, but that not matter, what matter that i still can see how cos() or sin() stop working. I remove all * 4, all + and so on, so now function which use cos()/sin() looks just like this:
void ApplyWaterEffect(in sampler2D sampler,in vec2 vCoords, out vec3 color)
{
Now, even with such a small function, once we do 1600 loops, everything stops. For both cos() and sin(). Before it was 400 because in shader's code we multiply it and so on.
Is there anything you can think of , why "1600" exactly are stop factor to broke sin() and cos() ?
I do not know if it worth to report more things about warp3dnova, as nothing from previous got fixed for too long already, even simply ones, taking aside those ones which need to spend more time :( But hope die last, so there is another thing to discuss, maybe not only Hans, but also Daniel or Capehill will bring some ideas about:
So i do port irrlicht engine over gl4es, and while generaly speaking it all works fine, through, its pretty slow in test-cases: its just on the level of some 10 years old notebok with amd 1.6gz and some intel-inbuild-gfx-card. I not sure because of what it all (but knowing that there is no GART support, and that some parts of warp3dnova just have disabled optimisation, and that we have no DMA for x5k in graphics.library, we can skip that till point that all will be fixed/done).
But issue i want to discuss there is more than just "its slow", is that 2 examples which come with Irrlicht behave pretty strange (and only on amigaos4 of course). As main example we can take "12.terrainrendering". That how it looks like (visual and source):
Now, i test it on that old 10 years notebook 1.6ghz , and it give 600fps, with no problems when i move camera, etc.
On amigaos4 it give at best 90 fps only, and when i start move camera, it increasely start to eat memory. At some point it eat up whole 256 mb of video memory and os even freezes or crashes in warp3d_si library
You can see on video , at top, where is small dockies placed, how gfx-ram dockie show filled video ram till maximum.
On pure software rendering it give on 10 years old amd1.6ghz 43 fps, and on amigaos4 34 fps. So kind of on the same level we can say. But in opengl, it just about 40-90fps at maximum, while on that 10 year old amd notebook give ~600fps (!).
Now, i ask ptitSeb to check it on Pandora , he say that all fine, i quote him:
--- Nothing special on the Pandora. Can't really tell the speed, but it's fine. I made a GLES2 capture: the drawing loop is pretty, with only 10 calls per frame! There is just a background "cube map" (well just 6 planes), than the large terrain in 1 drawing command, than the Irrlicht logo, and a last untextured drawing at the end (not sure what it is). The only thing I can see that can have some effect, is that, before that last drawing command, it bind on GL_TEXTURE0 the texture 0 (that is default unattached texture). That a classic way to avoid texture drawing in OpenGL 1.x, you bind Tetxure #0 and disable Texture mapping. So here there is a drawing with a shader that doesn't use texture, with an invalid texture binded. Not sure if this can be the source of the issue you have. But I seriously haven't seen anything wrong in the draw loop. ---
Everything looks fine, but still we have 2 issues on amigaos4
1. All video memory eats when i just move camera with such a simple test-case
2. Its just sloooow. But maybe that because of the issue #1.
Now, any ideas, anyone ?:) I thinking at first that maybe somewhere is "calloc" or "realloc" hides which can cause such kind of issues, but all the other 25 examples (which is more and heave than those ones) didn't have such issues..
Well, think about it, the terrain in that frame is a large set of almost 29k vertices. That's a big object. What I'm not sure is if the arrays are rebuilt for each frame (to include only triangle that are visible) or if it's the same at each frame. My guess is that's a new very large array at each frame. That means GLES2 will check if that large array is the same as a previous one (not sure how fast this is for large arrays) and then create a VBO with BigEndian->LittleEndian conversion (that will, again, take some time).
Of course that didnt explain eating if memory and THAT slow speed
@kas1e Those two examples trigger one of the many performance boosters in ogles2.library without which things would crawl to resize the internal multibuffer's VBOs all to rather huge sizes. I added a simple safety mechanism now to avoid such situations. This quick workaround comes at a rather huge performance penalty (only) for such situations though. I will improve that for the next official ogles2 version.
@Hans However it's important to note that there is no memory-leak or bug in ogles2.library! All those allocations are legal, Nova always reports "success". What is causing the crash here is something else, pretty interesting actually:
As being said, all VBO allocations are reported as being successful by Nova. This is true even if the physical gfx memory is already practically fully depleted. The stuff still runs, albeit with a significant slowdown. Something else is causing Nova to freeze the system then, namely a call to DestroyVertexBufferObject. This is what happened in ogles2: when gfx RAM was already in critical areas, yet another VBO was in the process of being resized - which involved its prior destruction before its recreation, bang. Note that the call to DestroyVertexBufferObject is fully legal and the respective VBO is not in use anywhere or whatever.
So, to sum it up: Nova freezes if you call DestroyVertexBufferObject when gfx RAM is low, at least.
EDIT: filed this as new W3D Nova bug report 0000447.
Edited by Daytona675x on 2019/9/7 10:56:15 Edited by Daytona675x on 2019/9/7 10:58:53
@All Ok so, for that example from Irrlicht called TerrainRendering we are done : Daniel add workaround for Nova bug when Nova freezes if you call DestroyVertexBufferObject when gfx RAM is low, as well as he added that safety mechanism to avoid such situations.
But now more news about : PtitSeb add real use of VBO for glBindBuffer() (at this time only for it), so, all the apps/games which use it much, benefit from it. For example, in FrickingShark i got +10fps more (not a lot, but still), but in that TerrainRendering example, it is now 550 fps instead of 50 (!) yeah, in 10 (ten) times faster.
But in case with TerrainRendering example, it was also slow and on Linux with gl4es. So, it was gl4es slow things down this time. But now all Irrlicht examples have on linux with gl4es almost the same speed as on linux with Mesa, so it didn't add as a layer speed issues anymore.
Now, probabaly the last thing i want to understand (and maybe it possible to speed up it somehow) with Irrlicht, is why some Irrlicht examples are 2-3 times slower, than on 10 years old AMD 1.6 ghz with some shity-inbuild gfx card.
To explain it more, we have in Irrlicht few different modes of rendering : OpenGL and Software Rendering.
Now, let's see figures of software rendering , on 3 different machines:
As we can see, x5000 by hardware specs on the level of 10 years old amd 1.6ghz notebook.
That time, we didn't take in account opengl, or warp3d, or anything of that sort. It is pure software rendering. And only what have impact there , is CPU, graphics.library and radeonHD driver.
Through if think more about it, its probabaly ok. Everyone know that x5000 is 10 years behind of current computer world, so, kind of ok.
Now, taking those results in mind, and knowing that we on the level of amd 1.6ghz , we expect x5000 with opengl rendering be at least on the same level (of course, as we have better gfx card, it is wish to be faster, but its ok to be just on the same level).
And there is table , with OpenGL. The same 3 machines are used:
But also added there results from icore_i5 (first config) under Linux with MESA, and under linux with GL4ES (so to see that GL4ES is ok, and on the same level as MESA and we can't blame it for some issues i will point out now). Also i added MiniGL results (i was able to made Irrlicht works for MiniGL somehow, which offten crashes with it, have rendering bugs, some example's didn't work either, but it still something to compare with).
What we can say there ? For first, MiniGL is suck. Only 2 examples at least on the same speed level (09.MeshViewer and 20.ManagedLights). For others its just too slow indeed.
Next thing we can notice there, that X5000 with GL4ES, with some examples pretty much faster that old AMD notebook (at least that what we expect when we have modern RadeonHD), like 03.CustomSceneNode, 04.Movement and most of them if not on the same level, but a little bit faster there and there.
Issue i see now, is that some examples show pretty degradate results, which i want to discuss and find out why: maybe it will be again possible to speed up somehow/somewhere , or at least we can find WHY.
Examples about which i told are:
02.Quake3Map (slower on 50% than even on 10 years old AMD with shiti intel gfx card)
16. Quake3MapShader (again slower on 50% than 10 years old notebook, but i assume its the same issue as with first example). But with that example, DISABLING of compositing make it be 90 fps, while with ENABLED compositing it is around 60-65 (maybe that will point out on something).
18. SplitScreen (slower in 300% ! Example again load quake3map, and split screen on 4 parts, and rendering happens independent in each).
So those 3 examples probably have the same single issue (i hope), as all of them use quake3map.
And, last one:
20.ManagedLights. That one the same as MiniGL one by speed, nothing changes, so i assume there is that issue with GART this time, or non-dma in graphics.library for x5k.
Now, to have something to discuss, i firstly upload all those 4 examples ready to run, so everyone can try:
I am in interest if any X1000 user (so with DMA in graphics.library) can run them, and see the maximum fps they have in all of them (FPS is writen in window title). So we can avoid or not avoid that its because of DMA. Quiting from examples can be done via close gadget or via "alt+f4". Run it from directory where they are (bin/amigaos4) as they want root's "media" directory.
Next, i made a tracing/profile for all those 4 examples via today's glsnoop which catch almost everything now (Capehill, thank you very much for that!). Profilings for both warp3dnova and ogles2 are at end of files
To be honest, the capture is very clean. Only 27 drawing call in the entire frame! No suspicious state change. Only the bare minimum per drawing call and that's all. And most drawing are using a quite simple shader that do multitexture rendering: Vertex shader:
His first guess was is the time difference is converting from BigEndian to LittleEndian to send the ever changing Vertex Arrays data to the GPU.
But yeah, sure that conversion take place, but it can't take THAT MUCH. I mean, old amd 1.6ghz with some slow intel-gfx card give us ~350fps, and on our setup ~150fps. I can't belive that such a conversion can take that much time. Its probabaly something else ?
Then we checked tracing/profiling logs, and can see that it's the Drawing itself that is the main bottleneck, not the VBO creation/handling it seems (well, it takes some time too, but ok-ish).
And currently run out of ideas. Its a pretty simple things as can be seen from log, through, that a big diffrence bettwen 10 years old amd with shity gfx-card. So something wrong somewhere, and we can't see why and where.