What if you put the printf() or Delay() at the very beginning of your function, before making any of the SDL calls? Does it work there? What if you put it at the very end, just before the "return SDL_TRUE;"? (Which means it won't even be called if compiling the shaders fails.) Or does it have to be in between the two sets of SDL calls in order to make a difference?
Putting Delay or Printf right at top of InitShaders() (before SDL calls) or rigth at bottom before return SDL_TRUE; didn't fix shaders compilation, but once i put it anywhere after the first SDL call tii that block:
/* Compile all the shaders */
for (i = 0; i < NUM_SHADERS; ++i) {
if (!CompileShaderProgram(&shaders[i])) {
SDL_LogError(SDL_LOG_CATEGORY_APPLICATION, "Unable to compile shader!\n");
return SDL_FALSE;
}
}
Then it works. I even can put those Delay/Printf inside of CompileShaderProgram() to make shaders compiles. That mean that our shaders compiles fine, when we have put Delay/Pause after first SDL call in the init_shader() function anywhere, including in the CompileShaderProgramm() itself.
Quote:
And what if you use IExec->DebugPrintF() rather than printf()?
IExec->DebugPrintF("aaaa\n"); do no fix issue. But pure prinfs/delay do fix.
@tonyw
As far as i can see putting IExec->DebugPrinF do NOT fix shaders-compiling issue. But then, i tried to put instead IExec->Reschedule(); - also didn't fix a thing. Only prinfs() from newlib and IDOS->Delay / IDOS->Printf / IDOS->PutStr at the moment.
Quote:
don't know anything about these advanced graphics, so I don't even understand what you are "compiling". But presumably there is a Process that has to operate on some pseudo-code or tokenised code, and it fails unless something else gets control of the CPU sometimes. Correct?
There i compile shaders which to be run on GPU directly (shaders = small blocks of C kind code which runs directly on GPU). So, for making shaders compiles back in past, in the age of OpenGL1.x/2.x, there were introduced *ARB extension functions, which allow even with old OpenGL to compile/execute shaders (in latest opengls that all was merget properly, and special GL fucntion to compile/execute/attach shaders were added).
But for many "old" opengl apps (old for whole world, but not that old for us), ARB functions were used.
So there i take a simple SDL2 example, which use ARB functions to load up and execute shaders and use them as example.
And what we have there, is that while everything compiles, gl4es support ARB functions, etc, all seems to works, but then, still, by some reassons "shaders didn't compiles" when we have code as it, but compiles when we add Printf("aa\n"); or IDOS->Delay(1) right after first SDL calls happens.
So we tried to understand why it happens at all, and then what and where should be fixed.
Now, as i say adding prinfs/delay after first SDL call in InitShaders() but before calling to CompileShaderPRogramm() fix the compilation. Next, i remove those prinfs/delays from InitShaders, and start adding them to CompileShaderProgramm() , and find out, that things "fixes" when we add it before calling to CompileShader().
glGetObjectParameterivARB(shader, GL_OBJECT_INFO_LOG_LENGTH_ARB, &length);
info = SDL_stack_alloc(char, length+1);
glGetInfoLogARB(shader, length, NULL, info);
SDL_LogError(SDL_LOG_CATEGORY_APPLICATION, "Failed to compile shader:\n%s\n%s", source, info);
SDL_stack_free(info);
return SDL_FALSE;
} else {
return SDL_TRUE;
}
}
Adding any prinfs anywhere inside of that function _do not_ fix issues. But adding prinfs/delay before calling that function from any "top" functions (like InitShaders() or CompileShadersProgramm() do fix it).
So, i add few prinfs with and without prinfs/delay added in top called functions, and find out, that when we fail, our "status" value is _NOT_ 0. But when we add prinfs/delay on "top" functions calling this CompileShader(), then status _IS_ 0 and all works.
In other words, glGetObjectParameterivARB(shader, GL_OBJECT_COMPILE_STATUS_ARB, &status); return "0" status when we didn't add prifs/delay, and return "not 0" status when we add prinf/delay. Through as i say, adding of those prinfs/delay not right before this call, but in the functions of top level, which call the whole CompileShader() function (which mean, that it can be stack messed up, as calling/not calling function involve the stack and registers/params stored in).
As said IMO it's very likely problem of some uninitialized variable in this shader compiling stuff. Maybe somewhat hidden (compiler may not warn about it) if for exampled caused as side effect of unimplemented/unhandled functionality/attributes/whatever.
The thing with the call of the magic functions (printf(), Delay(1)) that make things work, is, that the calling of this functions will cause modification of things on the stack so they will influence how uninitialized variables look like when they end up being used in other code (shader compilation) which is exectuted after that = ~"in the future".
Funny experiment: If
Delay(1);
works, but Delay(0) doesn't it probably is possible to make the working thing not-work again by calling both:
works, but Delay(0) doesn't it probably is possible to make the working thing not-work again by calling both:
Delay(1); Delay(0);
Nope, it just start to works as expected and Delay(0); not make it unworks back :)
Btw, what i also found, is that when we put prinf/Delay to make things works, the returned "status" is:
status = 1723453592 status = 1 status = 1 status = 1 status = 1 status = 1
So that mean, that returned status already bad in all cases. Just when we add prinfs/delay, we have that strange big value for status (which is "int") , and so we dind't come inside of if (status == 0) check, and take it all like fine.
But when we add nothing, then status is 0, and so we come to "check" and say that we fail.
In other words, in all cases, first "status" check is bad. Just with adding prinfs/delay, it another big value.
All of this point out that we overwrite things on the stack (and yeah, like you point out it seems some value unitialized and by some luck at the begining are 0). When we do prinfs/delays, it change the stack layout, overwrite the things, and our unitialized status looks like that.
To me your problem is stack related: the code is trashing the stack somewhere and depending on the dependency to the stack of the functions you are calling then it may or may not crash. On the other end as soon as you are calling printf then you are realigning it and everything got back on track. The worst is that printf itself is often the culprit of such stack trashing: for example you are passing too much or too few parameters compared to the format string or you are passing wrong type (eg. a 64 bit parameter while only consuming 32 bits).
@trgswe Tried all of -Wshadow - no, no warnings happens. And i assume test case by itself ok (as it multiplatform and works everywhere else), it's somewhere in the amigaos components (sdl2/gl4es/ogles/w3dnova/etc)
Did you ask Daniel yet about glCompileShader issue? According to your trace Nova didn't fail calls, but OGLES2 did 2 calls.
If not uninitialized variable or stack related, could it have something to do with filesystem then? When shader is compiled from GLSL to SPIR-V, could it be that something might fail in some rainy day case? Of course, pure speculation but maybe Snoopy can be checked as well.
@kas1e If not uninitialized variable or stack related
It's only the GetObjectParameterivARB() function call to check compile status which fails at least in first call for some reason (either function buggy or since it is a function pointer, it is not pointing to correct function) and the biggest fail is that it does not set status variable at all which should never happen. That's why the status variable stays undefined after first call and why he can ~"fix" it with magic printf() or other function calls in parent function or by setting it to a defined value (like 1234) before call to GetObjectgParamterivARB().
But where are this #?ARB functions listed? In https://github.com/kas1e/SDL2_GL4ES/bl ... SDL_os4opengles2wrapper.c they don't show up in the table in there.
We didn't use there os4opengles2wrapper.c because its for SDL2's OpenGLES 2.0 direct renderng. And OpenGLES do not need any support for ARB extensions of 2000 era, as they all already inside of main OpenGL ES 2.0 functons.
What we use there are gl4es to have support of "old" OpenGLs till 2.1 version when there were direct shaders functions, but special ARB extensions.
But our as our Minigl do not support ARB shaders, so there are none for as well. And probabaly minigl there need that table, because it didn't provide us with function like "getprocaddress", while ogles2.library do provide (do not mess the things : opengles2 in SDL2 it's direct usage of opengles2, gl4es is the layer to have old OpenGL working over opengles2.library again, but SDL2's opengles2.0 rendering there play no role).
So for GL4ES we dont have that table, as we have "aglGetProcAddress", but we still do have some "table" in GL4ES, just to show to gl4es where to get Ogles2 functions (which re-route implemented/usual ogl2.x functions to them):
So when we need address of function in SDL, we use SDL_GL_GetProcAddress() (which is aglGetProcAddress() in general) and take the address to functions or to extension. And then, if that supported / implemented in gl4es , some additional stuff added around, and trasfered to ogles2 in "working for ogles2" shape.
Btw, if you will check SDL_os4gl4es.c file, you will see how implemented our SDL_GL_GetProcAddress():
if (func == NULL) {
dprintf("Failed to load '%s'\n", proc);
SDL_SetError("Failed to load function");
}
return func;
}
It is so small, that dunno what can fail there.
ptitSeb (author of Gl4ES) explain a litle bit more how things working inside of gl4es:
Quote:
For SDL_GL_GetProcAddress(..) this function should use aglGetProcAddress(..) This function is split in 2 parts: the Amiga specific one is in agl/lookup.c (this is mostly the "agl" version of the linux "glX" function), and the generic one (that contains all the shaders functions and more) defined in gl/gl_lookup.c
Edited by kas1e on 2022/3/7 14:42:23 Edited by kas1e on 2022/3/7 14:45:05
dunno if this should work but noticed that you don't initialize status which means it will be initialized to undefined value (that's your status='1234').
don't know how SDL/opengles/opengl/Warp3D/stdlib/newlib handles it, but it might be unable to write to the variable as nonexistent and when it exits the function it's dummy value get's written to status, gdb would be good here... a long shot. but status should always get written something... Windows/linux/mac probably uses a C++ (if it is relatively new code... lets say at least c++-17) and that if ofcourse something that can come in and change everything, do you compile with gcc or g++, c++/g++ initiazlizes variables alot earlier than C/gcc, why it would work after using Delay(1); is a mystery, probably something with the stack (status is put on the stack?!)
dunno if this should work but noticed that you don't initialize status which means it will be initialized to an undefined value (that's your status='1234').
It then should be set by sane value, but that didn't happen for us for the first pass, with adding prinfs/idos-delay we only change the default "status" value not to be more than 0, but then, still, the issue is that this call does not set status in the first pass.
Quote:
don't know how SDL/opengles/opengl/Warp3D/stdlib/newlib handles it, but it might be unable to write to the variable as nonexistent and when it exits the function its dummy value gets written to status,
It's unutilized at the beginning, yeah, and have by default 0 or something, but then, when we call glGetObjectParameterivARB, the status should change. But it didn't happen for the first call, but do happen for all the next calls. Roots are still unknown. Prinfs/Delay only make that default uninitialized value be different, and just fix things by luck (as error check on test case checking only on "if not 0", and if it not, then he thinks all ok, while, with prinfs/dealy adding, our uninitialized value are 29872523923 kind).
Calling the same function 2 times one after another still didn't make it works for the first pass.
Quote:
gdb would be good here...
Have any idea what we can check for from the debugger side? Why i ask, because Alfkil for now works on Spotless debug heavy, and i help him beta test it and ask features for, so, what we need in such a text case to have in debugger what will help us to understand what happens? Memory surfer?
Quote:
but status should always get written something..
Of course, after we do call glGetObjectParameterivARB status should be set, does not matter what, it should be set or bad, or good, or whatever, but is set. In our case, the first call set nothing.
Quote:
Windows/Linux/mac probably uses a C++ (if it is relatively new code... let's say at least c++-17) and that if of course something that can come in and change everything, do you compile with gcc or g++, c++/g++ initializes variables a lot earlier than C/gcc
Our code, for now, is C, so we compile it of course as C everywhere.
Quote:
why it would work after using Delay(1); is a mystery, probably something with the stack (status is put on the stack?!)
The reasons (as far as we understand now with Georg in that topic), is failed first call of :
By default, this initialized "status" value has 0 (which while uninitialized still sane enough and 0). Then, when we call glGetObjectParameterivARB first-time status didn't change and still has 0.
SDL_LogError(SDL_LOG_CATEGORY_APPLICATION, "Failed to compile shader:\n%s\n%s", source, info);
SDL_stack_free(info);
return SDL_FALSE;
} else {
return SDL_TRUE;
}
}
See, if "status 0" we then fail.
Now, what happens when we add Print/Delay - stack a bit changes, and uninitialized variables changes too. And now, instead of 0, we have 1723453592. So that not the "prinf/delay" fix it, but "prinf/dealy" just causing stack memory areas to be a little bit changes, and uninitialized war overwrites. At least with prinfs, i can understand why: things placed in the stack, and maybe our "aaaa" values for prinfs, just by luck overwrite memory where uninitialized vars are placed. But that is interesting to know, why only prinfs/delay, and no other functions. But not much important, as we know that calling libs/functions do change stack layout, and uninitialized vars can be overwritten by all kinds of crap.
So, when we add prinfs/delay, unutilized war is 1723453592, and this code i quote above didn't fail, while of course, it is mostly an issue with a test case, because, it should check not just on 0, but on anything which is not 1.
Through, even if the test case no "error good enough" on the check, it still, show us that we have an issue with getting "status" back correctly for the first call of the glGetObjectParameterivARB.
If shaders fail or not to compile, "status" should be set does no matter what, and while it is not, we have some issue with that get status function, or something deeper in the mess of build binary code