for full normal debug version instead of -DCMAKE_BUILD_TYPE=Release do -DCMAKE_BUILD_TYPE=Debug if we want only "-gstabs" without gl4es debug output, then -DCMAKE_BUILD_TYPE=RelWithDebInfo
So in meantime, you will need to disable .psa generation from gl.init in the function void initialize_gl4es(), by adding:
Quote:
// temporary fix until Daniel add necessary things to ogles2. #ifdef __amigaos4__ globals4es.nopsa = 1; #endif
But that only, if you want the very latest version of gl4es until I did not make a new version of SDK.
Also, as 2 more speed hints, I want to bring 2 moments that may help to speed things up in your ported OpenGL over gl4es apps.
By default _in most_ cases what the gl4es default states have are enough, but, in some cases, some additional tricks can be done to speed things up (and radically).
One of them is environment LIBGL_BATCH:
Quote:
LIBGL_BATCH
BATCH simply tries to merge subsequent glDrawXXXXX (glDrawArrays, glDrawElements...). It only tries to merge if arrays are between MINBATCH and MAXBATCH (inclusive) The Batching stop when there is a change of GL State, but also if an Array of more than 100*N is encountered.
0 : Default: don't try to merge glDrawXXXXX N: Any number: try to merge arrays, 1st must be between 0 and 100*N MIN-MAX: 2 numbers separated by minus, to try merge arrays that are between MIN and MAX vertices
What it means in the reality, is that when you build any of your apps, and it didn't have FPS you are satisfied with, or, you simply want to have as much as possible, you then in the shell firstly play with this Env before running your port like: "setenv LIBGL_BATCH 0-40", and so on, or 0-20 or 20-50, i.e. you got an idea.
If, under some values, you have really better performance, then to avoid issues/worry for users, you do as I do for NeverBall/NeverPutt ports: you just build your special gl4es build, with settings necessary value again in the initializing function of gl4es, like you do for disabling .psa. if you build the latest version. That trick brings in NeverBall quite a lot FPS more, the same as in my local test of Cube port. So, that must check always.
A second speed-hint which you may try for your ports, is happening for me only in the FrickingShark game, and so far i do not know the reasons of it, just found it by some luck. It's the usage of MAX_Textures in gl4es.
For FrikingShark i found that if i use in config.h #define MAX_TEX 8, or 4, or 2 instead of 16, i have better and better FPS in-game. And by better, i mean almost 50% increase with MAX_TEX 2. But that _ONLY_ in fricking shark. In no other games that make any difference (but I tried all my ports). The reason for that is still unknown but we think it can be some weird stuff in FrickingShark itself (i.e. game's code) that enables a VA for every existing TMU (without actually using them then). At least that would explain the performance differences. But that's just a guess. Through it is worth noting that this was the second hint helping me to increase the speed.
So that is all you need for the current maximum usage of gl4es if you build your own versions.
All the environment there works for AmigaOS4 as well, via "setenv LIBGL_xxxx blablaba". And there is a lot, and some of them surely can bring some performance too, just i didn't find with ptitSeb anything else more or less important as LIBGL_BATCH and that MAX_TMU switch in FrickingShark.
@Raziel I can build easily if you need new gl4es and new sdl2 at any time, but not the new full SDK as I have more plans for it to add. But for your use with scummvm, etc that will be more than fine, just tell me when you will need it
To build SDL1 and SDL2 recent version with applying necessary changes to make it works with GL4ES, you can just check this commit history for both SDLs and apply the same on recent SDLs:
@kas1e Looks like I got gl4es built. How did you build GLU_gl4es? Also when testing my compiled version of libgl4es.a it works way slower than yours. Any ideas about that?
It's usual ./configure ; make of ptitSeb's version of GLU, just be sure that SDK/Local/common/GL/ directory contain GL4ES includes, and not MiniGL ones and as it contain some older version of necessary for libtool/etc files, you may want to run autoupdate/etc as well. Like this for clib2 build:
git clone https://github.com/ptitSeb/GLU
cd GLU
autoupdate
autoreconf --force --install
./configure --build=x86_64 --host=ppc-amigaos --target=ppc-amigaos CFLAGS="-mcrt=clib2" CXXFLAGS="-mcrt=clib2"
make -j4
And same if you need newlib version, just omit clib2 flags (so newlib will takes by default):
git clone https://github.com/ptitSeb/GLU
cd GLU
autoupdate
autoreconf --force --install
./configure --build=x86_64 --host=ppc-amigaos --target=ppc-amigaos
make -j4
Ready library will be in the .libs directory, named ligGLU.a which you just rename like i did for SDK to libGLU_gl4es.a.
But of course you can keep it as libglu.a as well of course, just it will make logical mess with minigl's one in SDK and you will in needs to keep an eye on it always to not mess one with other, so better to rename.
Quote:
Also, when testing my compiled version of libgl4es.a it works way slower than yours. Any ideas about that?
How you detect that it is slower ? On what tests ? And what versions do you compare : my one from SDK, or my one which i uploaded there in previous message ? The one in SDK is for newlib and a 10-15 months old one, and the one i upload few posts back are for latest version of experemental-in-progress clib2 and from recent gl4es repo (which may have regression too, of course).
Maybe you didn't reboot when test and some LIBGL_* left_overs from Huno's EGL_Wrap in ENV: left, or, maybe you build debug version by default with -gstabs and without optimization turned on by some reasons ?
@kas1e thankyou for the info. Very much appreciated.
Quote:
Quote:
Also, when testing my compiled version of libgl4es.a it works way slower than yours. Any ideas about that?
How you detect that it is slower ? On what tests ? And what versions do you compare : my one from SDK, or my one which i uploaded there in previous message ? The one in SDK is for newlib and a 10-15 months old one, and the one i upload few posts back are for latest version of experemental-in-progress clib2 and from recent gl4es repo (which may have regression too, of course).
Maybe you didn't reboot when test and some LIBGL_* left_overs from Huno's EGL_Wrap in ENV: left, or, maybe you build debug version by default with -gstabs and without optimization turned on by some reasons ?
It was on program that I wrote. It's about as fast as a software only renderer when linking against the lib I compiled versus yours which runs nice and fast. I am using newlib and your version is from your sdk archive. I compiled using your instructions but obviously using a new clone of gl4es. Is there any chance you could do a new build for newlib so I can have a direct lib to compare against (file size etc) to help see what's going on?
I tried the newly kas1e comnpiled newlib version but its still slow (slower than software only rendering actually).
My code isn't ready for prime time yet, but I did notice a couple of differences in the GL4ESBANNER output, so not sure if that is relevant.
old sdk gl4es:
LIBGL: Initialising gl4es
LIBGL: v1.1.5 built on Apr 17 2021 23:02:30
LIBGL: Using GLES 2.0 backend
LIBGL: Using Warp3DNova.library v54 revision 16
LIBGL: Using OGLES2.library v3 revision 3
LIBGL: OGLES2 Library and Interface open successfuly
LIBGL: Targeting OpenGL 2.1
LIBGL: NPOT texture handled in hardware
LIBGL: Not trying to batch small subsequent glDrawXXXX
LIBGL: try to use VBO
LIBGL: Force texture for Attachment color0 on FBO
LIBGL: Hack to trigger a SwapBuffers when a Full Framebuffer Blit on default FBO is done
LIBGL: Current folder is:Work:Programming/displwo
LIBGL: Hardware test on current Context...
LIBGL: Hardware Full NPOT detected and used
LIBGL: Extension GL_EXT_blend_minmax detected and used
LIBGL: FBO are in core, and so used
LIBGL: PointSprite are in core, and so used
LIBGL: CubeMap are in core, and so used
LIBGL: BlendColor is in core, and so used
LIBGL: Blend Substract is in core, and so used
LIBGL: Blend Function and Equation Separation is in core, and so used
LIBGL: Texture Mirrored Repeat is in core, and so used
LIBGL: Extension GL_OES_mapbuffer detected
LIBGL: Extension GL_OES_element_index_uint detected and used
LIBGL: Extension GL_OES_packed_depth_stencil detected and used
LIBGL: Extension GL_EXT_texture_format_BGRA8888 detected and used
LIBGL: Extension GL_OES_texture_float detected and used
LIBGL: Extension GL_AOS4_texture_format_RGB332 detected
LIBGL: Extension GL_AOS4_texture_format_RGB332REV detected
LIBGL: Extension GL_AOS4_texture_format_RGBA1555REV detected and used
LIBGL: Extension GL_AOS4_texture_format_RGBA8888 detected and used
LIBGL: Extension GL_AOS4_texture_format_RGBA8888REV detected and used
LIBGL: high precision float in fragment shader available and used
LIBGL: Extension GL_EXT_frag_depth detected and used
LIBGL: Max vertex attrib: 16
LIBGL: Max texture size: 16384
LIBGL: Max Varying Vector: 32
LIBGL: Texture Units: 16/16 (hardware: 32), Max lights: 8, Max planes: 6
LIBGL: Extension GL_EXT_texture_filter_anisotropic detected and used
LIBGL: Max Anisotropic filtering: 16
LIBGL: Max Color Attachments: 1 / Draw buffers: 1
LIBGL: Hardware vendor is A-EON Technology Ltd. Written by Daniel 'Daytona675x' M ener @ GoldenCode.eu
LIBGL: GLSL 300 es supported
LIBGL: GLSL 310 es supported and used
new kas1e newlib compile:
LIBGL: Initialising gl4es
LIBGL: v1.1.5 built on Sep 26 2023 07:40:34
LIBGL: Using GLES 2.0 backend
LIBGL: Using Warp3DNova.library v54 revision 16
LIBGL: Using OGLES2.library v3 revision 3
LIBGL: OGLES2 Library and Interface open successfuly
LIBGL: Targeting OpenGL 2.1
LIBGL: Not trying to batch small subsequent glDrawXXXX
LIBGL: try to use VBO
LIBGL: Force texture for Attachment color0 on FBO
LIBGL: Hack to trigger a SwapBuffers when a Full Framebuffer Blit on default FBO is done
LIBGL: Current folder is:Work:Programming/displwo
LIBGL: Loaded a PSA with 2 Precompiled Programs
LIBGL: Hardware test on current Context...
LIBGL: Hardware Full NPOT detected and used
LIBGL: Extension GL_EXT_blend_minmax detected and used
LIBGL: FBO are in core, and so used
LIBGL: PointSprite are in core, and so used
LIBGL: CubeMap are in core, and so used
LIBGL: BlendColor is in core, and so used
LIBGL: Blend Subtract is in core, and so used
LIBGL: Blend Function and Equation Separation is in core, and so used
LIBGL: Texture Mirrored Repeat is in core, and so used
LIBGL: Extension GL_OES_mapbuffer detected
LIBGL: Extension GL_OES_element_index_uint detected and used
LIBGL: Extension GL_OES_packed_depth_stencil detected and used
LIBGL: Extension GL_EXT_texture_format_BGRA8888 detected and used
LIBGL: Extension GL_OES_texture_float detected and used
LIBGL: Extension GL_AOS4_texture_format_RGB332 detected
LIBGL: Extension GL_AOS4_texture_format_RGB332REV detected
LIBGL: Extension GL_AOS4_texture_format_RGBA1555REV detected and used
LIBGL: Extension GL_AOS4_texture_format_RGBA8888 detected and used
LIBGL: Extension GL_AOS4_texture_format_RGBA8888REV detected and used
LIBGL: high precision float in fragment shader available and used
LIBGL: Extension GL_EXT_frag_depth detected and used
LIBGL: Max vertex attrib: 16
LIBGL: Extension GL_OES_get_program_binary detected and used
LIBGL: Number of supported Program Binary Format: 1
LIBGL: Max texture size: 16384
LIBGL: Max Varying Vector: 32
LIBGL: Texture Units: 16/16 (hardware: 32), Max lights: 8, Max planes: 6
LIBGL: Extension GL_EXT_texture_filter_anisotropic detected and used
LIBGL: Max Anisotropic filtering: 16
LIBGL: Max Color Attachments: 1 / Draw buffers: 1
LIBGL: Hardware vendor is A-EON Technology Ltd. Written by Daniel 'Daytona675x' M ener @ GoldenCode.eu
LIBGL: GLSL 300 es supported and used
@Dave It can be gl4es regresstion then, but just in case : be sure you delete .psa directory (it's precompiled shaders dir created after you run app build with gl4es, so maybe old ones can cause harm).
Anyway, test case need it so i can find what wrong (if i will be not able to reproduce it with my tests).