GL4ES: another OpenGL over OpenGLES2 emulation - some tech. info and porting progress

	Bottom Previous Topic Next Topic
Register To Post

« 1 ... 32 33 34 (35) 36 37 38 ... 43 »

Daytona675x

Re: GL4ES: another OpenGL over OpenGLES2 emulation - some tech. info and porting progress

Posted on: 2019/9/18 9:48 #681

Not too shy to talk

@Hans
Quote:

Try setting the unused arrays to UINT8 as well. That should work for now, until I add a proper option to disable endianness conversion.

I did and unfortunately this doesn't work at all (upload taking seconds, app becoming blocked, not crashing though, can be closed gracefully). Again, even tried with size/stride 0.
Is there a chance that this gets fixed soon? Best would be a proper implementation of VBOSetLArray(W3DNEF_NONE) though, abusing UINT8 for "disable" seems like a bad idea anyway.

[Facebook] [YouTube Channel] [ko-fi]

Hans

Re: GL4ES: another OpenGL over OpenGLES2 emulation - some tech. info and porting progress

Posted on: 2019/9/18 9:51 #682

Home away from home

@Daytona675x

Quote:

I did and unfortunately this doesn't work at all (upload taking seconds, app becoming blocked, not crashing though). Again, even tried with size/stride 0.

That makes no sense whatsoever. The code in question literally compares the element sizes of all arrays in the VBO. If they're all the same, then it enables the global conversion mode.

Sorry, I cannot commit to fixing anything any time soon.

Hans

Join Kea Campus' Amiga Corner and support Amiga content creation
https://keasigmadelta.com/ - see more of my work

Daytona675x

Re: GL4ES: another OpenGL over OpenGLES2 emulation - some tech. info and porting progress

Posted on: 2019/9/18 10:10 #683

Not too shy to talk

@Hans
Quote:

That makes no sense whatsoever.

Yes, there's lots inside Nova which doesn't make sense

Quote:

That makes no sense whatsoever. The code in question literally compares the element sizes of all arrays in the VBO. If they're all the same, then it enables the global conversion mode.

Global conversion mode?! It should not do any conversion at all! That's the idea of having plain bytes

So now you are saying that what's important is that the element sizes of all arrays are identical?
That makes no sense once more.
Why should the size of the elements be of any interest when it comes to deciding whether to copy stuff with your no-endian-double-copier?!
The only thing that could matter is if they are tightly packed (which is the case for size/stride 0 btw.) and if they are all of endian-uncritical types (which is the case for NONE or UINT8 regardless of their element size).

Also, some minutes ago you were sure that W3DNEF_NONE would do the job, which clearly isn't the case. So the question is if your code actually does what you think it should do (it's called a "bug" if it doesn't

).

But let's see, let's make it easy, please just answer the question I had some posts ago:
is any special parameter combination required for VBOSetArray with W3DNEF_NONE (or W3DNEF_UINT8) to make it work as promised?
Let's asume I create a VBO with 2 arrays, VBO size is 1024 bytes, I want array #0 to be my uint8 raw data, so it's uint8, elem size 1, elem count 1024, and I want to disable array #1 temporarily. What's the correct VBOSetLayout command for that, so that the result is fast upload just as if the VBO had been created with just 1 array?

Quote:

Sorry, I cannot commit to fixing anything any time soon.

That's bad. Especially because more and more bug reports pile up for almost a year now

Not to mention the famous -O0 story.

Edited by Daytona675x on 2019/9/18 10:41:18
Edited by Daytona675x on 2019/9/18 10:45:33

[Facebook] [YouTube Channel] [ko-fi]

kas1e

Re: GL4ES: another OpenGL over OpenGLES2 emulation - some tech. info and porting progress

Posted on: 2019/9/18 10:44 #684

Home away from home

@Hans
Quote:

Sorry, I cannot commit to fixing anything any time soon.

Pretty sad to hear that. I was in hope you start to work on it again, at least to deal with -o0 issue and those issues with shaders.

Why you dont want to spend even few hourse to at least help Daniel to be able to disable conversion at all (even if it requere to fix/add something) ?

Is reasson about money ? If so, i can pay per 100$ to each bugreport i report. Will you take that offer ?

But what is more sad, is that not only you, but we all spend time on all this, and now you say "sorry no bug fixing any time soon", which mean never.

Why we all loose our time to go to the level when all almost works, just need some more work, and you say sorry guys, no fixes anytime soon. And what should we do ? Drop it all, put to trash and forget about crap ?

Sure you have priorities, family and stuff (as all of us), but why didnt take even a bit of time to fix things in nova ? You see that we all trying to improve thing, and that it all important. Why you want it all stops ?:(

Join us to improve dopus5!
AmigaOS4 on youtube

Daytona675x

Re: GL4ES: another OpenGL over OpenGLES2 emulation - some tech. info and porting progress

Posted on: 2019/9/18 11:06 #685

Not too shy to talk

@kas1e
No worries, I experimented a bit more:
The question "do all arrays of a VBO have the same size?" which Nova falsely uses to select fast-copy (it's the wrong question to ask, it makes no sense) works if applied that way:

- create VBO with N arrays
- set VBO byte size to a multiple of N, add extra bytes if necessary
- if you want to switch to fast endian-conv-free data transfer then do
for(uint32 i=0;i<N;++i) VBOSetArray(vbo_handle,i,W3DNEF_UINT8,FALSE,1,1,i*(vbo_size/N),vbo_size/N);

This works

Now let's get rid of yet another Nova snail effect

Edited by Daytona675x on 2019/9/18 16:09:33

[Facebook] [YouTube Channel] [ko-fi]

thellier

Re: GL4ES: another OpenGL over OpenGLES2 emulation - some tech. info and porting progress

Posted on: 2019/9/18 11:45 #686

Not too shy to talk

>The only thing that helps is to create a simple 1 array VBO in the first place.

?? Please, explain more.
Do you mean a simple array of floats declared as W3DNEF_NONE + uint8 ?
Because later on your text you say it dont works
but say "we still have a potential up-to-factor-4"

I mean do you really obtained x4 speed up ? or is it theory ?

Daytona675x

Re: GL4ES: another OpenGL over OpenGLES2 emulation - some tech. info and porting progress

Posted on: 2019/9/18 12:04 #687

Not too shy to talk

@thellier
You misread / skipped stuff. And maybe overread the latest post.

Quote:

The only thing that helps is to create a simple 1 array VBO in the first place

This one is outdated. See post above your post. If the unused arrays are "disabled" in the way (and only this way) I outlined there then it works with VBOs with arbitrary numbers of arrays.

Quote:

Do you mean a simple array of floats declared as W3DNEF_NONE + uint8 ?

I dont understand what you mean by that.

Quote:

Because later on your text you say it dont works but say "we still have a potential up-to-factor-4

As I stated: 7 fps vs 30 fps (~ factor 4) was what I measured:
a) 30 fps if I used a 1-array index VBO with the no-endian-conv-trick.
b) 7 fps if I instead made it a 2-array index VBO with the 2nd array disabled via W3DNEF_NONE, which weirdly enough results in Novas slow (standard) copy-conv being triggered.

What did not work at this time was that I could not have (b) to be as fast as (a) because I didn't find a way how to trick Nova into the raw-byte-copy-mode if the VBO had more than 1 array.

Hans W3DNEF_NONE info turned out to simply not work at all. And his W3DNEF_UINT8 hint lacked proper usage info and Nova has a logic bug when it comes to mode selection, which is why I did not try it in a way that would suite Nova.

NOW I found such a way. So now get (b) with the speed of (a), which means that if I apply that wisdom to my internal interleaved vertex data VBOs, then the upload of those should speed up accordingly.

Important: note that this expected speedup is usually not going to result in a 4x higher framerate!! What's being sped up will be the VBO upload only! If the respective app uses its own VBOs then there most likely won't be too many uploads... Also, ogles2 does a lot of tricks to avoid uploads at all costs.
So you can expect biggest improvements for situations where a) the ogles2 client uses client-RAM instead of his own VBOs and b) that vertex data is frequently changing.

Edited by Daytona675x on 2019/9/18 13:37:30

[Facebook] [YouTube Channel] [ko-fi]

thellier

Re: GL4ES: another OpenGL over OpenGLES2 emulation - some tech. info and porting progress

Posted on: 2019/9/18 14:01 #688

Not too shy to talk

@Daytona675x
Sorry, You are right I only read until page 34 :-/ so I missed your last explanations
I am not accoutumed to have a coding subject produce so much answers in so few time

> b) that vertex data is frequently changing
So for any minigl to nova (or warp3d to nova) wrapper it will make sense

Daytona675x

Re: GL4ES: another OpenGL over OpenGLES2 emulation - some tech. info and porting progress

Posted on: 2019/9/18 16:08 #689

Not too shy to talk

@thellier
Quote:

I am not accoutumed to have a coding subject produce so much answers in so few time

LOL

Quote:

So for any minigl to nova (or warp3d to nova) wrapper it will make sense

In general yes, but... The real effective per frame performance gain highly depends on the respective application / game. The highest gain is to be expected when every existing upload-avoidance-strategy inside ogles2 fails. This is the case for quickly changing, non-repetetive data, e.g. a procedural effect. Or if there are so many different objects per frame that ogles2's caches are constantly overwritten again. Or if there are very large objects which don't fit the internal cache buffers.
As long as there are only a handful of not too big uploads per frame, the performance gain will still exist but probably be neglectable. We'll see

[Facebook] [YouTube Channel] [ko-fi]

Kamelito

Re: GL4ES: another OpenGL over OpenGLES2 emulation - some tech. info and porting progress

Posted on: 2019/9/18 16:30 #690

Just popping in

Since A-Eon own Nova including the source code I suppose that another talented developer could fix it right?
https://amigaworld.net/modules/news/article.php?storyid=7787

kas1e

Re: GL4ES: another OpenGL over OpenGLES2 emulation - some tech. info and porting progress

Posted on: 2019/9/18 18:08 #691

Home away from home

@all
Hate to bring another bug-report again, but probably still worth to discuss to understand from where issues come.

So, to make it short, there something wrong with BGRA Texture hardware extension. There is 2 issues: one cosmetic one, and another one is hardcore crash when those BGRA Textures used with/in FBO.

First one, just pure cosmetic one, and didn't cause any issues except visuall differences: colors are swapped. Where they should be blue, they are red. Where they should be red, they are blue.

For example, there is one of irrlicht examples (13.rendertotexture) and how it looks like on our side now (@all: don't fear of number "6" in the title about fps, its just not whole string fit in my theme, its 600 fps) :

(press open in new tab for fullsize):

Resized Image

And that, how it looks like once i set via gl4es environments to ignore BGRA Texture hardware extension:

(press open in new tab for fullsize):

Resized Image

On win32 and on pandora colors are blue too, i.e. that the correct look, and red one are wrong.

The same issue happens in supertuxkart in the car choice screen, where cars should be red, they blue, where should be blue, they red. And the same once i disable usage of BGRA in texture, it start to show correct colors.

At first i of course think its gl4es and something about endian formats to be done inside, but then, ptitSeb says that no, there isn't much Irrlicht engine and gl4es do here. Texture with GL_BGRA format are supported on AmigaOS, as there are other games that support it, so it's an issue with this format. It's very well may be that on our side reading are fine, but writing to texture are not.

Second issue, is that i have very hardcore ISI crash, when i do choice a car in the car-selection-screen , and before it all crashes i have that warning:

[error] Irrlicht: FBO has one or serveral incomplete image attachements
[error] Irrlicht: FBO error
[error] main: Exception caught : std::bad_alloc
[error] main: Aborting SuperTuxKart

One time i even got:

[error] Irrlicht: FBO has one or serveral incomplete image attachements
[error] Irrlicht: FBO error
[error] Irrlicht: Fatal Error: Tried to set a texture not owned by this driver

We at first thinking that it can be related to that issue with swapped colors , but when i set to ignore that BGRA texture extensions, then colors are start to be fine, but crash still there. So issues probabaly not related, but in the same area about texture's and BGRA format. Or maybe related, hard to say now.

Now crash looks like this:


Crash log for task "supertuxkart"

Generated by GrimReaper 53.19

Crash occured in module  at address 0x590F9168

Type of crash: ISI (Instruction Storage Interrupt) exception

Alert number: 0x80000003



Register dump:

GPR (General Purpose Registers):

   0: 7EF95EB4 63DB1470 00000002 590F9170 590F9170 63DB1590 63DB1568 00000000 

   8: 00000000 590F9168 00000000 7F296AFC 33955353 63DC606C 63DC0000 634B0000 

  16: 63DC0000 634B2B58 FF007F00 634B2B70 FF7F7F00 634B2B88 FF00007F 634B2B9C 

  24: 63DC0000 00000001 63DB1590 590F9170 63DB1568 00000000 00000000 63D4C550 





FPR (Floating Point Registers, NaN = Not a Number):

   0:              nan                1                1                1 

   4:                1              nan              nan              nan 

   8:              888                1                0       4.5036e+15 

  12:       4.5036e+15                0     9.36758e-110     -3.22894e+49 

  16:     -8.5225e-173     2.08306e-235    -5.90513e-258     -2.28164e-24 

  20:      6.53871e-38    -1.11483e-105     1.67461e-200     3.42169e-265 

  24:    -1.82693e-178     -5.17867e-09     1.04918e-114     2.20234e-226 

  28:    -2.12876e-252    -3.49188e-264     2.92964e-255            0.028 



FPSCR (Floating Point Status and Control Register): 0xAE104100





SPRs (Special Purpose Registers):

           Machine State (msr) : 0x0002F030

                Condition (cr) : 0x00000000

      Instruction Pointer (ip) : 0x590F9168

       Xtended Exception (xer) : 0x5D311F10

                   Count (ctr) : 0x00000000

                     Link (lr) : 0x587CEEA0

            DSI Status (dsisr) : 0x587CEDB0

            Data Address (dar) : 0x59953999







680x0 emulated registers:

DATA: 83580D00 00000000 00000000 00000000 00000000 00000000 00000000 00000000 

ADDR: 6FFA4000 8280F300 00000000 00000000 00000000 00000000 00000000 63DB1070 

FPU0:                0                0                0                0 

FPU4:                0                0                0                0 







Symbol info:

Instruction pointer 0x590F9168 belongs to module "" (HUNK/Kickstart)



Stack trace:

    0x590F9168 symbol not available

    [/amiga/SuperTuxKart-0.8.1/lib/irrlicht/source/Irrlicht/COpenGLDriver.cpp:2184] supertuxkart:_ZN3irr5video13COpenGLDriver11draw2DImageEPKNS0_8ITextureERKNS_4core4rectIiEES9_PS8_PKNS0_6SColorEb()+0x78 (section 1 @ 0x3CFB70)

    [/amiga/SuperTuxKart-0.8.1/src/guiengine/skin.cpp:1849] supertuxkart:_ZN9GUIEngine4Skin13process3DPaneEPN3irr3gui11IGUIElementERKNS1_4core4rectIiEEb()+0x4ac (section 1 @ 0xD0B74)

    [/amiga/SuperTuxKart-0.8.1/src/guiengine/skin.cpp:2015] supertuxkart:_ZN9GUIEngine4Skin24draw3DButtonPaneStandardEPN3irr3gui11IGUIElementERKNS1_4core4rectIiEEPS8_()+0x84 (section 1 @ 0xD183C)

    [/amiga/SuperTuxKart-0.8.1/lib/irrlicht/source/Irrlicht/CGUIButton.cpp:244] supertuxkart:_ZN3irr3gui10CGUIButton4drawEv()+0x68c (section 1 @ 0x60BEE0)

    [/amiga/SuperTuxKart-0.8.1/lib/irrlicht/source/Irrlicht/CGUIEnvironment.cpp:318] supertuxkart:_ZN3irr3gui15CGUIEnvironment7drawAllEv()+0x1a4 (section 1 @ 0x48DCDC)

    [/amiga/SuperTuxKart-0.8.1/src/guiengine/engine.cpp:1164] supertuxkart:_ZN9GUIEngine6renderEf()+0x43c (section 1 @ 0xA937C)

    [/amiga/SuperTuxKart-0.8.1/src/graphics/irr_driver.cpp:1731] supertuxkart:_ZN9IrrDriver6updateEf()+0x858 (section 1 @ 0x6F9DC)

    [/amiga/SuperTuxKart-0.8.1/src/main_loop.cpp:164] supertuxkart:_ZN8MainLoop3runEv()+0x2b4 (section 1 @ 0x1967D0)

    [/amiga/SuperTuxKart-0.8.1/src/main.cpp:1536] supertuxkart:main()+0x8bc (section 1 @ 0x19534C)

    native kernel module newlib.library.kmod+0x0000257c

    native kernel module newlib.library.kmod+0x00003294

    native kernel module newlib.library.kmod+0x000037c8

    supertuxkart:_start()+0x170 (section 1 @ 0x1920)

    native kernel module dos.library.kmod+0x0002a5d8

    native kernel module kernel+0x0006b590

    native kernel module kernel+0x0006b5d8



PPC disassembly:

 590f9160: 00000025   .word             0x00000025

 590f9164: 00000000   .word             0x00000000

*590f9168: 00000038   .word             0x00000038

 590f916c: 00000099   .word             0x00000099

 590f9170: 67783088   oris              r24,r27,12424

As it some ISI crash, disassmbling not possible, that why it point on some random memory in disassembly.

The line in which we crash (Irrlicht/COpenGLDriver.cpp:2184), are from OpenGLDriver::draw2DImage() and looks like this:

const core::dimension2d<u32>& ss = texture->getOriginalSize();

But that didn't help much, as all looks correct (and it works as usuall on Pandora & Linux with gl4es).

Then, we enabled debug in gl4es's src/framebuffer.c to see what is happening in FBO creation & use, and that what we have when we crash:


glGenerateMipmap(GL_TEXTURE_2D)

glGenerateMipmap(GL_TEXTURE_2D)

glGenFramebuffers(1, 0x58e7e738)

glBindFramebuffer(GL_FRAMEBUFFER, 256), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, 568, 0) glstate->fbo.current_fb=256 (draw=256, read=256)

found texture, glname=568, size=512x512(512x512), format/type=GL_BGRA/GL_UNSIGNED_BYTE

Attach Texture 568 to FBO 256 as Attachement GL_COLOR_ATTACHMENT0

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=256 (draw=256, read=256)

glGenRenderbuffers(1, 0x58e4fb04)

glBindRenderbuffer(GL_RENDERBUFFER, 256), binded Fbo=0

glRenderbufferStorage(GL_RENDERBUFFER, 0x81A5, 512, 512)

glBindFramebuffer(GL_FRAMEBUFFER, 256), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_RENDERBUFFER, 256)

glCheckFramebufferStatus(0x8D40)=0x8CD5

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=256 (draw=256, read=256)

glBindFramebuffer(GL_FRAMEBUFFER, 256), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=256 (draw=256, read=256)

glDeleteRenderbuffer(1, 0x58e4fb04)

glDeleteFramebuffers(1, 0x58e7e738), framebuffers[0]=256

glGenFramebuffers(1, 0x58e7e738)

glBindFramebuffer(GL_FRAMEBUFFER, 257), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, 569, 0) glstate->fbo.current_fb=257 (draw=257, read=257)

found texture, glname=569, size=512x512(512x512), format/type=GL_BGRA/GL_UNSIGNED_BYTE

Attach Texture 569 to FBO 257 as Attachement GL_COLOR_ATTACHMENT0

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=257 (draw=257, read=257)

glGenRenderbuffers(1, 0x58e4f4cc)

glBindRenderbuffer(GL_RENDERBUFFER, 257), binded Fbo=0

glRenderbufferStorage(GL_RENDERBUFFER, 0x81A5, 512, 512)

glBindFramebuffer(GL_FRAMEBUFFER, 257), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_RENDERBUFFER, 257)

glCheckFramebufferStatus(0x8D40)=0x8CD5

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=257 (draw=257, read=257)

glBindFramebuffer(GL_FRAMEBUFFER, 257), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=257 (draw=257, read=257)

glDeleteRenderbuffer(1, 0x58e4f4cc)

glDeleteFramebuffers(1, 0x58e7e738), framebuffers[0]=257

glGenFramebuffers(1, 0x58e7e738)

glBindFramebuffer(GL_FRAMEBUFFER, 258), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, 570, 0) glstate->fbo.current_fb=258 (draw=258, read=258)

found texture, glname=570, size=512x512(512x512), format/type=GL_BGRA/GL_UNSIGNED_BYTE

Attach Texture 570 to FBO 258 as Attachement GL_COLOR_ATTACHMENT0

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=258 (draw=258, read=258)

glGenRenderbuffers(1, 0x58e4f4cc)

glBindRenderbuffer(GL_RENDERBUFFER, 258), binded Fbo=0

glRenderbufferStorage(GL_RENDERBUFFER, 0x81A5, 512, 512)

glBindFramebuffer(GL_FRAMEBUFFER, 258), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_RENDERBUFFER, 258)

glCheckFramebufferStatus(0x8D40)=0x8CD6

[error  ] Irrlicht: FBO has one or several incomplete image attachments

[error  ] Irrlicht: FBO error

glDeleteRenderbuffer(1, 0x58e4f4cc)

glDeleteFramebuffers(1, 0x58e7e738), framebuffers[0]=258

And there is log, when i 1 of 5 times by some luck didn't crash:


glGenerateMipmap(GL_TEXTURE_2D)

glGenerateMipmap(GL_TEXTURE_2D)

glGenFramebuffers(1, 0x592932a0)

glBindFramebuffer(GL_FRAMEBUFFER, 256), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, 568, 0) glstate->fbo.current_fb=256 (draw=256, read=256)

found texture, glname=568, size=512x512(512x512), format/type=GL_BGRA/GL_UNSIGNED_BYTE

Attach Texture 568 to FBO 256 as Attachement GL_COLOR_ATTACHMENT0

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=256 (draw=256, read=256)

glGenRenderbuffers(1, 0x592934a4)

glBindRenderbuffer(GL_RENDERBUFFER, 256), binded Fbo=0

glRenderbufferStorage(GL_RENDERBUFFER, 0x81A5, 512, 512)

glBindFramebuffer(GL_FRAMEBUFFER, 256), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_RENDERBUFFER, 256)

glCheckFramebufferStatus(0x8D40)=0x8CD5

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=256 (draw=256, read=256)

glBindFramebuffer(GL_FRAMEBUFFER, 256), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=256 (draw=256, read=256)

glDeleteRenderbuffer(1, 0x592934a4)

glDeleteFramebuffers(1, 0x592932a0), framebuffers[0]=256

glGenFramebuffers(1, 0x59292e60)

glBindFramebuffer(GL_FRAMEBUFFER, 257), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, 569, 0) glstate->fbo.current_fb=257 (draw=257, read=257)

found texture, glname=569, size=512x512(512x512), format/type=GL_BGRA/GL_UNSIGNED_BYTE

Attach Texture 569 to FBO 257 as Attachement GL_COLOR_ATTACHMENT0

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=257 (draw=257, read=257)

glGenRenderbuffers(1, 0x594284f4)

glBindRenderbuffer(GL_RENDERBUFFER, 257), binded Fbo=0

glRenderbufferStorage(GL_RENDERBUFFER, 0x81A5, 512, 512)

glBindFramebuffer(GL_FRAMEBUFFER, 257), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_RENDERBUFFER, 257)

glCheckFramebufferStatus(0x8D40)=0x8CD5

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=257 (draw=257, read=257)

glBindFramebuffer(GL_FRAMEBUFFER, 257), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=257 (draw=257, read=257)

glDeleteRenderbuffer(1, 0x594284f4)

glDeleteFramebuffers(1, 0x59292e60), framebuffers[0]=257

glGenFramebuffers(1, 0x59292e60)

glBindFramebuffer(GL_FRAMEBUFFER, 258), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, 570, 0) glstate->fbo.current_fb=258 (draw=258, read=258)

found texture, glname=570, size=512x512(512x512), format/type=GL_BGRA/GL_UNSIGNED_BYTE

Attach Texture 570 to FBO 258 as Attachement GL_COLOR_ATTACHMENT0

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=258 (draw=258, read=258)

glGenRenderbuffers(1, 0x5977652c)

glBindRenderbuffer(GL_RENDERBUFFER, 258), binded Fbo=0

glRenderbufferStorage(GL_RENDERBUFFER, 0x81A5, 512, 512)

glBindFramebuffer(GL_FRAMEBUFFER, 258), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_RENDERBUFFER, 258)

glCheckFramebufferStatus(0x8D40)=0x8CD5

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=258 (draw=258, read=258)

glBindFramebuffer(GL_FRAMEBUFFER, 258), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=258 (draw=258, read=258)

glBindFramebuffer(GL_FRAMEBUFFER, 258), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=258 (draw=258, read=258)

glBindFramebuffer(GL_FRAMEBUFFER, 258), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=258 (draw=258, read=258)

glBindFramebuffer(GL_FRAMEBUFFER, 258), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=258 (draw=258, read=258)

glBindFramebuffer(GL_FRAMEBUFFER, 258), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=258 (draw=258, read=258)

glBindFramebuffer(GL_FRAMEBUFFER, 258), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=258 (draw=258, read=258)

glBindFramebuffer(GL_FRAMEBUFFER, 258), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=258 (draw=258, read=258)

glBindFramebuffer(GL_FRAMEBUFFER, 258), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=258 (draw=258, read=258)

glBindFramebuffer(GL_FRAMEBUFFER, 258), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=258 (draw=258, read=258)

glBindFramebuffer(GL_FRAMEBUFFER, 258), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=258 (draw=258, read=258)

glBindFramebuffer(GL_FRAMEBUFFER, 258), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=258 (draw=258, read=258)

glBindFramebuffer(GL_FRAMEBUFFER, 258), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=258 (draw=258, read=258)

glBindFramebuffer(GL_FRAMEBUFFER, 258), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=258 (draw=258, read=258)

glBindFramebuffer(GL_FRAMEBUFFER, 258), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=258 (draw=258, read=258)

glBindFramebuffer(GL_FRAMEBUFFER, 258), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=258 (draw=258, read=258)

glBindFramebuffer(GL_FRAMEBUFFER, 258), list=none, glstate->fbo.current_fb=0 (draw=0, read=0)

glBindFramebuffer(GL_FRAMEBUFFER, 0), list=none, glstate->fbo.current_fb=258 (draw=258, read=258)

glBindFramebuffer(GL_FRAMEBUFFER, 258), list=none, glstate->fbo.current_fb=0 (draw=0, read=0) 

...

Having a look at logs, ptitSeb says that all looks fine: the log trace from the "crash" version looks clean. There is 3 FBO created, exactly in the same way, but the 3rd one didn't worked, for some reason I can't explain. And there isn't any more usefull log you can get from gl4es. In the current logs, the issue is when glCheckFramebufferStatus(...) is not 0x8CD5... (this one means FBO Complete).

So, he guess that on our side we need to check, if we seees an issue in bindding a GL_BRGA texture to an FBO.

And that we (me probabaly:) ) need to remind what bidding a texture to an FBO mean: that means that all the renedring that will be done don't go to the screen, but to the binded FBO, so the drawing will be done directly in the texture => so maybe GL_BGRA are fine when you read it, but maybe are wrongly handling for "writting" in an FBO.

And the last thing i do, is to create a glSnoop trace of whole game since start, till it crashes. Maybe that also will give as a clue of what wrong with FBO and that texture binding with:

http://kas1e.mikendezign.com/aos4/gl4es/BGRA/stk_crash_trace.zip

Packed 700kb, unpacked 26mb

Edited by kas1e on 2019/9/18 20:02:54

Join us to improve dopus5!
AmigaOS4 on youtube

Daytona675x

Re: GL4ES: another OpenGL over OpenGLES2 emulation - some tech. info and porting progress

Posted on: 2019/9/19 8:40 #692

Not too shy to talk

@kas1e
Regarding the blue/red color swap: it's a Nova bug, although, well, to be fair: it's more a question of definition who's responsable for what and what's to be expected from what.

Anyway, Nova ignores any eventual channel swizzle settings of a texture when it's bound to an FBO.
Unfortunately this channel-swizzling is the way how I implemented BGRA texture support: it's actually a texture of format W3DNPF_RGBA (which actually only means that it should have 4 color channels) and then I modify its default channel mapping.

I submitted a small bug report against Nova.
However, because Hans signaled that he won't fix anything anytime soon and because it's probably not really his fault this time, I just added a temporary workaround to ogles2:

whenever a texture is now used as FBO render target, I reset its swizzling so that it at least can be rendered as expected. This may have other sideeffects but for most usecases this should do.

Edited by Daytona675x on 2019/9/19 9:09:59
Edited by Daytona675x on 2019/9/19 9:40:07

[Facebook] [YouTube Channel] [ko-fi]

thellier

Re: GL4ES: another OpenGL over OpenGLES2 emulation - some tech. info and porting progress

Posted on: 2019/9/19 9:16 #693

Not too shy to talk

@kas1e

You cant blame Hans to not fix immediately Nova: after all we dont know about his life and perhaps he have more urgent/important personnal things to do in his life than fixing a hobby computer program.

Usually the answer is "will be done in a future release" for
subcontractor
or for employee "I will take care of it as soon as I will have time" and you put the paper up to a big stack of folders meaning " but I have so much to do..."

Dont blame him for not saying such bullshits

kas1e

Re: GL4ES: another OpenGL over OpenGLES2 emulation - some tech. info and porting progress

Posted on: 2019/9/19 11:32 #694

Home away from home

@Daniel
Yeah, tested new ogles2.library, colors fine now, thanks !

Also have some progress on crash front: but i already wrote it all on facebook, but maybe other ones in interst to read :)

So, crash gone once i set gl4es's LIBGL_RECYCLEFBO environment which make avoid multiple FBO creation / deletions. Through it produce some wrong look and mess instead of actual data :

(press open in new tab for fullsize):

Resized Image

But that can be easy gl4es issue as well , need to check with ptitSeb firstly.

Join us to improve dopus5!
AmigaOS4 on youtube

thellier

Re: GL4ES: another OpenGL over OpenGLES2 emulation - some tech. info and porting progress

Posted on: 2019/9/20 9:21 #695

Not too shy to talk

@Daytona675x

If I understood well this should works too, no?

- create VBO with N arrays
- define VBOsize as VBO byte size
- if you want to switch to fast endian-conv-free data transfer then do
for(uint32 i=0;i<VBOsize ;++i) VBOSetArray(vbo_handle,i,W3DNEF_UINT8,FALSE,1,1,i,1);

Edit: VBOsize I mean per line of the array
Exemple
3 arrays:
xyz
uvw
rgba

VBOsize=(3+3+4)*4=40

So no need to have a multiple of 3 can keep 40

Edited by thellier on 2019/9/20 12:28:47

Daytona675x

Re: GL4ES: another OpenGL over OpenGLES2 emulation - some tech. info and porting progress

Posted on: 2019/9/20 14:24 #696

Not too shy to talk

@thellier
No, that's not the way to go. The second parameter to VBOSetArray must be the index of an array of the respective VBO, namely a value between 0 and N-1, where N is the value you used when creating your VBO. Also, your 40 is not the VBOsize, it's your vertex-size, your VBO certainly contains more than 1 vertex

In your example N is 3, but you try to falsely set array-layouts for non-existing arrays 3 to 39.

Keep in mind that for this trick the real layout of your arrays is not of interest.
What's important is that you use W3DNEF_UINT8 (otherwise endian conversion would kick in) and the same size (otherwise the driver acts somewhat dumb and always selects complex-slow-copy, which is why you eventually have to add some extra bytes to your VBO (and I suppose an 8 byte divisible size won't hurt neither)) for all arrays of the VBO.
The idea simply is to split the VBO memory temporarily into N sequential raw-ubyte areas of the same size.

Let's asume your VBO should contain 4 of your vertices. Then it looks like this:

ArrayCount=3 (xyz, uvw, rgba)
VertexSize=40
VBOsize=4*VertexSize=160

But for the trick to work we must ensure that the VBOsize is divisible by our ArrayCount, therefore:

if(VBOsize % ArrayCount) VBOsize+=(ArrayCount-(VBOsize % ArrayCount)), so
VBOsize=162

So you will firstly create a VBO with 3 arrays and byte-size 162.

In reality your buffer will look like this:


..0: xyzuvwrgba

.40: xyzuvwrgba

.80: xyzuvwrgba

120: xyzuvwrgba

160: bb

162:

However, to make the anti-endian-conv-trick you temporarily make it appear like this though:


..0: bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb

.54: bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb

108: bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb

162:

which is
for(uint32 i=0;i<N;++i) VBOSetArray(vbo_handle,i,W3DNEF_UINT8,FALSE,1,1,i*(vbo_size/N),vbo_size/N);
or here
for(uint32 i=0;i<3;++i) VBOSetArray(vbo_handle,i,W3DNEF_UINT8,FALSE,1,1,i*54,54);

[Facebook] [YouTube Channel] [ko-fi]

kas1e

Re: GL4ES: another OpenGL over OpenGLES2 emulation - some tech. info and porting progress

Posted on: 2019/9/20 20:23 #697

Home away from home

@All
While porting supertuxkart, found another simple shader which fail on our side. Daniel's glslangvalidator_redux compile them fine (so probabaly olges2 also fine there, because code from daniel's last glslangvalidator_reduce are inside of ogles2 if i correctly remember), but then, compiled .spv version from that shader fail on Nova with reference to problems with shared_ptr

There is shader in question:


#version 120



// motion_blur.vert



void main()

{

    gl_TexCoord[0].st = vec2(gl_MultiTexCoord0.s, gl_MultiTexCoord0.t);

    gl_Position = gl_Vertex;

}

glslangvalidator compile it fine:

Quote:

8/0.Work:Warp3DNova/my_tests> glslangvalidator_redux -G -o motion_blur.vert.spv motion_blur.vert

Success.

Then nova fail to compile it:


8/0.Work:Warp3DNova/my_tests> W3DNShaderInfo motion_blur.vert.spv 

W3DNShaderInfo - Get shader information



Shader: motion_blur.vert.spv

Compiling motion_blur.vert.spv failed (12) with error: unknown error

Log:

ERROR: An exception occurred during compilation: 

ERROR: Assertion failed: px != 0, in T* boost::shared_ptr<T>::operator->() const [with T = GPUProg::CGRegisterAlloc], defined in /SDK/local/common/include/boost/shared_ptr.hpp in line 253

ERROR: An exception occurred during compilation: 

ERROR: Assertion failed: px != 0, in T* boost::shared_ptr<T>::operator->() const [with T = GPUProg::CGRegisterAlloc], defined in /SDK/local/common/include/boost/shared_ptr.hpp in line 253

ERROR: Code generation failed for an unknown reason.







Done.

And there is verbose output from nova:


8/0.Work:Warp3DNova/my_tests> W3DNShaderInfo -v motion_blur.vert.spv 

W3DNShaderInfo - Get shader information



Verbose mode

Shader: motion_blur.vert.spv

Compiling motion_blur.vert.spv failed (12) with error: unknown error

Log:

Shader size: 760 bytes

Parsing SPIR-V code



Module Version: 1.2.0

Generator Magic Number: 0x80003

Upper bound on ids: 32



Parsed instructions:

            OpCapability: : Shader

         1: OpExtInstImport: : GLSL.std.450

            OpMemoryModel: : addressing: Logical, memory: GLSL450

         4: OpEntryPoint: : main, execution model: Vertex

            OpSource: : GLSL ver 120

         4: OpName: : main

        12: OpName: : gl_TexCoord

        16: OpName: : gl_MultiTexCoord0

        29: OpName: : gl_Position

        30: OpName: : gl_Vertex

        29: OpDecorate: : BuiltIn(Position)

         2: OpTypeVoid: Void

         3: OpTypeFunction: 2 << func()

         6: OpTypeFloat: Float: 32 bits

         7: OpTypeVector: ???Vector4: num-elements: 4, element type id: 6

         8: OpTypeInt: UInt: 32 bits, unsigned

         9: OpConstant: 8 const9 = 0x1

        10: OpTypeArray: ???[]: length id: 9, element type id: 7

        11: OpTypePointer: ???Ptr: storage class: Output

        12: OpVariable: 11: var12: storage class: Output

        13: OpTypeInt: Int: 32 bits, signed

        14: OpConstant: 13 const14 = 0x0

        15: OpTypePointer: ???Ptr: storage class: Input

        16: OpVariable: 15: var16: storage class: Input

        17: OpConstant: 8 const17 = 0x0

        18: OpTypePointer: ???Ptr: storage class: Input

        23: OpTypeVector: ???Vector2: num-elements: 2, element type id: 6

        25: OpTypePointer: ???Ptr: storage class: Output

        29: OpVariable: 25: var29: storage class: Output

        30: OpVariable: 15: var30: storage class: Input

         4: OpFunction: func4(type: 3)



         5: OpLabel: 

        19: OpAccessChain: 18: 16[17]

        20: OpLoad: 6: tmp20 << 19

        21: OpAccessChain: 18: 16[9]

        22: OpLoad: 6: tmp22 << 21

        24: OpCompositeConstruct: 23: tmp24 << 20, 22

        26: OpAccessChain: 25: 12[14]

        27: OpLoad: 7: tmp27 << 26

        28: OpVectorShuffle: 7: tmp28 << 27, 24, 4, 5, 2, 3

            OpStore: : 28 >> 26

        31: OpLoad: 7: tmp31 << 30

            OpStore: : 31 >> 29

            OpReturn: 

            OpFunctionEnd: 



Linking the instructions

Initial Disassembly:

Module Info:

            OpSource: : GLSL ver 120

         1: OpExtInstImport: : GLSL.std.450

            OpMemoryModel: : addressing: Logical, memory: GLSL450



Capabilities:

            OpCapability: : Shader



Inputs:

        16: OpVariable: FloatVector4*: gl_MultiTexCoord0: storage class: Input

        30: OpVariable: FloatVector4*: gl_Vertex: storage class: Input



Outputs:

        12: OpVariable: FloatVector4[1]*: gl_TexCoord: storage class: Output

        29: OpVariable: FloatVector4*: gl_Position: storage class: Output Decorators: BuiltIn(Position)



Entry Points: 

         4: OpEntryPoint: : main, execution model: Vertex, Function: Void main()



Constants:

         9: OpConstant: UInt const9 = 1

        14: OpConstant: Int const14 = 0

        17: OpConstant: UInt const17 = 0



Disassembled Code:

         4: OpFunction: Void main()

                 5: lb5:

                        19: OpAccessChain: Float*: gl_MultiTexCoord0[0]

                        20: OpLoad: Float: tmp20 << gl_MultiTexCoord0[0]

                        21: OpAccessChain: Float*: gl_MultiTexCoord0[1]

                        22: OpLoad: Float: tmp22 << gl_MultiTexCoord0[1]

                        24: OpCompositeConstruct: FloatVector2: tmp24 << tmp20, tmp22

                        26: OpAccessChain: FloatVector4*: gl_TexCoord[0]

                        27: OpLoad: FloatVector4: tmp27 << gl_TexCoord[0]

                        28: OpVectorShuffle: FloatVector4: tmp28 << tmp27, tmp24, 4, 5, 2, 3

                            OpStore: : tmp28 >> gl_TexCoord[0]

                        31: OpLoad: FloatVector4: tmp31 << gl_Vertex

                            OpStore: : tmp31 >> gl_Position

                            OpReturn: 







Performing hardware-independent optimization...

Can't merge stores to array variable: gl_TexCoord

Optimization done.



Optimized Disassembly:

Module Info:

            OpSource: : GLSL ver 120

         1: OpExtInstImport: : GLSL.std.450

            OpMemoryModel: : addressing: Logical, memory: GLSL450



Capabilities:

            OpCapability: : Shader



Inputs:

        16: OpVariable: FloatVector4*: gl_MultiTexCoord0: storage class: Input

        30: OpVariable: FloatVector4*: gl_Vertex: storage class: Input



Outputs:

        12: OpVariable: FloatVector4[1]*: gl_TexCoord: storage class: Output

        29: OpVariable: FloatVector4*: gl_Position: storage class: Output Decorators: BuiltIn(Position)



Entry Points: 

         4: OpEntryPoint: : main, execution model: Vertex, Function: Void main()



Constants:

         9: OpConstant: UInt const9 = 1

        14: OpConstant: Int const14 = 0

        17: OpConstant: UInt const17 = 0



Disassembled Code:

         4: OpFunction: Void main()

                 5: lb5:

                        19: OpAccessChain: Float*: gl_MultiTexCoord0[0]

                        20: OpLoad: Float: tmp20 << gl_MultiTexCoord0[0]

                        21: OpAccessChain: Float*: gl_MultiTexCoord0[1]

                        22: OpLoad: Float: tmp22 << gl_MultiTexCoord0[1]

                        24: OpCompositeConstruct: FloatVector2: tmp24 << tmp20, tmp22

                        26: OpAccessChain: FloatVector4*: gl_TexCoord[0]

                        27: OpLoad: FloatVector4: tmp27 << gl_TexCoord[0]

                        28: OpVectorShuffle: FloatVector4: tmp28 << tmp27, tmp24, 4, 5, 2, 3

                            OpStore: : tmp28 >> gl_TexCoord[0]

                        31: OpLoad: FloatVector4: tmp31 << gl_Vertex

                            OpStore: : tmp31 >> gl_Position

                            OpReturn: 







Generating the compiled code...

ERROR: An exception occurred during compilation: 

ERROR: Assertion failed: px != 0, in T* boost::shared_ptr<T>::operator->() const [with T = GPUProg::CGRegisterAlloc], defined in /SDK/local/common/include/boost/shared_ptr.hpp in line 253

Intermediate disassembly (pre optimization):

Program Type: Vertex

Input Variables:

offset: 0, size: 16, FloatVector4 gl_Vertex

offset: 1, size: 16, FloatVector4 gl_MultiTexCoord0



Output Variables:

offset: 32, size: 16, FloatVector4 gl_TexCoord[1]



Special Output Variables:

offset: 12, size: 16, FloatVector4 gl_Position BuiltIn(Position)



Constants:

UInt32 const9: 1

Int32 const14: 0

UInt32 const17: 0



Instructions:

V_ADD_I32 vDst(VGPR0) src0(SGPR2) src1(VGPR0) // VOP2

# Void main()

Function: Void main()

# lb5

Label: lb5

#         19: OpAccessChain: Float*: gl_MultiTexCoord0[0]

#         20: OpLoad: Float: tmp20 << gl_MultiTexCoord0[0]

S_LOAD_DWORDX4_IMM offset(4) sBase(SGPR[6:7]) sDst(SGPR[8:11])

S_WAITCNT 0 

BUFFER_LOAD_FORMAT_X offset(0) offEn(0) idxEn(1) glc(0) addr64(0) lds(0) vAddr(VGPR[0:1]) vData(VGPR2) srSrc(SGPR[8:11]) slc(0) tfe(0) sOffset(0)

S_WAITCNT 0 

#         21: OpAccessChain: Float*: gl_MultiTexCoord0[1]

#         22: OpLoad: Float: tmp22 << gl_MultiTexCoord0[1]

S_LOAD_DWORDX4_IMM offset(4) sBase(SGPR[6:7]) sDst(SGPR[12:15])

S_WAITCNT 0 

BUFFER_LOAD_FORMAT_XYZW offset(0) offEn(0) idxEn(1) glc(0) addr64(0) lds(0) vAddr(VGPR[0:1]) vData(VGPR4) srSrc(SGPR[12:15]) slc(0) tfe(0) sOffset(0)

S_WAITCNT 0 

V_MOV_B32 vDst(VGPR3) src0(VGPR5)

#         24: OpCompositeConstruct: FloatVector2: tmp24 << tmp20, tmp22

V_MOV_B32 vDst(VGPR8) src0(VGPR2)

V_MOV_B32 vDst(VGPR9) src0(VGPR3)

#         26: OpAccessChain: FloatVector4*: gl_TexCoord[0]

#         27: OpLoad: FloatVector4: tmp27 << gl_TexCoord[0]



Performing GPU-specific optimization...

Pre register allocation control-flow processing...

Intermediate disassembly (pre register allocation):

Program Type: Vertex

Input Variables:

offset: 0, size: 16, FloatVector4 gl_Vertex

offset: 1, size: 16, FloatVector4 gl_MultiTexCoord0



Output Variables:

offset: 32, size: 16, FloatVector4 gl_TexCoord[1]



Special Output Variables:

offset: 12, size: 16, FloatVector4 gl_Position BuiltIn(Position)



Constants:

UInt32 const9: 1

Int32 const14: 0

UInt32 const17: 0



Instructions:

V_ADD_I32 vDst(VGPR0) src0(SGPR2) src1(VGPR0) // VOP2

# Void main()

Function: Void main()

# lb5

Label: lb5

#         19: OpAccessChain: Float*: gl_MultiTexCoord0[0]

#         20: OpLoad: Float: tmp20 << gl_MultiTexCoord0[0]

S_LOAD_DWORDX4_IMM offset(4) sBase(SGPR[6:7]) sDst(SGPR[8:11])

S_WAITCNT 0 

BUFFER_LOAD_FORMAT_X offset(0) offEn(0) idxEn(1) glc(0) addr64(0) lds(0) vAddr(VGPR[0:1]) vData(VGPR2) srSrc(SGPR[8:11]) slc(0) tfe(0) sOffset(0)

S_WAITCNT 0 

#         21: OpAccessChain: Float*: gl_MultiTexCoord0[1]

#         22: OpLoad: Float: tmp22 << gl_MultiTexCoord0[1]

S_LOAD_DWORDX4_IMM offset(4) sBase(SGPR[6:7]) sDst(SGPR[12:15])

S_WAITCNT 0 

BUFFER_LOAD_FORMAT_XYZW offset(0) offEn(0) idxEn(1) glc(0) addr64(0) lds(0) vAddr(VGPR[0:1]) vData(VGPR4) srSrc(SGPR[12:15]) slc(0) tfe(0) sOffset(0)

S_WAITCNT 0 

V_MOV_B32 vDst(VGPR3) src0(VGPR5)

#         24: OpCompositeConstruct: FloatVector2: tmp24 << tmp20, tmp22

V_MOV_B32 vDst(VGPR8) src0(VGPR2)

V_MOV_B32 vDst(VGPR9) src0(VGPR3)

#         26: OpAccessChain: FloatVector4*: gl_TexCoord[0]

#         27: OpLoad: FloatVector4: tmp27 << gl_TexCoord[0]

ERROR: An exception occurred during compilation: 

ERROR: Assertion failed: px != 0, in T* boost::shared_ptr<T>::operator->() const [with T = GPUProg::CGRegisterAlloc], defined in /SDK/local/common/include/boost/shared_ptr.hpp in line 253







Done.



8/0.Work:Warp3DNova/my_tests>

I do some google, and found that this variable used pretty offten in all tutorials and even in book “OpenGL Shading language".

Then i find that it all described in section 5.5 of GLSL spec, and in end i find in google some explain , that this “st” part is part of swizzle mask which let you recombine your vector. The texture coordinates are four-component vectors, but st mask selects the first two (you can use “xy”, it would be the same).

So i tried also with "xy", but it also fail.

Any ideas if it expected to work (so its a bug), or some non-implemented feature ?

Join us to improve dopus5!
AmigaOS4 on youtube

Daytona675x

Re: GL4ES: another OpenGL over OpenGLES2 emulation - some tech. info and porting progress

Posted on: 2019/9/21 7:43 #698

Not too shy to talk

@kas1e
The crash is obviously due to a typical programming bug in Nova, a wrong usage of boost::shared_ptr. Basically it's a classic invalid function-call-attempt on a nullptr.
Which somewhat fits into the discussion we had about potential boost misapplication some weeks ago, which has to be fixed asap.

Edited by Daytona675x on 2019/9/21 8:09:55

[Facebook] [YouTube Channel] [ko-fi]

kas1e

Re: GL4ES: another OpenGL over OpenGLES2 emulation - some tech. info and porting progress

Posted on: 2019/9/21 17:24 #699

Home away from home

@Daniel
Yeah, i submitted bug report about and add to report your remark (if you doesn't mind)

Join us to improve dopus5!
AmigaOS4 on youtube

thellier

Re: GL4ES: another OpenGL over OpenGLES2 emulation - some tech. info and porting progress

Posted on: 2019/9/23 7:25 #700

Not too shy to talk

@Daytona675x

I explained it badly I should have called the variable VertexSize not VBOsize : but we agree

[ For Wazp3D57 I have encapsulated all those VBO functions to more simple functions so I no more use them...]

I was thinking that "recasting" on the fly a VBO created with 3 "fields" to a VBO with 40 "fields" may be possible (after all Nova works so strangely so it may not have check the "fields" scount) as long as the global VBOsize stay the same (160). It may have permit to not change the VertexSize to a multiple of 3
but if it dont works it dont works...

Register To Post	« 1 ... 32 33 34 (35) 36 37 38 ... 43 »
	Top Previous Topic Next Topic

Currently Active Users Viewing This Thread: 1 ( 0 members and 1 Anonymous Users )