Pegasos2 with RadeonHD/RX via bridge

	Bottom Previous Topic Next Topic
Register To Post

« 1 ... 7 8 9 (10) 11 12 »

balaton

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 4/28 21:43 #181

Quite a regular

@joerg
Yes I know, maybe my wording was not clear. What GPL prevents is one party turning GPL software into closed source software which is what I really meant by "commercial code", so maybe that's what causes misunderstanding. BSD license has no such restriction so somebody can take it and use it in part or change it and make it closed source, they only cannot claim they wrote it and have to keep copyright messages to acknowledge original developers. GPL also requires to keep the sources free and available for everybody and not use it in code that does not allow the same. So when you substitute commercial code with closed source code that's what I meant but failed to describe properly.

By the way it's not the GPL code that's sold commercially because the code itself is freely available (and cannot be charged more than the fair amount needed to transfer it) but additional services using that code like making distros or providing services based on that code and so on. But GPL software can be developed commercially for sure.

msteed

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 4/29 7:23 #182

Just popping in

@kas1e

Quote:

ok# 16 config-l@ ok# config-l@ ok# . 646011AB ok#

I think you should only execute config-l@ once, not twice. When you execute it the second time it takes the value that the first execution left on the stack (the result you're trying to see) as the address to report on, which is not what you want (and may be why you're getting strange results). You should just do:


16 config-l@ .

Quote:

(Also realised where did you get 16 for BAR address. Forth is hex by default do dec# 16 is wriiten as 10 in Forth.)

Good point. Forth itself is decimal by default, but OpenFirmware may not be. The value '646011AB' is clearly in hex, so the number '16' is probably taken as hex as well. You could try something like:


20 1 - .

If you get 13 (the value 19 in hex) then the numbers are being interpreted as decimal. If you get 1F then they're being interpreted as hex.

kas1e

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 4/29 12:16 #183

Home away from home

@msteed
Quote:

I think you should only execute config-l@ once, not twice. When you execute it the second time it takes the value that the first execution left on the stack (the result you're trying to see) as the address to report on, which is not what you want (and may be why you're getting strange results). You should just do:

16 config-l@ .

It's strangely return 0, does not matter where i am in : in /pci/ , or in /pci/pci@7 (in bridge), see:


ok cd /pci

ok pwd

/pci@80000000

ok 16 config-l@ .

0

ok cd pci@7

ok pwd

/pci@80000000/pci@7

ok 16 config-l@ .

0

ok

At first i think that maybe it's exactly issue in bridge, but then the same happens if i go to any pci based directory, always zero..

Quote:

If you get 1F then they're being interpreted as hex.

Thanks for point, it is Hex:


ok 20 1 - .

1F

ok

Join us to improve dopus5!
AmigaOS4 on youtube

joerg

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 4/29 16:21 #184

Just can't stay away

@kas1e
With hex you have to use


10 config-l@ .

instead of 16 to get the first BAR.

kas1e

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 4/30 6:53 #185

Home away from home

@Joerg
Same 0 as expected :( (there should be values anyway, even with wrong 16). Question is : what BAR we tried to read with this "16", i mean first BAR of what, if it the same does not matter in what directory (be it root /pci, or /pci/pci@7, or any other pci) happens to be. Like it just general offset of 16 of whole PCI thing, but we need first BAR of the card behind the bridge : how to say/calculate that ?

Edited by kas1e on 2024/4/30 7:55:02

Join us to improve dopus5!
AmigaOS4 on youtube

kas1e

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 7/2 10:27 #186

Home away from home

@Sailor
It takes a while, but i at last received this AGP-to-PCI(e) adapter we talk about on the first page from there: https://www.ebay.com/itm/125723908233

Takes 2 just in case, but then i probably will test them once Hans deal with casual bridge , as there are changes that with this kind of adapters and tests around it my pegasos2 will burn and die, so firstly want to be able to finish testing of what Hans is working on, and then will test this adapter.

@All
Hans progressing pretty well with replacing non-working-with bridges RTAS way os4 kernel uses for pegasos2, to the direct PCI reading/writing way, and currently there few bits to fix before it can come up with something, but at least we surely have correct addresses of video memory now, things which remain is to fix some registers reading, and then there high chance it will work! Rise of Frankenstein !

Join us to improve dopus5!
AmigaOS4 on youtube

joerg

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 7/2 10:39 #187

Just can't stay away

@kas1e
Quote:

It takes a while, but i at last received this AGP-to-PCI(e) adapter we talk about on the first page from there: https://www.ebay.com/itm/125723908233

It's not PCIe but PCI-X!

kas1e

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 7/2 11:13 #188

Home away from home

@joerg
Right, but i mean to put pci2pcie bridge in this adapter, so to use agp to double speed (as agp in peg2 its pci but not on 33, but on 66)

Edited by kas1e on 2024/7/2 11:32:24

Join us to improve dopus5!
AmigaOS4 on youtube

kas1e

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 7/12 6:29 #189

Home away from home

@All
Tried this adapter in end of all : https://www.ebay.com/itm/125723908233

Tried it with both bridges : pericom and pex ones, and in both cases in firmware when i go to the pci@C0000000 (agp area) all i have is pci@8 , properties on which show that this is bridge (so both bridges detects correctly via adapter), but , both didn't see a graphics card in.

One time i was lucky (probably was some bad attachment of adapter or something), and instead of just pci@8 in pci@C0000000, i did have about 20 different pci's (pci@1, pci@2, pci@3, etc), in which card were detected ! (both audio and video parts). But that was just one time, and does not matter how hard i tried to reproduce it, i always can't. While when bridge just in pure PCI without adapters all fine and detects by firmware fine.

So probably conclusion is : this missed "lock" signal is what made it not works. The one time detecting was probably some bug in this lock signal handling or something.

Join us to improve dopus5!
AmigaOS4 on youtube

kas1e

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 7/24 7:13 #190

Home away from home

@All

Did i understand correctly, that on pegasos2 we have PCI slots which is 32-bit ones on 33MHZ, and an AGP one which in reality the same PCI 32-bit one, just not on 33MHZ, but on 66 MHZ, and that all difference ?

If so, then did i get it right, that maximum limit of the PCI bus is 133.33 MB/s , while AGP (in our case PCI 66MHZ one) is 266 MB/s ? I.e. with PCI to PCIe bridge, we can only reach the limits of the PCI bus, which is 133 mb/s ?

What i mean, that i tested for now via gfxbench my Radeon9250 in AGP (so 32bit PCI 66mhz one) slot, and have those results:

Copy from RAM to VRAM:
Transfer size: 16296960 bytes
Src: 0x63c3f000, Dest: 0xc27a4160
copy32: 105.892 MiB/s (took 0.146772 seconds)
copy64: 107.069 MiB/s (took 0.145159 seconds)
copy64f: 135.885 MiB/s (took 0.114376 seconds)
copy64x2: 107.573 MiB/s (took 0.144479 seconds)
copy64fx2: 134.772 MiB/s (took 0.115321 seconds)
copy64fx2PF: 141.174 MiB/s (took 0.110091 seconds)
copy64fx4PF: 140.885 MiB/s (took 0.110317 seconds)
useMemcpy: 53.291 MiB/s (took 0.291645 seconds)
useExecCopyMem: 53.431 MiB/s (took 0.290878 seconds)
copyToVRAM: 207.152 MiB/s (took 0.075027 seconds)
WritePixelArray: 216.664 MiB/s (took 0.071733 seconds).

As far as i can see there, only WritePixelArray almost hit the limit of our AGP (216 mib = ~226mb, while limit is 266). But copy32, copy64 and all that are 2 times slower than a limit.

Is it mean Radeon9250 just can't reach AGP's bus maximum then in some cases ?

Then:

Copy from VRAM to RAM:
Transfer size: 16296960 bytes
Src: 0xc27a4160, Dest: 0x63c3f000
copy32: 37.399 MiB/s (took 0.415567 seconds)
copy64: 35.307 MiB/s (took 0.440196 seconds)
copy64f: 49.172 MiB/s (took 0.316077 seconds)
useMemcpy: 36.144 MiB/s (took 0.429999 seconds)
useExecCopyMem: 35.728 MiB/s (took 0.435004 seconds)
copyFromVRAM: 50.216 MiB/s (took 0.309504 seconds)
ReadPixelArray: 48.550 MiB/s (took 0.320125 seconds).

This one absolutely not hit the limits of AGP, as all the values in 5 times less that the AGP limits.

Is it again, because of Radeon9250 which can't reach AGP limits, or, it's just AmigaOS itself and it's kernel/driver/graphics.library cause issues there ?

Basically, if i got it right, with the PCI bridge in PCI (33mhz) slot we can reach at maximum with does not matter what graphics card we will use, a WritePixelArray of ~130MIB/s maximum , but then, copy from VRAM to RAM can be or the same at worst, or faster till 130mb/s in all tests, as even with Radeon9250 they didn't hit the limits.

Did i understand that all correctly ?

Join us to improve dopus5!
AmigaOS4 on youtube

Hans

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 7/24 14:30 #191

Home away from home

@kas1e

Reads are slower because they involve sending a request to the card, and then receiving the response (i.e., the returned value) from the card. This is inherently slower than shoveling data to the card.

DMA transfers can reduce the overhead by sending data in large blocks, so you need much fewer requests.

Hans

Join Kea Campus' Amiga Corner and support Amiga content creation
https://keasigmadelta.com/ - see more of my work

joerg

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 7/24 15:11 #192

Just can't stay away

@kas1e
Quote:

As far as i can see there, only WritePixelArray almost hit the limit of our AGP (216 mib = ~226mb, while limit is 266). But copy32, copy64 and all that are 2 times slower than a limit.

If you have a G4 CPU WritePixelArray and useExecCopyMem use AltiVec transferring 128 bits at a time, copy64f the FPU with 64 bits at a time, copy64 probably 2 * 32 bit integers and copy32 32 bit integer accesses.
AFAIK copyToVRAM and copyFromVRAM use AltiVec as well.
Each access to VRAM has PCI overhead, more bits transferred per access results in faster speeds.
The useMemcpy and useExecCopyMem results are much slower than they should be, but I don't know what the problem is.

@Hans
Quote:

DMA transfers can reduce the overhead by sending data in large blocks, so you need much fewer requests.

On Classic Amigas, AmigaOne and Pegasos2 there is no OS DMA copy (graphics (Read|Write)PixelArray(), exec CopyMemQuick(), etc.), only Sam4x0, X1000, X5000 and maybe A1222 have DMA based copy functions.
AFAIK GART is disabled in your drivers on AmigaOne and Pegasos2 as well, therefore no DMA at all.

Depending on the CPU the OS copy functions may use AltiVec on AmigaOne and Pegasos2, but for gfx card VRAM accesses that can only be about twice as fast as FPU based copy functions, if at all.
The DRAM read part of an AltiVec based copy between DRAM and VRAM should be more than twice as fast as a FPU one, but DRAM writes shouldn't make a difference (using DCBA or DCBZ with FPU writes is about the same speed as AltiVec writes using the streaming instructions).

Edited by joerg on 2024/7/24 15:30:50

kas1e

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 7/29 19:28 #193

Home away from home

@All
For first, very good news: Hans did it ! After replacing RTAS way of working with PCI registers in pegasos2 kernel to the direct way and fixing some issues in process, we were able to get both RadeonHD and RadeonRX to work via PCI-2-PCIe bridge!

So, good news first : everything works. Hardware video acceleration via VA library, GL4ES, Warp3DNova, ogles2.library, etc, etc. While i made some big video about, see the short one just for little bit of tease:

Then the bad news : while copy from RAM to VRAM are slow (2-3 times slower than Radeon9250 in AGP slot), the from VRAM to RAM is abnormaly bad: slower in 25(!) times than Radeon9250 with AGP.

Yes, what you see on the video, it's usage of VA library and Spencer game, which seems to be programmed in "large enough" blocks (or so), and it didn't surfer much from those small operations, but at least when you use workbench you can see that in some operations (like scrolling the icons in the directory) slower pretty much (while, moving the window with transaprency very fast).

I tested 2 bridges:

One from startech.com with "Pericom - PI7C9X111SL" chipset: https://www.startech.com/en-us/cards-adapters/pci1pex1

And another one, with PLX PEX8112 chip, named : "PCI/PCI Express X16 adapter PXE8112", like from there: https://www.amazon.co.uk/Mumuve-Expres ... ter-Pxe8112/dp/B0D2D84M7C

The Pericom - PI7C9X111SL while also suffer from those speed issues, still, few times better and faster than PEX8112 based one. Dunno what the reassons, but that it. It feels almost "OK",
but not enough for to be called fluid, but a PLX's one, this really pain.

I made a small graph with copy to-from ram-vram so to see visually how it all looks like in one table, but you can take 3 gfxbench files directly too to see them all, too:

graph (click open image in new tab to expland for full size)

Copy from RAM to VRAM:


.

                       Radeon 9250 AGP             Pericom PI7C9X111SL bridge            PLX PEX8112-AA66BI F Bridge



copy32:                 105.892 MiB/s                   43.034 MiB/s                            16.262 MiB/s

copy64:                 107.069 MiB/s                   43.017 MiB/s                            16.251 MiB/s

copy64f:                135.885 MiB/s                   58.078 MiB/s                            21.570 MiB/s

copy64x2:               107.573 MiB/s                   43.008 MiB/s                            16.159 MiB/s

copy64fx2:              134.772 MiB/s                   58.235 MiB/s                            21.415 MiB/s

copy64fx2PF:            141.174 MiB/s                   58.149 MiB/s                            21.404 MiB/s

copy64fx4PF:            140.885 MiB/s                   58.139 MiB/s                            21.410 MiB/s

useMemcpy:               53.291 MiB/s                   21.259 MiB/s                             8.135 MiB/s 

useExecCopyMem:          53.431 MiB/s                   21.436 MiB/s                             8.130 MiB/s 

copyToVRAM:             207.152 MiB/s                   77.449 MiB/s                            35.642 MiB/s

WritePixelArray:        216.664 MiB/s                   70.913 MiB/s                            30.078 MiB/s

Copy from VRAM to RAM:


.

                       Radeon 9250 AGP             Pericom PI7C9X111SL bridge            PLX PEX8112-AA66BI F Bridge



copy32:                  37.399 MiB/s                    2.764 MiB/s                                1.891 MiB/s

copy64:                  35.307 MiB/s                    2.766 MiB/s                                1.891 MiB/s

copy64f:                 49.172 MiB/s                    2.646 MiB/s                                1.837 MiB/s

useMemcpy:               36.144 MiB/s                    2.768 MiB/s                                1.893 MiB/s

useExecCopyMem:          35.728 MiB/s                    2.768 MiB/s                                1.893 MiB/s

copyFromVRAM:            50.216 MiB/s                    2.589 MiB/s                                1.795 MiB/s

ReadPixelArray:          48.550 MiB/s                    2.583 MiB/s                                1.793 MiB/s

gfxbench result files:

Radeon 9250_AGP.txt
Radeon RX PI7C9X111SL_BRIDGE.txt
Radeon RX PEX8112_BRIDGE.txt

So..now question : wtf and how we can improve the situation. That not necessary need to be gazillion faster, but at least it need to be suffer from visual pauses.
Did anyone know how bridges need to be programmed ? Maybe they had some features we can enable/disable to speed things up ?

I found datasheets on:

Pericom - PI7C9X111SL: https://kas1e.mikendezign.com/pegasos2/bridge/PI7C9X111SL.pdf
PLX - PEX 8112: https://kas1e.mikendezign.com/pegasos2 ... e/PEX-8112.2_20081029.pdf

Have anyone clue what can be done ?

Thanks !

Edited by kas1e on 2024/8/4 6:48:35

Join us to improve dopus5!
AmigaOS4 on youtube

m3x

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 7/29 20:06 #194

Just popping in

@kas1e

you may try to enable prefetch on the 8112, like I've done in u-boot for the Sam440-flex board:


void config_pex8112(void) {

    pci_dev_t dev;      // special configuration for PCI-PCI Express bridge PEX8112



    if ((dev = pci_find_device(0x10b5,0x8112,0)) >= 0) {         

        pci_write_config_byte(dev, PCI_CACHE_LINE_SIZE, 0x10);         

        pci_write_config_byte(dev, PCI_LATENCY_TIMER, 0xff);         

        pci_write_config_byte(dev, PCI_SEC_LATENCY_TIMER, 0xff);         

        pci_write_config_byte(dev, 0x48, 0x11);         

        pci_write_config_byte(dev, 0x84, 0x0c); // index = 0x100c         

        pci_write_config_dword(dev, 0x88, 0xcf008020); // data     

    } 

}

Max Tretene, ACube Systems Srl, Soft3

kas1e

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 7/29 20:27 #195

Home away from home

@max
Interesting! Do you have any numbers of how fast speed increases for sam440 and gfx cards with prefetch enabled in pex bridge ?

Join us to improve dopus5!
AmigaOS4 on youtube

smf

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 7/29 21:29 #196

Just popping in

@kas1e

I can't help you but cool progress!

This makes me wanna keep my peg2 for a little bit longer.

m3x

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 7/29 21:46 #197

Just popping in

@kas1e

There were some improvements in read/write benchmarks, but I don't remember the exact figures at the moment.

Max Tretene, ACube Systems Srl, Soft3

Hans

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 7/30 12:33 #198

Home away from home

@m3x

Looks like the driver's PEX bridge code needs updating. The driver enabled blind prefetching. But, if your little code snippet can increase the read transfer rates, then the driver's bridge code is sub-optimal.

BTW, what exactly does this code do:


pci_write_config_byte(dev, 0x48, 0x11);         

        pci_write_config_byte(dev, 0x84, 0x0c); // index = 0x100c         

        pci_write_config_dword(dev, 0x88, 0xcf008020); // data

Hans

Join Kea Campus' Amiga Corner and support Amiga content creation
https://keasigmadelta.com/ - see more of my work

kas1e

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 7/30 18:47 #199

Home away from home

@Hans,Max
Quote:

then the driver's bridge code

Did i understand right, that we have somewhere in the OS4 a driver for PLX's bridge ? (that surprise for me). Or a driver for Pericom's bridge ? I were under impression, that "bridge driver" in os4, it's just some kernel based code, which handle bridge on "generic" way : same code for any pci-to-pcie bridge, and this one is the kernel's code, and only matter of how the firmware (?) configure a bridge ?

What is more strange for me, is how much slower the PLX's bridge are in compare with Pericom's one. It's like, or we have no any code for any bridge in terms of configuring, and doing in kernel some "usual" stuff, and by default Pericom just have better default values, while PLX one are not.

Join us to improve dopus5!
AmigaOS4 on youtube

joerg

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 7/31 6:38 #200

Just can't stay away

@kas1e
Quote:

Did i understand right, that we have somewhere in the OS4 a driver for PLX's bridge ? (that surprise for me). Or a driver for Pericom's bridge ?

The Sam440ep and Sam440ep-flex include a Pericom 8150B PCI bridge.
I don't know if any code for it is required in the AmigaOS kernel (expansion.library) PCI functions, or if the initialization in U-Boot is enough, but in case special code for it is required in AmigaOS it's of course as usual only included in the kernel versions for hardware including it, i.e. the Sam4x0 kernels, not in any other kernel versions like the Pegasos2, classic Amiga, X1000, X5000, etc. ones without such hardware.
The 8150B is a PCI-to-PCI, not a PCI-to-PCIe, bridge, but if you have code for something similar already adding support for the 8112 PCIe one probably wasn't much additional work.
The AmigaOne SE/XE/µA1 has PCI brides as well, the only hardware without any bridge is probably the Pegasos2.

I guess what Hans meant with "Looks like the driver's PEX bridge code needs updating." is bridge code in his Radeon HD/RX drivers, not something in the AmigaOS kernel.

Register To Post	« 1 ... 7 8 9 (10) 11 12 »
	Top Previous Topic Next Topic

Currently Active Users Viewing This Thread: 3 ( 0 members and 3 Anonymous Users )