Logo54.166.130.157 
  Home  News  Recent  Forums  Search  Contact
  Menu
 Username

 Password


   Register here

 Main menu
   View images
   BBCode test
 
 Content
   Statement of intent
   Terms Of Service
   IRC Channel
   List Content

 In cooperation with
  OS4Depot.net
  OpenAmiga.org
  OS4Welt

 Links
  AmigaOS4
  IntuitionBase
  UtilityBase
  Amiga Flame
  Amiga Spirit
  AmiKit
  Aminet
  AmiBay
  AmigaBounty
  AmigaWorld
  Exec
  Amiga.cz
  View comments
[View News][Print][View Files]
Sam460ex available with AmigaOS 4.1Posted by elwood at 20110127 23:59
ACube Systems Srl is pleased to announce the immediate release of Hyperion's AmigaOS 4.1 Update 2 for Sam460ex. Boards and installation CDs are now shipping to resellers.
[Read More][34 Comments]
  Comments
Navigate: 1-20 21-34 
SinanRe: Other SATA-Controllers20110130 22:17  #1586
@kas1e

Here it is :

http://hotfile.com/dl/100677180/cdbb18c/Sam3.divx.html

kas1eRe: Other SATA-Controllers20110130 23:37  #1593
@Sinan

Thanks again. From that i can say, that on sam460 its a little, very a little, but faster. Well .. maybe there s big role play CPU speed itself as well (that can explain why its only a bit faster, because i have 1ghz on peg2, but with bigger l2 cache, while sam460 have a faster cpu, but smaller l2 cache => which result are the same speed of cpu in real apps).

But strange, i think that new video-bus speed and all the other stuff will give a bit bigger speed up .. Well, by only this test we can't say to much, only that "sam460 with all the stuff a bit faster than peg2".

Maybe plain SDL test then ?:) (just to make more exact benchmark)There is archive. Just unpack it, reboot, and run from shell "bench" binary and post all the output info here.

(its all annoing of course, but just to found the truth). Thanks for
SinanRe: Other SATA-Controllers20110201 19:53  #1682
@kas1e

Here are the results of bench utility:

Pitch = 640
Hardware surfaces avail = 1
Window manager avail = 1
Blitter hardware = 1
Colorkey blit hardware = 0
Alpha blit hardware = 0
Software->Hardware accel = 0
Video memory = 0

320x240 320x240 640x480 640x480
software hardware software hardware
Slow points (frames/sec): 2.1036 126.984 0.264454 32.1285
Fast points (frames/sec): 207.12 64.4512 52.2876 16.125
Rect fill (rects/sec): 6311.25 66064.5 1629.28 20686.9
32x32 blits (blits/sec): 22755.6 75851.9 23011.2 75851.9

kas1eRe: Other SATA-Controllers20110201 20:12  #1685
@Sinan

Hm ! There is my:


320x240 320x240 640x480 640x480
software hardware software hardware
Slow points (frames/sec): 12.0301 222.222 1.56464 58.3942
Fast points (frames/sec): 711.111 118.738 181.689 29.8368
Rect fill (rects/sec): 20898 120471 6481.01 58514.3
32x32 blits (blits/sec): 54613.3 157538 53894.7 157538



My much-much faster. Did you have very latest sdl.so ? (there is). And did you run test after reboot ?
SinanRe: Other SATA-Controllers20110201 20:23  #1686
@kas1e

With this version of libsdl, it looks same results..

Is this first part same ? Colorkey blit hardware ? Alpha blit hardware ?
RadeonHD driver isn't finished as Radeon9250.. There may be one of the reasons.
I guess Hans can comment...

Pitch = 640
Hardware surfaces avail = 1
Window manager avail = 1
Blitter hardware = 1
Colorkey blit hardware = 0
Alpha blit hardware = 0
Software->Hardware accel = 0
Video memory = 0

320x240 320x240 640x480 640x480
software hardware software hardware
Slow points (frames/sec): 2.10194 125 0.264297 32.1285
Fast points (frames/sec): 206.952 64.3863 52.2769 16.125
Rect fill (rects/sec): 6282.21 66064.5 1625.4 20686.9
32x32 blits (blits/sec): 22382.5 75851.9 22260.9 75851.9
kas1eRe: Other SATA-Controllers20110201 20:34  #1687
@Sinan

My whole output looks like this:



11/0.RAM Disk:bench-1.0> bench
Mode = 320x240, software
Pitch = 320
Hardware surfaces avail = 1
Window manager avail = 1
Blitter hardware = 1
Colorkey blit hardware = 0
Alpha blit hardware = 0
Software->Hardware accel = 0
Video memory = 0

Slow points test
Fast points test
Rect fill test
32x32 Blitter test
Mode = 320x240, hardware
Slow points test
Fast points test
Rect fill test
32x32 Blitter test
Mode = 640x480, software
Slow points test
Fast points test
Rect fill test
32x32 Blitter test
Mode = 640x480, hardware
Slow points test
Fast points test
Rect fill test
32x32 Blitter test
320x240 320x240 640x480 640x480
software hardware software hardware
Slow points (frames/sec): 12.2511 222.222 1.58103 58.3942
Fast points (frames/sec): 711.111 118.683 181.947 29.8334
Rect fill (rects/sec): 20686.9 124121 6410.02 58514.3
32x32 blits (blits/sec): 55351.4 151704 53894.7 151704

11/0.RAM Disk:bench-1.0>

SinanRe: Other SATA-Controllers20110201 22:48  #1695
@kas1e

Only difference is Pitch=640 is on my side, while it is =320 on your side...
hansRe: Other SATA-Controllers20110202 00:21  #1708
@kas1e

There are a few things to bear in mind with these benchmarks. Firstly, Radeon HD cards have more per-blit-overhead than the Radeon X1000 series and older. As a result, testing small 32x32 blits is likely to favour the older cards. I also have no idea what the SDL port is doing.

Secondly, there is a greater performance hit with PCIe for non-burst transfers. This is particularly bad for read-modify-write operations. Burst transfers mean transferring blocks of contiguous data, preferably with DMA. For this reason I am encouraging people to use blitter/compositing/3D operations instead of software rendering operations. If that isn't possible, then render to a bitmap in main memory, and copy it over, or perform direct writes if appropriate. If that is also not possible, then copy a bitmap region to main memory, update it, and copy it back. Do not perform read-modify-write operations on pixels in VRAM.

Hans

hansGraphics benchmarks20110202 00:39  #1709
@kas1e

Here are some preliminary results from a benchmark tool that I'm writing:


Radeon 9000 pro A1-XE G4:
Opening screen: P96-0:Radeon 9000:1920x1080

Copy from RAM to VRAM:
Transfer size: 1048576 bytes
Src: 0x5eb47000, Dest: 0x8139d200
copy32: 112.676 MiB/s (took 0.008875 seconds)
copy64: 113.572 MiB/s (took 0.008805 seconds)
copy64f: 113.161 MiB/s (took 0.008837 seconds)
copy64x2: 112.727 MiB/s (took 0.008871 seconds)
copy64fx2: 110.828 MiB/s (took 0.009023 seconds)
copy64fx2PF: 113.714 MiB/s (took 0.008794 seconds)
copy64fx4PF: 113.766 MiB/s (took 0.008790 seconds)
useMemcpy: 48.172 MiB/s (took 0.020759 seconds)
useExecCopyMem: 45.450 MiB/s (took 0.022002 seconds)
copyToVRAMNoAltivec: 113.649 MiB/s (took 0.008799 seconds)

Copy from VRAM to RAM:
Transfer size: 1048576 bytes
Src: 0x8139d200, Dest: 0x5eb47000
copy32: 18.320 MiB/s (took 0.054585 seconds)
copy64: 18.628 MiB/s (took 0.053684 seconds)
copy64f: 24.771 MiB/s (took 0.040370 seconds)
useMemcpy: 28.145 MiB/s (took 0.035530 seconds)
useExecCopyMem: 20.392 MiB/s (took 0.049038 seconds)
copyFromVRAMNoAltivec: 28.270 MiB/s (took 0.035373 seconds)

FillRect:
Size    	Time (s) 	Ops/s    	MPixel/s 
(16, 16)	    0.026344	  379593.076	      92.674
(32, 32)	    0.052177	  191655.327	     187.163
(64, 64)	    0.142935	   69961.871	     273.289
(128, 128)	    0.360604	   27731.251	     433.301
(256, 256)	    0.969295	   10316.777	     644.799
(512, 512)	    3.165158	    3159.400	     789.850
(1024, 1024)	   12.616102	     792.638	     792.638
(1920, 1061)	   23.213250	     430.788	     836.914

... Sorry, no blit operation results for the Radeon 9000 as this was obtained with an older version ...


Radeon X1950 pro Sam 460ex:
Opening screen: P96-0:Radeon X195:1920x1080

Copy from RAM to VRAM:
Transfer size: 16327680 bytes
Src: 0x4dcfd000, Dest: 0xb13ea8a0
copy32: 95.083 MiB/s (took 0.163766 seconds)
copy64: 95.692 MiB/s (took 0.162723 seconds)
copy64f: 160.942 MiB/s (took 0.096751 seconds)
copy64x2: 95.687 MiB/s (took 0.162731 seconds)
copy64fx2: 160.947 MiB/s (took 0.096748 seconds)
copy64fx2PF: 160.957 MiB/s (took 0.096742 seconds)
copy64fx4PF: 160.955 MiB/s (took 0.096743 seconds)
useMemcpy: 49.990 MiB/s (took 0.311491 seconds)
useExecCopyMem: 95.755 MiB/s (took 0.162616 seconds)
copyToVRAMNoAltivec: 160.959 MiB/s (took 0.096741 seconds)

Copy from VRAM to RAM:
Transfer size: 16327680 bytes
Src: 0xb13ea8a0, Dest: 0x4dcfd000
copy32: 32.812 MiB/s (took 0.474566 seconds)
copy64: 36.172 MiB/s (took 0.430480 seconds)
copy64f: 40.700 MiB/s (took 0.382585 seconds)
useMemcpy: 30.906 MiB/s (took 0.503820 seconds)
useExecCopyMem: 32.802 MiB/s (took 0.474705 seconds)
copyFromVRAMNoAltivec: 46.506 MiB/s (took 0.334824 seconds)

FillRect:
Size    	Time (s) 	Ops/s    	MPixel/s 
(16, 16)	    0.082763	  120826.940	      29.499
(32, 32)	    0.082625	  121028.744	     118.192
(64, 64)	    0.082810	  120758.363	     471.712
(128, 128)	    0.083481	  119787.736	    1871.683
(256, 256)	    0.295890	   33796.343	    2112.271
(512, 512)	    1.155281	    8655.903	    2163.976
(1024, 1024)	    4.595531	    2176.027	    2176.027
(1920, 1063)	    8.865945	    1127.911	    2195.379

BltBitMap:
Size    	Time (s) 	Ops/s    	MPixel/s 
(16, 16)	    0.054462	  183614.263	      44.828
(32, 32)	    0.054388	  183864.088	     179.555
(64, 64)	    0.054494	  183506.441	     716.822
(128, 128)	    0.096309	  103832.456	    1622.382
(256, 256)	    0.379871	   26324.726	    1645.295
(512, 512)	    1.481053	    6751.953	    1687.988
(1024, 1024)	    5.879482	    1700.830	    1700.830
(1920, 1063)	   12.302927	     812.815	    1582.072



Radeon HD 4650 Sam 460ex
Opening screen: P96-0:Radeon RV73:1920x1080

Copy from RAM to VRAM:
Transfer size: 4194304 bytes
Src: 0x4ebe6000, Dest: 0xb1392840
copy32: 78.070 MiB/s (took 0.051236 seconds)
copy64: 71.342 MiB/s (took 0.056068 seconds)
copy64f: 147.934 MiB/s (took 0.027039 seconds)
copy64x2: 78.201 MiB/s (took 0.051150 seconds)
copy64fx2: 154.208 MiB/s (took 0.025939 seconds)
copy64fx2PF: 154.267 MiB/s (took 0.025929 seconds)
copy64fx4PF: 153.722 MiB/s (took 0.026021 seconds)
useMemcpy: 47.368 MiB/s (took 0.084445 seconds)
useExecCopyMem: 78.198 MiB/s (took 0.051152 seconds)
copyToVRAMNoAltivec: 154.226 MiB/s (took 0.025936 seconds)

Copy from VRAM to RAM:
Transfer size: 4194304 bytes
Src: 0xb1392840, Dest: 0x4ebe6000
copy32: 36.445 MiB/s (took 0.109753 seconds)
copy64: 40.870 MiB/s (took 0.097871 seconds)
copy64f: 46.569 MiB/s (took 0.085894 seconds)
useMemcpy: 28.936 MiB/s (took 0.138234 seconds)
useExecCopyMem: 36.391 MiB/s (took 0.109917 seconds)
copyFromVRAMNoAltivec: 54.418 MiB/s (took 0.073505 seconds)

FillRect:
Size    	Time (s) 	Ops/s    	MPixel/s 
(16, 16)	    0.531550	   18812.906	       4.593
(32, 32)	    0.529701	   18878.575	      18.436
(64, 64)	    0.531329	   18820.731	      73.518
(128, 128)	    0.531754	   18805.688	     293.839
(256, 256)	    0.660159	   15147.866	     946.742
(512, 512)	    2.313944	    4321.626	    1080.406
(1024, 1024)	    8.925382	    1120.400	    1120.400
(1920, 1063)	   17.761982	     563.000	    1095.830

BltBitMap:
Size    	Time (s) 	Ops/s    	MPixel/s 
(16, 16)	    0.570604	   17525.289	       4.279
(32, 32)	    0.570299	   17534.662	      17.124
(64, 64)	    0.571488	   17498.180	      68.352
(128, 128)	    0.571608	   17494.507	     273.352
(256, 256)	    0.857107	   11667.155	     729.197
(512, 512)	    2.982864	    3352.483	     838.121
(1024, 1024)	   10.719018	     932.921	     932.921
(1920, 1063)	   17.708562	     564.699	    1099.136


As you can see, with larger blits, the Radeon HD 4650 beats the Radeon 9000. However, 2D blits don't test the GPU's processing power, which is where the newer cards leave the old ones in the dust.

NOTES:
- When I did the Sam 460ex tests, my board was configured differently from what users will be getting (different clocking configuration). This also lowered the write speed compared to tests run by others.
- While the Radeon X1950 beats all of the cards in 2D operations, the Radeon HD 4650 has more processing power. The X1950 has much greater VRAM bandwidth, which is characteristic of high-end cards, and which matters most with 2D blitter operations

I should add that I will be looking to improve performance of the Radeon HD cards after I have 2D and 3D acceleration done. There is a lot of room for improvement.

Hans

P.S. Before anyone asks, no, I'm not releasing my benchmark tool until it is finished.
amigo1Re: Other SATA-Controllers20110203 19:26  #1746
@samo79

I was a about to post the same.. well now I did! :-)
gregthecanuckRe: Graphics benchmarks20110204 05:22  #1757
@hans

Thanks very much for this first round of results. It's looking really good, even for a 4650 which is a low-end-ish card. The 4650 only has roughly half the "oomph" of the X1950 according to my favourite wikipedia page:

http://en.wikipedia.org/wiki/Ati_gpu

Would be interesting to see 4850 results - this card is a better equivalent to the X1950.

Are you using DMA over the PCI-Express bus or still stuck with polled i/o?


Thanks and nice work!
hansRe: Graphics benchmarks20110204 06:45  #1758
@gregthecanuck

Still no DMA yet. I'm focusing on functionality not speed. When I do start using DMA for the command processor (i.e., pull mode) then I expect the speed with smaller blits to improve.

Hans
gregthecanuckRe: Graphics benchmarks20110204 07:38  #1759
@hans

Thanks for update. One step at a time... :)
thellierRe: Wazp3D on Os4 machine20110314 14:25  #2862
@Sinan
Hello
>Supertuxkart is slow like 3-4 fps with wazp3d... My point is
>Blender, aquaria e.t.c doesn't work with wazp3d of course

So there is someone that is using my Wazp3D on a Sam460 :-)

I just wanted to give some precision about Wazp3D on real Amigas hardware

1) Wazp3D process pixels with all the nice effects that Warp3D allow (blending, color modulation,etc...) so it is slower than a software renderer that use only 256 colors textures and no effects at all

2) Wazp3D process pixels but to avoid doing reading/writing all the time then it read lots of pixels store them in structures fragbuffer3D then process them
Typically it use a buffer maded with 4096 fragbuffer3D (4096 x 24 = 98304 bytes) on an AmigaOS4 machine (for winUAE/PC it use a bigger cache)
As the Sam440 dont have a big data cache ==> Wazp3D is desesperately slow on this machine
Certainly the fragbuffer3D size should be optimized for various Amiga hardware

3) Wazp3D can draw two way

"Directly draw in Bitmap" is enable
Then it read some pixels from AmigaBitmap to the fragbuffer3D, process them, then write the pixels to AmigaBitmap from the fragbuffer3D
==> more Warp3D compatible but read/write to bitmap each 4096 pixels

If "Directly draw in Bitmap" is disable
Then a buffer is allocated in fast ram (ie a bitmap maded of fast ram) then all pixels are written to this buffer
When a frame is finished then Wazp3D do a WritePixelArray() from buffer to AmigaBitmap

This disabled option was faster on the Sam440 perhaps it is also faster on the Sam460

4) There are some code in the latest Wazp3D 49 to do Wazp3D->Mesa on Aros (so using hardware Mesa/Gallium) badly i didnt compiled nor tested this part
(coders can see in soft3d_opengl.c)

Also i have readed that Blender wasnt working at all with Wazp3D on OS4
But I see no reasons why Blender should never works with Wazp3D
So i maded some investigations on My Sam440 ===> fixed 2 bugs
(Can send or upload somewhere this updated Wazp3D 49b )
now it begin to work without crashing but it remains some weird displays problems on menus
So I have investigated more : For the menus this bug was due to the fact that Blender draw a panel then write upside the text ... but as pixels are buffered
in Wazp3D then the panel is not finished to be drawn with Wazp3D when graphics.library begin to draw the fonts
==> This bug can be avoided if you check "[Step] DrawPoly" in Wazp3D-Prefs
(It will force a pixels flush for each polygon drawn )
Badly Blender still have other display problem : when an object is loaded
it is badly drawn (zbuffer ? lighting ?...)

BTW: for drawing menus Blender should better draw all the background panels then draw all texts

Also about the slow TuxKart i am almost sure that it is TuxKart and/or MiniGL itself that have a problem not especially Wazp3D
(certainly related to textures' allocation/reallocation/conversion)

Wazp3D can display faster the more complexfull BlitzQuake in WinUAE
Even cow3d with 5000 faces is faster on wazp3d/os4/sam440 than tuxkart


Alain Thellier - Wazp3D' author
Navigate: 1-20 21-34 
Home
Snack! forum website engine, Created in 2008 by Björn Hagström