Login
Username:

Password:

Remember me



Lost Password?

Register now!

Sections

Who's Online
148 user(s) are online (137 user(s) are browsing Forums)

Members: 3
Guests: 145

hlt, levellord, Mr_byte, more...

Support us!

Headlines

 
  Register To Post  

« 1 ... 8 9 10 (11) 12 13 »
Re: NVMe device driver
Quite a regular
Quite a regular


See User information
@Rolar

I've seen some reports of drives that fail to work with my driver.

I've tried to search if other platforms have similar issues and it looks like the issue might be related to legacy interrupts (emulated pin interrupts). I've encountered multiple websites which recommend to use MSI/MSI-X interrupts only with NVMe. Windows even offers a tool (LOGO) which checks which type of interrupt work for an attached NVMe drive and which don't.
The publically available OS4 kernels supports legacy interrupts only. But the latest SDK contains traces of MSI support. So if a new kernel will ever be released, this might solve that issue.

The driver on os4depot is really interrupt based. If an interrupt is not received within the timeout window then it will generate an error. In your case, this error likely occurs during initialisation and therefore shows the symptoms as if there was no NVMe drive found at all (bug in cleanup routine).
My current beta driver checks for an active interrupt inside the NVMe drive itself. But since the NVMe completion is much faster then the interrupt response, I might as well simply poll the completion queues. So stay tuned

Edit1: The good news is, it works to ignore interrupt and check the completion queue. The bad news is that it has a negative impact on performance because I need to flush caches during polling.


Edited by geennaam on 2023/6/1 14:23:38
Go to top
Re: NVMe device driver
Quite a regular
Quite a regular


See User information
@geennaam

I have it on the best authority that the kernel does not, and never has, supported MSI interrupts. Nothing has changed in that regard with newer kernels.

cheers
tony
Go to top
Re: NVMe device driver
Quite a regular
Quite a regular


See User information
@tonyw

Weird. This is what I can find in pci.h inside the latest SDK:
enum enCapMSIRegs
{
    
PCI_MSI_CONTROL            2,                /* Message Control Register */
    
PCI_MSI_ADDRESS_LOW        4,                /* Message Address Register, low 32 bit */
    
PCI_MSI_ADDRESS_HIGH    8,                /* Message Address Register, high 32 bit */
    
PCI_MSI_DATA_32            8,                /* Message Data Register for 32 bit structure */
    
PCI_MSI_DATA_64            12,                /* Message Data Register for 64 bit structure */
    
PCI_MSI_MASK_32            12,                /* Message Mask Register for 32 bit */
    
PCI_MSI_MASK_64            16,                /* Message Mask Register for 64 bit */
    
PCI_MSI_PENDING_32        16,                /* Message Pending Register for 32 bit */
    
PCI_MSI_PENDING_64        20,                /* Message Pending Register for 64 bit */
};

enum enCapMSIBits
{
    
PCI_MSI_CONTROL_ENABLE    0x0001,            /* Enable MSI */
    
PCI_MSI_CONTROL_MCAP    0x000e,            /* Multi Message Capable Mask */
    
PCI_MSI_CONTROL_MEN        0x0070,            /* Multi Message Enable Mask */
    
PCI_MSI_CONTROL_64        0x0080,            /* Structure is 64 bit */
    
PCI_MSI_CONTROL_MASK    0x0100,            /* Individual masking allowed */
};

/* Message Signaled Interrupt CAP */
struct PCICapability_MSI
{
    
struct PCICapability      CapHeader;

    
BOOL                      Is64Bit;                /* True if the device is capable of 64 bit MSI addresses */
    
uint64                     MessageAddress;        /* The message target address. Note that the interrupt controller code
                                                     * has to set this up accordingly. 0 means MSI is disabled for this device
                                                     */

};


I thought that this was the ground work for MSI support in new kernel versions. But apparently not.

Go to top
Re: NVMe device driver
Just popping in
Just popping in


See User information
@geennaam
Quote:
I've tried to search if other platforms have similar issues and it looks like the issue might be related to legacy interrupts (emulated pin interrupts). I've encountered multiple websites which recommend to use MSI/MSI-X interrupts only with NVMe. Windows even offers a tool (LOGO) which checks which type of interrupt work for an attached NVMe drive and which don't.
The publically available OS4 kernels supports legacy interrupts only. But the latest SDK contains traces of MSI support. So if a new kernel will ever be released, this might solve that issue

Ah, my usual luck to choose a model which has unexpected issues... Unfortunately it's too late to return it. Is there for Linux some tool similar to 'LOGO'?

Quote:
The driver on os4depot is really interrupt based. If an interrupt is not received within the timeout window then it will generate an error. In your case, this error likely occurs during initialisation and therefore shows the symptoms as if there was no NVMe drive found at all (bug in cleanup routine).
My current beta driver checks for an active interrupt inside the NVMe drive itself. But since the NVMe completion is much faster then the interrupt response, I might as well simply poll the completion queues. So stay tuned

Edit1: The good news is, it works to ignore interrupt and check the completion queue. The bad news is that it has a negative impact on performance because I need to flush caches during polling.

Ok, let me know when there is a new version available... And if you need betatesters for the prerelease versions, just drop me a PM .

Go to top
Re: NVMe device driver
Quite a regular
Quite a regular


See User information
@geennaam

Yes, those #defines are in the headers but not implemented in the code, I have been told.

cheers
tony
Go to top
Re: NVMe device driver
Quite a regular
Quite a regular


See User information
I have updated DiskSpeed (not SCSISpeed) with the changes suggested by Joerg (to fix the counter overflow problem) and added a 1 MB buffer setting for test.

I have submitted DiskSpeed V4.5 to OS4Depot for upload, should be available soon.

Meanwhile, here are results of Geennaam's driver with a 512 GB Kingston "device" on my X5000-20:

>DiskSpeed drive NVMeDrive: all
DiskSpeed 4.5, OS4 version
Copyright © 1989-92 MKSoft Development
Copyright © 2003-04 Daniel J. Andrea II & Stéphane Guillard
Modified June 2023 by A. W. Wyatt for VP DOS API

------------------------------------------------------------
CPU: X5000-20 AmigaOS Version: 54.56 Normal Video DMA
Device: NVMeDrive: Buffers: <information unavailable>

Testing directory manipulation speed.
File Create: 5610 files/sec
File Open: 56.79 kfiles/sec
Directory Scan: 66.18 kfiles/sec
File Delete: 7578 files/sec

Seek/Read: 423.75 kseeks/sec

Testing with a 512 byte, LONG-aligned buffer.
Create file: 30.60 MiB/sec
Write to file: 42.26 MiB/sec
Read from file: 190.60 MiB/sec

Testing with a 4096 byte, LONG-aligned buffer.
Create file: 74.44 MiB/sec
Write to file: 288.38 MiB/sec
Read from file: 709.14 MiB/sec

Testing with a 32768 byte, LONG-aligned buffer.
Create file: 95.62 MiB/sec
Write to file: 1.15 GiB/sec
Read from file: 1.05 GiB/sec

Testing with a 262144 byte, LONG-aligned buffer.
Create file: 98.28 MiB/sec
Write to file: 1.58 GiB/sec
Read from file: 1.08 GiB/sec

Testing with a 1048576 byte, LONG-aligned buffer.
Create file: 95.12 MiB/sec
Write to file: 932.00 MiB/sec
Read from file: 986.50 MiB/sec

cheers
tony
Go to top
Re: NVMe device driver
Not too shy to talk
Not too shy to talk


See User information
@tonyw

Thanks for the DiskSpeed updates, Tony! I am a little surprised there isn't a bigger delta in performance between your NVMe drive and my SSD. The numbers posted below are from a X5000/20 with a Samsung EVO SSD attached to the on-board SATA interface. The volume under test is a 400GB NGFS\01 partition on that disk:

6.RAM Disk:DiskSpeeddiskspeed drive Scratchall
DiskSpeed 4.5
OS4 version
Copyright © 1989
-92 MKSoft Development
Copyright © 2003
-04 Daniel JAndrea II Stéphane Guillard
Modified June 2023 by A
WWyatt for VP DOS API

------------------------------------------------------------
CPUX5000-20  AmigaOS Version54.56  Normal Video DMA
Device
Scratch:    Buffers: <information unavailable>

Testing directory manipulation speed.
File Create:         4109  files/sec
File Open
:          72.24 kfiles/sec
Directory Scan
:    297.27 kfiles/sec
File Delete
:        14.45 kfiles/sec

Seek
/Read:         530.67 kseeks/sec

Testing with a 512 byte
LONG-aligned buffer.
Create file:        18.18 MiB/sec
Write to file
:      45.39 MiB/sec
Read from file
:    223.65 MiB/sec

Testing with a 4096 byte
LONG-aligned buffer.
Create file:        24.90 MiB/sec
Write to file
:     323.09 MiB/sec
Read from file
:    785.62 MiB/sec

Testing with a 32768 byte
LONG-aligned buffer.
Create file:        27.61 MiB/sec
Write to file
:       1.17 GiB/sec
Read from file
:      1.10 GiB/sec

Testing with a 262144 byte
LONG-aligned buffer.
Create file:        27.69 MiB/sec
Write to file
:       1.62 GiB/sec
Read from file
:      1.10 GiB/sec

Testing with a 1048576 byte
LONG-aligned buffer.
Create file:        26.88 MiB/sec
Write to file
:     959.12 MiB/sec
Read from file
:      1.00 GiB/sec

-- eliyahu

"Physical reality is consistent with universal laws. When the laws do not operate, there is no reality. All of this is unreal."
Go to top
Re: NVMe device driver
Just can't stay away
Just can't stay away


See User information
@eliyahu
Quote:
Thanks for the DiskSpeed updates, Tony! I am a little surprised there isn't a bigger delta in performance between your NVMe drive and my SSD. The numbers posted below are from a X5000/20 with a Samsung EVO SSD attached to the on-board SATA interface. The volume under test is a 400GB NGFS\01 partition on that disk:
DiskSpeed is a tool to compare different FileSystems using the same driver and hardware, not for comparing different drivers/hardware, that's what SCSISpeed is for.
In tonyw's results the most important details are missing: Which filesystem is used? Which BlockSize is used on the test partition? In case of NGFS: Is it a beta version with the strange 128 KB transfer size limit fixed already, or an old version with this limit which makes fast reads and writes impossible with any driver and hardware?
Adding 1 MB default buffer size is better than the old versions (512 byte - 256 KB only), but still way too small to get fast read/write speeds on any current hardware. For example the C:Copy tests geennaam did were using a 16 MB buffer. For any usable test, no matter if DiskSpeed, ScsiSpeed or C:Copy, the buffer size has to be larger than the disk cache used, or all you'll get is the performance of the IExec->CopyMemQuick() implementation on your system instead of anything related to the disk speed.

Go to top
Re: NVMe device driver
Quite a regular
Quite a regular


See User information
@eliyahu

It looks like there's some DDR3 memory cache benchmarking going on.

I can assure you that the X5k is SATA2. Hence a 300MByte/s theoretical limit. However the raw read speed is about 250MB/s with my SATA SSD. With larger transfer sizes, the P50x20sata.device starts chopping up the transfer to smaller sized chunk (the driver informs this with debug ouput on terminal). As a result, the read speed drops a little. I will post the benchmarks later today with my scsispeed alternative.

EDIT1:

X5000 Sata:
SSDBenchmark V0.3

device
p50x0sata.device,

--------------------------------------
Read size 512 bytes9 Mbyte/s
Read size 1024 bytes
26 Mbyte/s
Read size 2048 bytes
48 Mbyte/s
Read size 4096 bytes
80 Mbyte/s
Read size 8192 bytes
127 Mbyte/s
Read size 16384 bytes
165 Mbyte/s
Read size 32768 bytes
200 Mbyte/s
Read size 65536 bytes
222 Mbyte/s
Read size 131072 bytes
232 Mbyte/s
Read size 262144 bytes
242 Mbyte/s
Read size 524288 bytes
246 Mbyte/s
Read size 1048576 bytes
249 Mbyte/s
Read size 2097152 bytes
249 Mbyte/s
Read size 4194304 bytes
247 Mbyte/s
Read size 8388608 bytes
248 Mbyte/s
Read size 16777216 bytes
248 Mbyte/s
Read size 33554432 bytes
243 Mbyte/s
Read size 67108864 bytes
241 Mbyte/s
Read size 134217728 bytes
240 Mbyte/s
--------------------------------------
DONE!


X5000 NVMe:
SSDBenchmark V0.3

device
nvme.device,

--------------------------------------
Read size 512 bytes10 Mbyte/s
Read size 1024 bytes
20 Mbyte/s
Read size 2048 bytes
31 Mbyte/s
Read size 4096 bytes
68 Mbyte/s
Read size 8192 bytes
131 Mbyte/s
Read size 16384 bytes
54 Mbyte/s
Read size 32768 bytes
95 Mbyte/s
Read size 65536 bytes
153 Mbyte/s
Read size 131072 bytes
281 Mbyte/s
Read size 262144 bytes
454 Mbyte/s
Read size 524288 bytes
626 Mbyte/s
Read size 1048576 bytes
768 Mbyte/s
Read size 2097152 bytes
857 Mbyte/s
Read size 4194304 bytes
861 Mbyte/s
Read size 8388608 bytes
875 Mbyte/s
Read size 16777216 bytes
1103 Mbyte/s
Read size 33554432 bytes
1282 Mbyte/s
Read size 67108864 bytes
1389 Mbyte/s
Read size 134217728 bytes
1484 Mbyte/s
--------------------------------------
DONE!


Edited by geennaam on 2023/6/5 9:49:35
Go to top
Re: NVMe device driver
Not too shy to talk
Not too shy to talk


See User information
@joerg

Thanks for the background. I appreciate the explanation.

@geennaam

I'd be interested in running your SSDBenchmark tool on my systems at some point. Any possibility of a public release?

-- eliyahu

"Physical reality is consistent with universal laws. When the laws do not operate, there is no reality. All of this is unreal."
Go to top
Re: NVMe device driver
Quite a regular
Quite a regular


See User information
@eliyahu

This is it for now. There will be no new public release in the near future.


Edited by geennaam on 2023/6/16 18:57:33
Edited by geennaam on 2023/6/16 18:58:27
Go to top
Re: NVMe device driver
Quite a regular
Quite a regular


See User information
@joerg

The transfer size limit is set by the size of the disk cache, the read-ahead cache and the number of available "buffers". Since NGFS has a write-through cache, all Reads and Writes go through the cache. Also, since it is a journalling file system, all Writes to disk (of meta data) take three Write operations, not just one.

The cache "buffers" are permanently allocated from the system and controlled by internal allocation code. Allocating and de-allocating cache buffers from the Exec imparts a heavy speed penalty. For a partition of 100 GB+, 4096-byte blocks are used, which requires 16 MB of cache for each such partition. I have 23 such partitions on my X-5000, so the cache is no bigger than necessary.

Many years ago, when I spent a lot of time optimising performance, I played with cluster sizes, number of cache buffers, etc. The FS was optimised (at the time) for overall speed *of my test suite*, not for the speed of individual transfers.

I have a test suite that runs all sorts of different tests and takes about 12 minutes to complete. The optimisation work was performed on a Sam 460 with a mechanical hard drive (the mid-range machine at the time). The 32-block cluster that limits read/write transfer sizes gave the best *overall* performance at the time.

Now that I have Geennaam's driver working, I can revisit the speed optimisations and check to see if there is anything to be gained by changing the settings. I doubt that any great increase can be achieved.

PS. Naturally, the test results I published were taken using the current version of NGFS. It would be unfair to publish the results of tests performed on other file systems. The partition size in this case was about 120 GiB.

cheers
tony
Go to top
Re: NVMe device driver
Quite a regular
Quite a regular


See User information
@tonyw

Small sized, single command transfers is the Achilles' heel of NVMe. Small sizes are fine as long as you overload the drive with them (the more IOs, the better). Alternatively, large transfers are fine because they are broken down in multiple small transfers (Size depends on NVMe controller) and fed to the submission queue.

Currently, my driver is optimised for large transfers. A future release will include independant submission and retire queues in order to create a true pipelined flow. But this will also require a filesystem which is capable of sending multiple IOs.

Go to top
Re: NVMe device driver
Just can't stay away
Just can't stay away


See User information
@tonyw
Quote:
The transfer size limit is set by the size of the disk cache, the read-ahead cache and the number of available "buffers". Since NGFS has a write-through cache, all Reads and Writes go through the cache.
Read-ahead and copy-back caches only help for small transfers, not for large ones (slower than without a cache) and caching everything doesn't make sense either.
For meta-data blocks SFS has "buffers", which are something completely different than the diskcache.library (or SFS internal if diskcache.library isn't used) caches.
For transfers larger than the cache line size, IIRC 64 KB in diskcache.library, I just just invalidate the caches of the transfer, in case some of the sectors were in the cache and the contents change, and do a single device read or write of the size the file system got from the application if it's start address and size are multiples of the block size. If it's not block aligned only the first and/or last part(s) smaller than a block are done through the read-ahead/copy-back cache, but the largest part is bypassing the cache.
The disk cache used in the AmigaOS port of NTFS, and probably all FUSE/FileSysBox file systems, does the same as I do in diskcache.library: Only small transfers use the cache, large ones don't.

Quote:
Also, since it is a journalling file system, all Writes to disk (of meta data) take three Write operations, not just one.
It's the same in SFS, (at least) 3 writes and a CMD_UPDATE, but delayed by the flush timeout.

Quote:
For a partition of 100 GB+, 4096-byte blocks are used, which requires 16 MB of cache for each such partition. I have 23 such partitions on my X-5000, so the cache is no bigger than necessary.
The diskcache.library cache is much larger, some percent of the installed RAM, but it's a single cache shared by all partitions using diskcache.library.

Go to top
Re: NVMe device driver
Quite a regular
Quite a regular


See User information
@joerg

Thanks for that discussion. I think I tried (years ago) bypassing the cache for large transfers but it did not benefit the overall speed of the test suite, so I removed the extra code (don't like special cases). Of course, the test suite does not use a lot of huge transfer sizes such as we are testing with Geennaam's driver.

I'll try re-enabling the bypass-cache code and see if it improves DiskSpeed's performance.

I keep asking myself: "Why are we striving for maximum benchmark performance if it won't make much difference to real-world operation? What sort of application will benefit from an increase of transfer speed for buffer sizes > 1 MiB?"

I can't help thinking that this whole investigation is a solution looking for a problem.

cheers
tony
Go to top
Re: NVMe device driver
Just can't stay away
Just can't stay away


See User information
@tonyw
Quote:
I keep asking myself: "Why are we striving for maximum benchmark performance if it won't make much difference to real-world operation? What sort of application will benefit from an increase of transfer speed for buffer sizes > 1 MiB?"
Some examples:
- Compiling software, 16 MB is too few for keeping all executables (make, gcc, gas, ld, etc.) in the cache and loading the large executables bypassing the cache should be faster. Small files like the includes will stay in the cache, and if the large executables aren't cached much more of them.
- Playing large audio or video files.
- Editing or converting audio or video files.
- Copying files.

Usual benchmarks are faster if you put everything into the cache (only if the benchmark uses files <= the cache size you are using), but real world software is usually faster if you bypass the cache for large transfers.
Most software using large transfers uses the data of the large transfers only once and putting it into the cache removes a lot of other cached data which is accessed more often.

Go to top
Re: NVMe device driver
Just can't stay away
Just can't stay away


See User information
@geennaam
Quote:
Currently, my driver is optimised for large transfers. A future release will include independant submission and retire queues in order to create a true pipelined flow. But this will also require a filesystem which is capable of sending multiple IOs.
The only AmigaOS file system which might still be able to do that, if it wasn't ported to the new AmigaOS 4.1 FS API yet, is FFS2, using the ACTION_(READ|WRITE)_RETURN packets for device I/O.
In FileSystems using the new AmigaOS 4.1 FS API that's no usable option, and in my AmigaOS 4.x SFS/JXFS implementations, which neither use the old TRIPOS/AmigaOS 0.x-3.9 packet API nor the new AmigaOS 4.1 FS API but a custom one, it's not possible either.

Go to top
Re: NVMe device driver
Quite a regular
Quite a regular


See User information
@joerg

That's why it's not high on my priorities list. Actually, nothing is. I have an Amiga summer dip with temperatures approaching 30 degC atm.

Go to top
Re: NVMe device driver
Quite a regular
Quite a regular


See User information
So I modified NGFS' ReadData() function so that for a Read request size larger than MAX_CACHE_READ, it bypasses the cache and reads the device directly into the caller's buffer. I haven't made any changes to Write yet.

Result is surprising: read speeds fall by a factor of 4 or 5. I then tried breaking up the long Read into several shorter Reads, but the overall speed doesn't change much with different sub-read sizes.

I think what is happening is this:

In the current version, everything goes through the cache. So the first read is slow, then all later reads are much faster, leading to an average that is pretty good. But when you ignore the cache and read directly from the disk each time, it's going to be much slower than reading from the memory-resident cache.

The code in DiskSpeed measures the overall time to Read() and Seek() to the beginning again (repeated many times). The actual times of the first disk Read() and the subsequent cache Reads are all averaged, so the difference between them is not visible. Bypass the cache and you see only slow transfers.

In the case of Writes, they all write into the memory-resident cache, which is written to disk some time later, so short Write() operations appear fast. They only slow down when the Write() length exceeds the cache size. A 1 MiB test size operates at full Write speed, although the reported speed is going to be slower than the maximum because of the included Seek() times.

I will add some longer test transfers to DiskSpeed and see what happens.

cheers
tony
Go to top
Re: NVMe device driver
Just can't stay away
Just can't stay away


See User information
@tonyw
What you got might be true for SATA, maybe even for X5000 SATA2, but did you test it with NVMe as well?
Please ignore any results you may get from benchmarks, your own benchmark tool as well as any foreign ones like DiskSpeed, only use some real-world software tests instead.
I guess over the nearly 20 years I worked on SFS I did about as much tests with it as you are doing with NGFS, incl. an always cache everything version of diskcache.library (i.e. the same as you are doing), but in my results nearly all real-world software was much faster with SFS if large transfers aren't cached.

The DiskSpeed results you get, with a file/buffer size < cache size, aren't any disk related speed results at all, but just IExec->CopyMemQuick() benchmarks copying data from/to the disk cache memory.


Edited by joerg on 2023/6/9 12:09:41
Go to top

  Register To Post
« 1 ... 8 9 10 (11) 12 13 »

 




Currently Active Users Viewing This Thread: 4 ( 0 members and 4 Anonymous Users )




Powered by XOOPS 2.0 © 2001-2024 The XOOPS Project