NVMe device driver

	Bottom Previous Topic Next Topic
Register To Post

« 1 ... 9 10 11 (12) 13 »

tonyw

Re: NVMe device driver

Posted on: 2023/6/14 8:59 #221

Quite a regular

I have spent the last week with an experimental version of NGFS that can bypass the cache when reading large transfers. Depending on the criterion for reading directly rather than through the cache, I can get speed increases for different transfer sizes. Some speed increases (for particular Read transfer sizes) can be as much as two to four times.

I have not attempted to allow Writes to bypass the cache because of the extra work involved to avoid clashes.

Overall speeds of (say) compiling the file system are not improved to any extent. Overall speeds of my test suite are not improved either. I have not tried profiling movie playback.

The show-stopper showed up today in the way of intermittent Read errors in one of my tests. If a caller writes a small update to an existing file, that change will be cached and will remain in the cache (not on disk) until the cache contents are flushed to disk. If, before the flush, a large Read comes in, that Read will get the old contents of the file directly from the disk, not seeing the update.

Since the Fast read shows speed improvements only for a narrow range of benchmark tests, and introduces problems that can only be fixed by adding code for particular conditions, the result is not worth the effort. I have abandoned the experimental version. Maybe one day a flash of inspiration will hit me, but in the meanwhile, I am happy to leave the FS as it is.

cheers
tony

joerg

Re: NVMe device driver

Posted on: 2023/6/14 15:30 #222

Just can't stay away

@tonyw
Of course it's much more complex to bypass the cache for large transfers, for example small write before a large read and error handling as you mentioned... But it's worth the effort, with SFS and JXFS and diskache.library it was about 2-3 times faster on average on A1XE and Sam440ep with SATA, and on an X5000 with NVMe the difference should be even higher.

Another, maybe easier, way to speed up large reads would be to do the same as IIRC the FFS2 cache does, or did 20 years ago. Instead of using very small (max. 128 KB in your case) cache reads and then CopyMemQuick() the cache contents to the application buffer do it the other way round:
1. Check if some of the blocks of the transfer are in the cache and modified, but not written do disk yet. If there are any write and flush the cached blocks of the transfer range first.
My IDiskCache->Read() functions simply first calls Self->Flush() with the same arguments before doing a large transfer which bypasses the cache.
2. Do a direct device I/O transfer with the size and buffer you get from the applications, which is much faster for large transfers.
3. CopyMemQuick() the data from the application buffer to your disk cache, if there was no I/O error.

Something similar can be done for writes as well, instead of flushing the cached parts before the large, non-cached I/O transfer you just invalidate (or remove them from the cache if you don't have an "unused" flag) the involved cache lines of the transfer instead.

geennaam

Re: NVMe device driver

Posted on: 2023/7/30 17:46 #223

Quite a regular

Just a little teaser before holidays.

Found some additional speed.

Especially starting at a transfer size of 16kB, with my previous methods, there was a considerable drop in performance. This has been solved using new methods. Performance of the new method drops again when approaching the maximum transfer size of my Samsung 970 EVO. So a combination of old and new metods will probably maximize performance.

This is the latest result compared to the result in #209


SSDBenchmark V0.3



device: nvme.device,0 



--------------------------------------

Read size 512 bytes: 11 Mbyte/s                +10%

Read size 1024 bytes: 22 Mbyte/s              +10%

Read size 2048 bytes: 42 Mbyte/s              +35%

Read size 4096 bytes: 81 Mbyte/s              +19%

Read size 8192 bytes: 142 Mbyte/s             +8%

Read size 16384 bytes: 222 Mbyte/s           +311%

Read size 32768 bytes: 274 Mbyte/s           +188%

Read size 65536 bytes: 305 Mbyte/s           +99%

Read size 131072 bytes: 450 Mbyte/s         +60%

Read size 262144 bytes: 609 Mbyte/s         +34%

Read size 524288 bytes: 670 Mbyte/s         +7%

Read size 1048576 bytes: 718 Mbyte/s        -7%

Read size 2097152 bytes: 794 Mbyte/s        -7%

Read size 4194304 bytes: 829 Mbyte/s        -4%

Read size 8388608 bytes: 1379 Mbyte/s      +57%

Read size 16777216 bytes: 1658 Mbyte/s    +50%

Read size 33554432 bytes: 1851 Mbyte/s    +44%

Read size 67108864 bytes: 1964 Mbyte/s    +41%

Read size 134217728 bytes: 2034 Mbyte/s  +37%

--------------------------------------

DONE!

LiveForIt

Re: NVMe device driver

Posted on: 2023/7/30 18:06 #224

Home away from home

@geennaam

So we are hiting the max PCIe x4 v2.0 speed.

https://en.wikipedia.org/wiki/PCI_Express

(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.

sailor

Re: NVMe device driver

Posted on: 2023/7/30 18:29 #225

Quite a regular

@geennaam
Great !!!

AmigaOS3: Amiga 1200
AmigaOS4: Micro A1-C, AmigaOne XE, Pegasos II, Sam440ep, Sam440ep-flex, AmigaOne X1000
MorphOS: Efika 5200b, Pegasos I, Pegasos II, Powerbook, Mac Mini, iMac, Powermac Quad

geennaam

Re: NVMe device driver

Posted on: 2023/7/30 18:30 #226

Quite a regular

@LiveForIt

Yes, the higher the transfer size, the lower the OS4 overhead. So it's basically benchmarking PCIe to DDR3 memory DMA at this point. And that seems to be no bottleneck for the X5000.

For real world speed, only the area up to 128kB is interesting because that's the transfer limit of NGFS. And there's where I've concentrated my optimization efforts.

SFS2 does benefit from transfer speeds at higher transfer sizes but this filesystem hits a brick wall around 375MB/s due to its own overhead. So 128MByte blocks are transferred at 2GB/s but then the drive idles until the next filesystem command arrives.

joerg

Re: NVMe device driver

Posted on: 2023/7/30 19:16 #227

Just can't stay away

@geennaam
Quote:

SFS2 does benefit from transfer speeds at higher transfer sizes but this filesystem hits a brick wall around 375MB/s due to its own overhead. So 128MByte blocks are transferred at 2GB/s but then the drive idles until the next filesystem command arrives.

Usually not recommended, but please try if using a SFS\0 or SFS\2 partition with 32768 bytes/block makes any difference.
If it does I might be able to improve the speed for 512 bytes/block SFS partitions with nvme.device as well.

geennaam

Re: NVMe device driver

Posted on: 2023/8/2 15:35 #228

Quite a regular

@joerg

There's a small difference but nowhere near NVMe speeds.

geennaam

Re: NVMe device driver

Posted on: 2023/8/2 15:40 #229

Quite a regular

The SSD reports that 71 GByte is occupied. But in reality, only 6 GByte is occupied. So gradually the drive will fill up until everything is occupied and then the drive slows down. This is why we need TRIM support.

joerg

Re: NVMe device driver

Posted on: 2023/8/2 16:06 #230

Just can't stay away

@geennaam
ATA TRIM is not possible, unless there is a HD_ATACmd now, but using UNMAP (3.54) with HD_SCSICmd would be.
But even if I'd add support for it in SFS it would only work if diskcache.library is disabled/removed in the kicklayout...

geennaam

Re: NVMe device driver

Posted on: 2023/8/2 16:27 #231

Quite a regular

@joerg

I can implement my own trim functionality. All I need to know from the filesystem is which LBAs have been deleted.

joerg

Re: NVMe device driver

Posted on: 2023/8/2 16:48 #232

Just can't stay away

@geennaam
Doesn't make much difference if you implement a private nvme.device command for TRIM, or convert HD_SCSICmd UNMAP to ATA TRIM (would be the better solution IMHO), but SFS without diskcache.library is way to slow to be usable.
Of course the file system itself know which blocks/sectors are freed, but diskcache.library is a file system independent cache system and doesn't. Mixing the APIs in SFS (IExec->DoIO() if diskache.library isn't installed, IDiskCache->Read()/Write()/etc. if it's installed) isn't possible either.
The easiest way may be implementing a separate TRIM/UNMAP tool for SFS (or adding it to PartitionWizard), which stops the file system, reads the bitmap, uses TRIM/UNMAP on all unused sectors and restarts the file system. That would work with and without diskcache.library.

eliyahu

Re: NVMe device driver

Posted on: 2023/8/2 17:17 #233

Not too shy to talk

@joerg

Quote:

Of course the file system itself know which blocks/sectors are freed, but diskcache.library is a file system independent cache system and doesn't.

I must be misunderstanding your statement here. I thought only SFS used diskcache.library. It can be (or is) used by other filesystems?

-- eliyahu

"Physical reality is consistent with universal laws. When the laws do not operate, there is no reality. All of this is unreal."

joerg

Re: NVMe device driver

Posted on: 2023/8/2 17:21 #234

Just can't stay away

@eliyahu
Quote:

I must be misunderstanding your statement here. I thought only SFS used diskcache.library. It can be (or is) used by other filesystems?

It could be used by any file system, but the only other one which did was AFAIK my failed attempt to implement a better file system for AmigaOS 4.x (JXFS, should only have been available to OS4 beta testers).
IIRC I even added support for it to FFS2, but olsen didn't like it and rejected the changes. IMHO FFS2's own cache system (fs_plugin_cache) is the worst possible way to implement a file system/disk cache...

Edited by joerg on 2023/8/2 17:50:28

kas1e

Re: NVMe device driver

Posted on: 2023/8/2 18:49 #235

Home away from home

@joerg
Quote:

but SFS without diskcache.library is way to slow to be usable.

Sorry, i probably miss something, but i read in some other place that you say that "diskcache.library" should be removed from kicklayout so to have better speed. But not you say in opposite. Can you explain it a bit (again, sorry). thanks!

Join us to improve dopus5!
AmigaOS4 on youtube

joerg

Re: NVMe device driver

Posted on: 2023/8/2 18:56 #236

Just can't stay away

@kas1e
Not because of speed, but diskcache.library is optimized for max. speed on HDDs. The result is that it (re)writes much more sectors than required.
On HDDs that's no problem, but on flash based storage like SSDs and NVMe, with much less possible overwrites before the hardware dies, or at least gets extremely slow, that's bad.

kas1e

Re: NVMe device driver

Posted on: 2023/8/2 19:30 #237

Home away from home

@joerg
Thanks!

@geennaam (and all)

Do you think it worth to try NVME device via PCI to PCIE bridge on pegasos2 ? It can be pretty good to compare with IDE even with CF2IDE or SATA2IDE adapters. Dunno through if it will faster than SiI3114 with Sata disk.. In end of all it's Pegasos2 , with SFS(2) only as the fastest option.

Join us to improve dopus5!
AmigaOS4 on youtube

LiveForIt

Re: NVMe device driver

Posted on: 2023/8/2 20:07 #238

Home away from home

@kas1e

i think SATA2IDE is the same speed,
you will only get max pci speed - overheads.

133 MB/s on 66Mhz bus.
66 MB/s on 33Mhz bus.

Under emulation it’s a different story, I guess.

Our old real PowerPC CPU has pretty slow PCIe compared to
newer X86/ARM cpu’s, simply a question of how fast it can
emulate a PowerPC. In particular DMA transferee will be killer, on faster bus.

I feel need a proper benchmark tool, to show the strength and weaknesses of a system. and visualize it, to compare the results.

(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.

joerg

Re: NVMe device driver

Posted on: 2023/8/2 20:27 #239

Just can't stay away

@LiveForIt
Quote:

I feel need a proper benchmark tool, to show the strength and weaknesses of a system. and visualize it, to compare the results.

I guess the results in the QEMU Emulation vs Hardware CPU Benchmarks topic are usable, just needs some visual representation.

Short summary: ARM CPUs are quite good at emulating PPC CPUs with QEmu, AMD64 CPUs suck at it (only exception: derfs' Ryzen 5600).

LiveForIt

Re: NVMe device driver

Posted on: 2023/8/2 21:50 #240

Home away from home

@joerg

Yes.

But I’m thinking more about benchmarks on Hans webpage,
we are bombarded with 16bit benchmarks from emulation, while most of real hardware uses 32bit modes in the benchmarks. I feel it needs to be split into different categories.

32bit will always get lower score then 16bit on the same hardware because its half of the data, if there no major byte swap, GFX issues, that is.

Also not so interested in 1000’s of QEMU scores, only need to see the unique ones. without knowing host CPU as well, the benchmarks become kind meaningless.

(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.

Register To Post	« 1 ... 9 10 11 (12) 13 »
	Top Previous Topic Next Topic

Currently Active Users Viewing This Thread: 1 ( 0 members and 1 Anonymous Users )