Login
Username:

Password:

Remember me



Lost Password?

Register now!

Sections

Who's Online
141 user(s) are online (132 user(s) are browsing Forums)

Members: 1
Guests: 140

emeck, more...

Support us!

Headlines

 
  Register To Post  

« 1 ... 9 10 11 (12) 13 »
Re: NVMe device driver
Quite a regular
Quite a regular


See User information
I have spent the last week with an experimental version of NGFS that can bypass the cache when reading large transfers. Depending on the criterion for reading directly rather than through the cache, I can get speed increases for different transfer sizes. Some speed increases (for particular Read transfer sizes) can be as much as two to four times.

I have not attempted to allow Writes to bypass the cache because of the extra work involved to avoid clashes.

Overall speeds of (say) compiling the file system are not improved to any extent. Overall speeds of my test suite are not improved either. I have not tried profiling movie playback.

The show-stopper showed up today in the way of intermittent Read errors in one of my tests. If a caller writes a small update to an existing file, that change will be cached and will remain in the cache (not on disk) until the cache contents are flushed to disk. If, before the flush, a large Read comes in, that Read will get the old contents of the file directly from the disk, not seeing the update.

Since the Fast read shows speed improvements only for a narrow range of benchmark tests, and introduces problems that can only be fixed by adding code for particular conditions, the result is not worth the effort. I have abandoned the experimental version. Maybe one day a flash of inspiration will hit me, but in the meanwhile, I am happy to leave the FS as it is.

cheers
tony
Go to top
Re: NVMe device driver
Just can't stay away
Just can't stay away


See User information
@tonyw
Of course it's much more complex to bypass the cache for large transfers, for example small write before a large read and error handling as you mentioned... But it's worth the effort, with SFS and JXFS and diskache.library it was about 2-3 times faster on average on A1XE and Sam440ep with SATA, and on an X5000 with NVMe the difference should be even higher.

Another, maybe easier, way to speed up large reads would be to do the same as IIRC the FFS2 cache does, or did 20 years ago. Instead of using very small (max. 128 KB in your case) cache reads and then CopyMemQuick() the cache contents to the application buffer do it the other way round:
1. Check if some of the blocks of the transfer are in the cache and modified, but not written do disk yet. If there are any write and flush the cached blocks of the transfer range first.
My IDiskCache->Read() functions simply first calls Self->Flush() with the same arguments before doing a large transfer which bypasses the cache.
2. Do a direct device I/O transfer with the size and buffer you get from the applications, which is much faster for large transfers.
3. CopyMemQuick() the data from the application buffer to your disk cache, if there was no I/O error.

Something similar can be done for writes as well, instead of flushing the cached parts before the large, non-cached I/O transfer you just invalidate (or remove them from the cache if you don't have an "unused" flag) the involved cache lines of the transfer instead.

Go to top
Re: NVMe device driver
Quite a regular
Quite a regular


See User information
Just a little teaser before holidays.

Found some additional speed.

Especially starting at a transfer size of 16kB, with my previous methods, there was a considerable drop in performance. This has been solved using new methods. Performance of the new method drops again when approaching the maximum transfer size of my Samsung 970 EVO. So a combination of old and new metods will probably maximize performance.

This is the latest result compared to the result in #209

SSDBenchmark V0.3

device
nvme.device,

--------------------------------------
Read size 512 bytes11 Mbyte/s                +10%
Read size 1024 bytes22 Mbyte/s              +10%
Read size 2048 bytes42 Mbyte/s              +35%
Read size 4096 bytes81 Mbyte/s              +19%
Read size 8192 bytes142 Mbyte/s             +8%
Read size 16384 bytes222 Mbyte/s           +311%
Read size 32768 bytes274 Mbyte/s           +188%
Read size 65536 bytes305 Mbyte/s           +99%
Read size 131072 bytes450 Mbyte/s         +60%
Read size 262144 bytes609 Mbyte/s         +34%
Read size 524288 bytes670 Mbyte/s         +7%
Read size 1048576 bytes718 Mbyte/s        -7%
Read size 2097152 bytes794 Mbyte/s        -7%
Read size 4194304 bytes829 Mbyte/s        -4%
Read size 8388608 bytes1379 Mbyte/s      +57%
Read size 16777216 bytes1658 Mbyte/s    +50%
Read size 33554432 bytes1851 Mbyte/s    +44%
Read size 67108864 bytes1964 Mbyte/s    +41%
Read size 134217728 bytes2034 Mbyte/s  +37%
--------------------------------------
DONE!

Go to top
Re: NVMe device driver
Home away from home
Home away from home


See User information
@geennaam

So we are hiting the max PCIe x4 v2.0 speed.

https://en.wikipedia.org/wiki/PCI_Express

(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.
Go to top
Re: NVMe device driver
Not too shy to talk
Not too shy to talk


See User information
@geennaam
Great !!!

AmigaOS3: Amiga 1200
AmigaOS4: Micro A1-C, AmigaOne XE, Pegasos II, Sam440ep, Sam440ep-flex, AmigaOne X1000
MorphOS: Efika 5200b, Pegasos I, Pegasos II, Powerbook, Mac Mini, iMac, Powermac Quad
Go to top
Re: NVMe device driver
Quite a regular
Quite a regular


See User information
@LiveForIt

Yes, the higher the transfer size, the lower the OS4 overhead. So it's basically benchmarking PCIe to DDR3 memory DMA at this point. And that seems to be no bottleneck for the X5000.

For real world speed, only the area up to 128kB is interesting because that's the transfer limit of NGFS. And there's where I've concentrated my optimization efforts.

SFS2 does benefit from transfer speeds at higher transfer sizes but this filesystem hits a brick wall around 375MB/s due to its own overhead. So 128MByte blocks are transferred at 2GB/s but then the drive idles until the next filesystem command arrives.

Go to top
Re: NVMe device driver
Just can't stay away
Just can't stay away


See User information
@geennaam
Quote:
SFS2 does benefit from transfer speeds at higher transfer sizes but this filesystem hits a brick wall around 375MB/s due to its own overhead. So 128MByte blocks are transferred at 2GB/s but then the drive idles until the next filesystem command arrives.
Usually not recommended, but please try if using a SFS\0 or SFS\2 partition with 32768 bytes/block makes any difference.
If it does I might be able to improve the speed for 512 bytes/block SFS partitions with nvme.device as well.

Go to top
Re: NVMe device driver
Quite a regular
Quite a regular


See User information
@joerg

There's a small difference but nowhere near NVMe speeds.

Go to top
Re: NVMe device driver
Quite a regular
Quite a regular


See User information
Resized Image


The SSD reports that 71 GByte is occupied. But in reality, only 6 GByte is occupied. So gradually the drive will fill up until everything is occupied and then the drive slows down. This is why we need TRIM support.

Go to top
Re: NVMe device driver
Just can't stay away
Just can't stay away


See User information
@geennaam
ATA TRIM is not possible, unless there is a HD_ATACmd now, but using UNMAP (3.54) with HD_SCSICmd would be.
But even if I'd add support for it in SFS it would only work if diskcache.library is disabled/removed in the kicklayout...

Go to top
Re: NVMe device driver
Quite a regular
Quite a regular


See User information
@joerg

I can implement my own trim functionality. All I need to know from the filesystem is which LBAs have been deleted.

Go to top
Re: NVMe device driver
Just can't stay away
Just can't stay away


See User information
@geennaam
Doesn't make much difference if you implement a private nvme.device command for TRIM, or convert HD_SCSICmd UNMAP to ATA TRIM (would be the better solution IMHO), but SFS without diskcache.library is way to slow to be usable.
Of course the file system itself know which blocks/sectors are freed, but diskcache.library is a file system independent cache system and doesn't. Mixing the APIs in SFS (IExec->DoIO() if diskache.library isn't installed, IDiskCache->Read()/Write()/etc. if it's installed) isn't possible either.
The easiest way may be implementing a separate TRIM/UNMAP tool for SFS (or adding it to PartitionWizard), which stops the file system, reads the bitmap, uses TRIM/UNMAP on all unused sectors and restarts the file system. That would work with and without diskcache.library.

Go to top
Re: NVMe device driver
Not too shy to talk
Not too shy to talk


See User information
@joerg

Quote:
Of course the file system itself know which blocks/sectors are freed, but diskcache.library is a file system independent cache system and doesn't.

I must be misunderstanding your statement here. I thought only SFS used diskcache.library. It can be (or is) used by other filesystems?

-- eliyahu

"Physical reality is consistent with universal laws. When the laws do not operate, there is no reality. All of this is unreal."
Go to top
Re: NVMe device driver
Just can't stay away
Just can't stay away


See User information
@eliyahu
Quote:
I must be misunderstanding your statement here. I thought only SFS used diskcache.library. It can be (or is) used by other filesystems?
It could be used by any file system, but the only other one which did was AFAIK my failed attempt to implement a better file system for AmigaOS 4.x (JXFS, should only have been available to OS4 beta testers).
IIRC I even added support for it to FFS2, but olsen didn't like it and rejected the changes. IMHO FFS2's own cache system (fs_plugin_cache) is the worst possible way to implement a file system/disk cache...


Edited by joerg on 2023/8/2 18:50:28
Go to top
Re: NVMe device driver
Home away from home
Home away from home


See User information
@joerg
Quote:

but SFS without diskcache.library is way to slow to be usable.


Sorry, i probably miss something, but i read in some other place that you say that "diskcache.library" should be removed from kicklayout so to have better speed. But not you say in opposite. Can you explain it a bit (again, sorry). thanks!

Join us to improve dopus5!
AmigaOS4 on youtube
Go to top
Re: NVMe device driver
Just can't stay away
Just can't stay away


See User information
@kas1e
Not because of speed, but diskcache.library is optimized for max. speed on HDDs. The result is that it (re)writes much more sectors than required.
On HDDs that's no problem, but on flash based storage like SSDs and NVMe, with much less possible overwrites before the hardware dies, or at least gets extremely slow, that's bad.

Go to top
Re: NVMe device driver
Home away from home
Home away from home


See User information
@joerg
Thanks!

@geennaam (and all)

Do you think it worth to try NVME device via PCI to PCIE bridge on pegasos2 ? It can be pretty good to compare with IDE even with CF2IDE or SATA2IDE adapters. Dunno through if it will faster than SiI3114 with Sata disk.. In end of all it's Pegasos2 , with SFS(2) only as the fastest option.

Join us to improve dopus5!
AmigaOS4 on youtube
Go to top
Re: NVMe device driver
Home away from home
Home away from home


See User information
@kas1e

i think SATA2IDE is the same speed,
you will only get max pci speed - overheads.

133 MB/s on 66Mhz bus.
66 MB/s on 33Mhz bus.

Under emulation it’s a different story, I guess.

Our old real PowerPC CPU has pretty slow PCIe compared to
newer X86/ARM cpu’s, simply a question of how fast it can
emulate a PowerPC. In particular DMA transferee will be killer, on faster bus.

I feel need a proper benchmark tool, to show the strength and weaknesses of a system. and visualize it, to compare the results.

(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.
Go to top
Re: NVMe device driver
Just can't stay away
Just can't stay away


See User information
@LiveForIt
Quote:
I feel need a proper benchmark tool, to show the strength and weaknesses of a system. and visualize it, to compare the results.
I guess the results in the QEMU Emulation vs Hardware CPU Benchmarks topic are usable, just needs some visual representation.

Short summary: ARM CPUs are quite good at emulating PPC CPUs with QEmu, AMD64 CPUs suck at it (only exception: derfs' Ryzen 5600).

Go to top
Re: NVMe device driver
Home away from home
Home away from home


See User information
@joerg

Yes.


But I’m thinking more about benchmarks on Hans webpage,
we are bombarded with 16bit benchmarks from emulation, while most of real hardware uses 32bit modes in the benchmarks. I feel it needs to be split into different categories.

32bit will always get lower score then 16bit on the same hardware because its half of the data, if there no major byte swap, GFX issues, that is.

Also not so interested in 1000’s of QEMU scores, only need to see the unique ones. without knowing host CPU as well, the benchmarks become kind meaningless.

(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.
Go to top

  Register To Post
« 1 ... 9 10 11 (12) 13 »

 




Currently Active Users Viewing This Thread: 5 ( 0 members and 5 Anonymous Users )




Powered by XOOPS 2.0 © 2001-2024 The XOOPS Project