Jump to content

PS5 - a patent dive into what might be the tech behind Sony's SSD customisations


Recommended Posts

http://www.freepatentsonline.com/y2017/0097897.html

 

Copy pasta

 

 

It talks about the limitations of simply using a SSD 'as is' in a games system, and a set of hardware and software stack changes to improve performance.

Basically, 'as is', an OS uses a virtual file system, designed to virtualise a host of different I/O devices with different characteristics. Various tasks of this file system typically run on the CPU - e.g. traversing file metadata, data tamper checks, data decryption, data decompression. This processing, and interruptions on the CPU, can become a bottleneck to data transfer rates from an SSD, particularly in certain contexts e.g. opening a large number of small files.

At a lower level, SSDs typically employ a data block size aimed at generic use. They distribute blocks of data around the NAND memory to distribute wear. In order to find a file, the memory controller in the SSD has to translate a request to the physical addresses of the data blocks using a look-up table. In a regular SSD, the typical data block size might require a look-up table 1GB in size for a 1TB SSD. A SSD might typically use DRAM to cache that lookup table - so the memory controller consults DRAM before being able to retrieve the data. The patent describes this as another potential bottleneck.

Here are the hardware changes the patent proposes vs a 'typical' SSD system:

- SRAM instead of DRAM inside the SSD for lower latency and higher throughput access between the flash memory controller and the address lookup data. The patent proposes using a coarser granularity of data access for data that is written once, and not re-written - e.g. game install data. This larger block size can allow for address lookup tables as small as 32KB, instead of 1GB. Data read by the memory controller can also be buffered in SRAM for ECC checks instead of DRAM (because of changes made further up the stack, described later). The patent also notes that by ditching DRAM, reduced complexity and cost may be possible, and cost will scale better with larger SSDs that would otherwise need e.g. 2GB of DRAM for 2TB of storage, and so on.

- The SSD's read unit is 'expanded and unified' for efficient read operations.

- A secondary CPU, a DMAC, and a hardware accelerator for decoding, tamper checking and decompression.

- The main CPU, the secondary CPU, the system memory controller and the IO bus are connected by a coherent bus. The patent notes that the secondary CPU can be different in instruction set etc. from the main CPU, as long as they use the same page size and are connected by a coherent bus.

- The hardware accelerator and the IO controller are connected to the IO bus.

An illustrative diagram of the system:

 

uS6bo2P.png

 

 

 

At a software level, the system adds a new file system, the 'File Archive API', designed primarily for write-once data like game installs. Unlike a more generic virtual file system, it's optimised for NAND data access. It sits at the interface between the application and the NAND drivers, and the hardware accelerator drivers.

The secondary CPU handles a priority on access to the SSD. When read requests are made through the File Archive API, all other read and write requests can be prohibited to maximise read throughput.

When a read request is made by the main CPU, it sends it to the secondary CPU, which splits the request into a larger number of small data accesses. It does this for two reasons - to maximise parallel use of the NAND devices and channels (the 'expanded read unit'), and to make blocks small enough to be buffered and checked inside the SSD SRAM. The metadata the secondary CPU needs to traverse is much simpler (and thus faster to process) than under a typical virtual file system.

The NAND memory controller can be flexible about what granularity of data it uses - for data requests send through the File Archive API, it uses granularities that allow the address lookup table to be stored entirely in SRAM for minimal bottlenecking. Other granularities can be used for data that needs to be rewritten more often - user save data for example. In these cases, the SRAM partially caches the lookup tables.

When the SSD has checked its retrieved data, it's sent from SSD SRAM to kernel memory in the system RAM. The hardware accelerator then uses a DMAC to read that data, do its processing, and then write it back to user memory in system RAM. The coordination of this happens with signals between the components, and not involving the main CPU. The main CPU is then finally signalled when data is ready, but is uninvolved until that point.

A diagram illustrating data flow:

 

 

CYH6AMw.png

 

Though I wouldn't read too much into this, in most examples it talks about what you would need to support a end-to-end transfer rate of 10GB/s.

The patent is also silent on what exactly the IO bus would be - that obviously be a key bottleneck itself on transfer rates out of the NAND devices. Until we know what that is, it's hard to know what the upper end on the transfer rates could be, but it seems a host of customisations are possible to try to maximise whatever that bus will support.

Once again, this is one described embodiment. Not necessarily what the PS5 solution will look exactly like. But it is an idea of what Sony's been researching in how to customise a SSD and software stack for faster read throughput for installed game data.

Link to post
Share on other sites
  • Replies 50
  • Created
  • Last Reply

Top Posters In This Topic

This guy is claiming that when PS4 games are properly patched that load times can be increased significantly (ie: Spider-man); however, if not patched the games still load really fast... around 32% faster than an NVMe PCIe SSD (specifically Samsung PM1725a). Assuming this is real (bag of salt), this is pretty damn sick.

Link to post
Share on other sites
Just now, lynux3 said:
This guy is claiming that when PS4 games are properly patched that load times can be increased significantly (ie: Spider-man); however, if not patched the games still load really fast... around 32% faster than an NVMe PCIe SSD (specifically Samsung PM1725a). Assuming this is real (bag of salt), this is pretty damn sick.

It sounds pretty awesome.  Mind you PCIe4 is right around the corner and PCIe5 is supposed to be coming 2021.  Still, they can't design games around a small fraction of the user base that have NVMe drives and PCIe 4+... so it will be interesting to see how it goes.  I can see high end games on PC's which don't meet the transfer rate requirements requiring more RAM and longer initial load times to reduce streaming pressure on the I/O for higher end games.

 

NVMe's are getting cheaper all the time and by 2020/21 adoption will be really high.

 

The best thing about new console launches for me is that it brings the baseline up.  And these consoles are basically going to raise the bar way past the average PC.  Forcing all the stragglers weighing us down to either upgrade or get lost :tom: 

 

PS5 is gonna be a beast. :whew: 

Link to post
Share on other sites

There won't exist a game in the next 5 years that will require anything faster than say, SATA IIIs 6gb/s transfer rate. Big difference between something being vital to operation and a big boost to loading like Sony's implementation here. 

 

Which is nice of course. Always figured it would be some kind of custom bus that the storage was directly tied to. But the goal is to get content from the storage to the RAM. Anyone who is playing on Windows going forward is going to have a minimum of 16GB and many gamers are going for 32. 

Link to post
Share on other sites

By 2020/21 a large number of the PC gamers who are interested in playing next gen games will already have 32/64GB of RAM.

 

I think we'll simply see games on PC require more RAM to offset the storage I/O.  Currently, they don't design games in that way because there's no real need to.  They want to keep RAM requirements as low as possible to reach the maximum number of users (8GB of RAM)..  But with next gen, games that are designed in a way to really take advantage of Sony's storage tech, they'll have to utilize one of PC's strengths, which is memory capacity.  If you're zooming around in a game super fast and it requires tons of I/O bandwidth to stream it in, then short of requiring PCIe4.0..(which they wont) you're going to have to have a longer initial load and store more of the level/world in memory so that it doesn't have to thrash the I/O bandwidth and can stream data into RAM at a bit slower of a rate.

 

 

Link to post
Share on other sites
22 hours ago, lynux3 said:

This guy is claiming that when PS4 games are properly patched that load times can be increased significantly (ie: Spider-man); however, if not patched the games still load really fast... around 32% faster than an NVMe PCIe SSD (specifically Samsung PM1725a). Assuming this is real (bag of salt), this is pretty damn sick.

So the guy who posted this had an avatar. If you look close enough there's a link below it:

 

K740fAa.png

 

The link reveals the supposed specifications of the devkit:

 

Quote

pfnbeKW.png

48 CUs

1849MHz GPU Clock

715MHz MemClock

4096bit bus

18GB HBM2

~11.3TF

Edited by lynux3
Link to post
Share on other sites
3 minutes ago, lynux3 said:

So the guy who posted this had an avatar. If you look close enough there's a link below it:

 

K740fAa.png

 

The link reveals the supposed specifications of the devkit:

 

48 CUs

1849MHz GPU Clock

715MHz MemClock

4096bit bus

18GB HBM2

~11.3TF

18 GB of HBM2

 

Make sure that Zima is cold :aitch:

Link to post
Share on other sites
1 minute ago, DynamiteCop! said:

18 GB of HBM2

 

Make sure that Zima is cold :aitch:

I definitely will. :whew: I'll hit you up on PSN in party chat we'll drink one together if this shit is true. :D

Link to post
Share on other sites
13 minutes ago, lynux3 said:

So the guy who posted this had an avatar. If you look close enough there's a link below it:

 

K740fAa.png

 

The link reveals the supposed specifications of the devkit:

 

48 CUs

1849MHz GPU Clock

715MHz MemClock

4096bit bus

18GB HBM2

~11.3TF

He was already outed as fake :tom: 

Link to post
Share on other sites
2 minutes ago, JONBpc said:

Next gen consoles showing how much of a scam PC gaming is LOL

Just wait 1.5 more years.... for them to begin to release... another year after that for games to start really taking advantage of them :drake: 

 

PCs will move far beyond that shit :smoke: 

Link to post
Share on other sites
1 minute ago, Remij_ said:

Just wait 1.5 more years.... for them to begin to release... another year after that for games to start really taking advantage of them :drake: 

 

PCs will move far beyond that shit :smoke: 

Pcs are locked to what consoles are regardless . If consoles are doing 4k60 , no point of PC.  Oh slightly more draw distance . Rofl 

Link to post
Share on other sites
1 minute ago, JONBpc said:

Pcs are locked to what consoles are regardless . If consoles are doing 4k60 , no point of PC.  Oh slightly more draw distance . Rofl 

There is a point.  The point is not to own a bunch of fucking stupid consoles :tom: 

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recently Browsing   0 members

    No registered users viewing this page.


×
×
  • Create New...