Finally leverage "the mainframe lesson" and fully exploit NUMA in POWERsystems

See this idea on ideas.ibm.com

Why run VIO servers in the same cores, DIMMs, buses, etc. as all your real workloads? Instead of a 2-socket / 2 NUMA domain system, sell us an asymmetric 4-socket / 4 NUMA domain system where 2 of the sockets / domains are much smaller and contain exactly half of the I/O (NVMe / PCIe, CAPI, etc.) each. Run a VIO server in each one, and almost zero hardware interrupts would ever have to hit the "main" processing hardware. (Low-low-low end might only have support for 1 VIO module) If the fabric (for lack of a more precise term) in between the NUMA domains scales well enough, you could add a ton of much smarter I/O modules to scale things more than just adding semi-smart drawers. If it really scales well, put "main memory" behind a similar module and leave the "main processors" to do almost all CPU-bound workloads. The OS and PowerVM would need a bunch of work for that last one, but might need near-zero work for the VIOS->I/O module design.

Idea priority

Low

Post comment

Admin

Carmelita Ruvalcaba

Reply
| Apr 10, 2023

Thanks for submitting this idea. In general, it is definitely a good thing to offload work from the CPUs where possible and there is a number of things that we do already. Starting with Power7+ we have added separate data encryption engines and data compression engines to offload this from software and allow the CPUs to do other work. Not only does this save CPU compute resources, but the offload engines are much faster. Regarding your specific suggestion to offload the I/O handling and virtualization to separate customized CPUs, there are several considerations, 1) having dedicated CPU modules will be a challenge since the number of cores needed for this is relatively low and the number of cores per socket is significantly increasing. Dedicating an entire socket to I/O and virtualization will be hard to optimize. 2) There a trade-off to be made here in that today’s implementation would allow for a highly affinitized configuration where the I/O and memory can be local to the CPU socket or node to provide a low latency, high bandwidth partition. If the I/O was in a separate node, all of the I/O traffic would need to be transferred across the processor fabric which would be susceptible to bottleneck/contention issues. 3) Lastly, instead of using CPUs for I/O offload, the industry is headed towards DPUs to offload network and storage tasks from the CPU. Where this makes sense, we will look to support these DPUs that are embedded into PCIe cards similar to how we leveraged the SRIOV virtualization on PCIe cards.

0 reply Hide replies

Admin

Carmelita Ruvalcaba

Reply
| Mar 6, 2023

Can you please describe "the main frame lesson" and how is this different than using dedicated cores for VIOS?

1 reply Hide replies

By clicking the "Post Comment" or "Submit Idea" button, you are agreeing to the IBM Ideas Portal Terms of Use.
Do not place IBM confidential, company confidential, or personal information into any field.

Shape the future of IBM!

Search existing ideas

Post your ideas

Specific links you will want to bookmark for future use

Finally leverage "the mainframe lesson" and fully exploit NUMA in POWERsystems

Please enter your email address

RELATED IDEAS

Finally leverage "the mainframe lesson" and fully exploit NUMA in POWERsystems