This portal is to open public enhancement requests against IBM Power Systems products, including IBM i. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).
Shape the future of IBM!
We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:
Search existing ideas
Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updateson them if they matter to you. If you can't find what you are looking for,
Post your ideas
Post an idea.
Get feedback from the IBM team and other customers to refine your idea.
Follow the idea through the IBM Ideas process.
Specific links you will want to bookmark for future use
vnicserver device needs to signal to phyp when it's shutting down to avoid network outage
Environment: Power9 server with vios 220.127.116.11 pair which host vnicserver devices with backing sriov adapters for the vnic client devices in failover configuration
Problem: During shutdown/reboot of the vio server that hosts the active vnicserver device for a client lpar the client lpar experiences a network outage of about 10 seconds. When doing a manual vnic failover from the HMC, the network outage is less than 3 seconds.
Analysis of the phyp support team: Since it sounds like this is especially prevalent during VIOS shutdown, I'm wondering how events are sequenced in the VIOS during a shutdown. During normal operation, the VIOS reports failover status down to the hypervisor at regular intervals to indicate that the device is still operational. If the hypervisor does not receive status within a timeout period, the adapter is deemed unresponsive and a failover occurs. I would hope that the VIOS reports a non-operational status if they know they are shutting down, as that would cause the hypervisor to failover immediately. If the VIOS instead simply stops sending us events, it can take some time for the hypervisor to determine that the VIOS is unresponsive. From recent traces it looks like that timeout period is about 3 seconds - that's about how long it would take for us to initiate the failover if the VIOS suddenly becomes unresponsive.
Analysis of the vnicserver device driver team: vNIC failover is initiated by Hypervisor and looks like HMC has a mechanism to communicate to Hypervisor to trigger vNIC failover. But VIOS doesn’t have a such mechanism to communicate to Hypervisor to trigger vNIC failover, rather Hypervisor do failover when it detects VIOS is down, this process takes a little extra time and hence more n/w breakage to vNIC client while VIOS is getting down. So, to reduce network breakage during VIOS reboot, vNICServer team and Hypervisor team should collaborate and come up with a mechanism, perhaps the way HMC and Hypervisor established it as of now. This is a development effort and hence need an RFE.
Do not place IBM confidential, company confidential, or personal information into any field.