Improvements for site Priority Override Location handling.
In a multi-site cluster with resource group "Inter-Site
Management Policy"="Prefer Primary Site" setting, the site
Priority Override Location" (POL) changes the cluster startup
and manual RG move behaviour.
PowerHA sets the site POL if you move a resource group to the
backup site with smit "Move Resource Groups to Another Site".
You can see site Priority Override Location current status
with clRGinfo -p command. if "Site setting" paragraph is
present, then site POL is set.
In TS002306973 we agreed with the DCT team in these
action:
I have no problem adding messages or changing smit
panels to make the user interface and behavior of
the product more understandable.
But as with everything, the question is not only
how to implement such changes, but what is the
right thing to implement in the first place.
Item 4 here is a perfect example: several years
ago I reworked the “move resource groups” smit
path so that “site enabled” groups could only be
moved through the “move to site” path and not the
“move to node” path.
I forget now if I was trying to fix a specific bug
or not, but I do recall thinking that there was no
good indication at the time to let the user know
they were dealing with site enabled groups and that
a “move to any random node” might actually involve
a site move. Or it may have been that we don't
actually allow it, e.g. you can only move a site
enabled group to a node of our choosing in the
other site.
Now Victor is requesting the “old” behavior – being
able to move to any random node regardless of site
– which was specifically undone.
So while I agree that we can make improvements in
this area, a good, comprehensive solution will take
some time and effort to develop, which is obviously
not something we can take on in response to a PMR.
1. Clver should print out a warning message if
site POL is set
Clverify should print a message if _any_ POL is set
or any other detectable change which may result in
resource groups moving. The challenge here is that
the POL information is in the clstrmgr and there is
no API to retrieve it (that I know of off the top of
my head). And detecting other changes that may cause
a move – like changing a dependency – may be even
more difficult to identify.
But for a minimal solution for the APAR we could
probably figure out how to expose POLs (say through
lssrc output) and add a message in clverify.
2. smit "Move Resource Groups to Another Site": if
the RG site relationship is "prefer primary site"
and we move the RG to the backup site, it should
print out a message: "This operation sets the backup
site the temporary highest priority site. Please
check clRGinfo -p output".
Agreed, and easy enough to implement.
3. smit "Move Resource Groups to Another Site":
if the RG site relationship is "prefer primary site"
and site POL is set and we move the RG to the primary
site, it should print out a message: "This operation
sets the primary site as highest priority site.
Please check clRGinfo -p output".
I agree in principal, but faces the same challenges
as (1) – if we solve (1) this one may be easy to do.
4. smit "Move Resource Groups to Another Node"
should list all available nodes regardless of
the site POL. If site POL is set, then maybe we can
highlight the nodes from the highest priority site.
Broadly, we need the user interface to reinforce to
the user that they are dealing with site enabled
groups and the possible consequences of moving to
any node. I tried to do this by massaging the smit
path, but there may be a better solution.
So while I agree in principal, this would require
some effort.
5. smit "Move Resource Groups to Another Node":
if the RG site relationship is "prefer primary site"
and we move the RG to the backup site, then smit
should have one more option: "Set the backup site as
the temporary highest priority site? true / false".
I am not sure we can support this: the api for send
the request to clstrmgr may not support this (and
internally the clstrmgr may not even support a
“move without setting POL).
So while I agree this would be a good improvement,
the implementation may be non-trivial.
6. smit "Move Resource Groups to Another Node": if
the RG site relationship is "prefer primary site",
site POL is set and we move the RG to the primary
site, then print out a message: "This operation sets
the primary site as highest priority site. Please
check clRGinfo -p output."
This assumes item (4) is implement as well. We may
be able to add messages in some of the existing
panels and operations, but I am not sure we can
support this specifically.