Moving a Nutanix Hyper-V Cluster between Domains
So you have a shiny 4-node Server 2012 R2 Hyper-V Failover Cluster running on Nutanix humming along no problem. Sadly, you only have a single virtual domain controller hosted somewhere else that owns the AD for your severs and failover cluster. Crap, someone deleted the DC! Well, you wanted to move this cluster to your primary domain anyway, guess now you have an excuse. This is probably a very rare scenario but should you be unlucky enough to have it happen to you, here is how to deal with it. All of this took place on my 4-node Nutanix cluster which is honestly inconsequential, but goes to show that the Nutanix storage cluster itself was entirely unaffected and frankly unconcerned by what I was doing. The same basic steps in this post would apply assuming your VMs are on stable, well-connected, shared storage.
In this scenario, there are really 2 ways to go:
Option A: start over, rebuild your hosts
Some might opt for this and as long as you don’t have data that needs preserving, go for it. In my case I have CVMs and VMs on every host, not interested.
Option B: change domains, rebuild the Failover Cluster, migrate VMs
This might seem messy but since I have data I need to save, this is the route I’ll be going. This scenario assumes that networking was functioning prior to the migration and that it will remain the same. Adding IP, vSwitch or other network changes to this could really complicate things and is not recommend.
Core or GUI
If you’re a PowerShell master then staying in Core mode may not be an issue for you. If you’re not, you might want to convert your Server Core instance to full GUI mode first, if it isn’t there already. I wrote about how to do that here. While you’re at it, make sure all nodes are at exactly the same Windows patch level.Out with the old
I’ll be transitioning 4 active nodes from a dead and gone domain named test1.com to my active domain dvs.com. First, power off all VMs and remove their association from the old cluster. We’re not touching the storage here so there will be no data loss.Migration of each node will occur one at a time by first evicting the node to be converted from the old cluster. Important: do this BEFORE you change domains!
If, and only if, this is the last and final node you will be migrating, you can now safely destroy the old failover cluster. Do this only on the very last node!
Once a node is ready to make the switch, change the host’s DNS entries to point to the DCs of the domain you will be migrating to, then join the new domain.
In with the new
Once your first node is back up, create a new failover cluster. I’m reusing the same IP that was assigned to the old cluster to keep things simple. Since this is Nutanix which manages its own storage, there are no disks to be managed by the failover cluster, so don’t import anything. Nothing bad will happen if you do, but Hyper-V doesn’t manage these disks so there’s no point. Also, if you run the cluster suitability checks they will fail on the shared storage piece.Repeat this process for each node to be migrated, adding each to the new cluster. Next import your pre-existing VMs into the new cluster and configure them for high availability.
In Prism, just for good measure, update the FQDN in cluster details:
Let the Distributed Storage Fabric settle. Once the CVMs are happy, upgrade the NOS if desired.
Pretty easy if you do things in the right order. Here is a view of Prism with my 4 hosts converted and only CVMs running. Definitely not a busy cluster at the moment but it is a happy cluster ready for tougher tasks!
If things go badly
Maybe you pulled the trigger on the domain switch first before you evicted nodes and destroyed the cluster? If so, any commands directed at the old cluster to the old domain will likely fail with access being denied. You will be prevented from removing the node or destroying the old cluster.If this happens you’ll need to manually remove & restore the cluster services on that node. Since none of the Cmdlets are working it’s time to turn to the registry. Find “ClusDisk” and “ClusSVC” keys within the following path and delete them both. You’ll see entries reflecting the old cluster and old configuration:
HKLM\System\CurrentControlSet\Services\
Now you can remove the Failover Clustering feature from the Remove Roles and Features wizard:
Reboot the host and install the Failover Clustering feature again. This will set the host back to square one from a clustering perspective, so you can now create a new cluster or join one preexisting.
For more information...
Forcibly removing clustering featuresDell XC Web-scale Appliance Architectures for VDI
Dell XC Web-Scale Converged Appliances
nutanix.com
Very nice post, this is the type of stuff you don't see on other blogs. I'd love to see more unique errors/problems.
ReplyDeleteMost of the time I don’t make comments on websites, but I'd like to say that this article really forced me to do so. Really nice post!
ReplyDeleteBrockville Moving