Did some maintance last night as we have had building works so cleaning servers etc
all virtual servers which are on the cluster is fine. we have 2 pysical server that host the virtual servers and is clusterd. however something that i am not to sure about is when i come to look at vrt02 i cant see my virtual servers but i can on vrt03. in my event log i have this message about quorum ?? looked at this artical Event ID 1573 and dont really say what is actually wrong but just states that you need to made make sure that it is node and disk majority (cluster Disk 1)
does it matter if my witness disk is 99.1% free?
and would this relate to some of the servers randomly restarting?
If you have 2 hosts, then you need your quorum to be set to node and disk majority otherwise potentially you have a stand off if both servers think they should do different things, the disk would have the deciding vote (in basic terms).
That bit is easy to fix, right click the cluster in Failover Cluster Manager then choose more actions then Configure Cluster Quorum Settings. Then change your quorum type.
Whether this would cause random reboots, I wouldn't have thought so but it might.
Your witness disk being 99.1% free isn't a problem.
ok well if i change this what impact is this going to have on my nas and physical server and virtual servers more importanly
Changing the quorum setup? There should be no actual change you will notice (except you will now be running a supported configuration).
I have changed mine before, multiple times in fact, for example when 1 host fails I have needed to go to disk and node, then back again once the host is fixed. I never suffered from random restarts though.
Are they "clean" restarts you are getting?
strange well i did a live migration of our wsus server and it went over fine checked the cluster manager showed the server but still not showing the other servers however in virtual machine manager that tells me that it is now hostet ( being managed ) by the new server strange
You'll have to clarify exactly what is happening and on which servers, we can called them vrt02 and vrt03.
vrt02, you cannot see the virtual servers (except the one you have recently moved?) when logged in through Failover Cluster Manager.
vrt02, you can see all servers in Failover Cluster Manager? Those on both vrt02 and vrt03?
In Virtual Machine Manager, you can see both all the servers on both vrt02 and vrt03?
If that is the case, then if you open HyperV on vrt02 does it show any servers at all?
Which method did you use to migrate the server, through Virtual Machine Manager or through Failover Cluster Manager?
this morning came in and found that the server that i moved last night (wsus) from vrt03 to vrt02 is now back on vrt03.
i can see both servers in the VMM
when i did the move i used the cluster manager tool
hyper v yea i can see all the servers
Has vert02 restarted over night? There would probably be messages in Failover Cluster Manager saying if it had or not.
Did the servers reboot? Other event logs will show that.
It appears that your issue is to do with the networking? dropped packets maybe. Then what appears to happen is communication is lost (for the unknown reason), so the cluster service fails, and if vrt02 fails first even by seconds the VMs on it move tot he other node.. Then the virtual machines are either all being loaded onto 1 node because that stays active (seems unlikely though) or once communication is restored the VMs just start on the node they were last on.
Is there any events at about 1am on the servers that give any more information on them, maybe network errors?
looks like they have i have just looked at the uptime and they are all on about 6 hours but that is on vrt03 i checked the storage box and it does look ok only time we took the vm off line was the other day on the 10th but had to becasue had to clean the servers but it still dont explain when eveything was ok last night and the server moved it self from vrt02 to vrt03
i have just checked on vrt02 at about 1:06:18 a critical erro faloverclustering event id 1135Cluster node
'WPVRT03' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapters on this node. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.
Cluster resource 'Cluster Disk 1' in clustered service or application 'Cluster Group' failed. @ 1:06:25
Cluster resource 'Cluster Disk 1' in clustered service or application 'Cluster Group' failed. @ 1:06:32
The Cluster service is shutting down because quorum was lost. This could be due to the loss of network connectivity between some or all nodes in the cluster, or a failover of the witness disk.
Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapter. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges. @ 1:06:39 Quorum Manager WPVRET02
Which of the hosts is controlling Cluster Disk 1 at the moment? I assume it will be VRT03.
Ok, so it would appear both servers are loosing a connection, hence all the VMs are starting again. Since you can't any evidence of a server reboot (because other logs don't show it). Have you looked at any logs on the switch they go through to see if that rebooted overnight?
Also what type of storage are you connecting to, and how are you connecting?