So I learnt something about APC UPSes today!
OK after 2 hours I've figured out what happened to my system on Saturday morning! It would seem that 1 of my 2 hosts power supplies failed. Being helpful it sent a spike back through the APC UPS and tripped the whole cabinet out.
of course the VMs on that host didn't shutdown nicely :( Unfortunately it seems that the UPS that powers the switches failed as well (duff battery) that was just bad luck. (it didn't tell me prior to this event), killing any chance of the first UPS telling the other servers that they were all running on battery! (So to be clear an unrelated UPS also failed when it got the spike!)
This was a HP G6 server PSU how on earth did it fail so badly? Why didn't the first UPS (APC 2200) stop the spike going back to the rest of the cabinet?