One of the nodes in our voip cluster (voip-3) entered a complicated failure state overnight which resulted in a number of incomplete calls.
We are unsure as to the full reason behind this failure although a reboot did resolve the issue with the affected node. As such we will be performing emergency maintenance this evening to perform further investigations and apply any necessary updates to the cluster.
The impact from any maintenance should be minimal however we wanted to notify customers that there will be an “at risk” period this evening from 21:00 to 04:00 while we work on the cluster.
In the mean time we are monitoring the service closely for any further signs of problems. The node in question had been online for more than a year without any previous issue being identified and no changes have been made to the system for several weeks prior to the failure.