Re: [bitfolk] 2021-06-09 ~23:33Z - 2021-06-10 ~00:16Z: Emerg…

Top Page

Reply to this message
Author: Samuel Bächler
Date:  
To: users
Subject: Re: [bitfolk] 2021-06-09 ~23:33Z - 2021-06-10 ~00:16Z: Emergency reboot of server "clockwork"
Thank you for informing transparently. LGS

On 10 June 2021 02:57:33 CEST, Andy Smith <andy@???> wrote:
>Hi,
>
>At around 00:33 BST (23:33Z) we started to receive alerts regarding
>services on host "clockwork". Upon investigation it was showing all
>the signs of being the intermittent "frozen I/O" problem we've been
>having:
>
>https://lists.bitfolk.com/lurker/message/20210425.071102.9d9a1cc5.en.html
>
>As mentioned in that earlier email, I'd decided that the next step
>would be prepare new hypervisor packages and I did do that the next
>day.
>
>As this issue seems to happen only every few months and on different
>servers we do not yet know if the new packages fix the problem.
>They've been in use on other servers since late April without
>incident, but that isn't yet proof enough given the long periods
>between occurrences.
>
>Anyway, after "clockwork" was power cycled the new packages were
>installed there and then all VMs were started again. This was
>completed by about 01:16 BST (00:16Z).
>
>There are still many of our servers where we know this is going to
>happen again at some point. I don't feel comfortable scheduling
>maintenance to upgrade them when I don't know if the upgrade will be
>effective. If we can go a significant period of time on the newer
>version without incident then we will schedule a maintenance window
>to get the remaining servers on those versions too. It is also
>possible that there will be a security patch that forces a
>maintenance, in which case we'll upgrade the hypervisor packages to
>the newer version at the same time.
>
>There are also some servers still left to be emptied so that their
>operating systems can be upgraded. Those are "hen" and "paradox".
>Once they are emptied and upgraded they will of course be put on the
>newer version of the hypervisor. It is expected that customer
>services we move from these servers will be put on servers that
>already have the newer hypervisor version.
>
>Thank you for your patience and I apologise for the disruption. I'm
>doing all that I can to try to find a solution.
>
>Thanks,
>Andy
>
>--
>https://bitfolk.com/ -- No-nonsense VPS hosting


--
Sent from my Android device with K-9 Mail. Please excuse my brevity.