[bitfolk] Upcoming maintenance work / how to handle notifica…

Top Page
Author: Andy Smith
Date:  
To: users
Subject: [bitfolk] Upcoming maintenance work / how to handle notifications

Reply to this message
gpg: Signature made Thu Mar 18 12:34:31 2021 UTC
gpg: using DSA key 2099B64CBF15490B
gpg: Good signature from "Andy Smith <andy@strugglers.net>" [unknown]
gpg: aka "Andrew James Smith <andy@strugglers.net>" [unknown]
gpg: aka "Andy Smith (UKUUG) <andy.smith@ukuug.org>" [unknown]
gpg: aka "Andy Smith (BitFolk Ltd.) <andy@bitfolk.com>" [unknown]
gpg: aka "Andy Smith (Linux User Groups UK) <andy@lug.org.uk>" [unknown]
gpg: aka "Andy Smith (Cernio Technology Cooperative) <andy.smith@cernio.com>" [unknown]
Hi,

There's some major and unavoidable upcoming maintenance work being
undertaken by our colo provider. The effect on BitFolk is that 7 of
our servers need to be physically removed from one rack and moved to
a different rack in a different room in Telehouse.

This is part of the colo provider's need to rebuild the entire rack,
so it's not in our control, is unavoidable and we have limited say
over the schedule as it also affects all their other customers in
that rack as well. It's a big piece of work, though not complicated
(for us).

I'm writing to you because I'm not sure of the best way to notify
customers about this work and would like you to give me some
feedback on how you'd prefer the communication to work.

We haven't yet agreed firm dates/times but it's going to be no
sooner than a month from now and probably no later than six weeks
from now. We're going to be upgrading the servers to 10GE networking
so we're going to move one of them first, wait a week to be sure the
hardware is stable in that configuration and then do the remaining
six the following week. The first one may happen evening UK time,
like 9pm or something, but the rest will likely happen a bit later,
around midnight into early hours. So if affected, assume you're in
that latter group, and expect half an hour or so of being powered
off.

An additional complication is that this comes while we are right in
the middle of doing rolling upgrades of our fleet of servers. That
work was started before I was made aware that the server move would
be necessary, otherwise I might have postponed it.

Some of you will have already been through that rolling upgrade
process or be going through it now. As that's completely within our
control we've been moving customers between servers one by one, at
times convenient to you and individual to you, and then upgrading
the server once it's empty.

The consequence of that is, we don't know which customers will be on
which servers come the date of the move. We could send out
personalised notices as soon as we know the date, but some of those
people won't be on the affected servers when the time comes, and
some people who didn't get the notice WILL be affected when the time
comes.

As part of the rolling upgrades I'm trying to avoid moving anyone
from a server that's not going to be moved onto one that will be
moved. It is however unavoidable that some people will be moved
between two servers that are going to be moved, so some will get one
very short outage for the move of their VM and then the later longer
outage when the whole server is relocated.

So is there any value in sending out personalised advanced notice of
maintenance more than a month ahead? Would it be better just to send
notice to the announce@ address giving the list of affected servers
and the time it's going to happen and then a refresher a week ahead
and again a day ahead or something?

If the prospect of being powered off for half an hour or so a month
from now is not acceptable to anyone, then we can most likely move
their VM to another server ahead of time - one that we know won't be
involved in the move. By semi-live migration if necessary. That's
the only per-customer thing we can do and it's extra work so we will
only do it if people ask for it. There isn't enough spare capacity
to do it for everyone so doing migration as default for everyone is
not an option. But happy to do it on a case by case basis.

Your thoughts?

I just want to reiterate that these servers moves—part of larger
work involving our colo provider and their other customers—cannot be
negotiated individually with the several hundred BitFolk customers
that will be affected. I'm asking you only about broad arrangements
for notice that can be applied to the whole customer base. Once the
date/time is decided the only per-customer action that can take
place is whether you require us to do extra work to migrate your VM
ahead of time, and you don't need to tell me that now as there will
be an announcement when the date is known (soon, possibly even
today).

Cheers,
Andy

--
https://bitfolk.com/ -- No-nonsense VPS hosting