Re: [bitfolk] Upcoming maintenance work / how to handle noti…

Top Page

Reply to this message
Author: Tim Bannister
Date:  
To: Andy Smith, users
Subject: Re: [bitfolk] Upcoming maintenance work / how to handle notifications
on 18 March 2021 12:34:54 Andy Smith <andy@???> wrote:

> Hi,
>
> There's some major and unavoidable upcoming maintenance work being
> undertaken by our colo provider. The effect on BitFolk is that 7 of
> our servers need to be physically removed from one rack and moved to
> a different rack in a different room in Telehouse.
>
> This is part of the colo provider's need to rebuild the entire rack,
> so it's not in our control, is unavoidable and we have limited say
> over the schedule as it also affects all their other customers in
> that rack as well. It's a big piece of work, though not complicated
> (for us).
>
> I'm writing to you because I'm not sure of the best way to notify
> customers about this work and would like you to give me some
> feedback on how you'd prefer the communication to work.
>
> We haven't yet agreed firm dates/times but it's going to be no
> sooner than a month from now and probably no later than six weeks
> from now. We're going to be upgrading the servers to 10GE networking
> so we're going to move one of them first, wait a week to be sure the
> hardware is stable in that configuration and then do the remaining
> six the following week. The first one may happen evening UK time,
> like 9pm or something, but the rest will likely happen a bit later,
> around midnight into early hours. So if affected, assume you're in
> that latter group, and expect half an hour or so of being powered
> off.
>
> An additional complication is that this comes while we are right in
> the middle of doing rolling upgrades of our fleet of servers. That
> work was started before I was made aware that the server move would
> be necessary, otherwise I might have postponed it.
>
> Some of you will have already been through that rolling upgrade
> process or be going through it now. As that's completely within our
> control we've been moving customers between servers one by one, at
> times convenient to you and individual to you, and then upgrading
> the server once it's empty.
>
> The consequence of that is, we don't know which customers will be on
> which servers come the date of the move. We could send out
> personalised notices as soon as we know the date, but some of those
> people won't be on the affected servers when the time comes, and
> some people who didn't get the notice WILL be affected when the time
> comes.
>
> As part of the rolling upgrades I'm trying to avoid moving anyone
> from a server that's not going to be moved onto one that will be
> moved. It is however unavoidable that some people will be moved
> between two servers that are going to be moved, so some will get one
> very short outage for the move of their VM and then the later longer
> outage when the whole server is relocated.
>
> So is there any value in sending out personalised advanced notice of
> maintenance more than a month ahead? Would it be better just to send
> notice to the announce@ address giving the list of affected servers
> and the time it's going to happen and then a refresher a week ahead
> and again a day ahead or something?
>
> If the prospect of being powered off for half an hour or so a month
> from now is not acceptable to anyone, then we can most likely move
> their VM to another server ahead of time - one that we know won't be
> involved in the move. By semi-live migration if necessary. That's
> the only per-customer thing we can do and it's extra work so we will
> only do it if people ask for it. There isn't enough spare capacity
> to do it for everyone so doing migration as default for everyone is
> not an option. But happy to do it on a case by case basis.
>
> Your thoughts?
>
> I just want to reiterate that these servers moves—part of larger
> work involving our colo provider and their other customers—cannot be
> negotiated individually with the several hundred BitFolk customers
> that will be affected. I'm asking you only about broad arrangements
> for notice that can be applied to the whole customer base. Once the
> date/time is decided the only per-customer action that can take
> place is whether you require us to do extra work to migrate your VM
> ahead of time, and you don't need to tell me that now as there will
> be an announcement when the date is known (soon, possibly even
> today).
>
> Cheers,
> Andy

If my VM had access to metadata about itself, that it could fetch via HTTP,
I think that'd be useful. The metadata could include upcoming maintenance.

But if there's no such metadata and the service overall is pretty
available, I don't mind a bit.

Tim