[bitfolk] Tonight's outage

Αρχική Σελίδα
Συντάκτης: Andy Smith
Ημερομηνία:  
Προς: users
Αντικείμενο: [bitfolk] Tonight's outage

Reply to this message
gpg: Signature made Wed May 7 03:06:04 2008 UTC using DSA key ID BF15490B
gpg: Good signature from "Andy Smith <andy@strugglers.net>"
gpg: aka "Andrew James Smith <andy@strugglers.net>"
gpg: aka "Andy Smith (UKUUG) <andy.smith@ukuug.org>"
gpg: aka "Andy Smith (BitFolk Ltd.) <andy@bitfolk.com>"
gpg: aka "Andy Smith (Linux User Groups UK) <andy@lug.org.uk>"
gpg: aka "Andy Smith (Cernio Technology Cooperative) <andy.smith@cernio.com>"
Hi folks,

Tonight at approximately 21.40 GMT the UK network became unreachable.
It soon became apparent that a large number of my colo provider's
other customers were also unreachable, and after checking that Jump
were aware all I could do was keep checking for more information.

It has now been determined that a 32A commando socket & plug arced
over and burnt out, cutting power to the rack in TFM4. The circuit
was not overloaded (no more than 20A) so it's a suspected faulty
part which Telehouse ops needed to replace. Exact details are not
known at this stage and more information will hopefully be available
later today.

At approximately 01.40 GMT service was restored. All customers on
curacao and islay suffered a hard power cycle, but all VPSes should
have been up since this time.

VPSes on servers in TFM8 (corona and kwak) did not lose power or
network. However both resolvers 212.13.194.71 and .96 are hosted on
VPSes on curacao and islay respectively. With the majority of
customers set up to use these resolvers, even those unaffected by
the power outage in TFM4 would have been without DNS resolution for
the duration.

I will have more to say once I know more details, but given what is
known at present I don't think there was much that either BitFolk or
Jump could have done to avoid this problem or expedite the
resolution. Certainly there should be resolvers in both suites
which is something I will attend to promptly, but that is about it.

It's difficult to say if dual powered servers would have avoided
this (more info needed on exactly which feeds died and why) but this
would involve very different hardware at higher expense.

During the outage, this mailing list and the BitFolk web site were
also unavailable. As usual more up to date information was
available in the IRC channel and while that is an unofficial support
method I would recommend dropping in during situations like this.
I've also decided to use the #BitFolk wiki page at
http://wiki.blitzed.org/Channel:bitfolk/Outages to copy information
about outages, as it is not hosted at BitFolk.

Thanks,
Andy

-- 
http://bitfolk.com/ -- No-nonsense VPS hosting
Encrypted mail welcome - keyid 0x604DE5DB

_______________________________________________
users mailing list
users@???
http://lists.bitfolk.com/mailman/listinfo/users