The Register® — Biting the hand that feeds IT

Feeds

Webhost in five day server FAIL

HostV's virtual private server node goes very private

Ensure Ease of Recovery with Asigra’s Agentless Software

Updated Update: This story has been updated with additional facts from Cirtex CEO John Xie.

New York-based webhost HostV - a division of Cirtex - is five days into an server node outage that has left customer websites completely inaccessible.

London-based Register reader Alan Ayoub says the outage has brought down 10 of his sites, and many others are complaining of downed sites in the HostV forums here.

Ayoub's sites have been inaccessible since Thursday. "People's business and livelihoods are going down the toilet," he says.

Cirtex CEO John Xie tells us that the outage has affected thirty to forty customers. Cirtex's HostV division offers virtual private server (VPS) as well as dedicated server hosting. On February 2, with a Twitter post to a feed that provides server status updates, the company indicated that its VPS infrastructure was under attack. "We are experiencing some serious issues," the post read. "It seems like some kind of attack on our servers. Several nodes are down at the moment."

Over the next 21 hours, regular Tweets alerted customers to failures and repairs of various server nodes. Then, at midday on Thursday, the feed went silent. According to the last three Tweets, one server node was still down, and it seems the failure was related to RAID problems rather than some sort of server attack.

"NODE-16 is still offline at the moment restoring from backup, we apologize for the inconvenience and are doing our best to restore service," the feed said, before going silent.

A day later, in a post to the HostV forums, a company representative said Node 16 was still down but that the staff was in the process of restoring data.

"The node16 is under maintenance due to hardware issues," the post read. "We are currently restoring data on node16. All Os ad [sic] related files have been restored. Now we are restoring vps's data. We hope everything will be setup and fine very soon. We appreciate your patience in this. Currently 28% has been restored."

But four days later, the node is still down.

The company's last public post came this Tuesday morning. "As for an ETA on the restore, we don't have one," a company representative said. "However, it does look like its going to take a fair amount of time. More than 24 hours. Possibly more than 48 hours."

In this post, the company offered to set up affected customers on another node in the meantime, but this would not provide access to data. "That would be pretty much useless," Ayoub tells The Reg. "We still won't have our websites."

Ayoub - like other customers - complains that the company has been far too slow to provide updates on the situation, with as much as 24 hours passing between notices. And he's worried the data restore won't be successful. ®

Update

Citrex CEO John Xie says that the node in questions had a hardware RAID failure after it was rebooted for a server patch. "It caused so much corruption from the single RAID Card failure that we had to restore from backup," he says.

But the company is also having problems with its backup system, from R1soft. "The restore process is still going on, and the main issue is the speed of restore from encrypted and protected files, "We're closely working with R1soft developers...to bring this one particular node back online," he continues.

"There are no excuses for this but we have already offered a refund and migration for all clients on this server after this restore has been completed."

Update 2

Xie adds that HostV's data on the node in question "has some corruption in the backup that is causing glitches for baremetal restore, so we're pursuing manual restore through R1Soft."

Cloud based data management

Latest Comments

Ah, backups

Most people don't start making backups (including local copies of remotely hosted content) until they've suffered catastrophic data loss.

Then they don't start testing the integrity of their backups until they've suffered further catastrophic data loss.

0
0

RAID fail?

Maybe if they're doing business-class hosting, they should be using some clustering technologies underneath so that a single server doesn't cause these types of problems. N+1 or N+2 are pretty standard scenarios for business-class services.

If you're hosting in a virtualized environment, you should be using the technology correctly, including shared storage and high availability. A single server shouldn't take out your hosting environment.

0
0

webfusion are the worst

Webfusion recently migrated to a new datacentre in Nov - entire nodes of VPS (including my client sites) were offline for a week. Then there were still issues about it being in the wrong container (eg running Win2003 SP1 but being placed in a Win2003 set) as there were functionality issues on the machine

This was raised for a support ticket (24/7 support my arse) and they "investigated" meaning they then knocked the server (completely inaccessible) offline for 27 days. Managed to get it online after a second migration and it's completely wiped. No configuration, no data, nothing.

As of yet, not apologies, no explanation and no compensation.

0
0

More from The Register

1,000 O2 staff chose redundancy over Capita
Betrayal, or just decent terms?
Google launches broadband balloons, radio astronomy frets
A careless Loon could blind the square kilometre array
 breaking news
Pttow! Ofcom kicks hams out of MoD bands
Geet off my land, you, you ... 'secondary user'
 breaking news
Now you can use your phone instead of your wallet at the ATM, too
Blimey, these little paper towels out of the vending machine are really expensive
 breaking news
UK.gov's £530m bumpkin broadband rollout: 'Train crash waiting to happen'
Whitehall whispers of damning watchdog report next month
 breaking news
MySpace zaps millions of teens' tearful rants, causes wave of angst
'Your crappy redesign SUCKS, I wanna read my blogs' screech users
 breaking news
Microsoft Office 365 on iPhone NOW: No, we're not making this up
Word, Excel, Powerpoint for your pocket-stroker
 breaking news
EU signs off on eCall emergency-phone-in-every-car plan
GPS and a mobe in every car - do you suppose the NSA would fancy that?