Webhost in five day server FAIL

HostV's virtual private server node goes very private

Updated Update: This story has been updated with additional facts from Cirtex CEO John Xie.

New York-based webhost HostV - a division of Cirtex - is five days into an server node outage that has left customer websites completely inaccessible.

London-based Register reader Alan Ayoub says the outage has brought down 10 of his sites, and many others are complaining of downed sites in the HostV forums here.

Ayoub's sites have been inaccessible since Thursday. "People's business and livelihoods are going down the toilet," he says.

Cirtex CEO John Xie tells us that the outage has affected thirty to forty customers. Cirtex's HostV division offers virtual private server (VPS) as well as dedicated server hosting. On February 2, with a Twitter post to a feed that provides server status updates, the company indicated that its VPS infrastructure was under attack. "We are experiencing some serious issues," the post read. "It seems like some kind of attack on our servers. Several nodes are down at the moment."

Over the next 21 hours, regular Tweets alerted customers to failures and repairs of various server nodes. Then, at midday on Thursday, the feed went silent. According to the last three Tweets, one server node was still down, and it seems the failure was related to RAID problems rather than some sort of server attack.

"NODE-16 is still offline at the moment restoring from backup, we apologize for the inconvenience and are doing our best to restore service," the feed said, before going silent.

A day later, in a post to the HostV forums, a company representative said Node 16 was still down but that the staff was in the process of restoring data.

"The node16 is under maintenance due to hardware issues," the post read. "We are currently restoring data on node16. All Os ad [sic] related files have been restored. Now we are restoring vps's data. We hope everything will be setup and fine very soon. We appreciate your patience in this. Currently 28% has been restored."

But four days later, the node is still down.

The company's last public post came this Tuesday morning. "As for an ETA on the restore, we don't have one," a company representative said. "However, it does look like its going to take a fair amount of time. More than 24 hours. Possibly more than 48 hours."

In this post, the company offered to set up affected customers on another node in the meantime, but this would not provide access to data. "That would be pretty much useless," Ayoub tells The Reg. "We still won't have our websites."

Ayoub - like other customers - complains that the company has been far too slow to provide updates on the situation, with as much as 24 hours passing between notices. And he's worried the data restore won't be successful. ®

Update

Citrex CEO John Xie says that the node in questions had a hardware RAID failure after it was rebooted for a server patch. "It caused so much corruption from the single RAID Card failure that we had to restore from backup," he says.

But the company is also having problems with its backup system, from R1soft. "The restore process is still going on, and the main issue is the speed of restore from encrypted and protected files, "We're closely working with R1soft developers...to bring this one particular node back online," he continues.

"There are no excuses for this but we have already offered a refund and migration for all clients on this server after this restore has been completed."

Update 2

Xie adds that HostV's data on the node in question "has some corruption in the backup that is causing glitches for baremetal restore, so we're pursuing manual restore through R1Soft."

Sponsored: Designing and building an open ITOA architecture