The best and worst of GitHub: Repos wiped without notice, quickly restored – but why?
That feel when the 'beating heart' of your project returns a 404
Updated Game designer Jason Rohrer has had a bad week, discovering that his 23 code repositories representing 15 years of development and community contributions were wiped from GitHub.
"I can't believe how easy it apparently is to have someone's life work taken down from Github," he said on his forum, fortunately hosted elsewhere. "And GitHub never emailed me about this. Not a whisper. Just poof, account gone. When I logged in, I could see my account, and see that it had been 'flagged', with no other explanation given."
Rohrer turned to the last and sometimes sole refuge of tech giant victims – social media. "If you're thinking about using @github for your life's work, FYI, they may remove it without any warning or notice," he protested. 373 retweets and a few hours later, and everything was restored.
GitHub CEO Nat Friedman noted that support restored the account even before he noticed the incident on Twitter, and apologised personally to Rohrer.
Thank you!— Jason Rohrer (@jasonrohrer) June 5, 2019
The incident raises questions for developers about the safety of their GitHub code. It is not quite as bad as it sounds, since the way it works means that developers have a copy of the repository on their own machine. But it is still pretty bad. "GitHub is the beating heart of my operation. I'm running 30 linodes as game servers, download servers and other types of servers, and they all update automatically using git pull," said Rohrer.
Another problem is that discussions and bug tracking on GitHub are not locally downloaded so real work can be lost in this type of incident.
Why did it happen? Friedman notes that GitHub is "investigating", and the company refused to give The Reg any comment on the matter. Rohrer thought at first that an ill-wisher had reported his account for something like inappropriate content. Then he speculated that by posting a series of issues with links to his forum, an algorithm had identified his repo as a source of spam.
Any free service on the internet (GitHub has both free and paid accounts) attracts plenty of malicious or undesirable registrations, and coping with these is a major headache. Such accounts are often created by bots and putting in place human approval systems does not scale, which means that companies rely on algorithms and AI to automate account flagging and clean-up.
But you'd think that such algorithms would be able to distinguish between accounts that operate actively and without incident for many years, and those created on the fly with obvious bad intent.
Account hijacking is another issue, but where this is suspected, deleting years of work without notice does not seem the right approach.
Just as well we have Twitter, eh?®
Updated to add 17:03, 7 June
A GitHub spokesperson told The Register: "This account was mistakenly flagged by our spam algorithm, and we restored it promptly after learning of the mistake. We work hard to make GitHub a safe and inclusive place to host developer content and are constantly working on improvements to our spam filtering."