Photobox ditches Amazon's Redshift, cuddles up to Snowflake
AWS and Google data warehousing stuff considered, then ignored
Online photo print and gift service Photobox is quitting Amazon's Redshift data warehouse to hitch its wagon to competing cloud-native systems from Snowflake.
The migration to bring its brands onto a single platform, with the aim of boosting analytics, was completed prior to Christmas.
The decision comes after Photobox switched photo editing and shipping management onto Snowflake's platform, and survived the seasonal peak in demand that can see summer monthly site activity rise as much as four times.
Chris Astall, Photobox veep of architecture and data, told us: "We had quite a lot of technical debt, so in the first quarter of last year we started to look at how to build a new platform that can ingest data in real-time, process data much faster and provide that data back to users in a much more consumable way."
The company's systems have long been hosted on AWS, but its data warehouse and data management functions were spread across different systems.
In 2015, Photobox Group was bought by Exponent Private Equity for £400m and since then it has been working to bring its brands – including Hofmann, posterXXL and Greetz – onto one ecommerce and photo-editing platform hosted on AWS.
Moonpig.com, also owned by Photobox, is remaining on a separate ecommerce and data platform.
Photobox said it flirted with the idea of migrating data to Google's BigQuery, but the idea of moving huge volumes between Google Cloud Platform and AWS was considered a non-starter.
"We decided against it before a detailed analysis of Big Query," Astall said. "We did look at a number of Amazon technologies, but they were not really at the same sort of level as Snowflake. After a relatively short proof of concept, we decided to move forward with that as one of the key pillars of our platform."
The ability to decouple compute and storage – vital if data managers want to optimise performance and control costs in the cloud – was appealing about Snowflake, he said. But this feature is inherent in rival systems and it was not the only reason for going with Snowflake.
"Being able to have multiple warehouses running from the same data set and being able to scale those based on business need, whether that is an analyst running some horrible big queries, or having users look at dashboards, or a piece of AI code running. They can all have separate warehouses. Being able to do that – and turn them off and not pay for them – was a massive differentiator for us," Astall said.
The technology stack involves AWS S3 for real-time ingestion of data, data processing in Snowflake, data pipelines tool Airflow and data transform tool dbt. Consumption of data is in dashboard tool Looker.
But Andy Ruckley, Photobox director of data, BI and analytics, said the organisation was looking for more than technical benefits.
"There is also a business change going along with this. We didn't want a pool of report analysts waiting for requests to come back. I want to get everyone in the business to have access to the information they need to do their job better."
Photobox said Snowflake's data warehouse ingested 1,200 real-time events per second into the platform. The next stage is to switch core reporting for business and marketing to the system.
Ruckley told us the resident techies resisted the temptation to "lift and shift" their old architecture to a new platform.
"Building this new capability in Snowflake allows us to correct some of the things that we may have done wrong or improve where we cut corners. Maybe some of the data architecture was OK 10 years ago, but it is definitely not OK now." ®