EnterpriseDB cranks up Postgres database scalability
Ditch Ellison for the elephant
SaaS data loss: The problem you didn’t know you had
EnterpriseDB, the commercial entity that is helping to push development of the open source PostgreSQL relational database – and profit from it – is trying to keep the heat on database rival Oracle with its latest 9.2 release and a matching 3.0 update of its database management console.
The PostgreSQL project put out its beta of the raw 9.2 code back in May, and EnterpriseDB grabbed the code and ran it through its kiln to harden it for a beta release as Postgres Plus Advanced Server 9.2 in early October.
The biggest change is that the database now scales a lot further. PostgreSQL 9.1 started thrashing once a machine had more than 24 processor cores and was processing somewhere around 75,000 queries per second in which writes were part of the queries. But with the 9.2 update of the raw code, PostgreSQL can now span as many as 64 cores in a single system image and drive more than 350,000 queries per second.
This performance boost is not just enabled by spanning more cores, but by speeding up both read and write times in the database on top of letting the database span more iron.
This is a big jump in performance, and one that PostgreSQL users have been clamoring for. What it means is that PostgreSQL – and therefore the commercially supported variant from EnterpriseDB that is now available – can scale to all but the largest x86 and Itanium servers and actually make use of the iron.
What we don't know is how this performance stacks up against the open source MySQL alternative as well as the closed source Oracle 11g and IBM DB2 databases.
The database also supports cascading replication. In normal replication, if you have a database server and you want to replicate to three more machines, you have to do them all in a series. But with cascading replication, you can pass data asynchronously over an IP link to one remote server, and then have that server replicate to the two others.
As is the case with any database update, Postgres Plus Advanced Server 9.2 has a lot of nips and tucks to boost performance that only database propellerheads understand.
In-memory sorts run 20 per cent faster than with the 9.1 release. And with Postgres 9.2, you can now do index-only scans, like other databases have been able to do for years. If a portion of a table is cached in memory and you need that data, you used to have to hit the index and then the full table, but now you can hit the index and whip right into memory for that bit of the table, thus speeding up access to that data by a factor of ten.
The database also has support for JSON data types (for hybrid document-relational databases) and support for range data types. Both of these are clever, and you can read all about it in this presentation (PDF).
The open source PostgreSQL database has binaries for Linux, OS X, and Windows, and EnterpriseDB takes this code and certifies it to also run on HP-UX on Itanium servers as well as on x86 iron running Oracle's Solaris Unix variant.
The database also has skinning software that allows it to look and smell like an Oracle 1og or 11g database to applications; IBM has licensed this technology and embedded it in its own DB2 database so it can offer an Oracle compatibility mode.
Postgres Enterprise Manager is revved to a 3.0 release concurrent with the release of the updated database, and it includes a new web-based client that allows it to run from any browser on any device so long as it has an internet connection. The PEM 3.0 tool also supports SSH tunneling so it's easier to access databases behind the corporate firewall. For admins, not hackers. (Well, that's the theory, right?)
A support contract for Postgres Plus Advanced Server 9.2 with 24x7 coverage costs $4,945 per system per year, and a subscription to the PEM 3.0 add-on normally costs $1,500 for the first two monitored database servers and $500 per database server after that. But through the end of December, EnterpriseDB will sell you PEM 3.0 at 50 per cent off.
EnterpriseDB says that it has over 5,800 customers using PEM, and that in the past year there have been over 5 million downloads of Postgres databases and various add-ons. ®
COMMENTS
Re: 350,000 queries per second until...
You might very well be right. And there is nothing stopping you from proposing a fix and submitting a patch. Try that with Oracle.
Re: 350,000 queries per second until...
Most of your gripes sound like they're down to bad database design / configuration / management to me. Postgres doesn't have hints, it's been a long running argument and the general consensus is that they encourage bad habits and scale worse than the planner alone in all but very exceptional cases. There has been some recent discussion about implementing hints and having an extra guc parameter to obey or ignore them, I'm not sure where that's going.
Your raid issues? Well that’s a hardware problem, but you can easily reconfigure cost variables to smooth things out. Index issues? Again sounds like bad design to me.
Re: 350,000 queries per second until...
> PostgreSQL relies heavily on statistics to determine the order in which conditions should be applied to queries
like all modern DBs I believe. Stats-based optimisers were described as the 2nd biggest step forward in database technology (can't remember what the 1st was. Poss dynamic programming based on the stats).
> On the other hand, there's that rare case where it goes horribly wrong
like all modern DBs, no question about it.
You check your stuff is well designed (as others here have clearly said), and when the inevitable happens, you deal with it. A good start is to simplify the query. This may involve revisiting the DB design and it may take breaking a query down into 2 with a materialised table in between (not optimal). Or you could look here if it helps <http://www.postgresql.org/docs/current/static/runtime-config-query.html>.
It's can be very hard to debug a large nonperforming query so your chance of fixing one even if you could specify the query plan is small. On a really complex query you sometimes can't tell it's significantly suboptimal at all without serious time consuming digging.
Back in the article...
> now you can hit the index and whip right into memory for that bit of the table, thus speeding up access to that data by a factor of ten.
That is a factor plucked out of thin air. Could be X1 or X1000 depending on the fatness of the index and whether it fits totally into ram.

IT infrastructure monitoring strategies
Agentless Backup is Not a Myth
Top 10 SIEM implementer’s checklist
Steps to Take Before Choosing a Business Continuity Partner
Enabling efficient data center monitoring