Amazon SimpleDB: a database server for the internet
Disorganized for a chaotic world
Amazon has announced SimpleDB, the latest addition to what is becoming an extensive suite of web services aimed at developers. It is now in beta.
Perhaps the main reason is scalability. If demand spikes, Amazon handles the load. Second, SimpleDB is universally accessible, whereas your MySQL may well be configured for local access on the web server only. If you want an online database to use from a desktop application, this could be suitable. It should work well with Adobe AIR once someone figures out an ActionScript library. That said, MySQL and the like work fine for most web applications, my blog being one example. SimpleDB meets different needs.
This is utility computing, and prices look relatively modest to me, though you pay for three separate things:
Machine utilization - $0.14 per Amazon SimpleDB Machine Hour consumed.
Data transfer - $0.10 per GB - all data transfer in. From $0.18 per GB - data transfer out.
Structured data storage - $1.50 per GB-month.
In other words, a processing time fee, a data transfer fee, and a data storage fee. That's reasonable, since each of these incurs a cost. The great thing about Amazon's services is that there are no minimum costs or standing fees. I get billed pennies for my own usage of Amazon S3, which is for online backup.
Unlike MySQL, Oracle, DB2 or SQL Server, SimpleDB is not a relational database server. It is based on the concept of items and attributes. Two things distinguish it from most relational database managers:
- Attributes can have more than one value.
- Each item can have different attributes.
While this may sound disorganized, it actually maps well to the real world. One of the use cases Amazon seems to have in mind is stock for an online store. Maybe every item has a price and a quantity. Garments have a "size" attribute, but CDs do not. The "category" attribute could have multiple values, for example "clothing" and "gifts".
You can do such things relationally, but it requires multiple tables. Some relational database managers do support multiple values for a field (FileMaker, for example), but it is not SQL-friendly.
This kind of semi-structured database is user-friendly for developers. You don't have to plan a schema in advance. Just start adding items.
A disadvantage is that it is inherently undisciplined. There is nothing to stop you having an attribute called "color", another called "hue", and another called "shade", but it will probably complicate your queries later if you do.
All SimpleDB attribute values are strings. That highlights another disadvantage of SimpleDB - no server-side validation. If a glitch in your system gives an item a price of "red", SimpleDB will happily store the value.
Not transactional or consistent
SimpleDB has a feature called "eventual consistency". It is described thus:
Amazon SimpleDB keeps multiple copies of each domain. When data is written or updated (using PutAttributes, DeleteAttributes, CreateDomain or DeleteDomain) and Success is returned, all copies of the data updated. However, it takes time for the update to propogate to all storage locations. The data will eventually be consistent, but an immediate read might not show the change.
Right, so if you have one item in stock you might sell it twice to two different customers (though the docs say consistency is usually achieved in seconds). There is also no concept of transactions as far as I can see. This is where you want a sequence of actions to succeed or fail as a block. Well, it is called SimpleDB.
This doesn't make SimpleDB useless. It does limit the number of applications for which it is suitable. In most web applications, read operations are more common than write operations. SimpleDB is fine for reading. Just don't expect your online bank to be adopting SimpleDB any time soon.
This article originally appeared in ITWriting.
Copyright (c) 2007, ITWriting.
A freelance journalist since 1992, Tim Anderson specializes in programming and internet development topics. He has columns in Personal Computer World and IT Week, and also contributes regularly to The Register. He writes from time to time for other periodicals including Developer Network Journal Online, and Hardcopy.
Sponsored: Beyond the Data Frontier