Back to basics for SQL Server 2008
Project Watch: Microsoft 2008 When I asked: "How do we convert more than 12,000 location items - by hand?" we had almost completed the process as part of our move to Microsoft's up-coming SQL Server 2008. The question was, in fact, rhetorical. Nevertheless, we received a lot of advice and suggestions from Reg Dev readers. This, for example, from AlanGriffiths:
"Almost (sic) address validation software should get you most of the way, the postal address file the rest. (And there are plenty of companies that will do the job for you at a reasonable price....)"
This was a complex problem. The bulk of the original data was easy to convert by automated process, so we did. Post-coded data, within reason, is straightforward. The problem lay, as it usually does, with the exceptions. Location data included:
- Streatley Hill, Berkshire
- Clayhithe, Cambridgeshire
- Yalta, Crimea
- Causey Pike Gill, Cumberland
There was simply no list of these place names with their co-ordinates, so we ultimately solved the problem by throwing human intelligence at it. One of our intelligent human converters explains how she worked:
"Online gazetteers were very helpful, especially the Ordnance Survey one, as were those for Welsh and Scottish place names. Where places remained elusive, searching the web often provided clues in such varied places as a list of repairs to railway bridges, hill walkers' blogs and even an illustrated mythical story. With the location pinned down, the latitude and longitude could be read from MapPoint or Google Earth, both of which use the all-important WGS84 datum. At the time of writing, some still elude us - like Old Park Pool on Anglesey and Pleasby Wood in Nottinghamshire."
If anyone is familiar with either Old Park Pool or Pleasby Wood, please do let me know.
Happily this is a one-time conversion, and subsequent data is likely to be algorithmically convertible.
Another part picked on last time by readers was the fact that, during our move to SQL Server 2008, we were using text indicators (N,S,E,W) rather than using signed decimals for the spatial data.
Sponsored: Hyper-scale data management