When I worked at Logica I was told about a new project that I would be starting. The project was a Geographical Information System (GIS) for Anglian Water. My part on it was to be on the team that would write the software convert the (geographical) data from the old system to the new one. As it was described to me I thought "that's going to take days to run, maybe even a whole week". When it came to running it for real it took more than a few days, it was several months in the end.
The original data was organised in 1km squares, and we had several thousand of them. I can't remember how long it took to do a square, but I think it was between several minutes a couple of hours. At one point I suggested loading the whole database into memory, so we spent a large amount of money to get a massive 128Mb. They did it while I was on holiday and said it ran slower, but I wasn't convinced it had been done properly. The conversion process was as automated as I could make it, but it still took most of my time, 3 days a week to "feed" the two machines that were doing the processing with the next squares to do, and also to handle the results. (I can't quite remember why I couldn't just queue them all up and let them run.)
If only we had massive distributed computing power like we do these days, then we could have farmed the processing out to some servers in the cloud (for a price of course) and got it done in a few days maybe.
No comments:
Post a Comment