Tuesday, January 23, 2007

Pyramidal tiling data organization

So we do not want to load the complete gigapixel image in memory. What can we do about it ?

The first idea is to use a scheme that is called tiling. The image is internally organized as an array of rows and columns. This organization makes possible to retrieve a part of the image without loading the complete image. Requesting a part of the image is now requesting only the tiles that are intersecting this part.

Imagine you have a 10000x10000 pixels images, divided in 256x256 pixels tiles. If you want to display only the top-left part on your 1280x1024 screen, then you only need to load 5x4 = 20 tiles, that is 196 608 bytes x 20 = 3 932 160 bytes, instead of 300 000 000 bytes to load the entire image.


This has three immediate consequences :

  1. Image loading is very fast, because you only load what is visible on the screen
  2. Image panning can also be made very fast, because as you pan around, tiles not visible can be discarded from memory and visible ones are loaded (this is called on-demand loading)
  3. Memory requirement is almost constant is about the memory needed for one visible screen of data, regardless of image size, so that you can now load your image on any PC, even PCs with as less as 128 megabytes of memory

Now that we are able to freely pan the image, we may want to be able to zoom out, up to the point where the entire image is visible. Tiling will not help here, because in order to display the complete image, we need to access all pixels, that means loading the whole image to compute a reduced version of this image. This is where pyramidal comes in. The idea is to generate images that are reduced version of the complete one, each beeing 2 times smaller than the previous one. (Un)Zooming is now only a matter of switching between these resolution. Of course these subimages are themselves tiles to allow arbitray access.

Let’s take a small example with an original image that is 10000x8000 pixels. We would generate a pyramid of images:

  • 10000x8000 (level 0)
  • 5000x4000 (level 1)
  • 2500x2000 (level 2)
  • 1250x1000 (level 3)

We can see that at level 3, the complete image fits into our 1280x1024 screen. If we are at level 3 and want to zoom in, then we switch to level 2, and so on.

We now have an organization of data that allow us to zoom and pan freely in our large image, with memory requirements limited to our physical screen size !

Of course you may say this comes at a cost in storage, because we have to store all these additional levels. What cost exactly ?

If the image size at level 0 is 1 unit, then level 1 is 1/4 this unit (0.25), level 2 is 1/16 and so on.

This is a very well known mathematical suite whose limit is 1.33333333.., so the overhead is not very large given the benefits.

This pyramidal tiling is used in at least two very widely known applications :

  1. Google Earth
  2. Google Maps

No comments: