Today was the first time I clicked on these buttons in Amazon that allow you to look at a few pages of a book. That is some Ajax application (which of course warned me first with some fancy schmancy Ajax popup, fading the window etc. that they only support Firefox, IE etc., ignoring the fact that Epiphanys rendering engine is, well, the same as Firefoxs’. I actually clicked the Ok button twice before it finally closed…)

Anyway, what I want to share with you is a screenshot of some malfunction:

Amazon look-inside disfunction

So apparently they scanned the page word-by-word; often clipping the lower part (notice the missing parts of the p, g, y letters in most words). Some words have gone missing altogether (before “life”). However, the page appears to be a Jpeg image; you can see some compression artifacts around the words that are not my fault (you might need to zoom the image).

So it seems like they disassemble the pages when scanning, then reassemble them to present them to the user. Sometimes messing up the background color, like in this example… interesting. Is there much data storage to be saved by merging word images and storing them only once? Are they reassembled on-demand?