Instead of using the image's bit depth, use the bit depth of the main device. This should help speed up drawing by not requiring up or downsampling to blit to the device buffer.
Flip the context before rendering the image. Then control classes could use CGContextDrawImage instead of HIViewDrawCGImage and hopefully that would speed things up a little.