Optimizations:
1) Keep a rectangle updated per-screen rather than regenerate each time 2) Strip palette info when putting pixels into rectangles rather than during scaling 3) Tighten up the screen locks a bit 4) Don't require a full resend of both screens on an update request 5) Only force a redraw for cursor movement when the cursor is visible (And force it whenever the cursor changes) 6) Avoid doubles in interpolation 7) Heavily optimize interpolate_height() interpolate_width() likely doesn't need it because it's generally not used and also it reads from the next pixel in memory making the prefetchers job easier. 8) Fix some memory-leak-on-error issues 9) For ARGB8 XImages, manipulate the data directly rather than through XPutPixel() At this point, the scaling and X11 output time is heavily dominated by cache misses. The only really effective way to reduce this hit is to spread the work across all the L3 caches in the system or move it into the GPU. With the latest updates, at the S...