So I have a new toy, an i7-4950hq processsor. This little beast is one of the special Intel chips sporting an Iris Pro 5200, better known as Haswell GT3e. That GPU has 40 execution units and 128MiB of eDRAM to serve as a fourth-level cache for both the CPU and GPU.
Enough spiel, just how fast is it?
For context, here are some results comparing it with my old Sandybridge laptop (with an i5-2520m).
Comparing the processor using the single-threaded cairo-image:
and again comparing the GPUs, using SNA and cairo-xlib:
On the whole, we see a two-fold increase of both single-threaded CPU performance and GPU performance (for 2D graphics using cairo) from the jump from a Sandybridge i5-2520m to a Haswell i7-4950hq. In most cases SNA is being limited by how fast the application can feed it commands and so the performance increase is mostly due to same improvement in CPU speed. (This increase is above and beyond the expected improvements due to IPC, so it is more likely the ability of the Haswell chip to turbo higher and longer thanks to improved thermals and cooling.)
And we can compare the relative merits of using OpenGL and a specialised 2D driver by comparing the various rendering backends available for the DDX. The results are normalized to the cairo-image results, and we have
- none – a multithreaded CPU renderer inside the DDX
- blt – disable the render acceleration, but allow the DDX to use the BLT engine to move data about i.e. copies and fills
- sna – SNA render acceleration, default in xf86-video-intel-3.0
- uxa – UXA render acceleration, current default
- glamor – Glamor render acceleration, uses OpenGL to offload rendering operations onto the GPU
The summary here is that Glamor offers a meagre improvement over UXA. However, both are still much slower on average than cairo-image, i.e. the performance attainable by using a single CPU core. It takes multiple threads inside the DDX to match the performance of cairo-image – this is due to the inherent inefficiencies of the current Render protocol. However, if we then utilize the render acceleration on the GPU (using SNA) we can indeed outperform cairo-image, on average about 2x faster and about 4x faster than UXA and Glamor. Thus SNA does deliver hardware acceleration that succeeds in offloading work onto the GPU (letting the CPU get on with other tasks) and performs faster than rendering everything with the CPU.