Skip navigation

This time comparing different backends using the ‘fast’ wip/stroke-to-path branch:

image xlib drm gl
epiphany-20090810 0.0 -106.5 -19.8 -235.8
evolution-20090607 0.0 -100.3 -130.6 -440.6
evolution-20090618 0.0 -61.9 -69.4 -380.3
firefox-20090601 0.0 -103.5
firefox-periodic-table 0.0 -92.3 20.6 -228.6
firefox-talos-gfx-20090702 0.0 8.3 381.6 -207.1
firefox-world-map 0.0 -186.2 27.0 -51.1
gnome-terminal-20090601 0.0 -29.7 207.9 -287.3
gnome-terminal-20090728 0.0 60.0 406.7 -70.5
poppler-20090811 0.0 -70.2 132.4 -344.2
poppler-bug-12266 0.0 -101.6 59.0
swfdec-fill-rate 0.0 -112.9 36.3 30.1
swfdec-fill-rate-2xaa 0.0 -69.3 345.1 45.7
swfdec-fill-rate-4xaa 0.0 -244.4 0.6 -3.2
swfdec-giant-steps 0.0 -41.7 -66.6 -260.8
swfdec-youtube 0.0 12.1 192.9 7.6

[image] 1.9.2-505-g2e9cad3.tiny
[xlib] 1.9.2-505-g2e9cad3.xlib.tiny
[drm] 1.9.2-525-g8c7de80.drm.tiny

As always there is more work to do.

About these ads


  1. What’s the difference between drm and gl?

    • Ah, yes the cryptic labels to refer to the names given to the cairo backends. So image is a cairo_image_surface_t and solely uses pixman (i.e. software) for its compositing. xlib attempts to convert the incoming operation into RENDER primitives, but may fallback to using pixman on a local image surface (thus incurring the cost of XGetImage/XPutImage) if the XServer cannot handle the request. (On this machine, we have to fallback for the extended repeat modes since we are waiting on a new release before we enable them to be sure that the driver support is ready.) gl is the most recent attempt at an OpenGL surface – it’s still very rough around the edges and has not been tuned yet. (However, the implementation looks fairly clean, so frankly I think that is the Mesa stack that is the limiting factor here. cairo-gl is reasonably fast in micro-benchmarks, benefitting greatly from direct rendering, but that performance does not carry across into ‘real-world’ complexity.) I’ll benchmark cairo-glitz at some point, because that highlights the limitation of solely implementing the RENDER extension using OpenGL (i.e. cairo-gl is faster than cairo-glitz on the hardware I have). cairo-drm is my experimental backend that issues rendering commands directly to the i915 hardware. (Thanks to the beauty of the GEM interface, the kernel handles all memory-management, all I have to do is pack batch buffers with rendering commands.) The purpose of such a custom backend is to explore the performance limitations of the GPU and to see how fast we can accelerate cairo on a given chipset. (Note that even the choices I have made so far, such as sending all geometry as scan line RECTLISTs, as opposed to using pixman to compute a trapezoidal mask, cf intel-gfx, is occasionally a net performance loss.) The goal is to make cairo-gl (and even cairo-xlib!) as fast. [And then to make cairo-drm faster...]

      When looking at these graphs one crucial factor to remember is that these are only measurements of throughput. With regards to a responsive display, we also need tight bounds on latency. Measuring that in a meaningful is a task for the future. Also we need a handle on the measuing performance impact on interactive applications.

  2. It might be worth setting up a speed center for cairo, see:

    “I would also like to note, that if other performance-oriented opensource projects are interested, I would be willing to see if we can set-up such a Speed Center for them. There are already people interested in contributing to make it into a framework to be plugged into buildbots, software forges and the like. Stay tuned!”

    …Having these as nightly stats could help raise interest too

  3. Weird, so basically both the xlib and gl backends which attempt to accelerate things, instead in most cases cause a performance loss versus doing the entire thing in software?

    • Weird? No, just the usual stumbling blocks that a driver faces when trying to handle a rendering pattern that is not necessarily optimal for GPUs. Cairo is an immediate mode renderer that renders to lots of surfaces and combines them sequentially. This is lots of very short operations, generating lots of state changes, anathema to GPU performance. Though with a bit of time and care, the drivers can be fixed to be faster than the CPU even for Cairo, with the reduction in overhead improving the drivers for all.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

%d bloggers like this: