GSoC 2014 – The clock is again ticking!


The Google Summer of Code 2014 coding period starts tomorrow. This year, my project is to expose NVIDIA’s GPU graphics counter to the userspace through mesa. This idea follows my previous Google Summer of Code which was mainly focused on reverse engineering NVIDIA’s performance counters.

The main goal of this project is to help Linux developpers in identifying the performance bottleneck of OpenGL applications. At the end of this GSoC, NVIDIA’s GPU graphics counter for GeForce 8, 9 and 2XX (nv50/tesla) will (almost-all) be exposed for Nouveau. Some counters won’t be available until the compute support (ie. the ability to launch kernels) for nv50 is not implemented.

During the past weeks, I continued to reverse engineering NVIDIA’s graphics counter for nv50 until now. Currently, the documentation is almost complete (except for aa, ac and af because I don’t have them), and recently I started this process for nvc0 cards. At the moment this documentation hasn’t been pushed to envytools and it is only available in my personal repository.

For checking the reverse engineered configuration of the performance counters, I developed a modified version of OGLPerfHarness (the OpenGL sample code of NVPerfKit). This OpenGL sample automatically monitors and exports values of performance counters by using NVPerfSDK on Windows. The figure below shows an example.


This tool is called (using a bash script) for all available counters and it produces the following output (for shader_busy signal in this example) :


All stats produced by the OpenGL sample are available in my repo. However, I didn’t publish the code because I don’t have the right to redistribute it, but I can send a patch if anyone is interested.

For checking the configuration of these performance counters on Nouveau, I ported my tool to Linux. Then, I was able to compare values exported from Windows using nv_perfmon for monitoring counters.

Now, the plan for the next weeks is to work on the kernel ioctls interface.

See you later!


2 thoughts on “GSoC 2014 – The clock is again ticking!

  1. For the global counters I think a perf-based interface makes a lot more sense than rolling yet another driver private ioctl. At least that’s the direction we’re heading towards with i915. Of course things are different if you need to set sample-points within the cmd stream submitted to the gpu. But for truly global events that just roll in the background I think perf is a perfect fit. Chris Wilson has a bunch of patches floating around in case you’re interested.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s