“Premature optimization is the root of all evil” is probably the most respected rule in the development of Krita. But now that we have made three major releases, and even if the development version is under heavy refactoring, most of the framework is in good shape, and now is time to start optimizing. Which is something on which I have started to work actively in September, after the feature freeze for 1.6, some new progress will be included in the upcoming 1.6.1 (mostly in the convolution code, the painting, and gradients).
Optimization is divided in two tasks, the first one is to find where you code is slow (and most of the time it’s not the function you expected to be slow which cause you problem) and the second one is to find how to remove the slowness. And to accomplish those two tasks you need some proper tools, for the second task, it’s your brain, but for the first one, there is a lot of possible solutions, divided in three categories:
- tools which requires special compilation option mainly it’s gprof, which is annoying to use as you must recompile your code with the “-pg” option which slow down your program, on the other hand the results are accurate and reproductible.
- valgrind is a virtual machine and therefor is very slow to execute, and the main tool for optimization is callgrind/calltree which in fact counts cache misses, while it gives an account of where your code is spending time, it’s not accurate computation of CPU cycles, on the other unlike gprof you don’t need to rebuild your application just for profiling, and unlike oprofile and sysprof, the results are reproductible, which means that you can take the output for two different implementation, and you will know which one is faster.
- kernel module which measures CPU cycle consumption, the two I know are oprofile and syprof, both work by inserting a module in the kernel which will make measurement of how many cpu cycles are spend in a given functions, the main advantage is that your application run at its normal speed, but the problem is that it is an experimental analysis, therefor the results are not reproducible, as they depend of how many measurements the kernel module was able to run, which means you can’t use them to make a comparison between two implementations, and small and fast functions but which are called a lot and therefor are expensive might be unnoticed by the module. The difference between them is that sysprof offers very basic functionalities but is very easy to use (which are accessible through a GUI as shown on the screenshot below) while oprofile is a very powerful tools but difficult to master
If you have heard of an other profiler for linux with which you had a good experience, I will be more than happy to hear about it.
I have been a long time user of valgrind until recently, my main issue is the speed, especially with krita, it can takes ten minutes to load when all plug-ins are installed. Because of that, now I am mostly using sysprof unless I need to compare two implementations, in which case I fall back to valgrind.
In a next blog entry, I will speak about a practical example of using sysprof.