is there a profiler? I'd like to get a better understanding of what parts of my patches are expensive, and where there might be optimization opportunities.
Cortex code profiler?
I don't think there is. Especially since axoloti isn't one big binary.
Your best bet is to look at the assembly code that is generated by the compiler. As the relevant code is executed at fixed intervals and in most cases doesn't branch a lot, it is possible to directly correlate the number of instructions/cycles to the CPU-time required. S rate code is executed 16times as often as k rate code, so take that into account.
If that is too technical and tedious, you can always simply remove some objects and see the effect on CPU usage.
Also note that with the current execution scheme, you can shave off some CPU usage by making sure that connections between objects always go down and to the right, never up or left. The latter will add a 16 sample delay in the processing chain and that adds CPU time (and memory footprint) as well.
In some of my polyphonic patches I was able to shave off 3% or more, just by dragging object around to make sure the connections go down and to the right.
(developer talk, assuming familiarity with embedded development tools)
It may be possible to do unobtrusive statistical real-time profiling without debugger, by programming a timer at random intervals, peeking at the program counter on the stack, and building a histogram of relevant program counter values, say, in sdram. Associating the program counter values to the C++ code may give a distorted view due to compiler optimizations, but I can imagine that this approach would be useful.
Instrumenting functions for function call counting would hurt performance a lot, and the results would be boring for the dsp code.
Here is a tutorial about profiling, but I have not done this.
On the RTOS side, ChibiOS can be configured to keep per-thread statistics, and after firmware recompilation and flashing, those could be read out using an st-link debugger, or self-reported by an object similar to the jt/debug_threads
object.