Call for patches that push Core to its limits, expose issues

urklang · 2019-10-08 19:11:23 UTC

We should be able to unlock the 64k SRAM table limitation as well. It's currently in a 128k region, but we can probably go up from there a bit. We've got about 1M to work with overall up from 256k on the original Core.

jaffasplaffa · 2019-10-08 19:18:09 UTC

Sounds great about SRAM too. So exited here

Not really sure what more we can test here.

The SRAM + SDRAM problems are pretty much the only limits I encounter all the time. Besides that I don't find a lot of issues.

I just look forward to see what this upgrade will lead to.

Higher quality filters, oscillators etc.? It sure will be an option

Pygmy · 2019-10-08 22:04:45 UTC

Bring on the convolution reverbs
If I could just get a reverb to sound as nice as my old Behringer Rev2496 V-Verb Id be a very happy man

jaffasplaffa · 2019-10-09 06:59:10 UTC

Yeah I heard good things about that one too.

Also being able to actually have like , kick, bas, hihats, pad, lead... AND on top of that have a reverb and maybe a delay loaded, is going to be soooooo sweet. Already saving money for the new Axo, hehe

urklang · 2019-10-09 19:46:34 UTC

I'm a huge reverb guy too. This is something that I definitely want to explore, just getting a bunch of the known reverb implementations ported over and usable.

Tangent: I have one of the Zoom CDR pedals. It has some pretty neat clones of Eventide algorithms among other things. If you have links out to interesting reverb code or existing implementations, post and I'll look at getting them ported.

jaffasplaffa · 2019-10-09 19:48:49 UTC

@urklang

Should we do it here or maybe use this thread for reverb talk?

jotelcalifornia · 2019-10-10 10:28:33 UTC

hah, since my last post in that thread pops up: I don't know it this is relevant to you, but the patch you see there (8ap-loop-plus_myedit) maxes out Sram pretty much. When I was writing it I realized my DSP Load was around 60%, so I thought I can still add lots of Options, definitly wanted to implement some form of pre delay (now its implemented pretty crudely, just a single delay) I can't really add anything without getting the overflow error.

jaffasplaffa · 2019-10-10 11:32:32 UTC

Yeah that is the problem. I think I never had the DSP meter go to 100%, because the Sram is pretty much always overloaded before you get that far

Sram is by far the biggest issue on Axoloti. But for the new one, we get a lot more of that too. I really look forward for this to be up for sale

urklang · 2019-10-10 13:28:04 UTC

This is really valuable feedback guys. Thanks. I'll keep as much SRAM headroom available as I possibly can in the new implementation. Memory-related bottlenecks are always going to be a barrier even with the significantly increased CPU clock. There will likely be other opportunities for savings though as I get more into optimizing other parts of the system for the H7.

jaffasplaffa · 2019-10-12 08:31:33 UTC

Yes that would be really great.

SRam and SDRam is the biggest issue. I never managed to max out the DSP in anyway, but the SRam is reached in pretty much in every patch and also often the SDRam.

I like to use Axoloti as a full production suite, not only as an instrument, like a bass or a karplus strong algo. I like to use it as full suite where everything is playing form the device, like kick, bass, hihat, snare, a pad and a lead and then maybe a reverb and a delay effect. You pretty quickly max out one of the old axo SRam and SDRam by wanting that. With this new one I am very hopeful that this will be a lot more fu, with more headroom for the ram types.

So yeah, the more headroom for these two guys, the better

Captain_Burek · 2019-10-13 11:46:40 UTC

Just on the off chance no-one's thought of this: An easy way to get to the DSP as well as S(D)RAM limits for me has always been to take a polyphonic patch and up the number of voices.

From me, too, more power to you for taking this on!

jotelcalifornia · 2019-10-13 22:01:29 UTC

one small update to this: I tried using the axoloti with this patch (8ap-loop-etc.) as a midi thru (on top of the reverb) and found, that it won't deliver a steady clock with this patch (Sram nearly maxed out). When I add the thru objects into an empty patch it works just fine. It has to be an issue with the sram, because as I said, my DSP load is at about 60%, so there should be some headroom.

Still don't really get how the SRAM works, are there any threads explaining it? I mean, I get that for delays, allpasses, etc. you need some ram to store the audio, but it seems almost all objects, regardless if they use a buffer eat some sram away.
e.g. My patch was overflowing and I deleted a normal map b object, and then it worked fine!

jaffasplaffa · 2019-10-13 22:06:13 UTC

For delays and tabes and so on, it is recommended to use the SDRAM version. Those object should have "SdRam" in the end of the name. That saves SRam. But yes I think you are right, pretty much any object uses some Sram, some more than others. And the really memory consuming ones, again, like delays etc. has their own SdRam versions.

lokki · 2019-10-14 06:46:23 UTC

all objects use sram simply because you usually save variables etc. there.

so if you use an arpeggiator or a lowest note priority keys object those will eat 128 byte each just for storing all the possibly held down keys...

urklang · 2019-10-14 17:43:27 UTC

Actual machine instructions for each patch object are also saved in SRAM, not just their data storage.

One approach I want to try eventually is to just put all of the factory objects in Flash. I'm of the opinion that this is probably how it should have worked from the beginning. The vast majority of objects never change, yet we're forcing ourselves to recompile them all the time.

With this approach the "patch" becomes lighter weight; most of the time we're referring to the factory objects that are already available in Flash. Objects that are being dynamically edited can still be loaded into SRAM as edits are taking place.

In other words, I think there are bottlenecks in the current system coming from the software architecture, not from the hardware itself.

tele_player · 2019-10-14 19:11:12 UTC

Urklang, I agree that it might be nice for unchanging objects to be in flash, but I don’t think the fact that they get recompiled every time merits much concern. On even an old core2duo MacBook Pro, patches compile and load quickly, in my opinion. On my newer i7, it’s even better.

And I’m certain that treating all objects the same is simpler to implement.

Is there any performance difference between code running from flash vs. from SRAM?

urklang · 2019-10-14 20:10:24 UTC

From the literature and from anecdotal evidence there shouldn't be any advantage to running instructions from SRAM. People seem to observe that SRAM can actually be slower due to caching and CPU pipeline structure (One example: https://community.st.com/s/question/0D50X00009Xkh2VSAR/the-cpu-performance-difference-between-running-in-the-flash-and-running-in-the-ram-for-stm32f407).

The fundamental issue is that we're taking the special case, i.e. dynamically editing an object implementation, and applying it universally. It's over-engineered; there's no technical reason for it. Yes, it might be tolerable to deal with a compile cycle because your host machine is fast. That doesn't make it correct or desirable. Compilation should only occur when someone is changing an object implementation. This approach also has the effect of making everything harder to debug. If all of that object implementation code is compiled ahead of time, it's much easier for a debugger to be aware of all of those symbols.

The actual object implementation can remain largely the same whether in Flash or SRAM. You would simply call into object implementations in Flash as needed. I would argue that working from Flash would be easier overall because there could less concern about the exact positions of things in memory. The patcher right now actually has awareness of explicit memory addresses on the target device, again, for no real technical benefit. It's a very brittle design.

The patch load process could just as easily target Flash actually, but then we'd get into wear-leveling concerns, etc.

The bottom line is that the vast majority of patching could easily be done completely live without any compile cycle at all. It has the added bonus of actually being simpler to work with and to debug.

Ha, sorry if this all comes across as hyper-critical and doom and gloom! I'm just trying to make this thing the best it can possibly be. It bothers me that people are dealing with what I see as self-imposed technical limitations.

tele_player · 2019-10-14 20:36:10 UTC

I don't see it as hyper-critical or gloom and doom, I'm just not convinced it's desirable enough to merit much work.
The last guy to dig deep into rewriting how the system works has been mostly missing in action for a few years. We don't want that to happen before you ship some upgraded hardware.

urklang · 2019-10-14 20:49:37 UTC

Great! I agree that this is all stuff that isn't happening until the new hardware is out in the wild. Definitely enhancements, not in scope just to get hardware shipped.

Zaphod · 2019-10-16 16:13:55 UTC

The patch that I've appended is my first patch on Axoloti! It is an alpha version of a 12 band vocoder. I intended on making a 31 terts band vocoder, but soon found out that this will demand too much resources on Axoloti. Therefore I scaled it down to 12 bands. Depending on the positioning of the elements of the patch in the gui it compiles or I get an error message telling me that there is an overflow. I hope this can be of some help in the further development of Axoloti. And if anyone uses it to make some music, please send me a link, I'd be very curious to know what people use it for. To use it, put a carrier signal on the right input and a modulator signal on the left input. The 12 resulting bands are panned a bit, so any signal should result in some stereo output signal. The vca + const/i elements are meant to scale the signals. Depending on your input signals you may have to adjust these.vocode-o-matic_emulatie_12x12_banden_linear_03.axp (73.2 KB)