Best Practices for I2C Objects/Drivers


#1

@SmashedTransistors wrote:

Maybe it is a good idea to start a new thread about "tips and good practices for I2C objects".

Here's my take on writing I2C drivers for Axoloti. Feel free to comment/contribute.

Introduction

I2C is a shared bus technology.
i2c wikipedia

Lot's of interesting chips have an I2C interface and the Axoloti provides an I2C bus on the IO pins.
If you have a new I2C device you want to hook up, you'll need to write an I2C driver/object for it.

I2C is a shared bus.

The Axoloti is the bus master. Each device is a bus slave. There can be multiple slaves on a single bus (each with a unique 7-bit I2C address). The Axoloti initiates a bus transaction (read/write) and the slave device responds. Each device is controlled by it's own driver- which will typically be a patch object.

Here are some best practices to ensure different device drivers work well on the same bus.

1. Provide a way for the user to specify a unique device address.

When the master (Axoloti) starts a transaction on the bus, it specifies the device it wants to talk to by using a 7-bit I2C address. The master talks to one device at a time, so each device needs a unique I2C address. The device datasheet will tell you which addresses a specific device can use. The device will often have some pins to customize (with hardware jumpers) some bits of the address. This allows the user to have multiple instances of the same device on the bus, each with a unique address. The driver writer needs to provide a way to specify which device address a driver should be using.

The device address needs to be specified at compile time. On Axoloti this can be handled by an attribute in the object:

<attribs>
<combo name="adr">
<MenuEntries>
<string>0x1d</string>
<string>0x53</string>
</MenuEntries>
<CEntries>
<string>0x1d</string>
<string>0x53</string>
</CEntries>
</combo>
</attribs>

2. Initialise the I2C bus only once.

The I2C bus needs to be initialized in the patch. A given patch might control multiple I2C devices, but the I2C bus should only be initialized once. The bus is initialized by a call to the i2cStart() function. This should not be called within the driver because with multiple devices each driver would make a call to initialize the bus. It should be called through a single instance of the i2c config object.

axoloti-factory/objects/gpio/i2c/config.axo

This object does not connect to anything in the patch, it just handles the IO setup and shutdown for the I2C bus.

3. Allocate I2C Transaction Buffers out of the SRAM2 segment.

The buffers used to tx/rx bytes from the I2C device need to be in the SRAM2 segment. If we look at the linker script we see:

SRAM2 : org = 0x2001E000, len = 0x00002000 /* second half (8kB) of SRAM2 for DMA*/

The bytes from memory to the I2C controller are DMA'ed so we need to use memory that can be DMA'ed. In the driver you can specify the segment in which the linker will allocate the storage by using the "attribute" pragma.

Example:

static uint8_t rxbuf[32] __attribute__ ((section(".sram2")));

This is more of a requirement than a best practice. If you don't do it the I2C bus transactions won't work.

4. Device IO operations should be in their own thread.

Axoloti uses ChibiOS which is a multi-threaded RTOS. The DSP thread calls the krate/srate functions of the patch and computes the audio. The DSP thread needs to run as quickly as possible so it can feed the codec.
The DSP thread should not be waiting for external IO to take place. The I2C bus operations should run in their own thread where they can run asynchronously without slowing down the DSP thread.

Example:

In the init function we create the IO thread:

// create the polling thread
s->thd = chThdCreateStatic(s->thd_wa, sizeof(s->thd_wa), NORMALPRIO, adxl345_thread, (void *)s);

And in the dispose function we terminate the thread:

// stop thread
chThdTerminate(s->thd);
chThdWait(s->thd);

5. Lock/Unlock the I2C bus around bus operations.

With each device driver running in its own thread we have multiple threads each trying to access the same I2C bus. The drivers will access the bus asynchronously, so we have to make sure access is properly shared. ChibiOS provides functions to lock/unlock the bus: i2cAcquireBus(), i2cReleaseBus() A given thread needs to get the lock before it uses the bus and release the lock after it has finished with the bus.

Example:

i2cAcquireBus(s->dev);
msg_t rc = i2cMasterReceiveTimeout(s->dev, s->adr, s->rx, 2, TTP229_I2C_TIMEOUT);
i2cReleaseBus(s->dev);

It's up to the driver writer to minimize lock time and not hog the bus. Many devices will be periodically polled/updated. Be a good neighbor. Take a look at the device datasheet and work out how slowly you can poll the device.

6. Lock shared memory access between DSP and IO threads.

The device driver IO thread and DSP thread need to communicate. This is typically done with memory shared between the threads. When one thread is writing the data, the other thread should not be reading it, and vice versa. Each thread has to get exclusive access to the shared memory before it reads or write it.

Example:

In the IO Thread:

// write to shared variables
chSysLock();
s->x = x;
s->y = y;
s->z = z;
chSysUnlock();

In the DSP Thread:

// read from shared variables
chSysLock();
x = s->x;
y = s->y;
z = s->z;
chSysUnlock();

In this example if you don't lock the shared variable access, you will potentially have x,y,z values being mixed up between samples.

Note: Using chSysLock/chSysUnlock is an interrupt disable/enable. This gets the job done, but should only be used for very short lock periods because it stops all threads, not just the one that is contending for the variable access. ChibiOS provides other, more sophisticated, methods for inter thread communication (not all of which are enabled in the standard Axoloti firmware...).

Driver Examples

Here are some I2C object/drivers that use the aforementioned practices:

mpr121 driver

ttp229 driver

Note:
I have an aversion to writing C code in XML files, so the bulk of the driver code for these examples is in the *.h include file. This makes the driver code a bit different from the "all in the axo" form but for either form the best practices still apply.


SPI/I2C OLED display
#2

cool stuff, and a great write up!


won't the chSysLock() potentially block the DSP (for a very short while) ?!

what id be tempted to do is use a lock free ring buffer, this is what I did for the (firmware) midi implementation. doesn't have to be a complex implementation if you have a good understanding of the timing of the IO on the I2C - the buffer only has to be big enough to allow for 'some latency/jitter'

yeah, I do the same for anything that's non-trivial


#3

@thetechnobear says:

won't the chSysLock() potentially block the DSP

Yes - it's turns into a global interrupt disable, so the thread can't be preempted until it has called the unlock.

For something like a touch sensor driver you only need a single uint32 as the result of the sensor read so I think a quick lock/unlock around the shared memory reference is probably about as fast as you can manage.

Someone might say: "You should be using a mutex!"

Check it out:

void chMtxLock(Mutex *mp) {
  chSysLock();
  chMtxLockS(mp);
  chSysUnlock();
}

Still doing a quick global interrupt disable/enable. If you only have to share a single uint32 then the interrupt disable/enable will be faster. It's also worth noting that the DSP thread can't spend its time waiting on any sort of lock. Whatever you do it needs to be quick.

Of course for some drivers (E.g. a keyboard scanner) you may need something more sophisticated like a circular buffer, in which case a the code needs to be more complicated. If you've got the chops to write lock-less code that works properly, more power to you :slight_smile:

BTW - ChibiOS has a nice feature called mailboxes.
RT mailboxes
It would be nice to use in drivers, but the feature is not enabled in the standard firmware build. Could we get it enabled for the next release?


#4

I didn’t say use a mutex, i said use a lock free strategy.


#5

Would this kind of strategy lead to use of double or triple buffering ?


#6

@thetechnobear

use a lock free ring buffer, this is what I did for the (firmware) midi implementation.

Can you point me to a code reference?
I've been looking for telltale strex/ldrex instructions, but I don't see anything in either master or experimental...

Thanks.


#7

Is this related to lock-free atomic read-modify-write operation ? Is it useful in a single core processor ?
As I did not had to care about multi-threading on the Axoloti yet, I'm not aware of all the mechanisms available.
That's very interesting.


#8

@SmashedTransistors

Yes - lock free techniques are generally useful on both single core and multi-core devices. ChibiOS has the usual set of semaphores, mutexes, mailboxes, etc. used for data sharing between threads. All of them disable interrupts to lock out other threads and enforce the atomic sharing of data. Lock-less techniques don't lock out other threads, they can still run. The Cortex-M4 ISA provides load exclusive and store exclusive instructions used to implement lock-less data sharing.

Now: I wouldn't consider myself and expert at concurrent programming, and certainly not at lock-less sharing, but from what I understand getting lock-less right is tough. The conservative approach is probably to use the conventional locking primitives and see if that performs well enough. If yes - declare success. If no- then maybe consider lock-less as an optimization.


#9

yeah doing a proper lock free implementation is complex - but there are lots of freely available versions to use, usually in the form of a ring buffer. (as they also need to not allocate memory)

Ive used them on other limited platforms (rPI/Organelle) to good effect, though ive not reviewed how much 'extra' memory they take , which may or may not be an issue with axoloti.


what I did for the midi ring buffer was simply to do it without locks (*), with detection for the read/write pointers crossing. but I was quite careful how I did the implementation, to ensure any theoretical thread contentions are minimised e.g. only writing in one thread - and making sure comparisons were very limited.
(so your implementation has to be a bit more considered, than just taking out the locks - I know exactly where the possible thread contention are, how likely (minimal) , and the consequences )

theoretically I could have used 'test and set' atomic operations, but I didn't because it doesn't really matter that much in this use case.
if the ring-buffer is overflowing theres little you can do about it... its more than likely that the producer is still going to keep producing at a higher rate than consumer, so known 1 byte before or after doesn't really change the outcome :wink:

basically its an optimistic strategy, where due to the ring buffer size, really it should not fail, and if it does its not the end of the world. ideally we would allow the user to optionally configure the ring buffer size, based on their needs.


in this example id suspect its similar, you can have a ring buffer which should never overrun because you know the timing involved on the i2c and you know the timing of the event loop (k-rate).
so under normal circumstances you know the max number of i2c events you expect in a k-rate cycle - and a bit more space for headroom and your done.

in someways I guess its a bit like setting the sample rate buffer size on a computer - you set it as low as you can go without experiencing issues

anyway, its just an idea - my personal preference is to avoid locks in the audio thread at all costs, as they invariable lead to little glitches when load increases, which are really hard to track down later.