Tuesday, 8 December 2015

Communicating between two (or more) Arduino / PIC / AVR chips in a noisy environment

Every now and again, in the middle of something else, you have to take a break and work on something completely unrelated. Sometimes, someone asks for some advice on the very thing you're currently working on. That's exactly what happened when Mike got in touch about a project he was working on that involves talking to up to 30 AVR chips (actually, Arduino-style ATMega328s) from a master controller. While initially it seemed quite straight-forward, Mike's project involves each chip being on a motor controller board - so lots of noisy emf and back emf as the motors start, stop and reverse.

To complicate matters, each chip lives in a separate, repeatable, stand-alone module, which may or may not be rotating, depending on the state of the motor it's controlling. So connections between the modules is being done using slip-rings.

So as well as the electro-magnetic noise generated by the motors, there's also the possibility of noise from the (data) lines passing through the slip-rings too. This is going to call for a well-thought-out communications protocol, and plenty of "smoothing" capacitors!

Of course, we're going to be putting plenty of de-coupling or smoothing capacitors on the power supplies of all our chips (in fact, a simple test, trying to drive the motor without decoupling caps on the supply to the AVR chip made it go haywire - intermittently resetting, locking up, and failing to exit looping routines).

But we're also going to have to think about smoothing our data/comms cables too.
Rather than a single-wire, timing based system (like UART/serial) we're going to use a two-wire clock-and-data system, more like I2C. This should prove more robust than a timing-based protocol, in the event of a "crackle" in the line just at the point that we're reading a logical one or a zero from the wire.

We're going to assume that our data lines are (electrically) very noisy. This is always the safest position to work from. Let's quickly review how I2C works:

The master device pulses a clock line up and down. The slave device monitors the clock line for  a change in state (for example, a low-to-high transition, or a high-to-low, whichever you prefer). When this change in transition occurs, the slave devices looks at the data line to see if it is high (logical one) or low (logical zero).

This is all well and good, in a perfect environment. But what if we add some extraneous noise to the line? What if our clock line was particularly noisy?

And even if the clock line were pefectly "clean", a noisy data line would result in the wrong bit values being transmitted

So we need to be sure that our SCK and SDA lines are free from noise. The easiest way to achieve this is a simple capacitor from the pin to ground. The capacitor will "filter out" any noise (on either line) so where there is "crackle" (a very rapid low-to-high or high-to-low rise/fall in the signal) the capacitor will "fill in the gaps" and remove the noise.

The downside to this simplistic approach is that we also lose the sharp falling edges of our clock and data lines. Instead of abruptly falling to zero, the clock line, for example, will decay to zero over time:

How a clock line might look, with a smoothing capacitor to ground on the SCK pin, while the SDA pin has no smoothing applied.

While the decay has been exaggerated for illustrative purposes, it should already be clear that when we introduce capacitors on the clock (and data) lines, we need to increase the delay between clock pulses, to allow enough time for the smoothing capacitor(s) to fully discharge. Just as the clock line decays to zero, if we add smoothing capacitors to the data line also, we can expect those to decay to zero too. This means that we also need to make sure that the delay between clock pulses is long enough to allow both capacitors to discharge fully, to ensure accurate values are read back.

As a result, our data transmission rate is going to be much slower than it might be, with a direct, single wire connecting the SDA and SCK lines together.

How much decay (and therefore how wide the gap between pulses needs to be) is determined by the size of the capacitor. A large capacitor would mean a very long decay time (and very long pulse widths would be required). A very small capacitor would mean very short decay times, so we can use shorter pulse widths.

As with most things, when applying electronics to the real world, there's a trade-off between getting the best compromise for the job. We want our pulse widths to be short, to allow faster communications. But we also want to ensure that the capacitor values are not so small (and the decay times so tiny) that they fail to "fill in the gaps" should there be any crackle or noise in the lines. The best way to find the ideal capacitor values? Hook up some I2C comms lines and try some different values!

The capacitor values need to be very small so as to smooth, but not muddy the signal. But - as with most non-digital/analogue electronics - it's a fine balancing act: too small and they will be discharging so quickly as to be barely noticeable. While this may actually be fine for our purpose, they don't really offer much protection against a really noisy line, that may have lots of large gaps in the signal, all very close together.

By fiddling about, trying different capacitors and different clock times, we settled on 50nF/0.05uF and a clock time of 2ms on, 2ms off.

This means each pulse width is 4ms, to we can send 50 bits (5 bytes) per second. This is much slower than many other data protocols, but comes with the security of knowing that any noise in the line (either through electro-magnetic noise from the motors, or because of the connections between slip-rings) isn't going to disrupt our data signals.

For good measure, we're also going to implement a simple checksum at the end of each message. This will be the usual XOR-sum as the last byte in the data packet. So if we're sending two bytes of data, we XOR them together and send it as the third byte. On the receiving end, after receiving two bytes of data, we XOR them together and compare this result to the third byte. If there is a match, we know the message is valid. If not, we assume the message is corrupt, and ignore.

How we handle "corrupt" or ignored messages is for another day - at least we've tried to ensure that our signal lines are as "clean" as they can be, given that we already know we're working in a very "noisy" environment.