Sunday, 7 October 2012

Alarm clock project - playing WAV files with a PIC microcontroller

An interesting project has come up as a fund-raiser for BuildBrighton members - to "hack" some alarm clocks and modify their behaviour so that instead of playing a single sample when the alarm sounds, one of up to seven different sounds can be played.


The clock hardware looks simple enough - three AA batteries are used; one single battery keeps the mechanical clock mechanism moving, and the other two are wired in series and connected to a sound playing device whenever either the alarm time is reached, or the user presses a button on the clock casing:



We're looking to reuse as much of the hardware as possible and this looks perfectly possible - with the exception, perhaps, of the simple PCB with black epoxy blob hiding what's going on! We're going to have to guess at how the clock alarm works, but it's not unreasonable to suspect that there's a microcontroller on-board, powered by the 2xAA batteries, which is waiting for a low signal on an input pin. When this input signal is received, the controller wakes from sleep, plays a sound then goes back into sleep/suspend mode.
The simple on/off slide switch breaks continuity between the low signal from the clock mechanism when in the off position (so you can turn the alarm off at weekends, for example) and the push button simply connects in parallel to the alarm mechanism - so the user can activate the sound just as if the alarm time had been reached on the clock.

With these assumptions in place, we're looking to replace the sound playing device with one of our own. Because a number of these clocks will be required, cost is an important consideration - though as is complexity of the circuit/software, as this will be a group project.

With all this in mind, we're proposing a PIC based solution (of course) using a DAC (digital-to-analogue converter) with the different sound samples stored on a regular SD card (read back via SPI).
There are loads of mcu-based sound players on the 'net, but many require expensive "wave shields" or third-party code libraries; we're after a solution that we can fully control ourselves, is not subject to restrictive licencing, is easy to understand and cheap to implement. That means building something from scratch....

To kick off with, we need to understand how we're going to play a sound.
Sound is generated when a speaker fluctuates backwards and forwards by varying amounts. Sounds can be represented by a "sound wave" and it is these "waves" that we're looking to capture and digitize so we can play them back on demand,.

To digitize a sound, we need to approximate the sound wave as a number of numerical values:


There are two important values when converting sound to digital values - bitrate and sampling frequency. These are terms that get thrown about quite a lot, but few people actually understand what they mean! In our example, we need to divide our sound wave up into a number of (vertical) slices.
This is the sampling frequency - how often we approximate the shape of the graph every second.
Common sampling rates are 44.1khz (CD quality), 22khz (tape quality), 11khz (radio quality) and 8khz (telephone quality). (A brief description of sampling can be found here - http://en.wikipedia.org/wiki/Sampling_rate)

Now we've divided our sound wave/graph into vertical slices, we need to assign a value to each "slice". The precision of these values is called the "bitrate" - i.e. the number of bits used to represent each slice of the graph.

For our sound playback, we're going to use 8-bit samples and 22khz frequency (half the quality of CD quality sound). This should give a reasonable playback quality, given that we're only using very cheap hardware and a transistor and cheap speaker to actually play the sound(s). We decided on 8-bit sound samples to keep things nice and simple in the microcontroller (the PIC micrcontroller range(s) we like to use - 16F and 18F - are 8-bit microcontrollers).

The sounds will be converted into 8-bit, 22khz format - using some software like Audacity (http://audacity.sourceforge.net/) and the raw sound data downloaded onto an SD card. The card will be connected to our PIC so that when we want to play back a sound, we give the SD card a byte/register value starting point and simply stream the data back off the card in a single SPI stream, one byte at a time (most SD cards support SPI as a legacy/fallback format).

If we're playing an 8-bit, 22khz sound file, this means that 22050 times per second, we need to read a single byte value from the SD card, convert this value into a variable voltage, and send it to the speaker (e.g. where the byte from sound wav is half value - 128 or 0x80 - this represents half the waveform amplitude so we want to set the output voltage to half of maximum). This all sounds fairly straight-forward so far. Where it gets complicated is getting this digital value out as an (analogue) voltage:

Audio out using PWM:

A microcontroller pin can take one of two states - high or low. This equates to full voltage and no voltage. There's no easy way of providing "roughly half voltage". One way to fake it, is to use "pulse width modulation". This basically means flicking an output pin on and off really quickly so that the average output voltage can be increased and reduced over a specific time period


If we connect our speaker to an output pin, add a smoothing capacitor then flick the output pin on and off really quickly, we should hear a sound. By changing the ratio between on and off, we can change the overall voltage level coming out of the pin (helped by the smoothing capacitor) and therefore change the sound played by the speaker.
PWM is a cheap way of creating a variable output voltage from a single output pin.


Audio out using a DAC (digital-to-analogue converter)

Another option would be to send our wavform value to a dedicated DAC chip. Simply put, you send your 8-bit value to a DAC and a variable voltage is presented on the DAC_Out pin. So if you send the value 128 (0x80 in hex) which is half the max value, you get 2.5v (or whatever half the supply voltage is) on the DAC_Out pin. Similarly, the value 64 (0x40) sees a quarter of maximum (1.25v) on DAC_Out and the value 192 (0xC0) three-quarters of a maximum (3.75v).

Most DAC chips consist of a simple resistor ladder which causes the output voltage to change, depending on which pins in the value-to-convert are high or low:


A lot of DAC chips support different interface methods - some are parallel (you connect your mcu outputs to each individual bit of the value to be represented) but most support some form of serial interface - either SPI or I2C. Although we can generate variable voltages through our PIC output pin using PWM, offloading our digital-to-analogue conversion to an external chip does provide some important benefits, even if it means introducing extra hardware.

The most important benefit is that we're saving clock cycles (as well as reducing code size).
Generating PWM means our PIC is doing a lot of work, looking after when the output pin needs to go high, when it needs to go low, and how often this should repeat in order to get an average voltage which matches the value in our waveform "slice". As we increase the sampling frequency (number of vertical slices) so the more work the PIC has to do, keeping track of all this PWM data.

Let's say we want to convert the value 128 (0x80) to a variable voltage, and we're reading sound samples recorded at 22khz. This means that we're changing the PWM rate 22050 times every second. But we can't just send a single pulse high and a single pulse low for equal time to generate our PWM value of 128 (50% on, 50% off) once every 1/22050th of a second. We need to send this value loads of times within our 1/22050th second window. All this means that our PIC is doing a lot of work just generating PWM signals - we still have to leave enough time to read data from our SD card in-between generating on/off output signals. Even with something as big as a 20Mhz crystal, the PIC would struggle to play anything above 32khz samples.

By offloading the analogue voltage to a DAC, the PIC is freed up to get on with other things.
The playing of a wav file is reduced to "read a byte, send it to the DAC" and repeating this once every 1/22050th of a second (in fact, we'll probably read a few bytes into a buffer but the theory is the same). Because of this, we're not only simplifying the code required, we can even run it on a low-end PIC (something like a 16F628A) from it's internal RC oscillator and do away with the need for a high-speed crystal. This in turn reduces current consumption and extends battery life since the PIC is running at a lower speed. It may even be possible to play higher-quality 44.1khz samples with such a simple code routine.

For this little project we'll be using

Microchip's MCP4902 8-bit SPI DAC
PIC 16LF628A (low voltage)
Cheap SD or SDHC card
Audacity sound editing software

The software's downloaded and installed, the wav files edited and everything is on order. All we're waiting for now is the postie to bring everything and we can get cracking!

EDIT - November 2012: We got this working using a PIC16F1825 and have written up how it works in a series of blog posts, starting here