Friday, 17 October 2014

Getting an ESP8266 wifi module to work with Arduino

Last night was another BuildBrighton nerd-meet-up and, luckily, we had a couple of these new fangled ESP8266 wifi modules to try out. In case you've been living in a cave with a tin can tied to the end of piece of string as an internet connection, you'll probably know that these are the ultra-cheap wifi modules currently flooding out of Chinese factories and onto "hacker benches" all over the world.

The reason they've created such a stir is that a) they're cheap and b) the firmware can be (relatively) easily upgraded. So hackers and makers all over the world have been busy trying to access the onboad microcontroller and utilise the extra GPIO pins, to create a single-chip, all-in-one wifi controller.

Our interests are less adventurous - the modules are sold as UART-to-wifi plugin devices, and that's exactly how we're going to use them.

As kindly pointed out by keen-eyed readers, in the image above, Vcc is the pin at the bottom left - and should not exceed 3.3V - the pin top right (labelled Vcc 3.3V) is actually the ground pin. D'oh!

Supposedly you need just four wires to get your microcontroller-based project talking to a home network over wifi. In theory, it's dead simple to bung this into a project and talk to the world. practice, we found a few gotchas which took up most of the evening to resolve - but we got there in the end.

The first thing to do is to power up the device and get some AT commands flowing.
This is where the first gotcha is waiting, for the unwary. It's documented all over the 'net that these little things can get quite power hungry - they can demand upwards of 280mA when sending/receiving data. After 30 minutes or so of operation, they also run slightly warm to the touch. Nothing that could cause skin burns, but noticeably warm.

There's always the old debate about powering devices from a USB port. Some people claim to run 500mA devices straight off a USB port. Some of us have experience of Windows ejecting USB devices that demand more than their allocated 100mA, without specifically requesting it. Who is right depends pretty much on how your USB ports are configured. But if you're going to try powering these off a USB-to-serial converter, they can still work. But they also reset spasmodically too and without warning.

  • So gotcha number 1; use a dedicated power supply (3.3.v)

We did this, connected up the uart lines and ran a terminal emulator at 57,600 bps as per the datasheet. There was nothing on the screen. The red light was on, on the module, but no data coming from ip. Touching the pins on the top of the board caused the onboard blue led to flicker, and a few characters of junk appeared on the screen - nothing useable, but something!

Some sources on the internet say you can leave the unused four pins floating. Some people say they need to be pulled high (though anything between 1k and 10k resistors). Some sites say you can simply connect these to Vcc. We tried all of these and still got no response.

To actually get the device to respond, we found we had to pull the RESET and CH_PD lines high  but leave the GPIO lines low/floating. Pulling all four pins high stopped the module from sending data.

  • Gotcha number 2: pull the RESET + CH_PID lines high, but not the GPIO lines.

Toggling the reset line at this point and suddenly the screen was full of gunk! A lot of sources say this is normal, and after a whole load of junk, you can expect to see the word "ready". In practice, we found that the data was garbled with no "ready" signal. Changing the baud rate to 115,200 bps fixed this

Gotcha number 3: Some modules use 56,700 baud, some use 115,200bps. Try both

Eventually we started seeing something meaningful in the terminal window

Now we did spend quite some time banging AT command in and getting any number of peculiar responses, ranging from an empty string, to "Error", the bizarre "link is not", to the rather more cryptic "no this fun".

Once you have a module responding to AT commands, the fun begins. It took a lot of Googling around, and reading lots of blog posts from people who tried, gave up, tried, fried-the-board-and-bought-another-to-find-it-worked-differently and a fair bit of guesswork to get our modules to work the way we wanted to, but eventually we did manage to establish a connection between our wifi module and a PC, over a home network. This may not be your preferred setup, but this is what we were aiming for, and how we got there:

We wanted to connect the wifi module onto our home network, and have it report back the IP address it had been assigned. We would then create an app (to run on a smartphone or tablet) into which we could enter this IP address, and send and receive commands over the home network, between the wifi module and the tablet.

This means setting up the wifi module as a "station" (rather than an "access point") but also to set it up as a "server" - i.e. to accept incoming connections. As you can imagine, there can be any amount of confusion when some people refer to a "station" as a "client", but then set up the network connection mode as "server". Server!=client. So we're sticking with the terminology from the Chinglish datasheet.

The first thing to do is set the device as a station:


1=station, 2=AP, 3=station + AP (some kind of weird hybrid mode).
Now, query the local access points with:


And the response (sometimes) looks something like this:

Occasionally this fails. Sometimes the module locks up entirely. There are a number of guesses at why this might be - some people favour that hidden APs screw things up, some that too many basically fill the buffer(s) and cause it to lock up, some just that the firmware is crapping out for some unrelated reason. We've had mixed success using the CWLAP command - sometimes we do get a list of access points. And sometimes it locks up so that it needs to be power-cycled to become responsive again.

Once you have the SSID of the access point you're looking for, connect using


And wait for the OK response.
Interestingly, you get OK even if you put an invalid SSID and/or password. OK doesn't mean "connected to the access point ok". It just means "OK, I heard you". The way to test if you're actually connected to the access point is to give it a few seconds, for the negotiations to complete, then query the IP address, using:


The response will either be ERROR (no ip address assigned) or a string with the ip address in it. Confusingly, sometimes the module requires =? to query a property, sometimes just ? (no equals sign) and sometimes neither/none. So far, we've avoiding using the ? to query anything, but that's coming....

If you've got an ip address, your device is connected on the network and you should be able to see it in the list of connected devices on your router/access point admin page.

Interestingly, the wifi module keeps it's connection data in local eeprom, so that it can silently reconnect to a network, if the connection drops. This is very useful (since we don't have to worry about storing SSID and password data in our microcontroller eeprom and attempting to reconnect on power-up: just give the wifi module power and it'll try to connect to the previous access point, if it's still available) but can also be a problem if the device is to be used for anything important.

The wifi module can be subject to the Google Chromecast attack and we're still not sure whether this is desirable behaviour. Here's how the Chromecast attack works:

When the wifi module powers on, it looks for the previous access point (SSID) and connects using the stored password, if  possible. If the original access point is not available, but a spoof or clone access point exists, with the same SSID and accepts incoming connections (is open, or uses the same password) the wifi module will happily connect to the spoof  access point, without reporting any error,

We proved that this was possible with the wifi modules too, by each setting up a wifi hotspot on a Samsung Galaxy S3, using the SSID "test". Each of us set up our phone as a wifi hotspot/access point with the SSID "test", and activated them in turn. In between activating the access points, we power-cycled the wifi module and it firstly connected to Chris' phone, then Steve's, then Jake's phone - each time completely silently, and without reporting that the access point had changed: it basically found an access point with the name "test" and connected using the same credentials as last time.

This does add a resilience into the wifi module - if the connection to the AP drops, the wifi module will silently reconnect, and resume the same ip address as before, and you'd never be any the wiser. But it does mean that your wifi-based project is subject to spoof attacks (should someone be able to reset your home router, and set up their own device as an access point with the same name!)

By this point, we'd managed to get our wifi module onto a home network (it was actually one of the many APs at the hackspace, but the principle is the same!) and to see it appear in the list of connected devices. Now we wanted to get some data exchange going:

AT+CIPSERVER supposedly sets up the device as a server, ready to accept incoming connections, but every time we tried this command, we just got "Error" as a response. A lot of Googling and we still had no answer - except some people simply said that this command doesn't work, and others saying that the firmware needed to be updated.

We found the solution to be a little simpler.
Before the module can accept (multiple) incoming connections, it needs to be put into "accept multiple connections" mode, using the command:


Now, the command AT+CIPSERVER=1,4040 gave the exciting response OK
It looked very much - just as we approached the midnight-deadline-for-getting-things-done that we had a wifi connected server, awaiting connections on port 4040.

After a few false starts with putty (starting with trying to connect over serial in SSH mode, instead of "raw" mode) Other Chris suggested just entering the ip address of the wifi module into a browser address bar. Not expecting anything to happen, and given the late hour, just about ready to give up, we tried it, in desperation: 4040

And, amazingly, the terminal window showed the response:

We had made first contact!
The + IPD message shows us that the response contained 355 bytes. More importantly, it first shows us the "channel number" (or client ID) of the incoming connection. This is important, and this is how we can send the correct data to the appropriate connected client. Following the id number and the length of the message is the main body of the message received.

To send a response, we used the AT command:


which says "send 5 bytes to channel zero".
After committing the command (i.e. hitting the cr+lf combination) the prompt changed to a > symbol to indicate that we were now entering data, not commands.

After entering "Hello" and hitting enter....... nothing happened.
The "loading" icon was still spinning on the web page we'd opened, to connect to the wifi module. It took about 90 seconds to time out. But when it did, the web page had changed:

It looked like we actually had some kind of two-way communication!
So we fired up Putty, only this time, taking care to use the "raw" rather than SSH mode...

... and started typing. This time, we had data!

So we sent some data back

And Putty duly showed the data immediately.

Data appeared on each end immediately. There was no noticeable lag - as soon as we hit enter on one terminal window, the data appeared in the other. By which time, after high fives all round, it was time to go home.

More investigations will continue over the weekend...