Make Your Own Z-Wave Device

Have you always wanted your very own Z-Wave widget-thing-a-ma-bob-doohickey? Silicon Labs recently released the Thunderboard Z-Wave (TBZ) which is an ideal platform for building your own Z-Wave device. Officially known as the ZGM230-DK2603A, the TBZ has sensors galore, expansion headers to connect even more stuff, comes with a built-in debugger via USB-C and can be powered with a single coin cell. Totally cool! I am working on a github repo for the TBZ but right now there are three simple sample apps in Simplicity Studio to get started.

ThunderBoard Z-Wave

Thunderboard Z-Wave


  1. ZGM230 Z-Wave Long Range Module – +14dBm radio – 1mi LOS RF range
    1. ARM Cortex-M33, 512/64K FLASH/RAM, UARTs, I2C, SPI, Timers, DAC/ADC and more
  2. Built-in Segger J-Link debugger
  3. USB-C connectivity for SerialAPI and/or debugging
  4. RGB LED, 2 yellow LEDs, 2 pushbuttons
  5. Temperature/Humidity sensor
  6. Hall Effect sensor
  7. Ambient Light sensor
  8. 6-Axis Inertial sensor
  9. Metal sensor
  10. 1Mbyte SPI FLASH
  11. Qwiic I2C connector
  12. Break-out holes
  13. SMA connector for antenna
  14. Coin cell, USB or external power
  15. Firmware development support via Simplicity Studio

Sample Applications

There are three sample applications in Simplicity Studio at the time of this writing (Aug 2022 – SDK 7.18.1);

  1. SerialAPI,
  2. SwitchOnOff
  3. SensorMultilevel

The TBZ ships with the SerialAPI pre-programmed into it so you can use it as a Z-Wave controller right out of the box. Connect the TBZ to a Raspberry Pi or other computer to build a Z-Wave network. Use the Unify SDK to get a host controller up and running quickly or use the PC-Controller tool within Simplicity Studio for development and testing. The SwitchOnOff sample app as the name implies simply turns an LED on/off on the board via Z-Wave. This is the best application to get started as the ZGM230 chip is always awake and is easy to debug and try out. The SensorMultilevel sounds like a great app as it returns a temperature and humidity but at the moment it does not use the sensor on the TBZ and simply always returns a fixed value. SensorMultilevel shows how to develop a coin-cell powered device. Additional sample apps are expected to be available in future SDK releases but I am working on a github repo with a lot of sensor support.

Naturally a single Z-Wave Node doesn’t do much without a network. You’ll need some sort of a hub to connect to. Most of the common hubs (SmartThings, Hubitat, Home Assistant, etc) will at least let you join your widget to the network and do some basic control or status reporting. You need either a pair of TBZs or perhaps purchase the even cheaper UZB7 for the controller side and then the TBZ for the end-device. Then you have a network and can build your doohickey and talk to it over the Z-Wave radio.

Getting Started

Plug in the TBZ to your computer and open Simplicity Studio which will give you a list of applicable documents including the TBZ User Guide. Writing code for the TBZ definitely requires strong C programming skills. This is not a kit for an average Z-Wave user without strong programming skills. There is a steep learning curve to learn how to use the Z-Wave Application Firmware (ZAF) so only experienced programmers should take this on. I would recommend watching the Unboxing the 800 series video on the silabs web site to get started using Simplicity Studio. I hope to make a new video on the TBZ and publish the github repo so stay tuned.

Have you created a Thing-a-ma-bob using the TBZ? Let me know in the comments below!

Z-Wave Watchdog Timer Best Practices

WatchDogVirtually all embedded systems must run 24 x 7 x 365 x many many years without ever being rebooted. Since there is no one there to “press the reset button” if the device fails, the watchdog timer is there to do just that. The 500 series Z-Wave chips from Silicon Labs have a watchdog timer and the example code provides a very minimal use of the watchdog timer. However, the minimal use in the example code is not sufficient to provide a robust watchdog for embedded Z-Wave devices. This post explains some rules and methods to code a robust watchdog timer.

Long time embedded expert Jack Ganssle has a great article on Watchdog timers. He describes the use of a watchdog timer on the Clementine spacecraft where a fault in the system caused the spacecraft to dump virtually of its fuel resulting in the loss of the mission. The lead software engineer had wanted a watchdog but the designers decided not to include it. Jacks example shows how important it is to spend at least some time coding a robust watchdog for our IoT devices. While our devices aren’t controlling multi-million dollar spacecraft, we are coding light switches that are hardwired into the wall and cannot be easily rebooted. Try telling the customer to go into the basement and toggle the power to his entire house to reboot the light switches!

What is a Watchdog?

A watchdog timer is a timer that runs constantly. Typically a complex combination of events resets (or “kicks”) the watchdog timer every now and then, usually every few milliseconds. If the combination of events ever gets stuck, the timer will continue to run. If the watchdog timer “times out”, the system is reset – basically the reset button is pushed! Your embedded system reboots and keeps on running. Generally no one even realizes it has rebooted (I’ll discuss that problem in more detail shortly).

WatchdogTimerThis diagram shows the Watchdog timers value which is constantly counting up. Every time the Watchdog is “kicked”, the counter is reset to zero. Somewhere in your code the ZW_WatchDogKick() routine is called which resets the watchdog timer. Sometimes this reset condition happens on a nice regular basis, sometimes it happens at varying times as shown by the level of the timer. The key is the timeout threshold has to be longer than any normal operating condition. If a fault condition occurs, the timer keeps on counting up until the threshold is reached and then the system is reset. When the watchdog timer fires, the Z-Wave chip goes thru a full reset just as if power had been removed and reapplied. Your embedded system is back up and running as if nothing had happened.

SiLabs Sample Code = Minimal Watchdog

The SiLabs sample code has the following implementation of the watchdog:

BYTE ApplicationInitSW(ZW_NVM_STATUS nvmStatus) {

void ApplicationPoll(void){

The sample code has the good implementation practice of putting the Watchdog code inside #defines so it can be easily enabled/disabled. Unfortunately it blindly kicks the dog every ApplicationPoll without checking any other conditions. ApplicationPoll is called roughly every few hundred microseconds and a lot of fault conditions can exist and ApplicationPoll will still be called. With this implementation the only way the watchdog is going to fire is if there is a catastrophic failure and ApplicationPoll is no longer being called. While this implementation is better than nothing, it won’t reset the system in many cases where the device has become unresponsive. This is where you come in, you have to add more code to the watchdog algorithm. It may be easy to just use what SiLabs provides, but for a robust product you really need to spend some time adding your own conditions to the watchdog algorithm.

A Better Watch Dog Example

Writing good watchdog code requires some significant thought and testing. The possible sources of failure need to be discussed with members of the team and with other Z-Wave developers who are fighting the same fight (thus the need for this blog). I can provide a few guidelines to include in your analysis but this is not a complete solution. Only you know all the possible failure modes of your product and that requires some serious thought and analysis.

Mutex Gets Stuck

The most common failure I have seen is the fact that the SiLabs provided Application Framework (AF) mutex can get stuck. When the mutex is stuck, it most often results in the device still able to receive Z-Wave traffic but often can’t respond. If the device is power cycled, then it returns to full operation. So often this failure goes unnoticed both in testing and in actual use.

What is the mutex you ask? The mutex is a simple flag in the AF that prevents the code from overwriting the Send Buffer while a message is currently being sent over the radio. When a GET command comes in, the AF will call a command class handler to handle the GET and build a REPORT frame in memory. When ready to send the frame, the AF will call pTxBuf=GetResponseBuffer() to get a buffer for the radio to send. There is only one buffer so if the buffer is already in use, you get a NULL pointer back and will have to wait and send the frame later.  This in general works fine as long as frames don’t come in too fast. But in a large network with lots of repeated and re-routed frames you will occasionally get a bunch of GETs quickly and it is possible for the REPORTs to get cross wired and end up locking up the mutex for a frame that will never be sent. If the code then doesn’t properly release the buffer, the mutex is stuck. The Application Framework code is known to lock the mutex occasionally so you must code around this problem. The easiest solution to this rare event is to ensure the watchdog is watching the mutex and simply reboot if it gets stuck for too long.

My solution is to have a counter that counts up once per second in ApplicationPoll anytime ActiveJobs() is true (in SDK 6.81.xx its now called ZAF_mutex_isActive()). ActiveJobs is true anytime a buffer is in use and false when all the buffers are free. There are actually two buffers, one for response frames (REPORTs sent as a result of a GET) and a second buffer for request frames (unsolicited notifications).

Application Specific Reasons

Beyond the mutex you must think long and hard about application specific failure conditions. The most obvious is that the device has not received or sent a frame in 25 hours. Most hubs will poll a device at least a couple of times per day to make sure it is still alive. So if there has been no traffic in a day, maybe something is stuck and a reboot is in order. Plus if nothing has happened in a day then probably no one will notice the reboot (which only takes 1.5 seconds). You do have to be careful that some other part of the application isn’t impacted as a result of the reboot. For example, if you are a light switch and by default you turn the light off on a reboot, then people will be really annoyed if the light randomly turns off because your hub hasn’t polled it in day. There are lots of potential checks you can make here but every application will have different requirements so you will have to think hard about all the possible conditions for your specific case.

Sample good watchdog:

E_APPLICATION_STATE ApplicationPoll( E_PROTOCOL_STATE bProtocolState ) {
if (ActiveJobs()) {              // Mutex buffer is busy
    if (OneSecondTimer) ActiveJobsCounter++;  // Once/sec increment
} else {
    ActiveJobsCounter=0;         // When buffer is free clear counter
if ((ActiveJobsCounter<30) &&       // Mutex isn't stuck 
    (LastCommsHours<25) &&          // Got a frame in the last 24 hrs
    ApplicationSpecificReasons) {   // Other reasons
    ZW_WatchDogKick();              // Everything is OK so reset WDOG

In the example code above we do have a major issue in that if the counters stop counting for some reason, the watchdog will never fire! But that’s easy to check for in ApplicationPoll and if ApplicationPoll itself isn’t running then the WatchDog is no longer being kicked so it will reset.

Doesn’t Work If Not Tested

The old coding adage (proven totally true by me many many times) goes “If the code hasn’t been tested, it doesn’t work”. Same thing applies to your Watchdog code. So how do you test the watchdog? The first thing to do is to log the number of times the watchdog has triggered. This has to be stored in NVM since RAM will be lost when you reboot. Fortunately ApplicationInitHW is called with the bWakeupReason parameter which lets you know the watchdog fired when equal to ZW_WAKEUP_WATCHDOG. Note that usually ApplicationInitHW just stores the bWakeupReason and later in ApplicationInitSW we check it as the NVM isn’t available in InitHW.

ApplicationInitSW(...) {
if (wakeupReason==ZW_WAKEUP_WATCHDOG) { // Increment WDOG counter with max 255
    if (i<255) MemoryPutByte((WORD)&EEOFFSET_NumberWatchDogResets_far, i+1);

Use a Configuration Command Class parameter to read or update this value for testing purposes. I also like to put in a small block of code wrapped in #ifdef WATCHDOG_TESTING_ENABLED that upon receiving a BASIC_SET with a value of 0xDE (not a valid value) calls GetResponseBuffer() which locks up the mutex and in 30 seconds the chip should reboot. If not, then you have a bug in the watchdog code! You can test all the branches in your watchdog code with various values of a BASIC_SET.

When to Enable Watchdog

Perhaps a better question is when NOT to enable the watchdog since ALL production builds absolutely must have the watchdog enabled! My recommendation is to disable the watchdog during development. You want the chip to lock up if you have a bug. The watchdog is really good at masking major bugs since things just keep on working. If the device locks up, then you know something is wrong and you need to chase it down. If you power cycle and the device is fine again, IT IS NOT FINE! You have a bug in your code! During production testing I usually turn the watchdog back on but I also have the testing scripts check the watchdog counter and if it increments then the test fails.

Watchdog Best Practices for Z-Wave Developers

  1. Disable Watchdog during development using #defines
  2. Only kick the watchdog when everything is idle
    1. Kicking every ApplicationPoll is INSUFFICIENT
    2. Check the ActiveJobs() being stuck (aka Mutex)
    3. Check other conditions within your product
  3. Check that the RF has received something every X minutes or hours
  4. Have a way to test the Watchdog during development
  5. Store the number of Watchdog resets in NVM and retrieve them via a configuration parameter


Z-Wave Challenges in MDUs and How to Resolve Them

Deploying a robust Z-Wave network in MDUs (like apartment buildings or hotels) can be challenging unless you follow a few basic rules.
The most common problem in MDU deployment is that many installers fail to take advantage of Z-Wave’s number one technical advantage – the mesh network. Every always-on (wall powered) Z-Wave device adds a node to the mesh. But battery powered devices like door locks, sensors and many thermostats do NOT add nodes to the mesh – they merely benefit from other devices on the mesh network. A system where there is one Z-Wave hub and a door lock in each dwelling unit will result in a poorly performing system because there is no mesh! To build a reliable mesh, every device in the network needs at least two routes between the hub and every device on the network. This means you need at least one Z-Wave repeater or lamp module in every network.
An MDU can easily have dozens or even hundreds of units all within Z-Wave range of each other. If each unit has just a single Z-Wave hub and a door lock, then each unit causes interference with every other unit resulting in a cacophony of Z-Wave traffic. A better solution is to have one hub serve 5 or even 10 units with each unit having at least one always-on device within it to provide a good “mesh” node to access the battery powered devices. Always-on devices in adjacent units help provide routing pathways to improve the robustness of the network. The installer needs the proper tools to evaluate the best location for these always-on devices to ensure a high-quality mesh network with plenty of alternate routes.
Another challenge in MDUs is that things are always changing. An owner might install a mirror (which is a metal plate on glass) or a metal appliance that significantly alters the Z-Wave quality within the unit. Even though the mirror or appliance is not in between the hub and the door lock does not mean that it won’t cause connectivity problems. The solution to this issue lies again with the mesh network and having alternate routes. Since things are always changing, the hub needs to have a policy to “heal” the network occasionally to adjust to the changes in the environment.
If some door locks seem to have short battery life then you might be suffering from limitations in older, pre-500 series Z-Wave devices. Early generations of Z-Wave would wake up battery powered devices like door locks using only their NodeID to request which node to wake up. This works fine in single family homes since every node on the network has a different NodeID, but in an MDU with multiple adjacent Z-Wave networks, if the door lock in each unit is NodeID=2, then every hub will wake up every door lock in the building any time a unit needs to check on the battery level of any door lock. The solution is to ensure each adjacent installation has a different NodeID for door locks or battery powered nodes. Thus, apartment 101 will have the door lock as NodeID=02, apartment 102 will have the door lock as NodeID=03, and so on. The latest generation of Z-Wave solves this problem so as these newer locks come on the market this issue will disappear.

A few quick rules for deploying Z-Wave in MDUs:

  1. Always build a Z-Wave mesh
  2. Install fewer hubs
  3. Use tools like IMA to validate mesh networks
  4. Don’t build the same network in every unit
  5. Network must be flexible due to changing environments

Seven Habits of Highly Effective Z-Wave Networks for Consumers

You have a Smart Home using Z-Wave as a wireless technology for all these Internet of Things (IoT) devices to communicate with each other. But maybe things are not working quite as well as you expect. You press a button on your phone and 1… 2… 3… and then finally a light comes on or maybe it doesn’t come on at all! Another common problem is when a battery powered sensor was updating the temperature last week and this week it just doesn’t seem to be sending updates anymore or at best sporadically. As a Z-Wave expert I’ve built and rebuilt hundreds of Z-Wave networks and have come up with a few habits to make Z-Wave networks more reliable.

1. Minimize Polling

This is probably THE number one mistake new users of Z-Wave make. They figure Z-Wave is a high speed network so they can just poll a light switch every 3 seconds and then react to any change in the switch. Z-Wave and most other wireless networks work best when the network is highly available. If the network is busy, every device that needs to send a message has to wait its turn and then compete (and often collide) with all that polling traffic. Collisions slow everything down just like rubber-necking on the highway.

Polling used to be the only way to get around a patent that fortunately expired in February 2016. The patent forced many light switch manufacturers to not send a message when you flipped the switch. Several manufacturers found ways to get around this or they licensed the patent. But now that the patent has expired, you can get light switches that do send a report immediately when their state has changed.

So the primary way to minimize polling is to replace the few devices in your Smart Home that trigger an event  (or SmartApp or Magic or whatever your hub calls it) with one that will instantly send an update. If you have some older switches but they’re not that important to instantly know their state has changed, you can still poll them but no more than once every few minutes. Remember that if you have 60 Z-Wave devices and you poll each one once/min then you are polling once/second and the network is hammered! So only poll a couple of nodes!

2. Have enough devices to create a mesh

I can’t tell you how many people I’ve worked with that had a door lock and a hub and nothing else, maybe a battery powered thermostat. And they wondered why the connection to the lock was unreliable when the hub was at the far end of the building! Z-Wave relies on Always-On (110VAC powered) nodes to build a “mesh” network. The mesh is the key to Z-Wave reliability. Every Always-On node acts as a repeater in the mesh and is able to forward a message from one node to another in the mesh. But only the Always-On nodes can forward a message. Battery powered devices like door locks and battery powered thermostats cannot forward messages. Only the Always-On nodes can.

Solution: If some devices are not reliable, add more Always-On devices. Add a Z-Wave repeater or any device like a lamp dimmer. Even if you don’t use the lamp dimmer it will act as a repeater and improve the network. I have a few lamp switches I use for my Christmas lights which I leave plugged in year round because they help the Z-Wave network since these nodes are at the periphery of my home.

Distance between nodes is not always the criteria for adding more nodes in a network. The Z-Wave radio signals may bounce off metal objects like mirrors or appliances and cause two nodes that are only a few feet apart be completely unable to talk to each other due to reflections of the radio signals. Adding more nodes in the mesh provide alternate routes to nodes that otherwise might be in a dead zone due to these reflections cancelling out the radio signals.

3. Place the hub in a central location

Putting the hub in a corner of the basement might be convenient, but its a terrible idea for Z-Wave. The hub is the most important node in the network and should have the best location possible. While Z-Wave is a mesh network and can route or hop thru other nodes in the mesh, each hop is a significant delay and chokes up the network with more traffic. Ideally the hub should reach 90% of the nodes in your Smart Home without relying on routing. If the hub has Wifi then putting it in a central location is easy, you just need a wall outlet to plug it in. I have my hub hung off the back of a TV cabinet in roughly the middle of the first floor of my home.

4. Heal the Network

Once a Z-Wave network is built, it has to be “healed” so every node can use all the other nodes in the network to route messages. This healing process can take many minutes to even hours depending on the size of the network. When you first build a Z-Wave network, the first node added only knows that the hub is in the network. When you add a second node, the hub knows that both the nodes are in the network but the first node you added has no idea that node 2 is there – unless you heal the network. So any time you add a node, you need to heal at least a few nodes in the network if not the entire network. Be cautious with the healing process – it uses 100% of the Z-Wave bandwidth during the process and every node will wake up every FliR node (door locks) at least once which will drain the batteries of the FLiR node. Generally only heal when nodes have been added or removed or if there seems to be a problem in the network.

Z-Wave is able to self-heal automatically. Z-Wave nodes will try various routes to get their message thru if at first it doesn’t succeed.  The node will remember the Last Working Route and try that one first for the next message. But if the nodes have no idea there are other nodes in the network they have no way of knowing what routes to try so at least one full heal of the network is required.


homeseerhealHomeSeer has several platforms so the precise method might be slightly different than shown here. From the web interface home page select the menu Plug-Ins->Z-Wave->Controller Management then select the Action “Fully Optimize a Network”. The network wide heal will take some time depending on the size of the network.


SmartThings Expert Z-Wave Eric Ryherd DrZwaveSmartThings  user interface is thru their app which makes finding the network heal a bit of a challenge. Start from the dashboard and click on the three lines in the upper left corner. Your Hub should be the first choice in the menu that slides out, click on your hub. A new menu comes up, click on the last choice “Z-Wave Utilities”. The last choice on the next menu that slides in is “Repair Z-Wave Network” so click on it and then click on “Start Z-Wave Network Repair”. The repair will take from minutes to over an hour depending on the size of your network.


verahealVera has several versions of their UI but each of them has a similar menu structure so these instructions should work on any version. The Vera version shown here is UI7. Use a PC to log into and select your hub. From the Dashboard, select Settings->Z-Wave Settings and then click on the advanced tab. At the bottom of the advanced tab is the GO button to run the “Update Node Neighbors”. Depending on the size of the Z-Wave network this process will take several minutes to over an hour.

5. If a device doesn’t pair, first exclude it, then include it

You’ve taken the brand new Z-Wave IoT widget out of the box and you’ve tried to pair it (the Z-Wave term is “inclusion”) but it just won’t include! Arrrghhh! The first thing to try is to exclude the node first and then try including it. Any hub can “reset” or exclude a Z-Wave device even if that device was previously connected to another network. Some manufacturers occasionally fail to exclude the device during testing so the device may already be connected to their test network. Z-Wave Expert IoT WirelessOr you may have inadvertently included the device but the inclusion process failed somehow and the hub is confused. Excluding the node should reset it to the factory fresh state. Newer Z-Wave Plus devices (which have this logo on them) are required to have a way to reset them to factory defaults using just the device itself. Every device is different so you’ll have to refer to the device manual to perform a factory reset but if all else fails this should make the device ready to pair. Naturally having the hub physically close to the device being paired will also help though most devices can be paired from a distance.

Secure devices like door locks are particularly challenging to pair. First the secure device has to join the Z-Wave network, then the AES-128 encryption keys have to be exchanged and if that process fails (which it does on occasion), then you have to exclude and try the inclusion process all over again. Secure devices definitely want to be within a few feet of the hub during inclusion to ensure reliable and speedy Z-Wave communication.

6. Battery life and how to maximize it

When a battery powered Z-Wave device wakes up and turns on its radio, it uses 10,000 times more battery power than when it’s asleep. So the entire trick to making batteries last is to minimize the amount of time the device is awake. Some devices naturally have other battery draining activities mostly involving motors to throw a deadbolt or raise a window shade. Obviously any motor will use a lot more battery power than the Z-Wave radio but the radio will play a significant role in battery life.

When a battery powered device is added to a Z-Wave network the hub should do two things:

  1. Assign the Association Group 1 NodeID to the hub
    1. Association Group 1 is the “LifeLine” in Z-Wave and devices use this lifeline to send all sensor data and alerts to this node
    2. All hubs are required to assign Group 1 but double check this assignment
  2. Set the Wake Up Interval to no more than once per hour and ideally only a few times per day
    1. Every hub assigns the WakeUpInterval differently and largely handles it behind the scenes so this may be difficult to verify or change
    2. If the device is waking up every few minutes and sends a sensor reading then its battery life isn’t going to be more than a few weeks
    3. The battery level of the device is usually reported at the WakeUpInterval  rate

Many sensors have other Association Groups or Configuration Parameters that will let you specify the frequency of sensor readings. Realize that the more often the sensors report in, the shorter the battery life.

7.  Dead nodes in your controller

One of the big problems in Z-Wave network maintenance is eliminating “dead” nodes. When a device fails or for whatever reason is no longer in use, then it needs to be removed from the controller. If it remains in the controller then the controller will try to route thru this dead node on occasion resulting in delays in delivering messages. Eventually the self-healing aspects of Z-Wave will make this less likely but various devices will on occasion attempt to route thru it. Since the node is dead, that wastes valuable Z-Wave bandwidth and potentially battery power of sleeping devices. Occasionally running a Heal on the network will remove the node from the routing tables but it will remain in the controllers routing tables. It is best to completely remove this dead node. Each hub has a different method for removing dead nodes and usually requires going into an advanced Z-Wave menu.

Following these guidelines will help your Z-Wave experience be more robust. If you have more questions please feel feel to reach out via email to drzwave at