Fast GPIO Wake Up in Z-Wave 700 Series

The Silicon Labs EFR32 family of IoT microcontrollers are very flexible and can do a ton of cool stuff. However, along with all that flexibility comes a lot of complexity. With that complexity are default settings that work fine for many applications but in some cases you want to dig into the details to come up with an optimal solution. In this post I’ll show how to speed up the wake up time for the Z-Wave ZGM130S chip from a GPIO.

But first – a caveat: This post applies to Z-Wave SDK 7.13.x. Future releases of the SDK may have different methods for sleep/wake and thus may require a different solution.

The Problem

Frequently Listening Routing Slaves (FLiRS) devices like door locks and many thermostats spend most of their time in Energy Mode 2 (EM2) to conserve battery power. Once per second they wake up briefly and listen for a Beam from an always-on device. If there is a beam, the FLiRS device will wakeup and receive the Z-Wave command. This allows battery powered devices to use very little power but still be able to respond to a Z-Wave command within one second. FLiRS devices use more battery power than fully sleeping devices like most sensors which use Hibernate Sleep mode (EM4). To wake every second the ZGM130 has to wake quickly and go right back to sleep to minimize power. The problem with EM4 is that it takes a few tens of milliseconds to wake up as the entire CPU and RAM have to be initialized as they were powered down to save power. For a FLiRS device, it’s more efficient to keep RAM powered but in a low-power state and resume quickly to go right back to sleep if there is no beam. Typically the ZGM130 can wake up in about 500 microseconds from EM2. But in many cases this is still too long of a time to stay awake if there are other interrupts such as UARTs or other sensors.

The scope shot above shows the processing that takes place by default on the ZGM130S. In this case I am using a WSTK to drive the SPI pins of another WSTK running the DoorLockKeyPad sample application. The chip is in EM2 at the start of the trace. When SPISEL signal goes low, the chip wakes up. But it is running on the HFRCO oscillator which is not accurate enough to run the radio but it is stable and usable in just a few microseconds. Thus, the SPI clock and data is captured in the USART using this clock. However, by default the Interrupt Service Routine is blocked waiting for the HFXO to stabilize. The 39MHz HFXO crystal oscillator has the accuracy required for the radio.

The question is what’s going on during this 500usec? The answer is the CPU is just waiting for the HFXO to stabilize. Can we use this time to do some other work? Fortunately, the answer is YES! The challenge is that it takes some understanding and some code which I’ll describe below.

The Solution

There are three functions that do the majority of the sleep processing. These are provided in source code so you can read the code but you should not change it. Instead you’ll provide a callback function to do your processing while the chip is waking up.

Simplified Sleep Processing Code:

  1. SLEEP_Sleep in sleep.c: The main function called to enter sleep
    1. CORE_ENTER_CRITICAL – PRIMASK=1 mask interrupts
    2. DO-WHILE loop
      1. Call enterEMx() – this is where the chip sleeps
      2. Call restoreCallback (return 0 to wake, 1 to sleep)
    1. Call EMU_Restore – waits for HFXO to be ready ~500us
    2. CORE_EXIT_CRITICAL – ISRs will now run
  2. enterEMx() in sleep.c:
    1. sleepCallback called
    2. Call EMU_EnterEM[1-4]
    3. wakeupCallback after returning from EMU_EnterEMx
  3. EMU_EnterEM2 in em_emu.c:
    1. Scales voltage down
    2. Call EMU_EM23PresleepHook()
    3. __WFI – Wait-For-Interrupt instruction – ZGM130 sleeps here
    4. Call EMU_EM23PostsleepHook() ~ 17usec after wakeup
    5. Voltage Scale restored which takes ~20us

The code is in sleep.c in the SDK which has a lot more detail but at a high level this is what you need to know. The important part to understand here is where the “hooks” are and how to use them.

  • Use Sleep_initEx() to assign:
    • sleepCallback – called just before sleeping
    • restoreCallback – Return 0 to wake, 1 to sleep
    • wakeupCallback – called after waking
    • Sleep_initEx() input is a pointer to a structure with the three callbacks or NULL if not used
  • Define the function:
    • EMU_EM23PresleepHook()
    • EMU_EM23PostsleepHook()
    • These are both WEAK functions with nothing in them so if you define them then the compiler will install them

The two EMU_EM23* weak functions are run immediately before/after the Wait-For-Interrupt (WFI) instruction which is where the CPU sleeps. These are very low level functions and while you can use them I recommend using the callbacks from Sleep_initEx().

The SLEEP_initEx() function is the one we want to use and in particular the restoreCallback. The comments around the restoreCallback function talk about restoring the clocks but if the function returns a 0 the chip will wake up and if it returns a 1 then it will immediately go back to sleep which is what we want! You can use the other two hooks if you want but the restoreCallback is the key one since it will immediately put the chip back to sleep if everything is idle.

The key to using ANY of these function is that you CANNOT call ANY FreeRTOS functions! You cannot send any Z-Wave frames or call any Z-Wave function as they all require the RTOS. At this point in the wakeup processing the RTOS is not running! All you can do in these routines is to capture data and quickly decide if everything is idle and to go back to sleep. If there is more processing needed, then return 0 and wait for the event in the RTOS and process the data there. You also don’t want to spend too much time in these routines as it may interfere with the timing of the RTOS. A hundred microseconds is probably fine but longer you should wait for the HFXO.

In ApplicationInit() you will call Sleep_initEx() like this:

const SLEEP_Init_t sleepinit = {NULL, NULL, CheckSPI};
...
ZW_APPLICATION_STATUS ApplicationInit(EResetReason_t eResetReason) {
...
SLEEP_InitEx(&sleepinit); // call checkSPI() upon wakeup from EM2.
...
}
...
uint32_t CheckSPI(SLEEP_EnergyMode_t emode) { 
	uint32_t retval=0; // wake up by default
	if (GPIO_IntGetEnabled() & 0x0000AAAA) { // Check SPI
		GPIO_ODD_IRQHandler(); // service the GPIO interrupt
		// wait for all the bytes to come in and compute checksum 
		NVIC->ICPR[0] = NVIC->ICPR[0]; //clear NVIC pending interrupts
		if (!SPIDataError && !IsWakeupCausedByRtccTimeout())	{
			 retval=1; // go back to sleep!
		}
	}
	return(retval); // 0=wakeup, 1=sleep
}

Recall that every second the FLiRS device has to check for a Z-Wave beam which is triggered by the RTCC timer. Thus the check for IsWakeupCausedByRtccTimer ensures that the beaming still works.

This scope shot shows the wake up processing of the ZGM130S:

  1. SPISEL_N SPI chip select signal goes low triggering a GPIO_ODD interrupt
    1. The chip wakes up, the HFRCO begins oscillating
  2. HFRCO begins oscillating in a few microseconds
    1. Once HFRCO is running, the peripherals are functional
    2. SPI data can begin shifting once the HFRCO is running
    3. The default HFRCO frequency is 19MHz but can be increased
    4. Higher frequencies for HFRCO also may need more wait states for the CPU and will use more power
  3. The WFI instruction that put the CPU to sleep is exited here
    1. EMU_EM23PostSleepHook function is called if defined
    2. After returning from PostSleepHook, the VSCALE is returned to full power which takes about 10usec
    3. It is best to wait for the voltage to be powered up to ensure all logic is running at optimal speeds
  4. EMU_EnterEM2 is exited and restoreCallback is called if initialized
    1. This is the function where the ISR should be called to process data
    2. If the data says things are idle and want to go back to sleep, return 1
    3. If more analysis is needed, then return 0
    4. Carefully clear the interrupt bits
      1. First clear the peripheral Interrupt Flags
      2. Then clear the NVIC Interrupt pending register
        1. NVIC->ICPR[n]=NVIC->ICPR[n] where n is 0-1 depending on your interrupt
    5. Make sure there aren’t other reasons to wake up fully
      1. !IsWakeupCausedByRtccTimeout() is the 1s FLiRS interrupt
      2. There may be other reasons to wake up which is application dependent
  5. In this example the SPI data is being fetched from the USART at each toggle of the GPIO
    1. The final toggle shows that the checksum was computed and the data is idle so go back to sleep
  6. The chip returns back to sleep in a few more microseconds
    1. Total processing time of this interrupt is less than 200usec which is a fraction of the time just waiting for the HFXO to stabilize
    2. Much of that time is receiving and processing the SPI data
    3. It is possible to sleep in under 50usec if the check for idle is quicker

If your peripheral processing will take significantly less than 500usec, then it may be more efficient to process the data using the HFRCO and not wait for the HFXO to power up. But if your application needs more processing, then you are probably better off waiting. Each application must make their own calculations to determine the most efficient path.

What About Sleeping Devices?

Fully sleeping devices (EM4 also known as RSS – Routing Sleeping Slaves) have entirely different wake/sleep processing. For sleeping slaves the processor and RAM have to be re-initialized and the chip essentially boots out of reset. All that initialization takes quite a bit of time – a few tens of milliseconds. If your device needs to do a lot of frequent checking of a sensor, then it might make more sense to force it to stay in EM2 by setting a Power Lock to PM_TYPE_PERIPHERAL. For more details on power locks see INS14259 section 7.6. Deciding which way to go is application specific so you have to make the calculations or measurements to find the right balance for your project.

This is a complex posting but I hope I’ve made it clear enough to enable you to optimize your application firmware. Let me know what you think by leaving a comment below.

How to Upgrade Your 700 Series Project from SDK 7.12 to 7.13

This is a very specific posting for Z-Wave developers and specifically for those developing with the new 700 series chips. If you’re not a 700 series developer you can probably stop reading…

I have posted details on upgrading from the 7.12 to the 7.13 Software Developers Kit at this Knowledge Based Article on the Silicon Labs web site: https://www.silabs.com/community/wireless/z-wave/knowledge-base.entry.html/2020/03/30/upgrading_700_seriesprojectfrom7122to7133-VZrM

Z-Wave SDK 7.13.3 released last week with a number of important stability improvements – you want to upgrade your 700 series project to this release!

  • Several stability improvements to prevent lockups in certain corner cases
  • RSSI reporting corrections (both 500 and 700)
  • Improved timing for routed acks and fixed sticky Last Working Routes
  • OTA Firmware Activate support delaying rebooting into the new firmware until all units have been downloaded
  • Details are found in SRN14629.pdf which is included in the Simplicity Studio release: SDK Documentation->End Device->SRN14629 Z-Wave 700 SDK 7.13.x

ZGM130 GPIO Decoder Ring

One of the challenges with using the Z-Wave developers kit is trying to figure out which pin is connected to what. Every pin of the ZGM130S can perform multiple functions but some pins are best used for certain things. This guide provides general recommendations, but these are not hard-and-fast rules. The ZGM130S provides a great deal of flexibility so feel free to explore the many options each pin and peripheral has to offer. I’m hoping I can guide you with some initial suggestions but feel free to delve into the details to find your ideal solution.

This guide is a compilation of a number of documents and while I’ve been thorough, I could easily have made a typo here or there. Please use the online documentation as the official reference and send me an email pointing out my error and I’ll fix it. I couldn’t possibly include every feature for every pin but the details are available in the reference documents below. The Decoder Ring below is an overview of the most commonly used features. The table below contains a lot of information, so you’ll need to view it on a computer with a big screen.

Reference Documents

ZGM130S        ZGM130S Datasheet – See section 6 for the pin definitions
EFR32XG13    Gecko Family Reference Manual – Each peripheral is detailed here
UG381             ZGM130S Zen Gecko DevKit Users Guide – Section 3 describes connectors

ZGM130S GPIO Decoder Table

ZGM130SGPIO port as described in the ZGM130S data sheet
Pin #GPIO pin number of the ZGM130S SIP package
BRD4202AZen Gecko Developers kit board for the ZGM130S
BRD8029AButton/LED board plugged into the WSTK EXP header for the sample apps
WSTK EXPPin number of the main WSTK board expansion header
WSTKMain developers kit board with USB, LCD and Segger J-link debugger
 The Pxx pins are the holes across the long side of the board
 The Fxx pins are secondary functions that typically connect to the isolators
MiniDBGSmall 20 pin ribbon cable used to connect WSTK to a target DUT
ALT FUNCSAlternate functions the GPIO can perform. All GPIOs can be simple digital 1/0s.
 Some pins have special analog or peripheral functions.
CommentsBrief comments describing special functions of the GPIO
Description of the columns in the table below

General Purpose IOs – Use these for your application first

ZGM 130SPin #BRD 4202ABRD 8029AWSTK EXPWSTKMini DBGALT FUNCsComments
PD1027LED_R
P201-35
  P32  LED ON=low, Pullup Ideal for PWM
PD1128LED_G
P201-37
  P34  LED ON=low, Pullup
PD1229LED_B
P200-3
  P36  LED ON=low, Pullup
PC658P201-4
P200-29
 EXP4P1 F16  Easy to use with WSTK since they are on EXP
PC759P201-6 EXP6P3   
PF35P201-13
P200-16
LED1EXP13P10  JTAG_TDI
PF46P201-11
P200-23
LED0EXP11P8   
PF68P201-7
P200-25
BUTN0EXP7P4 F12   
PC860P201-8
P200-28
 EXP8P5 F15   
PC961P201-10SWEXP10P7   
PB1446P201-30  P27  General IOs
PB1547P201-32  P29   
PD926P201-33  P30   
PF57P201-28
P200-24
  P25 F11   
PD1330P201-34
P200-31
  P31 F18 OPA1PUse for Analog functions
PD1534P201-38  P35 OPA1N 
PA237P201-3
P200-21
LED2EXP3P0 F8 OPA0P 
PA439P201-17  P14 OPA0N 
PA540P201-19
P200-18
  P16 F5  VCOM_ENB Must be 1 for WSTK to pass USART0 to USB

Special Function Pins – Use these for their alternate function

ZGM 130SPin #BRD 4202ABRD 8029AWSTK EXPWSTKMini DBGALT FUNCsComments
PC1062P201-15BUTN2EXP15P12 EM4WU12 I2C1_SDAEM4WU pins are the ONLY pins that will wakeup from EM4
PC1163P201-16BUTN3EXP16P13 I2C1_SCLUse for I2C, will NOT wakeup from EM4
PD1431P201-36 P200-30  P33 F17 OPA1 EM4WU4 
PF79P201-9 P200-26BUTN1EXP9P6 F13 EM4WU1 
PA338P201-5 P200-22LED3EXP5P2 F9 VDAC0 OPA0 EM4WU8 
PB1141P201-21 P200-34  P18 F21 OPA2PPTI_CLK
PB1242P201-23 P200-33  P20 F2010OPA2PTI_DATA Future Packet Trace debug
PB1345P201-25 P200-32  P22 F199OPA2N EM4WU9PTI_SYNC  

Fixed Function Pins – Reserved for debug

ZGM 130SPin #BRD 4202ABRD 8029AWSTK EXPWSTKMini DBGALT FUNCsComments
PA035P201-12
P200-19
 EXP12P9 F65USART0VCOM_TX
WSTK->ZGM130
PA136P201-14
P200-20
 EXP14P11 F74USART0VCOM_RX
ZGM130->WSTK
PF24P201-31
P200-15
  P28 F26EM4WU0SWO
JTAG_TDO
PF02P201-27
P200-14
  P24 F18BOOTTXSWCLK
JTAG_TCK
PF13P201-29
P200-13
  P26 F07BOOTRXSWDIO
JTAG_TMS
RST_N15P200-17  F4 SW1023 RESET button next to EXP header
ANT23SMA or PCB     Radio Antenna Move cap R2 for PCB

Power Pins

ZGM 130SPin #BRD 4202ABRD 8029AWSTK EXPWSTKMini DBGALT FUNCsComments
VSS GND1,10,11,
12,13,14,
16,17,18,
19,20,21,
22,24,25,
32,33,43, 48,49,51,
53,54,55,
64
    2 Ground 0 Volts
AVDD44      Analog Power 1.8-3.8V
1V850      Leave unconnected
VREG VDD52      DCDC Reg input
Tie to AVDD
2.4V min for DCDC
DECO UPLE56      No-Connect
IOVDD57      Digital IO Power 1.62V-VREGVDD AVDD>=IOVDD
     VAEM1 DUT Power

EXP Port on the WSTK

BRD8029AWSTKPin #Pin #WSTKBRD8029A
 GND12VMCUVMCU
LED2P0 (PA2)34P1(PC6) 
LED3P2(PA3)56P3(PC7) 
BUTN0P4(PF6)78P5(PC8) 
BUTN1P6(PF7)910P7(PC9)SW Pull up/dn
LED0P8(PF4)1112P9(PA0) 
LED1P10(PF3)1314P11(PA1) 
BUTN2P12(PC10)1516P13(PC11)BUTN3
 BoardID_SCL1718+5VNC
 BoardID_SDA19203V3BRDID PWR
P100 connects to the Button/LED BRD8029A

Legend:
WSTK = hole name (ZGM130 GPIO)
EX: WSTK P100 pin 16 is hole P13, ZGM130S GPIO PC11, Button 3 on the BRD8029A
BRD8029A is the Button/LED board. Functionality is:
LEDs are ON when driven high
BUTN pins are pulled up and go low when the button is pressed
The SW can pull the pin either up or down depending on the slide setting
left/on is 1, right/off is 0 thru 1MOhm resistors
The VMCU powers the LEDs/Buttons but the board ID is powered via pin 20
Schematic is in Simplicity Studio

Mini-Simplicity Debug header: Top View

NamePin #Pin #Name
VAEM (3V3)12GND
RESET_N34VCOM_RX (PA1)
VCOM_TX (PA0)56SWO(PF2)
SWDIO(PF1)78SWCLK(PF0)
PTI_SYNC(PB13)910PTI_DATA(PB12)

On the DUT board with the ZGM130S on it, connect VCOM_RX to PA1 and VCOM_TX to PA0. The header is a 0.05” pitch 10 pin header. Typically use a Samtec FTSH-110 but to make the pads even smaller use just the pads for a thru-hole connector. See my post “700 series Debug Header” from October 2019 for more details on the debug header.

Recommendations

Remember that the GPIOs in all the Wireless Gecko chips from Silicon Labs are very flexible and most peripheral functions can be routed to almost any IO pin. That said, there are some pins that have fixed functions and some pins are best at certain other functions. Let’s start with the pins you can’t use.

Power Pins

Obviously, the power pins have to be connected to the proper voltage. The ZGM130 has a lot of ground pins so make sure every one of them is connected to a solid ground. The voltages for the power pins have a few rules but in most cases you’ll tie AVDD and VREGVDD to the same power source of 3.3V. The on-chip DC to DC regulator allows a wide voltage range on VREGVDD so there is some additional flexibility which requires careful consideration. But for most applications, tie these two power pins directly to the battery. Most of the time IOVDD is also tied to the battery though depending on other chips in your system you might need an LDO to keep the IOs at the proper voltage.

Debug Pins

The next set of pins you should not use for GPIOs are the debug pins. All ten IO pins on the Mini-Simplicity header should be wired up as shown in the table above. The UART pins are very important to print out debug information while the chip is running. The advantage here is that you can see things happening in real-time without using a scope or logic analyzer. And you can easily enable/disable blocks of debug statements as needed. The sample apps print all sorts of debug information so you’ll want to do the same in your code. The UART defaults to only 115200 baud but it’s easy to increase it up to roughly 1Mbaud to reduce the time impact of printing debug messages. Also, keep the messages short or else they may impact the timing so much that your code will fail. If you’re really in a pinch for one more IO, you could use the SWO or VCOM_RX pins as you probably don’t need them.

Special Function Pins

A number of pins have specific functions especially for certain analog functions. The ADC/DAC only have their primary and alternate IOs though you can route them thru the ABUS. Only a few of these pins can wake the chip from sleeping in EM4. Be very careful and utilize these pins for buttons or sensors that need to wake the chip up.

General Purpose IOs  

The rest of the pins are available for any purpose. Most of the pins can be any digital IO function. Some pins are handy because of their connection on the DevKit. There are three IOs connected to an RGB LED on the 4202 board. These are really handy to visually observe PWM waveforms based on the color of the LED. The pins that are routed to the EXP header on the WSTK are easy to connect to a proto board. Be careful to avoid using PA5 as that is connected to the VCOM enable pin on the WSTK. PA5 has to be high to route the debug UART data to the USB bus. But this is a restriction only when using the DevKit 4202 board. If connecting to your own PCB via the Mini-Simplicity header you can freely use PA5.

Conclusion

The ZGM130S is a flexible chip with plenty of compute power, RAM and Flash. The GPIOs can largely be routed from any peripheral to any IO though there are some restrictions. The secret decoder ring given here maps out which pin can do what and which ones should be used for your application. This is only a guide and the chip has many more features than I can describe here. Refer to the Reference Documents for more in depth knowledge.

Z-Wave is No Longer Proprietary

One of the impediments to the expansion of Z-Wave by some developers is that Z-Wave was considered “proprietary”. But that impediment has now been removed with yesterdays announcement of the open availability of Z-Wave. Read more about it here. The Z-Wave RF protocol has been an ITU standard for many years – IUT-T G.9959. But that wasn’t enough to be able to get a product to market due to the all the software you would have to develop and then it was uncertain if it could be Z-Wave Certified.

I just wanted to make a quick note here about this very important announcement.

700 Series Debug Header

We’re all trying to make the Smart Home products smaller and less visible. Using coin cells instead of bulky cylindrical batteries significantly reduces the size of many products. The challenge with making products smaller is that the area available on the PCB for a debug header is in short supply.

With the 500 series I usually used a 0.1″ spacing 12 pin header from the ZDP03A programmer to the target board. The header was normally not installed in the final product but for debug purposes the solid thru-hole connector meant I would reliably program a device the first time and every time.

However, many customers I’ve worked with want to use less PCB real estate which means they come up with a custom set of test points. Typically a jig with spring loaded pins are used to contact to the PCB or more often wires are soldered to the PCB. The problem with this solution is that the jig is large, expensive and fragile. Soldering a cable to a board often results in a fragile connection where the cable can easily break a pin and not be immediately obvious. I’ve spent far too much time trying to figure out why I could program the part a minute ago but now I can’t only to realize the cable has a loose wire.

Nasty unreliable hand-soldered fragile Z-Wave debugging cable
Unreliable Z-Wave programming cable

500 Series Header

My recommendation for the 500 series is to use a full size 12 pin 0.1″ spacing connector for programming and debug. Either SMT or thru-hole is fine but either way you have a solid, reliable, portable connection. While this worked OK with the 500 series which typically used large cylindrical batteries, the 700 series often uses coin cells which doesn’t have the real estate for a full size connector.

Reliable 500 series header = 15x5mm

700 Series Debug Header

Fortunately Silicon Labs has an even better solution for the 700 series – use a 0.05″ spacing 10 pin SMT header. The Mini Simplicity Debug connector is described in AN958. If you have a little room then use the standard SMT header which is 6x6mm. If you are very tight on real estate then put down the pads for the thru-hole version of the connector but hand-solder the thru-hole header to the pads. Using just the pads results in a header only 3x6mm. You can’t tell me you can’t come up with 18sqmm to make the PCB debug reliable!

Either solution requires only a small amount of space on a single side of the PCB. Usually the header pads can be under a coin cell since during debug a power supply is used instead of the battery. This same header can be used for production programming using a jig to contact to the pads. Having a standard and reliable connection to the PCB will save you time during debug and on the production floor.

Reliable 700 series header = 6x6mm

Conclusion

No matter how tempted you might be to come up with your own cable/connector/test points, DON’T DO IT! Use the standard Mini Simplicity connector to save you so many headaches during debug. A solid, reliable debug connection is an absolute must otherwise you risk spinning your wheels chasing ghosts that are caused by a flakey connector. Take it from me, I’ve tried to debug just way too many of these over the years and it is not fun.

Z-Wave Summit 2019

ZWaveSummit2019

If you missed the Spring Z-Wave summit in Amsterdam, you won’t want to miss the fall summit in Austin Texas this fall! The Z-Wave summit is a great place to meet other Z-Wave developers and hear about the latest technology and marketing advances Z-Wave has to offer. The summit is a three day event that is packed with valuable information and learning for both developers and marketing people. The first day is an evening networking get together, the 2nd day is mostly roadmap presentations and information about how Z-Wave is doing and where it is going. The final day splits into two tracks with developers learning about details of the Z-Wave technology and marketing folks learning how to leverage the Alliance resources and all the events coming up.

Join us October 2-4, 2019 for the Z-Wave Fall Summit 2019 in Austin. Hot topic will be the 700-series! The 3-day event features a keynote & reception on the evening of the 2nd by 2-days of informative and insightful Developer’s Forum Technical Track and a Business / Marketing Track and a member networking evening event on the 3rd. All members are invited to attend!

 Date: 10/2/2019 to 10/4/2019
When:

10/2/19 – 5pm-9pm – Opening Reception

10/3/2019 – 9am-5pm – Developers/Marketing forums and Evening Reception

10/4/2019 – 9am-4pm -Keynote & forums – UnPlugFest

Where: W Hotel
200 Lavaca Street
Austin, Texas  78701
United States

Notes from EU Summit in May

The EU summit was right on the heels of the 700 series general release so it’s been a very busy time for everyone. The Z-Wave roadmap has major releases every 6 months and minor between them. Silicon Labs plans on a long life and improving volume shipments for Z-Wave and is investing in the technology to realize these gains. Attendance at the summit was up again with attendees from around the world.PANO_20190509_160327.vr

Certification costs recently went up but adding frequencies (countries/regions) is now just paperwork and free. The all important Certification Test Tool (CTT) was recently updated to version 2.8.4. If you are heading to certification with your new whiz-bang product you’ll want to test it with the latest version of all the scripts. The CTT is actively being improved so future releases will be even more powerful.

All 700 series Certifications must be Z-WavePlus V2. The logo remains the same but V2 raises the bar on several fronts. The main upgrade for V2 is that devices largely advertise their capabilities without the need for Hubs to write custom drivers for each and every device. Specifically, Configuration Command Class now requires the name/default/min/max values as well as some text describing the function of the parameter to be returned via a GET command.  However, 500 series are not required to support V2 but are recommend if you have the code space for it. For Hubs the bar has been raised significantly with Security S2, SmartStart and many interoperability improvements that will require some resources to achieve certification.

Expect the RF range minimums to go up from the current 40m (132′)  with the improved radio of the 700 series. The range requirement hasn’t changed yet but there are lots of discussions on the topic. Most of the devices I’ve worked on (over 3 dozen) have consistently achieved over 100m of line-of-sight RF range so I expect the minimum to probably double. This does make testing a bit more difficult as finding a location with 100m of open space can be a bit of a challenge, especially in winter!

Presentations

The presentations at the EU summit (and most of the previous ones) can be viewed from the Alliance web site. You need to be a member to gain access to them. My presentation (along with Axel Brugger) was a deep dive on using the 700 series and getting familiar with Simplicity Studio for developing end devices. Hans Kroner gave a similar presentation on developing gateways using ZIPGW/ZWare. While you can view the presentation online I highly recommend attending in person so you can ask questions and get straight answers right there on the spot.

Fun and Games

The members night was at The Beach which is an indoor beach games facility with lots of crazy stuff. Great food, Great music, Great games made for an enjoyable evening after a long day of stuffing information in your brain!

IMG_20190509_175126

 

The US Summit was held in Austin Texas Oct 2-4

The US summit was held at the Silicon Labs office and at the nearby W Hotel in Austin Texas. Another excellent turnout with lots of informative sessions from technical training to marketing by leveraging the Z-Wave Alliance.

This is just a fraction of the turnout at the US Summit in Oct 2019

Interoperables are BACK

Mitch Klein received a Lifetime Achievement award from CEDIA a few months back. As the leader of the Interoperables band we had quite the show all to ourselves along with plenty of food, drink and excellent conversation. By far the most beneficial part of the Summit is the chance to talk to your fellow Z-Wave developers, marketers, executives and enthusiasts.

The Z-Wave band “Interoperables” are back with a wide repertoire of classic rock

Zniffer File HowTo

Z-Wave developers have a handy tool for debugging firmware and Z-Wave network issues called the Zniffer. The Zniffer consists of two parts, the first is a USB dongle with special firmware and the second is the Windows program. You can’t buy just a Zniffer USB dongle (they come as part of some of the developers kits) but you can make one out of a standard UZB. You can even make a SuperZniffer as described in my previous blog posting. The Zniffer program is included in the Simplicity Studio IDE tools for developing Z-Wave products.

Zniffer traces are INVALUABLE when submiting a support case to the Silicon Labs Z-Wave support web site. I am an Field Applications Engineer so I often review Zniffer traces captured by developers who have questions or are reporting bugs. The problem is that many times I get a support case that says “Zniffer trace attached – what is problem?” and the Zniffer trace is several hundred megabytes with dozens of Z-Wave networks and maybe one hundred Z-Wave nodes captured across days of time. Talk about the proverbial needle in a haystack! So I am asking everyone to follow a few rules BEFORE attaching a Zniffer trace to a support case.

Zniffer File Rules

Before attaching a Zniffer file for Z-Wave support to review, include the following:

  1. The HomeID of the network with the problem
  2. The NodeID of the Z-Wave node that demonstrates the problem
  3. The line number or the date/time of the where the problem occurred (or a range)
  4. The Security Keys of the Z-Wave network
  5. A clear and concise description of the problem, what should have happened, what didn’t happen, what you believe is wrong

ZnifferHowTo1

HomeID

The HomeID of a Z-Wave network is a 4 byte, eight digit hexadecimal number that uniquely identifies a single Z-Wave network. Only devices with the same HomeID can talk to each other. In a development environment there are often dozens or even hundreds of Z-Wave networks in range. Remember the Zniffer captures every network in the air. Please do not filter the HomeID when saving out the Zniffer file as there may be critical interactions with other network or even noise that will be filtered out if you save only the matching HomeID. We can always filter by HomeID when displaying the network on our PC but we can’t see the data if its not in the file.

NodeID

The NodeID of the node that is displaying the issue has to be identified. You might have dozens of nodes in the network who are all talking at once so we need to know which one is the one with the problem. Please include details of the device as well such as what type it is (binary switch, thermostat, sensor, battery powered, etc) . Ideally if you can include the device within the zniffer file that will tell us just about everything we need to know as the NIF will be exchanged and the interview will take place.

Date/Time

Each transaction in the Zniffer trace is identified by a line number on the left side or the date/time. Indicating the line number or date/time or a range of these will help us navigate the potentially huge Zniffer file and quickly zoom in on the problem. Wading thru days of Zniffer data to finally find the interesting bit is just wasting our time and yours.

Security Keys

If you are working with Secure devices you MUST include the security keys. Without the security keys the data is encrypted and it is all just meaningless ones and zeroes and we can’t help you. Now that all devices are required to be secure, the key file is critical. The Zniffer trace has to include the SPAN table update as without the SPAN table we again cannot decrypt the message. The easiest way to be sure the SPAN is included is to add the device-under-test (DUT) to the network while capturing the Zniffer trace. The other option is to power cycle the DUT which will usually cause the DUT and the controller to exchange Nonces to resynchronize the SPAN table and we can once again decrypt the messages in the Zniffer.

To extract the security keys, join the PC Controller to the Z-Wave network. Be sure to enable all levels of security by providing the S2 DSK of the PC Controller. Once joined to the network, the keys can be saved to a file using the procedure below:

ZnifferHowTo2

The filename is the HomeID.txt which in the case above is FFE5B5C9.txt and contains:

9F;C592557B5F99DDC9BDD12D0D926BAFE5;1
9F;31944FE8F8DE2330E79741313A949190;1
9F;18FD847446AFD7E410B2BCF8912BC632;1
98;C65D55C44FB2156635CA07A48D362AD3;1

To decyrpt the messages in the Zniffer, just click on Load Keys and enter the directory for the file. Then all the messages are decrypted and we can help you solve your problem.

Pilgrimage to the Z-Wave Homeland

I finally made the trip to the home of Z-Wave, Copenhagen Denmark. Z-Wave began as Zensyszensys-logo back in 2001. My journey with Z-Wave began shortly after that in 2003 when I was disgusted with my highly unreliable X10 home automation experiments. I just couldn’t get that X10 junk to work! It was cheap, but it wasn’t worth my time and frustration so I was looking around for other technologies that would be reliable. I experimented with several custom baked wireless solutions but quickly realized that wireless is really hard and complicated. Z-Wave caught my eye because it was a real mesh network and actually worked. From there I have continued to be impressed by the technology improvements always with full backward compatibility and wide choice of fully interoperable products from many manufacturers.

My purpose is to meet the engineering team and to learn in much greater depth the details of Z-Wave and especially the new 700 series. We have an intense group of smart engineers working diligently on the many aspects of a wireless system as complex as Z-Wave. One team is busy with the gateway specific parts of the protocol and the Z/IP Gateway and Z-Ware code. Another team is working on the protocol and solving very complex issues that we find are happening in the real world. The support team (of which I am part) helps customers get their products to market quickly by answer their questions and providing training. And of course there are the marking and sales folks who make sure you all know about the benefits of Z-Wave.

ZWaveCPHThe Z-Wave team in Copenhagen resides in this modest building. Danes love to bicycle to work or take the excellent train/bus system. Only a few travel via car unlike those of us in the US who just love sitting in traffic for hours. The food in Copenhagen is wonderful with plenty of international choices as well as the Danish favorites. My hotel room is more of a spaceship pod than a boring room with the Danish penchant for efficient minimalism. The view of the windmills in the distance and the quaint classic European architecture are beautiful.

Z-Wave EU Summit

The other purpose of my visit across the pond is to attend and speak at the Z-Wave EU Summit. If you are a Z-Wave developer, I highly recommend attending as you’ll learn about the latest Z-Wave technology and best practices to build robust IoT products. We have two technical tracks in addition to the marketing track. Meeting with your fellow Z-Wave developers to share your experiences and learn from theirs is the main value of the summit. The only way to get that experience is to attend in person. See my other postings about last years EU summit and the US summits to get a feel of what goes on.

The summit is next week in Amsterdam. Click HERE for more details on the summit.

z-wave_spring_summit_19

 

 

Z-Wave Watchdog Timer Best Practices

WatchDogVirtually all embedded systems must run 24 x 7 x 365 x many many years without ever being rebooted. Since there is no one there to “press the reset button” if the device fails, the watchdog timer is there to do just that. The 500 series Z-Wave chips from Silicon Labs have a watchdog timer and the example code provides a very minimal use of the watchdog timer. However, the minimal use in the example code is not sufficient to provide a robust watchdog for embedded Z-Wave devices. This post explains some rules and methods to code a robust watchdog timer.

Long time embedded expert Jack Ganssle has a great article on Watchdog timers. He describes the use of a watchdog timer on the Clementine spacecraft where a fault in the system caused the spacecraft to dump virtually of its fuel resulting in the loss of the mission. The lead software engineer had wanted a watchdog but the designers decided not to include it. Jacks example shows how important it is to spend at least some time coding a robust watchdog for our IoT devices. While our devices aren’t controlling multi-million dollar spacecraft, we are coding light switches that are hardwired into the wall and cannot be easily rebooted. Try telling the customer to go into the basement and toggle the power to his entire house to reboot the light switches!

What is a Watchdog?

A watchdog timer is a timer that runs constantly. Typically a complex combination of events resets (or “kicks”) the watchdog timer every now and then, usually every few milliseconds. If the combination of events ever gets stuck, the timer will continue to run. If the watchdog timer “times out”, the system is reset – basically the reset button is pushed! Your embedded system reboots and keeps on running. Generally no one even realizes it has rebooted (I’ll discuss that problem in more detail shortly).

WatchdogTimerThis diagram shows the Watchdog timers value which is constantly counting up. Every time the Watchdog is “kicked”, the counter is reset to zero. Somewhere in your code the ZW_WatchDogKick() routine is called which resets the watchdog timer. Sometimes this reset condition happens on a nice regular basis, sometimes it happens at varying times as shown by the level of the timer. The key is the timeout threshold has to be longer than any normal operating condition. If a fault condition occurs, the timer keeps on counting up until the threshold is reached and then the system is reset. When the watchdog timer fires, the Z-Wave chip goes thru a full reset just as if power had been removed and reapplied. Your embedded system is back up and running as if nothing had happened.

SiLabs Sample Code = Minimal Watchdog

The SiLabs sample code has the following implementation of the watchdog:

BYTE ApplicationInitSW(ZW_NVM_STATUS nvmStatus) {
...
#ifdef WATCHDOG_ENABLED
 ZW_WatchDogEnable();
#endif
} 

void ApplicationPoll(void){
#ifdef WATCHDOG_ENABLED
 ZW_WatchDogKick();
#endif
}

The sample code has the good implementation practice of putting the Watchdog code inside #defines so it can be easily enabled/disabled. Unfortunately it blindly kicks the dog every ApplicationPoll without checking any other conditions. ApplicationPoll is called roughly every few hundred microseconds and a lot of fault conditions can exist and ApplicationPoll will still be called. With this implementation the only way the watchdog is going to fire is if there is a catastrophic failure and ApplicationPoll is no longer being called. While this implementation is better than nothing, it won’t reset the system in many cases where the device has become unresponsive. This is where you come in, you have to add more code to the watchdog algorithm. It may be easy to just use what SiLabs provides, but for a robust product you really need to spend some time adding your own conditions to the watchdog algorithm.

A Better Watch Dog Example

Writing good watchdog code requires some significant thought and testing. The possible sources of failure need to be discussed with members of the team and with other Z-Wave developers who are fighting the same fight (thus the need for this blog). I can provide a few guidelines to include in your analysis but this is not a complete solution. Only you know all the possible failure modes of your product and that requires some serious thought and analysis.

Mutex Gets Stuck

The most common failure I have seen is the fact that the SiLabs provided Application Framework (AF) mutex can get stuck. When the mutex is stuck, it most often results in the device still able to receive Z-Wave traffic but often can’t respond. If the device is power cycled, then it returns to full operation. So often this failure goes unnoticed both in testing and in actual use.

What is the mutex you ask? The mutex is a simple flag in the AF that prevents the code from overwriting the Send Buffer while a message is currently being sent over the radio. When a GET command comes in, the AF will call a command class handler to handle the GET and build a REPORT frame in memory. When ready to send the frame, the AF will call pTxBuf=GetResponseBuffer() to get a buffer for the radio to send. There is only one buffer so if the buffer is already in use, you get a NULL pointer back and will have to wait and send the frame later.  This in general works fine as long as frames don’t come in too fast. But in a large network with lots of repeated and re-routed frames you will occasionally get a bunch of GETs quickly and it is possible for the REPORTs to get cross wired and end up locking up the mutex for a frame that will never be sent. If the code then doesn’t properly release the buffer, the mutex is stuck. The Application Framework code is known to lock the mutex occasionally so you must code around this problem. The easiest solution to this rare event is to ensure the watchdog is watching the mutex and simply reboot if it gets stuck for too long.

My solution is to have a counter that counts up once per second in ApplicationPoll anytime ActiveJobs() is true (in SDK 6.81.xx its now called ZAF_mutex_isActive()). ActiveJobs is true anytime a buffer is in use and false when all the buffers are free. There are actually two buffers, one for response frames (REPORTs sent as a result of a GET) and a second buffer for request frames (unsolicited notifications).

Application Specific Reasons

Beyond the mutex you must think long and hard about application specific failure conditions. The most obvious is that the device has not received or sent a frame in 25 hours. Most hubs will poll a device at least a couple of times per day to make sure it is still alive. So if there has been no traffic in a day, maybe something is stuck and a reboot is in order. Plus if nothing has happened in a day then probably no one will notice the reboot (which only takes 1.5 seconds). You do have to be careful that some other part of the application isn’t impacted as a result of the reboot. For example, if you are a light switch and by default you turn the light off on a reboot, then people will be really annoyed if the light randomly turns off because your hub hasn’t polled it in day. There are lots of potential checks you can make here but every application will have different requirements so you will have to think hard about all the possible conditions for your specific case.

Sample good watchdog:

E_APPLICATION_STATE ApplicationPoll( E_PROTOCOL_STATE bProtocolState ) {
...
if (ActiveJobs()) {              // Mutex buffer is busy
    if (OneSecondTimer) ActiveJobsCounter++;  // Once/sec increment
} else {
    ActiveJobsCounter=0;         // When buffer is free clear counter
}
...
if ((ActiveJobsCounter<30) &&       // Mutex isn't stuck 
    (LastCommsHours<25) &&          // Got a frame in the last 24 hrs
    ApplicationSpecificReasons) {   // Other reasons
    ZW_WatchDogKick();              // Everything is OK so reset WDOG
}

In the example code above we do have a major issue in that if the counters stop counting for some reason, the watchdog will never fire! But that’s easy to check for in ApplicationPoll and if ApplicationPoll itself isn’t running then the WatchDog is no longer being kicked so it will reset.

Doesn’t Work If Not Tested

The old coding adage (proven totally true by me many many times) goes “If the code hasn’t been tested, it doesn’t work”. Same thing applies to your Watchdog code. So how do you test the watchdog? The first thing to do is to log the number of times the watchdog has triggered. This has to be stored in NVM since RAM will be lost when you reboot. Fortunately ApplicationInitHW is called with the bWakeupReason parameter which lets you know the watchdog fired when equal to ZW_WAKEUP_WATCHDOG. Note that usually ApplicationInitHW just stores the bWakeupReason and later in ApplicationInitSW we check it as the NVM isn’t available in InitHW.

ApplicationInitSW(...) {
...
if (wakeupReason==ZW_WAKEUP_WATCHDOG) { // Increment WDOG counter with max 255
    i=MemoryGetByte((WORD)&EEOFFSET_NumberWatchDogResets_far);
    if (i<255) MemoryPutByte((WORD)&EEOFFSET_NumberWatchDogResets_far, i+1);
}

Use a Configuration Command Class parameter to read or update this value for testing purposes. I also like to put in a small block of code wrapped in #ifdef WATCHDOG_TESTING_ENABLED that upon receiving a BASIC_SET with a value of 0xDE (not a valid value) calls GetResponseBuffer() which locks up the mutex and in 30 seconds the chip should reboot. If not, then you have a bug in the watchdog code! You can test all the branches in your watchdog code with various values of a BASIC_SET.

When to Enable Watchdog

Perhaps a better question is when NOT to enable the watchdog since ALL production builds absolutely must have the watchdog enabled! My recommendation is to disable the watchdog during development. You want the chip to lock up if you have a bug. The watchdog is really good at masking major bugs since things just keep on working. If the device locks up, then you know something is wrong and you need to chase it down. If you power cycle and the device is fine again, IT IS NOT FINE! You have a bug in your code! During production testing I usually turn the watchdog back on but I also have the testing scripts check the watchdog counter and if it increments then the test fails.

Watchdog Best Practices for Z-Wave Developers

  1. Disable Watchdog during development using #defines
  2. Only kick the watchdog when everything is idle
    1. Kicking every ApplicationPoll is INSUFFICIENT
    2. Check the ActiveJobs() being stuck (aka Mutex)
    3. Check other conditions within your product
  3. Check that the RF has received something every X minutes or hours
  4. Have a way to test the Watchdog during development
  5. Store the number of Watchdog resets in NVM and retrieve them via a configuration parameter

 

Z-Wave 700 Series Announcement

The long awaited 32-bit ARM based Z-Wave transceiver chip has finally been officially announced. The 700 series announcement is on the home page of the Silicon Labs web site so this is a big deal for SiLabs and Z-Wave. The companies joined forces just eight months ago and we already have a major advance in Z-Wave technology.

Z-Wave 700 Press Image

What’s New?

The 700 series is a major improvement to Z-Wave for both consumers and developers. For consumers, the lower power and longer radio range means more reliable communication and longer battery life. For developers the main advantage is we’ve finally moved beyond the 1980s 8051 8-bit CPU with very limited debug capability into a modern 32-bit ARM CPU with full serial wire debugging capabilities. We can FINALLY single step thru code instead of having to use PRINTF!

700 Series Features:

  • Longer RF range
    • 150% in the US and 200% in the EU due to improved RF sensitivity and increased transmit power in the EU
  • Lower Power = Longer Battery Life
    • Improved semiconductor technology and a faster CPU yields significant battery life improvement and 10 year coin cell operation
  • Lower Cost – Worldwide Support
    • Improved RF blocking means the country specific SAW filter is not needed saving cost and making a single SKU for worldwide operation
    • No external serial memory is required and OTA firmware update is now mandatory
  • Easier Product Development
    • Integrated Debug Environment (IDE) with full ARM debug, single step, trace, and energy profiler speeds product development
  • 100% Interoperable and Backwards Compatible
    • The 700 series is fully interoperable with all mesh-networked Z-Wave devices all the way back to the pre-100 series Z-Wave devices

When Can I Get One?

If you already signed up for a free Beta devkit, then one should be on its way in the next few weeks. Devkits have begun shipping but quantities are limited and will take until the end of January before all the Beta samples are shipped. The Beta signup closed back on October 1st so if you missed the deadline you’ll have to wait until later in Q1 to request one from your Silicon Labs salesperson. The official “general availability” (GA) release is the end of Q1 at which time the datasheets, chips, devkits and software will be released at the 7.xx full release version. Datasheets require an NDA until the GA release.

You can get started today using the Simplicity Studio IDE and begin developing code and explore the SDK. The software is free and can be downloaded from here.

My Initial Thoughts

I’ve had a 700 series Beta DevKit for a few weeks now working with our Alpha release partners to get some early feedback. We’ve had some hiccups and the firmware needs more work but the silicon is solid. I have joined my 700 series devkit to my home and it communicates fine with my very early pre-100 series Z-Wave light switches attesting to the ongoing commitment Z-Wave has to be fully interoperable and backwards compatible.