Reply
Posts: 1
Registered: ‎08-01-2017

assert issue in packet-buffer.c, when sending reports to a non-responding device

I develop a switch which is a router device in zigbee network. But I met a problem that the switch will reboot, if I power off the coordinator. The reboot reason is an assert happened in the packet-buffer.c file which is packaged by silab. This issue is easy to reproduce in case the switch report message to a non-responding coordinator.(after the switch joined the network, and then power off the coordinator). As our switch need to work properly no matter the coordinator is exist or not. My question is how i can solve this problem? Or, How I can workaround this problem?
<a href="http://community.silabs.com/t5/Welcome-and-Announcements/Community-Ranking-System-and-Recognition-Program/m-p/140490#U140490"><font color="#000000"><font size="2">Hero</font></font> </a> YK
Posts: 183
Registered: ‎02-13-2017

Re: assert issue in packet-buffer.c, when sending reports to a non-responding device

When I tested light/switch on 5.9 and 5.10, I don't see this issue. Which EmberZNet version do you use?
If my reply answers your question, please click on "Kudo"s or "Accept as Solution"s to benefit others who have the same issue.

YK Chen
Highlighted
Posts: 75
Registered: ‎04-26-2016

Re: assert issue in packet-buffer.c, when sending reports to a non-responding device

Would you please let us know the details about the issue.

 

1) What's the version of the stack?

2) Please attached the crashed log in the console.

3) Please attached the *.map file in your project (for the crashed device ).

4) If you can tell us the details of the reproduced steps would better.

 

Best Regards,

Lei

 

Posts: 4
Registered: ‎03-09-2017

Re: assert issue in packet-buffer.c, when sending reports to a non-responding device

I wanted to note that I have this exact same issue very frequently.  Here's the log I see:

 

ERROR: tx 66, Profile: HA (0x0104), Cluster: 0x0403, 8 bytes, ZCL Global Cmd ID: 10
ERROR: tx 66, Profile: HA (0x0104), Cluster: 0x0403, 8 bytes, ZCL Global Cmd ID: 10

[ASSERTRobot tongueacket-buffer.cEnter app

, MSP = 30433435, PSP = 200054C0
PC = 00005140, xPSR = 41000000, MSP used = 000004F0, PSP used = 00000000
CSTACK bottom = 20004B60, ICSR = 00000806, SHCSR = 00070008, INT_ACTIVE0 = 00000000
INT_ACTIVE1 = 00000000, CFSR = 00010000, HFSR = 00000000, DFSR = 00000000
MMAR/BFAR = E000ED34, AFSR = 00000000, Ret0 = 00006B43, Ret1 = 0000516B
Ret2 = 0000B14F, Ret3 = 0000B187, Ret4 = 0001A3CD, Ret5 = 0001A471
Dat0 = 0000B448, Dat1 = 000001E3init pass

 

The "ERROR: tx 66" lines are the key.  This device is an end device (EFR32MG1B732F256GM32) running Ember ZNet 5.9.2.0 sending very frequent (~2s) reports to a Gateway device, but the Gateway device is not responding.  The Gateway is another EFR32MG1B732F256GM32 operating as an NCP attached to a Pi running the Linux Gateway reference project (basically like the Gateway reference examples, but with an EFR32 instead of the CEL stick).

 

My assumption is that we are overflowing the packet buffer because we are pumping in new reports faster than the stack is clearing out the ones which are not getting a response.

 

In summary, I think the following steps would reproduce the issue:

 

1. Device A requests very frequent reporting of some attribute or attributes from Device B.  In our case it's several 48-bit integers.

2. Device A goes offline or otherwise quits responding.

3. Device B gets endless 66 errors and eventually crashes with a packet-buffer.c assert.  

 

I could provide a .map file in a support ticket.

 

As a side note, any ideas why the Gateway would quit responding?  All I did was stop and restart the siliconlabsgateway service.  I can tell from our debug that the Gateway is receiving the reports sent by the end device, and tries to send acknowledgements, but the acknowledgements are not received by the end device (I end up seeing 66 errors on both sides).  

 

Ironically, the packet-buffer assert actually "fixes" the problem...as soon as the end device crashes, reboots and rejoins the problem goes away.  I think there must be something off about how the Gateway's network is set up when the gateway service starts that prevents it from reaching the end device it is receiving reports from, but I don't know what it could be.

 

-Zac