send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Discussions about RaZberry - Z-Wave board for Raspberry computer
Mirar
Posts: 113
Joined: 19 Oct 2014 16:54
Location: Stockholm

Re: send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Post by Mirar »

Any news on this?

edit: since the answer is "probably not", I made a shell script that runs this loop ever 20 seconds:

Code: Select all

 wget -q -O - http://127.0.0.1:8083/ZWaveAPI/InspectQueue | grep -q '0\.2000.*0\.2000' && ( echo `date` "queue hanged"; /usr/sbin/service z-way-server restart)
But I really feel I shouldn't have to do that.
Mirar
Posts: 113
Joined: 19 Oct 2014 16:54
Location: Stockholm

Re: send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Post by Mirar »

And now it seems to have died more completely:

Code: Select all

[2015-01-11 20:24:59.426] [I] [core] ZWay Automation started
[2015-01-11 20:24:59.452] [I] [core] Terminating Z-Wave binding
[2015-01-11 20:25:04.459] [I] [core] Restarting Z-Wave binding
[2015-01-11 20:25:04.490] [I] [core] Notification: critical (z-wave): Can not start Z-Wave binding: Error: Binding with name 'zway' already exists
[2015-01-11 20:25:10.649] [E] [zway] SaveData will not save data since it wasn't loaded. This is to prevent data loss.
After a reboot:

Code: Select all

[2015-01-11 20:29:45.525] [I] [core] ZWay Automation started
[2015-01-11 20:29:45.594] [I] [core] Terminating Z-Wave binding
[2015-01-11 20:29:50.642] [I] [core] Restarting Z-Wave binding
[2015-01-11 20:29:50.700] [I] [core] Notification: critical (z-wave): Can not start Z-Wave binding: Error: Binding with name 'zway' already exists
[2015-01-11 20:29:57.112] [E] [zway] SaveData will not save data since it wasn't loaded. This is to prevent data loss.
UI just says "error handling data from server".

More detailed end of log:

Code: Select all

[2015-01-11 21:00:58.048] [D] [zway]   [49] "WatchDogStart"
[2015-01-11 21:00:58.048] [D] [zway]   [50] "WatchDogStop"
[2015-01-11 21:00:58.048] [D] [zway]   [51] "ZMEFreqChange"
[2015-01-11 21:00:58.048] [D] [zway]   [52] "ZMERestore"
[2015-01-11 21:00:58.048] [D] [zway]   [53] "ZMEBootloaderFlash"
[2015-01-11 21:00:58.048] [D] [zway]   [54] "ZMECapabilities"
[2015-01-11 21:00:58.049] [D] [zway] Opened device: /dev/ttyAMA0
[2015-01-11 21:00:58.050] [D] [zway] Worker thread successfully created
[2015-01-11 21:00:58.061] [D] [zway] Worker thread entry point
[2015-01-11 21:00:58.066] [I] [zway] Adding job: Get controller info and supported function classes
[2015-01-11 21:00:58.072] [D] [zway] SENDING: ( 01 03 00 07 FB )
[2015-01-11 21:00:58.073] [D] [zway] RECEIVED CAN
[2015-01-11 21:00:58.083] [D] [zway] SENDING: ( 01 03 00 07 FB )
[2015-01-11 21:00:58.085] [D] [zway] RECEIVED CAN
[2015-01-11 21:00:58.095] [D] [zway] SENDING: ( 01 03 00 07 FB )
[2015-01-11 21:00:58.096] [D] [zway] RECEIVED CAN
[2015-01-11 21:00:58.107] [D] [zway] SENDING: ( 01 03 00 07 FB )
[2015-01-11 21:00:58.108] [D] [zway] RECEIVED CAN
[2015-01-11 21:00:58.118] [D] [zway] SENDING: ( 01 03 00 07 FB )
[2015-01-11 21:00:58.120] [D] [zway] RECEIVED CAN
[2015-01-11 21:00:58.130] [D] [zway] SENDING: ( 01 03 00 07 FB )
[2015-01-11 21:00:58.131] [D] [zway] RECEIVED CAN
[2015-01-11 21:00:58.137] [D] [zway] RECEIVED: ( 01 0C 00 04 00 02 06 31 05 03 0A 00 04 CA )
[2015-01-11 21:00:58.138] [D] [zway] SENT ACK
[2015-01-11 21:00:58.138] [I] [zway] Unhandled request for function class 0x04
[2015-01-11 21:00:58.138] [D] [zway] SENDING: ( 01 03 00 07 FB )
[2015-01-11 21:00:58.140] [D] [zway] RECEIVED ACK
[2015-01-11 21:00:58.143] [D] [zway] RECEIVED: ( 01 2B 01 07 01 61 06 06 06 06 06 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 B4 )
[2015-01-11 21:00:58.143] [D] [zway] SENT ACK
[2015-01-11 21:00:58.144] [D] [zway] SETDATA controller.data.APIVersion = "01.97"
[2015-01-11 21:00:58.144] [D] [zway] SETDATA controller.data.manufacturerId = 1542 (0x00000606)
[2015-01-11 21:00:58.144] [D] [zway] SETDATA controller.data.manufacturerProductType = 1542 (0x00000606)
[2015-01-11 21:00:58.144] [D] [zway] SETDATA controller.data.manufacturerProductId = 1536 (0x00000600)
[2015-01-11 21:00:58.150] [D] [zway] SETDATA controller.data.vendor = ""
[2015-01-11 21:00:58.150] [D] [zway] Supported Function Classes:
[2015-01-11 21:00:58.150] [D] [zway] ---------------------------
[2015-01-11 21:00:58.150] [D] [zway] SETDATA controller.data.capabilities = int[0]
[2015-01-11 21:00:58.150] [D] [zway] ---------------------------
[2015-01-11 21:00:58.150] [D] [zway] Initialization done
[2015-01-11 21:00:58.151] [D] [zway] Job 0x07 (Get controller info and supported function classes): success
[2015-01-11 21:00:58.151] [C] [zway] Get Serial API Capabilities returned zero.
[2015-01-11 21:00:58.151] [I] [zway] Removing job: Get controller info and supported function classes
[2015-01-11 21:00:58.161] [D] [zway] Worker thread exit point
[2015-01-11 21:00:58.162] [D] [zway] Worker thread successfully finished
[2015-01-11 21:00:58.216] [I] [core] Notification: critical (z-wave): Can not start Z-Wave binding: Error: Bad data - communication failed
[2015-01-11 21:00:58.218] [I] [core] Loading module ScheduledScene from modules/ScheduledScene
[2015-01-11 21:00:58.221] [I] [core] Executing script: /*** ScheduledScene Z-Way HA module ******************************************* ...
[2015-01-11 21:00:58.224] [I] [core] Loading module DelayedScene from modules/DelayedScene
[2015-01-11 21:00:58.227] [I] [core] Executing script: /*** DelayedScene Z-Way HA module ******************************************* ...
[2015-01-11 21:00:58.230] [I] [core] Loading module HomeKitGate from modules/HomeKitGate
[2015-01-11 21:00:58.235] [I] [core] Executing script: /* ...
[2015-01-11 21:00:58.237] [I] [core] Loading module BindDevices from modules/BindDevices
[2015-01-11 21:00:58.241] [I] [core] Executing script: /*** BindDevices Z-Way HA module ******************************************* ...
[2015-01-11 21:00:58.244] [I] [core] Loading module SensorsPollingLogging from modules/SensorsPollingLogging
[2015-01-11 21:00:58.247] [I] [core] Executing script: /*** SensorsPollingLogging Z-Way HA module ******************************************* ...
[2015-01-11 21:00:58.250] [I] [core] Loading module ImportRemoteHA from modules/ImportRemoteHA
[2015-01-11 21:00:58.256] [I] [core] Executing script: /*** ImportRemoteHA Z-Way HA module ******************************************* ...
[2015-01-11 21:00:58.259] [I] [core] Loading module LogicalRules from modules/LogicalRules
[2015-01-11 21:00:58.266] [I] [core] Executing script: /*** LogicalRules Z-Way HA module ******************************************* ...
[2015-01-11 21:00:58.269] [I] [core] Loading module OpenRemoteHelpers from modules/OpenRemoteHelpers
[2015-01-11 21:00:58.274] [I] [core] Executing script: /* RaZOR ...
[2015-01-11 21:00:58.277] [I] [core] Starting automation...
[2015-01-11 21:00:58.372] [I] [core] Starting storage...
[2015-01-11 21:00:58.376] [I] [core] ZWay Automation started
[2015-01-11 21:00:58.390] [I] [core] Terminating Z-Wave binding
[2015-01-11 21:01:03.410] [I] [core] Restarting Z-Wave binding
[2015-01-11 21:01:03.431] [I] [core] Notification: critical (z-wave): Can not start Z-Wave binding: Error: Binding with name 'zway' already exists
[2015-01-11 21:01:09.589] [E] [zway] SaveData will not save data since it wasn't loaded. This is to prevent data loss.
Downgrading to 1.7.1 causes z-way-server to exit silently; 2.0.0 is no change in behaviour. I presume

Code: Select all

[2015-01-11 21:00:58.130] [D] [zway] SENDING: ( 01 03 00 07 FB )
[2015-01-11 21:00:58.131] [D] [zway] RECEIVED CAN
[2015-01-11 21:00:58.137] [D] [zway] RECEIVED: ( 01 0C 00 04 00 02 06 31 05 03 0A 00 04 CA )
[2015-01-11 21:00:58.138] [D] [zway] SENT ACK
means the hardware is alive, but

Code: Select all

[2015-01-11 21:00:58.216] [I] [core] Notification: critical (z-wave): Can not start Z-Wave binding: Error: Bad data - communication failed
isn't good.

edit: /dev/ttyAMA0 works fine.

Is there any documentation available on the protocol used on /dev/ttyAMA0?

edit: magic echo commmands did not help.
I cannot do anything with the power of the pi, my arms are not 2000km long.
pofs
Posts: 688
Joined: 25 Mar 2011 19:03

Re: send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Post by pofs »

You're probably lucky that magic echo commands didn't work, because the first one is resetting the controller to factory defaults (with all devices lost).

You need only the second one:

Code: Select all

echo -ne "\x01\x03\x00\x08\xf4" > /dev/ttyAMA0
You should completely stop z-way-server before executing it. Then you'll probably need to execute this command for a few times, because your devices are flooding controller with reports which require ACK, and you're sending command instead of ACKing the incoming packet.

You may even try an advanced version of the command (which sends ACK to the possible incoming packet before restarting chip):

Code: Select all

echo -ne "\x06\x01\x03\x00\x08\xf4" > /dev/ttyAMA0
Mirar wrote:Is there any documentation available on the protocol used on /dev/ttyAMA0?
This is under NDA. But there's also OpenZWave, and they probably have some kind of description available.
Mirar
Posts: 113
Joined: 19 Oct 2014 16:54
Location: Stockholm

Re: send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Post by Mirar »

pofs wrote:You may even try an advanced version of the command (which sends ACK to the possible incoming packet before restarting chip):

Code: Select all

echo -ne "\x06\x01\x03\x00\x08\xf4" > /dev/ttyAMA0
Thanks. That worked. But yes, the network is now reset... Restoring from backup doesn't seem to do much. Is it supposed to work? (It listed all devices, but they are now "dead".)
pofs wrote:This is under NDA. But there's also OpenZWave, and they probably have some kind of description available.
Ah, right.

(edit:)

What is the easiest way to get my network up and running again now? Presumably I have to do that when I come back home in a few days. Complete reset and exclude/include on all devices? Can I just include them again? Should backup restoration actually work?
pofs
Posts: 688
Joined: 25 Mar 2011 19:03

Re: send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Post by pofs »

If all the devices appeared after restore (with same identifiers and homeId), then restore has actually worked.
Not sure why they appear as dead. Try to refresh their status (there's either a button in expert UI for that, or just run

Code: Select all

http://ip:8083/ZWaveAPI/Run/IsFailedNode(x)
where x is device id)
Mirar
Posts: 113
Joined: 19 Oct 2014 16:54
Location: Stockholm

Re: send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Post by Mirar »

pofs wrote:If all the devices appeared after restore (with same identifiers and homeId), then restore has actually worked.
Not sure why they appear as dead. Try to refresh their status (there's either a button in expert UI for that, or just run

Code: Select all

http://ip:8083/ZWaveAPI/Run/IsFailedNode(x)
where x is device id)
The names of the devices were not restored and interview fails.

It seems like homeId wasn't restored from the backup:

From the backup file:
<data name="homeId" invalidateTime="1416423472" updateTime="1416423474" type="int" value="-387486128"/>
From UI:
Home Id: 0xfbde48d8

Shouldn't it be 0xe8e76e50? How do I set this?

edit: After another full reset in the API followed by backup restoration the network is alive. Yay! I have no idea why it did a half restore before?

(Backup restoration doesn't seem to work in my Firefox. It never asks for a file. It works in Chrome, though.)
Mirar
Posts: 113
Joined: 19 Oct 2014 16:54
Location: Stockholm

Re: send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Post by Mirar »

Ok, new phenomena again:

Code: Select all

View Job Queue
Legend

n = Send count, W = Wait wakeup, S = Wait security, E = Encapsulated, D = Done, U = Urgent
n	U	W	S	E	D 	Ack	Resp	Cbk	Timeout 	NodeId	Description	Progress	Buffer
0		W				-		-	0.20	13	Request node neighbours update		d
1	U	W				-	-	-	0.20	13	NoOperation	Not delivered to recipient	d 1 0 5
0		W				-	-	-	0.20	13	SensorMultilevel V1-4 Get		d 6 56 1 31 4 1d a2 5
3					D	+	+	-	0.02	16	SwitchBinary Set	Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Removing job due to too much retransmitions	10 3 25 1 ff 25
3					D	+	+	-	0.05	16	SwitchBinary Get	Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Removing job due to too much retransmitions	10 2 25 2 25
3					D	+	+	-	9.00	6	Meter Get (v2)	Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Removing job due to too much retransmitions	6 3 32 1 0 25
3					D	+	+	-	9.02	6	Meter Get (v2)	Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Removing job due to too much retransmitions	6 3 32 1 10 25
3					D	+	+	-	13.90	11	SwitchBinary Get	Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Removing job due to too much retransmitions	b 2 25 2 25
3					D	+	+	-	14.76	16	SwitchBinary Set	Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Removing job due to too much retransmitions	10 3 25 1 ff 25
3					D	+	+	-	14.79	16	SwitchBinary Get	Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Removing job due to too much retransmitions	10 2 25 2 25
3					D	+	+	-	15.67	6	Meter Get (v2)	Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Removing job due to too much retransmitions	6 3 32 1 0 25
3					D	+	+	-	15.70	6	Meter Get (v2)	Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Removing job due to too much retransmitions	6 3 32 1 10 25
3					D	+	+	-	18.65	6	Meter Get (v2)	Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Removing job due to too much retransmitions	6 3 32 1 0 25
3					D	+	+	-	18.68	6	Meter Get (v2)	Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Could not be delivered to Z-Wave stack
Removing job due to too much retransmitions	6 3 32 1 10 25
Queue length: 14
Is the hardware dying?

(Fixed, again, with the reset string while z-way-server was down.)
Last edited by Mirar on 12 Jan 2015 19:32, edited 1 time in total.
pz1
Posts: 2053
Joined: 08 Apr 2012 13:44

Re: send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Post by pz1 »

Maybe you have been sending to many requests in a short time to the controller? Check the routing table in the Expert User interface.
Since 29-12-2016 I am no longer a moderator for this forum
Mirar
Posts: 113
Joined: 19 Oct 2014 16:54
Location: Stockholm

Re: send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Post by Mirar »

How does the routing table show me any info related to this? It looks perfectly normal.

A few dozen packets a second shouldn't be a problem for the sigma controller. Was there a firmware update recently? All these problems came when I upgraded to 2.0.0.

The controller is ZW0301. Is the new one better?
User avatar
PoltoS
Posts: 7649
Joined: 26 Jan 2011 19:36

Re: send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Post by PoltoS »

This looks more like a dying h/w.... ;(

Or your network is too overloaded with unsolicited reports. First try to tun it switching all most of your sensors.
Post Reply