send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Discussions about RaZberry - Z-Wave board for Raspberry computer
Mirar
Posts: 113
Joined: 19 Oct 2014 16:54
Location: Stockholm

Re: send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Post by Mirar »

pofs wrote:Are you performing polling, or you configured your devices to send unsolicited reports?

Polling might become an issue when done on timer basis. You shouldn't make a new request while not received the previous answer.
I'm using both.

I'm not sure why this is an issue, can you explain?

I'm polling on 1 minute (switches) and 10 minute (sensor) basis (+random 0 to 10 seconds to spread out requests).

This should be enough time to either time it out of the queue (dead node) or get a reply. But if this is a problem, how do I check if my request is still in the queue? (JSON API) (And why isn't z-way-server checking this, if it's a problem? I thought this was what 'removed duplicate' log lines were about.)

If the problem is that I can't re-request data from a dead/sleeping node that never replied, I'm even more confused on how I can test for this or why it's a problem with the queue.

Any explanation on the "0.20" number?
pofs
Posts: 688
Joined: 25 Mar 2011 19:03

Re: send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Post by pofs »

Ah, polling once in a minute is not an issue, I thought you're polling much more often so you have 10 responses a second.

0.20 sec is a default timeout for receiving acknowledgement from z-wave chip when a packet is sent. But it is never sent, so there's no countdown.

You should try decreasing the number of reports per second (make devices report every minute or so) and see if the problem persists.
Mirar
Posts: 113
Joined: 19 Oct 2014 16:54
Location: Stockholm

Re: send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Post by Mirar »

I removed all my timed polling, no change, it hangs:

Code: Select all

Legend

n = Send count, W = Wait wakeup, S = Wait security, E = Encapsulated, D = Done, U = Urgent
n	U	W	S	E	D 	Ack	Resp	Cbk	Timeout 	NodeId	Description	Progress	Buffer
0						-	-	-	0.20	5	SwitchBinary Set		5 3 25 1 ff 25
0						-	-	-	0.20	4	SwitchBinary Set		4 3 25 1 ff 25
0						-	-	-	0.20	5	SwitchBinary Set		5 3 25 1 0 25
0						-	-	-	0.20	5	SwitchBinary Get		5 2 25 2 25
0						-	-	-	0.20	4	SwitchBinary Set		4 3 25 1 0 25
0						-	-	-	0.20	4	SwitchBinary Get		4 2 25 2 25
0						-	-	-	0.20	10	SensorBinary Get		a 2 30 2 25
0						-	-	-	0.20	7	SwitchBinary Set		7 3 25 1 ff 25
0						-	-	-	0.20	7	SwitchBinary Get		7 2 25 2 25
0						-	-	-	0.20	11	SwitchBinary Set		b 3 25 1 0 25
0						-	-	-	0.20	11	SwitchBinary Get		b 2 25 2 25
0						-	-	-	0.20	8	SwitchBinary Set		8 3 25 1 ff 25
0						-	-	-	0.20	8	SwitchBinary Get		8 2 25 2 25
0		W				-	-	-	0.20	14	Wakeup Sleep		e 2 84 8 5
0						-	-	-	0.20	6	Meter Get (v2)		6 3 32 1 0 25
0						-	-	-	0.20	6	Meter Get (v2)		6 3 32 1 10 25
0						-	-	-	0.20	16	SwitchBinary Set		10 3 25 1 ff 25
0						-	-	-	0.20	16	SwitchBinary Get		10 2 25 2 25
Queue length: 18
Considering how much the Aeon Labs stuff bugs when I'm trying to set it up, I don't think I can try that easily. I would unplug them, but I'm 200km away from home at the moment.

As I said, the system (with this amount of data) worked perfectly before I upgraded to 2.0.0. And no amount of chip/chip-API reset helps. And data is coming in fine, and other transmitters work as well (switches). All it's needed is to restart z-way-server, no hardware reset needed either.

So I still suspect it's a bug somewhere in z-way-server. I'd like to help debug it, but I don't know where to start. Maybe I should start recompiling it with debug on, to run gdb with more useful results?

edit: I don't see how to recompile it, there's no Makefile or anything and mostly .a libraries... so I guess I'd need an install bundle with debug.

edit 2: "Soft Reset" goes through the queue and vanishes after 20 seconds, but doesn't change anything:

Code: Select all

1	U				D	+			10.59	0	Soft reset	
pofs
Posts: 688
Joined: 25 Mar 2011 19:03

Re: send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Post by pofs »

Can you provide us a gdbserver (or ssh, doesn't matter) remote access when you're back home so we can connect to your box and try to debug it?
Mirar
Posts: 113
Joined: 19 Oct 2014 16:54
Location: Stockholm

Re: send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Post by Mirar »

Yes, I can do that remote too. Right now it's not hanged.

Edit: It's setup. Password in PM.
Mirar
Posts: 113
Joined: 19 Oct 2014 16:54
Location: Stockholm

Re: send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Post by Mirar »

Did you do anything with this? Now it's hanged again:

Code: Select all


n	U	W	S	E	D 	Ack	Resp	Cbk	Timeout 	NodeId	Description	Progress	Buffer
0						-	-	-	0.20	7	SwitchBinary Set		7 3 25 1 0 25
0						-	-	-	0.20	7	SwitchBinary Get		7 2 25 2 25
0						-	-	-	0.20	8	SwitchBinary Set		8 3 25 1 ff 25
0						-	-	-	0.20	8	SwitchBinary Get		8 2 25 2 25
0						-	-	-	0.20	5	SwitchBinary Set		5 3 25 1 ff 25
0						-	-	-	0.20	5	SwitchBinary Get		5 2 25 2 25
0		W				-	-	-	0.20	14	Wakeup Sleep		e 2 84 8 5
0						-	-	-	0.20	16	SwitchBinary Set		10 3 25 1 ff 25
0						-	-	-	0.20	16	SwitchBinary Get		10 2 25 2 25
0						-	-	-	0.20	6	Meter Get (v2)		6 3 32 1 0 25
0						-	-	-	0.20	6	Meter Get (v2)		6 3 32 1 10 25
Queue length: 11
I'll leave is like this for 24h or until I get a reply.
pofs
Posts: 688
Joined: 25 Mar 2011 19:03

Re: send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Post by pofs »

I just logged in to your RPi, and I don't see queue hanged.
It is pretty much alive, new jobs are added and old ones are removed.
Mirar
Posts: 113
Joined: 19 Oct 2014 16:54
Location: Stockholm

Re: send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Post by Mirar »

Yes, I just saw that. I guess it will hang again eventually. I didn't do anything to make it go again, though, so I'm a bit confused.

(As I said, I'm 200km away at the moment and on sparse-internet vacation, so I don't check that often. But razberry is happily working remote as well. :) )

I just saw something I never saw before: One (!) hanged node (7). The other nodes I tried get their commands sent. How did this happen?

Code: Select all

n = Send count, W = Wait wakeup, S = Wait security, E = Encapsulated, D = Done, U = Urgent
n	U	W	S	E	D 	Ack	Resp	Cbk	Timeout 	NodeId	Description	Progress	Buffer
0						-	-	-	0.20	7	SwitchBinary Set		7 3 25 1 ff 25
0						-	-	-	0.20	7	SwitchBinary Get		7 2 25 2 25
4					D	+	+	+	3.94	16	SwitchBinary Set	Delivered	10 3 25 1 ff 25
1					D	+	+	+	4.02	16	SwitchBinary Get	Delivered	10 2 25 2 25
1					D	+	+	+	5.10	6	Meter Get (v2)	Delivered	6 3 32 1 0 25
1					D	+	+	+	5.21	6	Meter Get (v2)	Delivered	6 3 32 1 10 25
1					D	+	+	+	14.69	6	Meter Get (v2)	Delivered	6 3 32 1 0 25
1					D	+	+	+	14.78	6	Meter Get (v2)	Delivered	6 3 32 1 10 25
1					D	+	+	+	18.64	16	SwitchBinary Set	Delivered	10 3 25 1 ff 25
1					D	+	+	+	18.74	16	SwitchBinary Get	Delivered	10 2 25 2 25
Queue length: 10
How do I request the queue over http/ZWaveAPI? Then I can build an alarm script.
pofs
Posts: 688
Joined: 25 Mar 2011 19:03

Re: send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Post by pofs »

Mirar wrote:How do I request the queue over http/ZWaveAPI? Then I can build an alarm script.

Code: Select all

curl http://127.0.0.1:8083/ZWaveAPI/InspectQueue
We also found and fixed one possible issue with queue when local time is adjusted backwards. We will release it soon to see if it resolves your issue.
Mirar
Posts: 113
Joined: 19 Oct 2014 16:54
Location: Stockholm

Re: send queue hanging (z-way-server v2.0.0, v2.0.1rc7)

Post by Mirar »

The thing I saw before continues. Now two nodes hanged:

Code: Select all


Legend

n = Send count, W = Wait wakeup, S = Wait security, E = Encapsulated, D = Done, U = Urgent
n	U	W	S	E	D 	Ack	Resp	Cbk	Timeout 	NodeId	Description	Progress	Buffer
0						-	-	-	0.20	16	SwitchBinary Set		10 3 25 1 ff 25
0						-	-	-	0.20	16	SwitchBinary Get		10 2 25 2 25
0						-	-	-	0.20	6	Meter Get (v2)		6 3 32 1 0 25
0						-	-	-	0.20	6	Meter Get (v2)		6 3 32 1 10 25
Queue length: 4
edit: definitely growing. 2 more entries in the queue now, node 6 switchbinary set/get.
Feel free to login and check. No nodes are dangerous so turn on/off (just lights).

edit 2: and now it's clear again, without restarting.
pofs wrote:
Mirar wrote:How do I request the queue over http/ZWaveAPI? Then I can build an alarm script.

Code: Select all

curl http://127.0.0.1:8083/ZWaveAPI/InspectQueue
We also found and fixed one possible issue with queue when local time is adjusted backwards. We will release it soon to see if it resolves your issue.
Thanks. Interesting. I might have that problem, I started up ntpd to my local network ntp master server so all my computers are synced... (And feel free to alpha it on my razberry if you want.)
Post Reply