Z-way-server occasionally goes down

Discussions about Z-Way software and Z-Wave technology in general
User avatar
GokMasE
Posts: 59
Joined: 13 Mar 2016 01:04
Location: Sweden

Re: Z-way-server occasionally goes down

Post by GokMasE »

I *think* that if you have outgoing HTTP requests, seeing new threads created and exited repeatedly is to be expected.

For quite some time I have been logging with gdb due to suspected instabilities when using the HTTPGet module to forward status/level changes of devices. The gdb output you are seeing looks very much like mine did.
JohannesF
Posts: 36
Joined: 04 Jan 2021 13:20

Re: Z-way-server occasionally goes down

Post by JohannesF »

Thanks for sharing your experience! In fact, I do have quite a number of Tasmota devices. Mostly plugs and temp sensors.

Did you have a chance to look at the extract of the log-file captured around a recent crash? I had posted it in the initial question.
You can easily navigate to the moment of the crash (03:30) as I inserted a blank line.

I was puzzled to see that many Error messages but couldn't identify a specific root cause of that crash.

THX and regards!
lanbrown
Posts: 279
Joined: 01 Jun 2021 08:06

Re: Z-way-server occasionally goes down

Post by lanbrown »

GokMasE wrote:
10 Jan 2023 20:02
I *think* that if you have outgoing HTTP requests, seeing new threads created and exited repeatedly is to be expected.

For quite some time I have been logging with gdb due to suspected instabilities when using the HTTPGet module to forward status/level changes of devices. The gdb output you are seeing looks very much like mine did.
I can accelerate the issue by change the polling for the remote z-way system from 5 seconds to the "recommended" and default setting of 1 second. It is the GUI that becomes inaccessible but does appear that other parts of the of the z-way-server process are still running. The remote z-way system application is from Z-Wave.me. So we have at least three known people that this appears to impact.

It would be nice if the remote Z-Way system application used WebSockets rather than polling. It would reduce the load on the HTTP side since it wouldn't be polling for changes, it would just wait to send or receive them as they occur. It is doubtful that there are sensor or other Z-Wave device state changes every second or even every five seconds. WebSockets is already implemented for the Home Assistant integration, so changing the remote Z-Way server app to use it wouldn't be a total rewrite.
JohannesF wrote:
10 Jan 2023 22:00
Thanks for sharing your experience! In fact, I do have quite a number of Tasmota devices. Mostly plugs and temp sensors.

Did you have a chance to look at the extract of the log-file captured around a recent crash? I had posted it in the initial question.
You can easily navigate to the moment of the crash (03:30) as I inserted a blank line.

I was puzzled to see that many Error messages but couldn't identify a specific root cause of that crash.

THX and regards!
If you do a Ctrl-c from where gdb is running, it will take you back to a gdb prompt. If you then issue the "bt" command it will output some data. I would bet that your output is similar to mine.
lanbrown
Posts: 279
Joined: 01 Jun 2021 08:06

Re: Z-way-server occasionally goes down

Post by lanbrown »

I still haven't heard back from PoltoS. I have emailed support as well. The title of the email I sent was the same as this thread and even included a link. It might make sense for you to do the same.
JohannesF
Posts: 36
Joined: 04 Jan 2021 13:20

Re: Z-way-server occasionally goes down

Post by JohannesF »

Good idea - done ;-)
lanbrown
Posts: 279
Joined: 01 Jun 2021 08:06

Re: Z-way-server occasionally goes down

Post by lanbrown »

Thanks. I don't know how much PoltoS does on the development side. So going direct to the support email might help get this issue resolved once and for all. It could be as simple as some config file changes for the HTTP server. I believe that when you install Z-Way that Mongoose which is a lightweight HTTP server gets installed. I've used that as a standalone app and maybe some config settings can fix the issue. That would be nice if that is the case since they could just provide the updated file or what to change and not have to wait for a release.

At this point I'm wondering if a nightly restart for z-way-server via cron wouldn't be an interim solution. The drawback I can see is a short duration where either feeds in or out don't work. You have your system talking via HTTP to other devices. So in your case there would be a disruption. In my case it would be Home Assistant not being able to communicate with it. A restart shouldn't take too long but there is always that chance of loss of communication right when something needs to be done.
User avatar
PoltoS
Posts: 7565
Joined: 26 Jan 2011 19:36

Re: Z-way-server occasionally goes down

Post by PoltoS »

Hi! Thank you all for the investigations you did. Especially to GokMasE who was feeding us with a lot of debug info for a year. Finally, it looks like his last observation lead us to the root of the bug. We have reproduced it in our lab and fixed it (we hope).

TLDR; we have made a new nightly build to test if our fix solved that. Please install z-way-4.0.2-4-ge4433cb0-lws16_armhf.deb and let it run for a while in the most stressful environment.

I'll try to share the solution with you. In short, the HTTP module (modhttp.so) is a wrapper of cURL for Google V8 JS engine Z-Way is using. The problem was that the thread releasing pointers to JS callback functions was doing it without a JS lock. All that was ok until HTTP answer was returning with a significant delay - this led to many objects being allocated and then freed. V8 has an internal garbage collector (GC) and when an object is released immediately, it is not involved. But when they are allocated in mass and then freed, it is called regularly. So it happened that GC was working while HTTP thread was releasing pointers to callbacks. A simple lock missing.

The clue to stably reproduce it was many concurrent requests (generated by HTTPGet, Tasmota or Z-Way-to-Z-Way binding) AND slow response of the remote side.
lanbrown
Posts: 279
Joined: 01 Jun 2021 08:06

Re: Z-way-server occasionally goes down

Post by lanbrown »

I'm currently running the nightly build on my system that crashes the most. I'll let it run for a few days and then if it is still running without crashing, install it on my other five systems.

Thank you for tracking this bug down.
lanbrown
Posts: 279
Joined: 01 Jun 2021 08:06

Re: Z-way-server occasionally goes down

Post by lanbrown »

So far so good even though it has only been a few hours.
JohannesF
Posts: 36
Joined: 04 Jan 2021 13:20

Re: Z-way-server occasionally goes down

Post by JohannesF »

Hi,

Some superstitious people here in Germany believe a Friday, 13th brings bad luck. It seems PoltoS just proved them wrong! Thanks so much for running an extra night shift, for chasing and finally hunting down the issue.

A big Merci also to GokMasE and Ianbrown. Without your guidance and patience I could have never gone so far on my own.

I installed the patch and so far it works like a charm. I'll let you know how my story continues.

Have a good weekend
Johannes
Post Reply