Need frequent server restarts to control CPU load

Discussions about Z-Way software and Z-Wave technology in general
User avatar
GokMasE
Posts: 59
Joined: 13 Mar 2016 01:04
Location: Sweden

Re: Need frequent server restarts to control CPU load

Post by GokMasE »

piet66 wrote:
25 Aug 2022 10:37
After 2 days running, I can state that the problem is solved.
Thanks
Great to hear that, another stability and performance issue sorted.

Then it is also safe to say that the problem with HTTP producing crashes is not related to your problem with the average CPU load slowly buildning up - before this summer, I tried the fix suggested here and it did not cure my problem unfortunately.

(I am using HTTPGet module to share sensor data updates, which seems to "reliably" trigger a crash within 0.5-2 days of uptime here. Crash seems easy to reproduce within that time frame, but apparently the reason behind it is very tricky to diagnose.)
seattleneil
Posts: 172
Joined: 02 Mar 2020 22:41

Re: Need frequent server restarts to control CPU load

Post by seattleneil »

Regarding the HTTPGet module (aka app) causing Z-Way to crash...

As written by the author, the HTTPGet module sends an HTTP request whenever there's a change of value in any Z-Wave device. In my Z-Wave network, the module's simplistic logic caused a large number of HTTP requests to be sent. If you have a "busy" Z-Wave network (e.g., lots of devices changing values), the HTTPGet module may be the culprit for Z-Way crashing. Keep in mind, even if you don't have many Z-Wave devices, some Z-Wave devices can be very chatty (e.g., my dimmer switches rapidly send multiple Switch Multilevel Report messages as the brightness changes).

To reduce the number of HTTP requests being sent by the HTTPGet module, I modified the module's index.js file to limit which devices will trigger the HTTP request (to binary and multilevel switches).

Code: Select all

In file /opt/z-way-server/automation/userModules/HTTPGet/index.js, I modified the following:
From:
       self.get_url(device, value);
To:
       if (device.get("deviceType") == "switchBinary" || device.get("deviceType") == "switchMultilevel") {
               self.get_url(device, value);
       }
If you only want to send an HTTP request when a specific device changes value, you could modify index.js to check the device.id value (e.g., if (device.id == "[Zway Device ID]") { ... }).

Note that for Z-Way to use the modified index.js, you'll need to either restart z-way-server or reinitialize the module (curl -s -u [USERNAME]:[PASSWD] --globoff '127.0.0.1:8083/ZAutomation/api/v1/modules/reinitialize/HTTPGet).

It will be interested to learn if modifying index.js prevents Z-Way from crashing on your system. If so, the module's logic could be enhanced to have the user configure which devices (or device types) should trigger the HTTP request.
User avatar
GokMasE
Posts: 59
Joined: 13 Mar 2016 01:04
Location: Sweden

Re: Need frequent server restarts to control CPU load

Post by GokMasE »

seattleneil wrote:
26 Aug 2022 18:57
Regarding the HTTPGet module (aka app) causing Z-Way to crash...

As written by the author, the HTTPGet module sends an HTTP request whenever there's a change of value in any Z-Wave device. In my Z-Wave network, the module's simplistic logic caused a large number of HTTP requests to be sent. If you have a "busy" Z-Wave network (e.g., lots of devices changing values), the HTTPGet module may be the culprit for Z-Way crashing. Keep in mind, even if you don't have many Z-Wave devices, some Z-Wave devices can be very chatty (e.g., my dimmer switches rapidly send multiple Switch Multilevel Report messages as the brightness changes).

To reduce the number of HTTP requests being sent by the HTTPGet module, I modified the module's index.js file to limit which devices will trigger the HTTP request (to binary and multilevel switches).



If you only want to send an HTTP request when a specific device changes value, you could modify index.js to check the device.id value (e.g., if (device.id == "[Zway Device ID]") { ... }).

Note that for Z-Way to use the modified index.js, you'll need to either restart z-way-server or reinitialize the module (curl -s -u [USERNAME]:[PASSWD] --globoff '127.0.0.1:8083/ZAutomation/api/v1/modules/reinitialize/HTTPGet).

It will be interested to learn if modifying index.js prevents Z-Way from crashing on your system. If so, the module's logic could be enhanced to have the user configure which devices (or device types) should trigger the HTTP request.
Yes, you make a good point on certain devices being "chatty" as you say. I have a switch with a built-in power meter, and the latter is pretty dumb when it comes to setting a threshold limit for status updates. IIRC, the threshold could only be set in percentage, which means that if the switch is off and the device thus is drawing a minimum amount of current - even the slightest change of the measured power going through the device is very likely to trigger a status update.

So yes, I am under the impression that "bursts" of status updates being transmitted through the HTTPGet requests are highly likely to increase the chance of getting server crashes. PoltoS did suggest I add a couple of Virtual Devices (javascript) that gets a random level change once per second, just to get an indication of how this affected the crashing. From the looks of it, putting the pressure on with the virtual devices triggering more traffic did shorten the time it takes to have the z-way-server crashing.

It always ends in the same way, the server goes down with a segmentation fault being reported. ATM I am sort of awaiting for the devs to have time and possibility to look into further tracing problems via remote.

Meanwhile, I'll try and look into your example and see if filtering things out the way you suggest might have an impact. A while back, I had the idea myself that it perhaps would be good to be able to set a limit on how often the HTTPGet module should be allowed to update sensors. That way one could perhaps take the edge of traffic from extremely "chatty" devices. No chance of doing that myself though, just brainstorming.
User avatar
GokMasE
Posts: 59
Joined: 13 Mar 2016 01:04
Location: Sweden

Re: Need frequent server restarts to control CPU load

Post by GokMasE »

seattleneil wrote:
26 Aug 2022 18:57
It will be interested to learn if modifying index.js prevents Z-Way from crashing on your system. If so, the module's logic could be enhanced to have the user configure which devices (or device types) should trigger the HTTP request.
Well, judging from an ongoing test, modifying index.js in the manner described earlier will (very much as expected, I might add) have an immediate impact on the crashing problem. ATM my system is up since ~3,5 days and still going. It would require long time testing to establish if crashes are 100% gone or if they are just very much less likely to occur - but the test is a clear indication that increasing the number of outgoing HTTPGet requests also increases the risc of triggering crashes. The more traffic, the sooner the crash.

When PoltoS asked me to add the 3 virtual devices that change their status once per second, it seemed very clear that the time required to achieve a z-way-server crash shortened noticeably. It didn't even matter if the HTTPGet is pointed to a non-existing LAN-address or a working one, the crashes appear regardless. All in all, the result of limiting the HTTP requests by modifying index.js seems very logical indeed.

So in short, yes the intensity of HTTP requests is a big factor. But the question "what is the exact reason for the crash" is yet to be answered.
timb
Posts: 99
Joined: 03 Jan 2015 00:10

Re: Need frequent server restarts to control CPU load

Post by timb »

I have a similar problem.
Over the course of a week or two, the system becomes progressively slower. When I try to reboot the Pi, I click "shutdown" from the menu and it can take a minte before the dialog box offering shutdown/reboot etc appears. Like the whole machine is running very slow.

I ahve two Pis, each running ZWAY (in 2 locations). This only happens to one of them - the one with the most intensive devices and apps.

Any thoughts?

Thanks
User avatar
PoltoS
Posts: 7565
Joined: 26 Jan 2011 19:36

Re: Need frequent server restarts to control CPU load

Post by PoltoS »

once it becomes slow, please check which thread is eating the CPU, how much memory is used compared to normal state. We can try to monitor across a week to find the issue
timb
Posts: 99
Joined: 03 Jan 2015 00:10

Re: Need frequent server restarts to control CPU load

Post by timb »

None of the CPUs are loaded. But memory maxed:
Mem 700M/921M
Swp 100.0/100.0M
User avatar
PoltoS
Posts: 7565
Joined: 26 Jan 2011 19:36

Re: Need frequent server restarts to control CPU load

Post by PoltoS »

Is Z-Way eating it?
Post Reply