Need frequent server restarts to control CPU load

Discussions about Z-Way software and Z-Wave technology in general
xurg
Posts: 52
Joined: 17 Aug 2020 22:38

Need frequent server restarts to control CPU load

Post by xurg »

I frequently observer comparatively higher than usual CPU loads caused by z-way-server. I'm not log reading expert but nothing strikes me in particular. I do see a fair number of "Discarding duplicate packet" but even when the load is not high. When the load IS high, the log is equally quiet so it is not busily cranking out new errors or warnings. My current work-around is to just restart the server and I now have this done by a daily cronjob. I have quite a hand-rolled setup and I don't expect people to magically diagnose my issue. But perhaps something is already known which can cause this kind of build-up. If not, what would be the best way to diagnose this?
Raspberry Pi 3 Model B Rev 1.2
Raspbian GNU/Linux 10 (buster)
RaZberry ZW0500 1024/2 SDK: 6.82.01 API: 05.39
Z-Way version v4.1.0
micky1500
Posts: 298
Joined: 07 Feb 2016 16:29
Location: England

Re: Need frequent server restarts to control CPU load

Post by micky1500 »

Me too,
Earlier I noticed it took 3 attempts to turn a light off, I assumed that was normal.
Z-Way is using 14.5% Processor on my Pi4, (14 days since restart)
Restarted sudo /etc/init.d/z-way-server restart
30 minutes later 0.6% processor
and things work properly.
Needs some investigation for the cause !
maybe ..
Raspi 4 - (Buster - 32 Bit) Zwave Version 4.1.1, Raz 7 Pro, Serial API Version: 07.38
lanbrown
Posts: 279
Joined: 01 Jun 2021 08:06

Re: Need frequent server restarts to control CPU load

Post by lanbrown »

I don't use many "apps", scenes, rules, etc. in Z-Way and mainly use Home Assistant to control the automation. I do have a few rules on some of my Z-Way instances. I have five Z-Way systems with Razberry Pro 7's and then one Z-Way system that aggregates them all together for Home Assistant to communicate with. The local rules are more or less for mission critical tasks. Early on I did have an issue with Z-Way crashing on the aggregation system but changing the polling from 1 sec to 5 sec resolved that.

With that said, the RAM usage is at or under 512MB on all of the systems. So CPU is also very low for the Z-Way process. So you might look into any scripts or apps you have installed that could be causing the issue.

You can also look at the Expert UI and see how the Z-Wave looks and what is in the queue at the time.
User avatar
PoltoS
Posts: 7562
Joined: 26 Jan 2011 19:36

Re: Need frequent server restarts to control CPU load

Post by PoltoS »

@lanbrown, could you please help us undersant what kind of crashes have you experienced and what polling have you reduced. From time to time we hear from customers about HTTP producing crashes, but we can't catch it. Do you know how to reproduce it?

@micky1500 and @xurg, CPU usage might be higher when a lot of rules are working and when there are a lot of commands in the queue. Beside that the CPU usage should be pretty low.

I would suggest first to tun "top" and press "H" key to go in the thread mode. You will see zway processes like zway/webserver or zway/ttyAMA. Please let us know which is eating the CPU.
lanbrown
Posts: 279
Joined: 01 Jun 2021 08:06

Re: Need frequent server restarts to control CPU load

Post by lanbrown »

@PoltoS

We discussed this in the past. You needed access but needed to see the issue, but that is not possible as the z-way process had failed so there would be noway to get in via Z-Way. I could via SSH and a restart of the process resolved it. If I checked the state it would show as exited.

The value I changed was in the app of "Link other Z-Way controller" and the field was the "Update period (in milliseconds)" and I changed it from 1000 to 5000.

Given that I ran into quite often; sometimes twice but usually just one. I have five Z-Way systems and then the sixth system just aggregates the other five. So if you take six Raspberry Pi's and use the link other Z-Way controller you should be able to reproduce it.

Another option to fix this bug is about two to three months out. I'm waiting on the Turing Pi 2 board to ship. It can house four Raspberry Pi 4 compute modules. I was going to take that sixth system that is already a compute module and put it into this system. I was also going to move my Home Assistant from a VM to a compute module. I was then going to take the remaining two slots and put a development Z-Way and development Home Assistant instance. I was going to take the development Z-Way system and point it to the same five Razberry 7 Pro Z-Way instances. I don't see why it would not be possible since the remote boards are not configured directly when you use the link to other Z-way controller app. The development Home Assistant was going to point to the development Z-Way. This way I can test the beta Home Assistant releases against my Z-Way deployment without impacting the production. It will have production devices though. So I could on the development Z-Way system set the update period to 1 second (1000 milliseconds) and when it crashes just give you SSH credentials to get into the system directly and get whatever logs you need. So my actual production system would still be humming along.
micky1500
Posts: 298
Joined: 07 Feb 2016 16:29
Location: England

Re: Need frequent server restarts to control CPU load

Post by micky1500 »

We seem to have gone off topic here.
OP said CPU usage consumes over time.
No Turing multi processor boards.
I'm looking at Top - H
Nothing yet to report after 1 day.
After a reboot all seems ok.
I will report back in a week, when z-way consumes 14% processor.
All the same tasks as it's doing today. just slower after 7 days.

Regards
Micky.
Raspi 4 - (Buster - 32 Bit) Zwave Version 4.1.1, Raz 7 Pro, Serial API Version: 07.38
lanbrown
Posts: 279
Joined: 01 Jun 2021 08:06

Re: Need frequent server restarts to control CPU load

Post by lanbrown »

I don't have increased CPU usage over time.

$ uptime
18:44:43 up 30 days, 20:21,
Apps, rules, custom scripts all would run under the z-way process. So a poorly written app or script can cause issues.

Is that 14% total or 14% of a single core? Out of my six systems the CPU load ranges from 2 to 11%. Uptimes are from 4-days to 30-days.
xurg
Posts: 52
Joined: 17 Aug 2020 22:38

Re: Need frequent server restarts to control CPU load

Post by xurg »

PoltoS wrote:
11 Aug 2022 02:00
@micky1500 and @xurg, CPU usage might be higher when a lot of rules are working and when there are a lot of commands in the queue. Beside that the CPU usage should be pretty low.

I would suggest first to tun "top" and press "H" key to go in the thread mode. You will see zway processes like zway/webserver or zway/ttyAMA. Please let us know which is eating the CPU.
Ok I'll try to get some more information. Note that when I oberserved this, there was practically nothing currently going on, ie not sensors were firing any more than the idle load. And we're only talking a handful of door sensors, a motion sensor and a few remote controllable power plugs. Not many automation rules present either.

The lead to the command queue is interesting. Could there be some kind of build up in case a sensor's battery run out or so?
Raspberry Pi 3 Model B Rev 1.2
Raspbian GNU/Linux 10 (buster)
RaZberry ZW0500 1024/2 SDK: 6.82.01 API: 05.39
Z-Way version v4.1.0
seattleneil
Posts: 172
Joined: 02 Mar 2020 22:41

Re: Need frequent server restarts to control CPU load

Post by seattleneil »

I've observed CPU utilization increases as the number of web UI sessions increases. This is reasonable, but there seems to be a problem...

As best as I can tell, the z-way-server process does not automatically terminate idle web UI sessions. This happens when a user closes a z-way UI browser tab without explicitly logging out. It's easy to create a lot of these "zombie" web UI sessions (and cause high CPU utilization).

You can see the active web UI sessions by running the following command: sudo lsof -n -p `pidof z-way-server` | grep IP

Until this problem is fixed, I've found the easiest workaround is to restart the z-way-server process.
micky1500
Posts: 298
Joined: 07 Feb 2016 16:29
Location: England

Re: Need frequent server restarts to control CPU load

Post by micky1500 »

Thanks Neil in Seattle.

sudo: lsof: command not found

Thought you had the answer there.
Alexa and Arduino requests sometimes wait 30 days to clear their Active Sessions.
I looked at all active sessions, there aren't any rouge ones.
Will see what TOP - H shows in a weeks time.
Raspi 4 - (Buster - 32 Bit) Zwave Version 4.1.1, Raz 7 Pro, Serial API Version: 07.38
Post Reply