Page 1 of 3

Need frequent server restarts to control CPU load

Posted: 10 Aug 2022 20:34
by xurg
I frequently observer comparatively higher than usual CPU loads caused by z-way-server. I'm not log reading expert but nothing strikes me in particular. I do see a fair number of "Discarding duplicate packet" but even when the load is not high. When the load IS high, the log is equally quiet so it is not busily cranking out new errors or warnings. My current work-around is to just restart the server and I now have this done by a daily cronjob. I have quite a hand-rolled setup and I don't expect people to magically diagnose my issue. But perhaps something is already known which can cause this kind of build-up. If not, what would be the best way to diagnose this?

Re: Need frequent server restarts to control CPU load

Posted: 10 Aug 2022 21:35
by micky1500
Me too,
Earlier I noticed it took 3 attempts to turn a light off, I assumed that was normal.
Z-Way is using 14.5% Processor on my Pi4, (14 days since restart)
Restarted sudo /etc/init.d/z-way-server restart
30 minutes later 0.6% processor
and things work properly.
Needs some investigation for the cause !
maybe ..

Re: Need frequent server restarts to control CPU load

Posted: 10 Aug 2022 22:03
by lanbrown
I don't use many "apps", scenes, rules, etc. in Z-Way and mainly use Home Assistant to control the automation. I do have a few rules on some of my Z-Way instances. I have five Z-Way systems with Razberry Pro 7's and then one Z-Way system that aggregates them all together for Home Assistant to communicate with. The local rules are more or less for mission critical tasks. Early on I did have an issue with Z-Way crashing on the aggregation system but changing the polling from 1 sec to 5 sec resolved that.

With that said, the RAM usage is at or under 512MB on all of the systems. So CPU is also very low for the Z-Way process. So you might look into any scripts or apps you have installed that could be causing the issue.

You can also look at the Expert UI and see how the Z-Wave looks and what is in the queue at the time.

Re: Need frequent server restarts to control CPU load

Posted: 11 Aug 2022 02:00
by PoltoS
@lanbrown, could you please help us undersant what kind of crashes have you experienced and what polling have you reduced. From time to time we hear from customers about HTTP producing crashes, but we can't catch it. Do you know how to reproduce it?

@micky1500 and @xurg, CPU usage might be higher when a lot of rules are working and when there are a lot of commands in the queue. Beside that the CPU usage should be pretty low.

I would suggest first to tun "top" and press "H" key to go in the thread mode. You will see zway processes like zway/webserver or zway/ttyAMA. Please let us know which is eating the CPU.

Re: Need frequent server restarts to control CPU load

Posted: 11 Aug 2022 05:14
by lanbrown
@PoltoS

We discussed this in the past. You needed access but needed to see the issue, but that is not possible as the z-way process had failed so there would be noway to get in via Z-Way. I could via SSH and a restart of the process resolved it. If I checked the state it would show as exited.

The value I changed was in the app of "Link other Z-Way controller" and the field was the "Update period (in milliseconds)" and I changed it from 1000 to 5000.

Given that I ran into quite often; sometimes twice but usually just one. I have five Z-Way systems and then the sixth system just aggregates the other five. So if you take six Raspberry Pi's and use the link other Z-Way controller you should be able to reproduce it.

Another option to fix this bug is about two to three months out. I'm waiting on the Turing Pi 2 board to ship. It can house four Raspberry Pi 4 compute modules. I was going to take that sixth system that is already a compute module and put it into this system. I was also going to move my Home Assistant from a VM to a compute module. I was then going to take the remaining two slots and put a development Z-Way and development Home Assistant instance. I was going to take the development Z-Way system and point it to the same five Razberry 7 Pro Z-Way instances. I don't see why it would not be possible since the remote boards are not configured directly when you use the link to other Z-way controller app. The development Home Assistant was going to point to the development Z-Way. This way I can test the beta Home Assistant releases against my Z-Way deployment without impacting the production. It will have production devices though. So I could on the development Z-Way system set the update period to 1 second (1000 milliseconds) and when it crashes just give you SSH credentials to get into the system directly and get whatever logs you need. So my actual production system would still be humming along.

Re: Need frequent server restarts to control CPU load

Posted: 12 Aug 2022 00:25
by micky1500
We seem to have gone off topic here.
OP said CPU usage consumes over time.
No Turing multi processor boards.
I'm looking at Top - H
Nothing yet to report after 1 day.
After a reboot all seems ok.
I will report back in a week, when z-way consumes 14% processor.
All the same tasks as it's doing today. just slower after 7 days.

Regards
Micky.

Re: Need frequent server restarts to control CPU load

Posted: 12 Aug 2022 02:53
by lanbrown
I don't have increased CPU usage over time.

$ uptime
18:44:43 up 30 days, 20:21,
Apps, rules, custom scripts all would run under the z-way process. So a poorly written app or script can cause issues.

Is that 14% total or 14% of a single core? Out of my six systems the CPU load ranges from 2 to 11%. Uptimes are from 4-days to 30-days.

Re: Need frequent server restarts to control CPU load

Posted: 13 Aug 2022 09:24
by xurg
PoltoS wrote:
11 Aug 2022 02:00
@micky1500 and @xurg, CPU usage might be higher when a lot of rules are working and when there are a lot of commands in the queue. Beside that the CPU usage should be pretty low.

I would suggest first to tun "top" and press "H" key to go in the thread mode. You will see zway processes like zway/webserver or zway/ttyAMA. Please let us know which is eating the CPU.
Ok I'll try to get some more information. Note that when I oberserved this, there was practically nothing currently going on, ie not sensors were firing any more than the idle load. And we're only talking a handful of door sensors, a motion sensor and a few remote controllable power plugs. Not many automation rules present either.

The lead to the command queue is interesting. Could there be some kind of build up in case a sensor's battery run out or so?

Re: Need frequent server restarts to control CPU load

Posted: 13 Aug 2022 19:17
by seattleneil
I've observed CPU utilization increases as the number of web UI sessions increases. This is reasonable, but there seems to be a problem...

As best as I can tell, the z-way-server process does not automatically terminate idle web UI sessions. This happens when a user closes a z-way UI browser tab without explicitly logging out. It's easy to create a lot of these "zombie" web UI sessions (and cause high CPU utilization).

You can see the active web UI sessions by running the following command: sudo lsof -n -p `pidof z-way-server` | grep IP

Until this problem is fixed, I've found the easiest workaround is to restart the z-way-server process.

Re: Need frequent server restarts to control CPU load

Posted: 13 Aug 2022 19:55
by micky1500
Thanks Neil in Seattle.

sudo: lsof: command not found

Thought you had the answer there.
Alexa and Arduino requests sometimes wait 30 days to clear their Active Sessions.
I looked at all active sessions, there aren't any rouge ones.
Will see what TOP - H shows in a weeks time.