I find the lsof command to be very useful - it shows all of the files that are open on a system, and which process opened the file. This includes regular files, directories, block special files, character special files, executing text references, libraries, streams and network files (Internet socket, NFS file or UNIX domain socket).
If you're curious about what an executing linux process is doing, the lsof and strace commands are very useful tools (strace can traces a running process and shows system calls and signals). Note the the z-way-server process is multi-threaded, so if you want to see the read/write system calls by the z-way-server process, run: sudo strace -f -p `pidof z-way-server` -e trace=read,write
To install lsof, run: sudo apt install lsof
To install strace, run: sudo apt install strace
If you use strace with "-e trace=desc", you should see the z-way-server process has a thread that checks various file descriptors (4, 6, 8 and 9) every 10 milliseconds. You can then use the lsof command to see that these file descriptors are associated with network files (serving TCP port 8083 and private UDP communication between various z-way-server threads). This suggests a linkage between web UI sessions and increased CPU utilization.
Hope this helps.
Need frequent server restarts to control CPU load
-
- Posts: 175
- Joined: 02 Mar 2020 22:41
Re: Need frequent server restarts to control CPU load
Not totally conclusive yet but zway/timers seems to be starting to eat up more CPU.
Raspberry Pi 3 Model B Rev 1.2
Raspbian GNU/Linux 10 (buster)
RaZberry ZW0500 1024/2 SDK: 6.82.01 API: 05.39
Z-Way version v4.1.0
Raspbian GNU/Linux 10 (buster)
RaZberry ZW0500 1024/2 SDK: 6.82.01 API: 05.39
Z-Way version v4.1.0
Re: Need frequent server restarts to control CPU load
sudo apt-get install lsofsudo: lsof: command not found
@seattleneil I would question your assumption about "lost" HTTP connections. Z-Way closes them when Linux kernel closes them. You might see some connections open for a minute or so after you close the browser, but after that Linux kernel recognises that and closes the connection. And that can not cause any crash untill there are thousands of them.
What do you see in your system? Are there many of such connections? I fear you see HTTP outgoing connections (from HTTP devies). Let's check your case and if we can chase the issue, would be nice.
Re: Need frequent server restarts to control CPU load
In my system the cpu load of z-way-server increases continuously from ~5% after start to >30% after 2 weeks. Then at latest I should do a restart.
I closed all logins and stopped all not essential apps, but that had no effect on the system load.
Here the printout of different cpu values after 5 days running:
Code: Select all
pi@raspberrypi:~ $ ./zway_procs.bash
>> top -bn1 -p 383
top - 13:07:53 up 5 days, 3:38, 2 users, load average: 0,21, 0,25, 0,20
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
%Cpu(s): 4,8 us, 1,6 sy, 0,0 ni, 93,7 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st
MiB Mem : 923,2 total, 44,8 free, 191,0 used, 687,4 buff/cache
MiB Swap: 100,0 total, 94,5 free, 5,5 used. 633,6 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
383 root 20 0 246788 93776 19912 S 20,0 9,9 889:22.60 z-way-server
>> 10x: top -bn1 -p 383; sleep 1s
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
383 root 20 0 246788 93776 19912 S 13,3 9,9 889:22.67 z-way-server
383 root 20 0 246788 93776 19912 S 13,3 9,9 889:22.86 z-way-server
383 root 20 0 246788 93776 19912 S 20,0 9,9 889:23.06 z-way-server
383 root 20 0 246788 93776 19912 S 13,3 9,9 889:23.25 z-way-server
383 root 20 0 246788 94304 19912 S 20,0 10,0 889:23.49 z-way-server
383 root 20 0 246788 94304 19912 S 13,3 10,0 889:23.68 z-way-server
383 root 20 0 246788 95548 19912 S 18,8 10,1 889:24.20 z-way-server
383 root 20 0 246788 95548 19912 S 18,8 10,1 889:24.39 z-way-server
383 root 20 0 246788 95548 19912 S 12,5 10,1 889:24.58 z-way-server
383 root 20 0 246788 95548 19912 S 18,8 10,1 889:24.78 z-way-server
>> 10x: top -Hbn1 -p 383 -o -PID -i; sleep 1s
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
620 root 20 0 246788 95548 19912 R 13,3 10,1 469:55.17 zway/timers
620 root 20 0 246788 95548 19912 S 12,5 10,1 469:55.32 zway/timers
508 root 20 0 246788 95548 19912 S 6,7 10,1 195:56.41 zway/core
620 root 20 0 246788 95548 19912 S 20,0 10,1 469:55.47 zway/timers
621 root 20 0 246788 95548 19912 S 6,7 10,1 182:50.68 zway/dev/ttyAMA
620 root 20 0 246788 95548 19912 S 6,2 10,1 469:55.61 zway/timers
508 root 20 0 246788 95548 19912 R 93,8 10,1 195:56.79 zway/core
620 root 20 0 246788 95548 19912 S 6,2 10,1 469:55.73 zway/timers
621 root 20 0 246788 95548 19912 S 6,2 10,1 182:50.76 zway/dev/ttyAMA
620 root 20 0 244740 90300 19912 S 13,3 9,6 469:55.84 zway/timers
621 root 20 0 244740 90300 19912 S 6,7 9,6 182:50.80 zway/dev/ttyAMA
620 root 20 0 244740 90300 19912 S 12,5 9,6 469:55.98 zway/timers
621 root 20 0 244740 90300 19912 S 6,2 9,6 182:50.83 zway/dev/ttyAMA
508 root 20 0 244740 90300 19912 S 6,7 9,6 195:57.69 zway/core
620 root 20 0 244740 90300 19912 S 13,3 9,6 469:56.13 zway/timers
620 root 20 0 244740 90300 19912 S 12,5 9,6 469:56.28 zway/timers
620 root 20 0 244740 90300 19912 S 13,3 9,6 469:56.43 zway/timers
>> top -Hbn1 -p 383 -o -PID
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
383 root 20 0 245764 91088 19912 S 0,0 9,6 0:15.95 z-way-server
502 root 20 0 245764 91088 19912 S 0,0 9,6 0:27.46 OptimizingCompi
503 root 20 0 245764 91088 19912 S 0,0 9,6 0:44.77 v8:SweeperThrea
504 root 20 0 245764 91088 19912 S 0,0 9,6 0:30.23 v8:SweeperThrea
505 root 20 0 245764 91088 19912 S 0,0 9,6 0:18.58 v8:SweeperThrea
506 root 20 0 245764 91088 19912 S 0,0 9,6 0:09.05 v8:SweeperThrea
508 root 20 0 245764 91088 19912 S 0,0 9,6 195:57.85 zway/core
619 root 20 0 245764 91088 19912 S 0,0 9,6 36:17.26 zway/webserver
620 root 20 0 245764 91088 19912 S 12,5 9,6 469:56.59 zway/timers
621 root 20 0 245764 91088 19912 S 6,2 9,6 182:50.98 zway/dev/ttyAMA
1390 root 20 0 245764 91088 19912 S 0,0 9,6 0:51.36 zway/sockets
>> ps -C z-way-server
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 383 11.9 9.6 245764 91088 ? Sl Aug17 14:49:28 z-way-server
>> ps uHcp 383
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 383 0.0 9.6 245764 91088 ? Sl Aug17 0:15 z-way-server
root 383 0.0 9.6 245764 91088 ? Sl Aug17 0:27 OptimizingCompi
root 383 0.0 9.6 245764 91088 ? Sl Aug17 0:44 v8:SweeperThrea
root 383 0.0 9.6 245764 91088 ? Sl Aug17 0:30 v8:SweeperThrea
root 383 0.0 9.6 245764 91088 ? Sl Aug17 0:18 v8:SweeperThrea
root 383 0.0 9.6 245764 91088 ? Sl Aug17 0:09 v8:SweeperThrea
root 383 2.6 9.6 245764 91088 ? Sl Aug17 195:57 zway/core
root 383 0.4 9.6 245764 91088 ? Sl Aug17 36:17 zway/webserver
root 383 6.3 9.6 245764 91088 ? Rl Aug17 469:56 zway/timers
root 383 2.4 9.6 245764 91088 ? Rl Aug17 182:50 zway/dev/ttyAMA
root 383 0.0 9.6 245764 91088 ? Sl Aug17 0:51 zway/sockets
Raspberry Pi 3 Model B Rev 1.2
Raspbian GNU/Linux 10 (buster, 32bit)
RaZberry by Z-Wave.Me ZW0700 7.20.00 07.38/1766938484 1025/257
Z-Way version v4.1.2 from 2023-10-18 03:34:26 +0300
Raspbian GNU/Linux 10 (buster, 32bit)
RaZberry by Z-Wave.Me ZW0700 7.20.00 07.38/1766938484 1025/257
Z-Way version v4.1.2 from 2023-10-18 03:34:26 +0300
Re: Need frequent server restarts to control CPU load
Oh, I see! The problem is in zway/timers! This is really insane. We will dig into. This is a very prominent analysis, thank you!
If you can provide us remote access, we will try to locate the error in a much faster way.
If you can provide us remote access, we will try to locate the error in a much faster way.
Re: Need frequent server restarts to control CPU load
After some investigaation it looks like there is some setInterval that is not stopped somewhere and another one is created. I would suggest to add the following line in main.js and then dump the whole huge log to us (or check it yourself)
This will allow us to track the creation and destroy of every timer and make sure they are not growing in number
Code: Select all
_setInterval = setInterval;
setInterval = function(f, i) {
var r = _setInterval(f, i);
debugPrint(">>> setInterval: " + r);
debugPrintStack();
return r;
};
_clearInterval = clearInterval;
clearInterval = function(r) {
_clearInterval(r);
debugPrint(">>> clearInterval: " + r);
debugPrintStack();
};
_setTimeout = setTimeout;
setTimeout = function(f, i) {
var r = _setTimeout(f, i);
debugPrint(">>> setTimeout: " + r);
debugPrintStack();
return r;
};
_clearTimeout = clearTimeout;
clearTimeout = function(r) {
_clearTimeout(r);
debugPrint(">>> clearTimeout: " + r);
debugPrintStack();
};
Re: Need frequent server restarts to control CPU load
I think I know the clue. Please try this minor fix https://github.com/Z-Wave-Me/home-autom ... 87a9ef8233
We have noticed this on slow hardware, but it never happened on fast RPi4. Anyway, it should fix it and it is already in v4.0.0
We have noticed this on slow hardware, but it never happened on fast RPi4. Anyway, it should fix it and it is already in v4.0.0
Re: Need frequent server restarts to control CPU load
I installed the fix and did a restart. Will observe it.
Raspberry Pi 3 Model B Rev 1.2
Raspbian GNU/Linux 10 (buster, 32bit)
RaZberry by Z-Wave.Me ZW0700 7.20.00 07.38/1766938484 1025/257
Z-Way version v4.1.2 from 2023-10-18 03:34:26 +0300
Raspbian GNU/Linux 10 (buster, 32bit)
RaZberry by Z-Wave.Me ZW0700 7.20.00 07.38/1766938484 1025/257
Z-Way version v4.1.2 from 2023-10-18 03:34:26 +0300
Re: Need frequent server restarts to control CPU load
Quick note from the OP: thanks for investigating this, I'm currently on the road and will test as soon as I get back.PoltoS wrote: ↑23 Aug 2022 02:28I think I know the clue. Please try this minor fix https://github.com/Z-Wave-Me/home-autom ... 87a9ef8233
We have noticed this on slow hardware, but it never happened on fast RPi4. Anyway, it should fix it and it is already in v4.0.0
Raspberry Pi 3 Model B Rev 1.2
Raspbian GNU/Linux 10 (buster)
RaZberry ZW0500 1024/2 SDK: 6.82.01 API: 05.39
Z-Way version v4.1.0
Raspbian GNU/Linux 10 (buster)
RaZberry ZW0500 1024/2 SDK: 6.82.01 API: 05.39
Z-Way version v4.1.0
Re: Need frequent server restarts to control CPU load
After 2 days running, I can state that the problem is solved.
Thanks
Thanks
Raspberry Pi 3 Model B Rev 1.2
Raspbian GNU/Linux 10 (buster, 32bit)
RaZberry by Z-Wave.Me ZW0700 7.20.00 07.38/1766938484 1025/257
Z-Way version v4.1.2 from 2023-10-18 03:34:26 +0300
Raspbian GNU/Linux 10 (buster, 32bit)
RaZberry by Z-Wave.Me ZW0700 7.20.00 07.38/1766938484 1025/257
Z-Way version v4.1.2 from 2023-10-18 03:34:26 +0300