z-way-server shuts down in the middle of the night

Discussions about RaZberry - Z-Wave board for Raspberry computer
AlphaX2
Posts: 110
Joined: 25 Jul 2015 15:03

z-way-server shuts down in the middle of the night

Post by AlphaX2 » 01 Mar 2017 11:44

Hi,

third day in a row, I'm having a strange behaviour: the z-way-server shutdown, crashed, stopped working, around 3 AM in the night. The server was just not running anymore in the morning (GF noticed the cold radiator), but could be restarted successfully via /etc/init.d/z-way-server start. And it had no problems all day long yesterday, after restart in the morning.

Also the log is fine, these are the last 50 lines:

Code: Select all

2017-03-01 03:02:46.042] [D] [zway] SETDATA devices.18.data.isAwake = True
[2017-03-01 03:02:46.042] [D] [zway] SETDATA devices.18.data.lastSend = 6987801 (0x006aa019)
[2017-03-01 03:02:46.042] [D] [zway] SETDATA devices.18.data.lastNonceGet = 6986801 (0x006a9c31)
[2017-03-01 03:02:46.042] [D] [zway] Running wakeup handler for node 18
[2017-03-01 03:02:46.042] [D] [zway] SETDATA devices.18.instances.0.commandClasses.132.data.lastSleep = 1488333766 (0x58b62bc6)
[2017-03-01 03:02:46.043] [I] [zway] Node 18:0 CC Wakeup: Send node to sleep
[2017-03-01 03:02:46.043] [I] [zway] Adding job: Wakeup Sleep
[2017-03-01 03:02:46.113] [D] [zway] SENDING (cb 0x17): ( 01 09 00 13 12 02 84 08 05 17 6B )
[2017-03-01 03:02:46.115] [D] [zway] RECEIVED ACK
[2017-03-01 03:02:46.119] [D] [zway] RECEIVED: ( 01 04 01 13 01 E8 )
[2017-03-01 03:02:46.119] [D] [zway] SENT ACK
[2017-03-01 03:02:46.119] [D] [zway] Delivered to Z-Wave stack
[2017-03-01 03:02:46.164] [D] [zway] RECEIVED: ( 01 07 00 13 17 00 00 05 F9 )
[2017-03-01 03:02:46.164] [D] [zway] SENT ACK
[2017-03-01 03:02:46.165] [I] [zway] Job 0x13 (Wakeup Sleep): Delivered
[2017-03-01 03:02:46.165] [D] [zway] SendData Response with callback 0x17 received: received by recipient
[2017-03-01 03:02:46.165] [D] [zway] SETDATA devices.18.data.lastSend = 6987813 (0x006aa025)
[2017-03-01 03:02:46.165] [D] [zway] Job 0x13 (Wakeup Sleep): success
[2017-03-01 03:02:46.165] [D] [zway] SETDATA devices.18.data.isAwake = False
[2017-03-01 03:02:46.165] [D] [zway] Sending the queue for node 18 into sleep
[2017-03-01 03:02:46.165] [I] [zway] Removing job: Wakeup Sleep
[2017-03-01 03:03:05.915] [D] [zway] RECEIVED: ( 01 08 00 04 00 06 02 98 40 2F )
[2017-03-01 03:03:05.915] [D] [zway] SENT ACK
[2017-03-01 03:03:05.915] [D] [zway] SETDATA devices.6.data.lastReceived = 0 (0x00000000)
[2017-03-01 03:03:05.917] [I] [zway] Node 6:0 CC Security: sending Nonce Report
[2017-03-01 03:03:05.917] [I] [zway] Adding job: Nonce Report
[2017-03-01 03:03:05.918] [D] [zway] SENDING (cb 0x18): ( 01 11 00 13 06 0A 98 80 15 FC 59 8B 6B 74 4D 06 25 18 BB )
[2017-03-01 03:03:05.925] [D] [zway] RECEIVED ACK
[2017-03-01 03:03:05.926] [D] [zway] RECEIVED: ( 01 04 01 13 01 E8 )
[2017-03-01 03:03:05.926] [D] [zway] SENT ACK
[2017-03-01 03:03:05.926] [D] [zway] Delivered to Z-Wave stack
[2017-03-01 03:03:05.946] [D] [zway] RECEIVED: ( 01 07 00 13 18 00 00 02 F1 )
[2017-03-01 03:03:05.946] [D] [zway] SENT ACK
[2017-03-01 03:03:05.946] [I] [zway] Job 0x13 (Nonce Report): Delivered
[2017-03-01 03:03:05.946] [D] [zway] SendData Response with callback 0x18 received: received by recipient
[2017-03-01 03:03:05.946] [D] [zway] SETDATA devices.6.data.lastSend = 6989736 (0x006aa7a8)
[2017-03-01 03:03:05.946] [D] [zway] Job 0x13 (Nonce Report): success
[2017-03-01 03:03:05.946] [I] [zway] Removing job: Nonce Report
[2017-03-01 03:03:05.975] [D] [zway] RECEIVED: ( 01 28 00 04 00 06 22 98 81 1B 53 93 42 D7 32 B9 67 1C 67 68 B5 69 5C 7D 3F 79 52 D7 B4 EB BC E7 15 0B 3F 41 F0 06 38 B2 25 5C )
[2017-03-01 03:03:05.975] [D] [zway] SENT ACK
[2017-03-01 03:03:05.975] [D] [zway] SETDATA devices.6.data.lastReceived = 0 (0x00000000)
[2017-03-01 03:03:05.976] [I] [zway] Node 6:0 CC Security: Received a secure message
[2017-03-01 03:03:05.982] [D] [zway] SETDATA devices.6.instances.0.commandClasses.152.data.firstPart = **********
[2017-03-01 03:03:05.982] [I] [zway] Node 6:0 CC Security: passing decrypted packet to application level: [ 32 02 21 34 00 00 00 e1 00 00 00 00 00 00 ]
[2017-03-01 03:03:05.983] [D] [zway] SETDATA devices.6.instances.0.commandClasses.50.data.2.val = 22.500000
[2017-03-01 03:03:05.983] [D] [zway] SETDATA devices.6.instances.0.commandClasses.50.data.2.delta = 0 (0x00000000)
[2017-03-01 03:03:05.983] [D] [zway] SETDATA devices.6.instances.0.commandClasses.50.data.2.ratetype = 1 (0x00000001)
[2017-03-01 03:03:05.983] [D] [zway] SETDATA devices.6.instances.0.commandClasses.50.data.2.previous = 0.000000
[2017-03-01 03:03:05.983] [D] [zway] SETDATA devices.6.instances.0.commandClasses.50.data.2 = Empty
[2017-03-01 03:03:06.612] [D] [zway] Job 0x13: deleted from queue
[2017-03-01 03:03:26.349] [D] [zway] Job 0x13: deleted from queue


Also /var/log/messages shows nothing interessting.

EDIT:
Checked back in syslog and found in the last three days the following happend around the moment of z-way-server crash:

Code: Select all

Feb 28 03:03:38 raspberrypi kernel: [1185165.551208] Unhandled prefetch abort: breakpoint debug exception (0x002) at 0x76db5950
Mar  1 03:03:57 raspberrypi kernel: [71474.434813] Unhandled prefetch abort: breakpoint debug exception (0x002) at 0x76d9c950
Mar  2 03:03:28 raspberrypi kernel: [10129.395641] Unhandled prefetch abort: breakpoint debug exception (0x002) at 0x76d9b950
Due to the fact, it's always around 3.03 AM, I would guess it could be a CRON Job or similar thing. But how to identify/find?

Changes I made in the past few days: deleted and reinstalled ThermostatController and installed HABridge to use Alexa with ZWay.

Any suggestions, ideas?

User avatar
PoltoS
Posts: 5053
Joined: 26 Jan 2011 19:36

Re: z-way-server shuts down in the middle of the night

Post by PoltoS » 03 Mar 2017 04:50

run Z-Way via gdb - can give you more info

AlphaX2
Posts: 110
Joined: 25 Jul 2015 15:03

Re: z-way-server shuts down in the middle of the night

Post by AlphaX2 » 06 Mar 2017 10:10

Hi again,

I've run z-way-server via gdb, this is what I got:

info thr

Code: Select all

  Id   Target Id         Frame 
  16   Thread 0x704ff450 (LWP 14965) "zway/core" 0x7641a964 in select ()
    at ../sysdeps/unix/syscall-template.S:81
  15   Thread 0x70cff450 (LWP 14964) "zway/core" 0x76973cb0 in read ()
    at ../sysdeps/unix/syscall-template.S:81
  11   Thread 0x714ff450 (LWP 14958) "zway/sockets" 0x7641a964 in select ()
    at ../sysdeps/unix/syscall-template.S:81
  10   Thread 0x720ff450 (LWP 14957) "zway/timers" 0x763ef360 in nanosleep ()
    at ../sysdeps/unix/syscall-template.S:81
  9    Thread 0x72aff450 (LWP 14954) "zway/core" 0x7641a964 in select ()
    at ../sysdeps/unix/syscall-template.S:81
  8    Thread 0x73aff450 (LWP 14953) "zway/webserver" 0x7641a964 in select ()
    at ../sysdeps/unix/syscall-template.S:81
* 7    Thread 0x74e7f450 (LWP 14951) "zway/core" 0x76e8c950 in v8::internal::OS::DebugBreak() () from ./libs/libv8.so
  6    Thread 0x74e8f450 (LWP 14950) "v8:SweeperThrea" 0x76972a40 in do_futex_wait (isem=isem@entry=0x645dc) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
  5    Thread 0x74e9f450 (LWP 14949) "v8:SweeperThrea" 0x76972a40 in do_futex_wait (isem=isem@entry=0x644d4) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
  4    Thread 0x74eaf450 (LWP 14948) "v8:SweeperThrea" 0x76972a40 in do_futex_wait (isem=isem@entry=0x643cc) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
  3    Thread 0x74ebf450 (LWP 14947) "v8:SweeperThrea" 0x76972a40 in do_futex_wait (isem=isem@entry=0x642c4) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
---Type <return> to continue, or q <return> to quit---
  2    Thread 0x756bf450 (LWP 14946) "OptimizingCompi" 0x76972a40 in do_futex_wait (isem=isem@entry=0x6417c) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
  1    Thread 0x76ff4000 (LWP 14941) "z-way-server" 0x763ef360 in nanosleep ()
    at ../sysdeps/unix/syscall-template.S:81
bt

Code: Select all

#0  0x76e8c950 in v8::internal::OS::DebugBreak() () from ./libs/libv8.so
#1  0x76e8c980 in v8::internal::OS::Abort() () from ./libs/libv8.so
#2  0x76b522a8 in v8::Utils::ReportApiFailure(char const*, char const*) ()
   from ./libs/libv8.so
#3  0x76b52514 in v8::internal::V8::FatalProcessOutOfMemory(char const*, bool)
    () from ./libs/libv8.so
So it seems related to running out of memory, but how can it be explained, that it is happening always around the same time point? Two days ago I started z-way-server ~7:30am and it run until 3:03am yesterday. For this GDB session I started the server ~1:20am and it crashed 3:04am. :?:

Checking in htop, RAM was actualy nearly full ~986MB. after quitting the session it's now back down ~160MB.
EDIT: Even my SWAP file was full (allows max. 2GB).

Help would be very appreciated. Thank you very much.

User avatar
PoltoS
Posts: 5053
Joined: 26 Jan 2011 19:36

Re: z-way-server shuts down in the middle of the night

Post by PoltoS » 06 Mar 2017 23:24

We have found a sever memory leak in v2.3.0. This is fixed in v2.3.1-rc3. We hope to release very soon v2.3.1 to solve those stability problems.

AlphaX2
Posts: 110
Joined: 25 Jul 2015 15:03

Re: z-way-server shuts down in the middle of the night

Post by AlphaX2 » 06 Mar 2017 23:31

Okay good to hear that. Is there any specific thing that triggers the leaking? Had no problems the first few days of 2.3.0 usage.

Thank you!

User avatar
PoltoS
Posts: 5053
Joined: 26 Jan 2011 19:36

Re: z-way-server shuts down in the middle of the night

Post by PoltoS » 06 Mar 2017 23:54

every packet lead to a leak of size of that packet. so no way to workaround except for turning off your ZWave devices ;)

AlphaX2
Posts: 110
Joined: 25 Jul 2015 15:03

Re: z-way-server shuts down in the middle of the night

Post by AlphaX2 » 06 Mar 2017 23:58

Sounds like a bad deal. ;)

But still strange, that the crash appears always at the same time?!

User avatar
PoltoS
Posts: 5053
Joined: 26 Jan 2011 19:36

Re: z-way-server shuts down in the middle of the night

Post by PoltoS » 07 Mar 2017 00:05

I think it was a chance ;)

BTW, you can take modzwave.so from the v2.3.1-rc3 build and use with your current release to fix the problem. just substitute it (in modules/ folder)

AlphaX2
Posts: 110
Joined: 25 Jul 2015 15:03

Re: z-way-server shuts down in the middle of the night

Post by AlphaX2 » 07 Mar 2017 00:19

Cool, will try that. So the rc builds not good enough for normal usage atm? Thanks!

User avatar
PoltoS
Posts: 5053
Joined: 26 Jan 2011 19:36

Re: z-way-server shuts down in the middle of the night

Post by PoltoS » 07 Mar 2017 02:13

We have not yet purified the UI part. The JS backend + C are fine

Post Reply