Page 2 of 5

Re: ZWay crashing daily

Posted: 07 Apr 2017 12:23
by sir_ray
I have the same problem with 2.3.1 How can it be resolved? As this is not a desired situation...

Re: ZWay crashing daily

Posted: 07 Apr 2017 12:55
by sir_ray
I downgraded to v.2.3.0 now it works OK again :D

Re: ZWay crashing daily

Posted: 11 Apr 2017 16:42
by 10neWulf
Well.. Mine actually crashed today. Nothing in the log, but it looks like it stopped logging about 30 minutes before it actually crashed, because the log times dont match up with my monitoring tool...

I have put extra load on the server recently, by having a touchscreen making API calls quite frequently, but I should still expect the pi to handle it..

anyone else still having troubles?

Re: ZWay crashing daily

Posted: 12 Apr 2017 11:36
by fez
I have also stability issues, but I gave up troubleshooting, I dont have time for that.
I just restart it when it crashes. It happens now and then, once a day, once a week, no clear pattern.
I also have a tablet loading the web UI every now and then. This noticably increased the crashing ratio.
I am running 2.2.5. I havent try the 2.3.1 yet because of the stability complaints.

Re: ZWay crashing daily

Posted: 13 Apr 2017 15:14
by ronie
Hi @ all,

this inconsistent z-way-server crashes are an issue that is known by us since months ago but unfortunately hard to reproduce and to debug - I know thats no excuse for your circumstances.

We hoped latest stable release v2.3.1 could solve most of them but that seems to be not the case ...
This inconsistences also appear not on every installation. Most of them are running without problems.

At the moment we suspect the access to sockets. Means the registered ones got lost or stuck after several time what leads into errors and finally let's the OS kill the z-way-server process ...
Unfortunately in most cases the z-way-server.log gives us no hint why it crashes - only what happened in z-way before. But with help of gdb debugger it's possible to check for the error - but not the cause (that's what we need to find out)

Here is a short How To:

The best would be, to do that within one screen session.
  1. start ssh connection to your box
  2. start screen session (you have to install that eventually, you can do this with: "$sudo apt-get install screen")

    Code: Select all

    $ screen
  3. change current user to 'root' user - this will avoid errornous notification during server is running

    Code: Select all

    $ sudo su
  4. and then start the gdb procedure ... (see below)
The advantage of this, is that this can be active in the background.
You can log out of the session with: "$detach" and when you reconnect through SSH, then you will be logged in at the appropriate point.
You can alternatively keep that window open.

Code: Select all

$ exit 
will end the screen session within the session.

Here is more about that: https://wiki.ubuntuusers.de/Screen/

GDB:
  • start ssh connection to your box
  • stop current z-way-server:

    Code: Select all

    $ sudo /etc/init.d/z-way-server stop
  • switch to the z-way-server directory:

    Code: Select all

    $ cd /opt/z-way-server
  • start z-way-server with gdb debugger (should be already installed):

    Code: Select all

    $ LD_LIBRARY_PATH=./libs gdb ./z-way-server
  • after first insert prompt type in ‘r’ for start
    continue first break
    continue first break
    gdb_c1.png (10.12 KiB) Viewed 13643 times
  • confirm one time with ‘c’ for continue
    start gdb
    start gdb
    gdb_r.png (26.02 KiB) Viewed 13643 times
  • z-way-server is starting…
    The attachment gdb_running.png is no longer available
When the server crashes, a message similar as one of the following appears:

Code: Select all

Program received signal SIGPIPE, Broken pipe. 
[Switching to Thread 0x743ff450 (LWP 22061)] 
0x769882f4 in send () at ../sysdeps/unix/syscall-template.S:81 
81 ../sysdeps/unix/syscall-template.S: No such file or directory. 
(gdb)

Code: Select all

Program received signal SIGSEGV, Segmentation fault. 
[Switching to Thread 0x71fff450 (LWP 21411)] 
0x75728588 in zwjs::SocketConnection::IsConfigured() const () 
from ./modules/modsockets.so 
(gdb)

Code: Select all

Program received signal SIGHUP, Hangup. 
0x76403360 in nanosleep () at ../sysdeps/unix/syscall-template.S:81 
81 ../sysdeps/unix/syscall-template.S: No such file or directory. 
(gdb)
If you detect other ones please let us know.


If such a message occurs please enter 'info thread' and 'bt' to get more information about this error, e.g:

Code: Select all

Program received signal SIGSEGV, Segmentation fault. 
[Switching to Thread 0x71fff450 (LWP 21411)] 
0x75728588 in zwjs::SocketConnection::IsConfigured() const () 
from ./modules/modsockets.so 
(gdb) info thread 
Id Target Id Frame 
11 Thread 0x71eff450 (LWP 20033) "zway/sockets" 0x7642e964 in select () 
at ../sysdeps/unix/syscall-template.S:81 
10 Thread 0x726ff450 (LWP 20032) "zway/timers" 0x76403360 in nanosleep () 
at ../sysdeps/unix/syscall-template.S:81 
9 Thread 0x738ff450 (LWP 20031) "zway/core" 0x7642e964 in select () 
at ../sysdeps/unix/syscall-template.S:81 
8 Thread 0x742ff450 (LWP 20030) "zway/webserver" 0x7642e964 in select () 
at ../sysdeps/unix/syscall-template.S:81 
7 Thread 0x74e7f450 (LWP 20029) "zway/core" 0x76403360 in nanosleep () 
at ../sysdeps/unix/syscall-template.S:81 
6 Thread 0x74e8f450 (LWP 20028) "v8:SweeperThrea" 0x76986a40 in do_futex_wait (isem=isem@entry=0x64764) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48 
5 Thread 0x74e9f450 (LWP 20027) "v8:SweeperThrea" 0x76986a40 in do_futex_wait (isem=isem@entry=0x6465c) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48 
4 Thread 0x74eaf450 (LWP 20026) "v8:SweeperThrea" 0x76986a40 in do_futex_wait (isem=isem@entry=0x64554) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48 
3 Thread 0x74ebf450 (LWP 20025) "v8:SweeperThrea" 0x76986a40 in do_futex_wait (isem=isem@entry=0x6444c) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48 
2 Thread 0x756bf450 (LWP 20024) "OptimizingCompi" 0x76986a40 in do_futex_wait (isem=isem@entry=0x64304) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48 
* 1 Thread 0x7634b000 (LWP 20021) "z-way-server" 0x76403360 in nanosleep () 
at ../sysdeps/unix/syscall-template.S:81 
(gdb) bt 
#0 0x76403360 in nanosleep () at ../sysdeps/unix/syscall-template.S:81 
#1 0x76403098 in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:137 
#2 0x0000aeb4 in main () 
(gdb)
This gives us some information what happend and why it has crashed but unfortunately not the reason(s) that lead into it ...

After this, you can exit the debugger with ‘q’ for quit and restart it with:

Code: Select all

$ sudo /etc/init.d/z-way-sever restart
If the port 8083 is still occupied, then please stop the Z-Way-Server manually with:

Code: Select all

$ sudo killall -9 z-way-server
Please collect this stacktraces over some time maybe (3-4 days) and report them to us.

We need also information about your system characteristics (http://YOUR_BOX_IP:8083/expert/#/network/controller):
  • installed z-way version (Software Information > Version Number)
  • type of controller (RaZ / RaZ 2 / UZB) + f/w (Firmware > Serial API Version)
  • list of your running apps (especially Sonos, MQTT, Global Caché, Fibaro API are interesting because they're using sockets)
  • is your system plain and dedicated for z-way-server? Means are there no more libs or services running running in addition to z-way-server and the common OS installation? If yes which ones (are they using sockets)?
  • snippet of your z-way-server.log (200 lines from the point before it crashes)
  • OPTIONAL: also backups of your system can help us to reproduce your issues (local backups from Smarthome/Expert UI)
If you don't want to share your private data with all other users please send them bundeled to support@zwaveeurope.com
We'll handle them with care and are also able to channel issue news or requests direct to you.
Otherwise feel free to support us with information, statistics and debugging. Of course we'll share our new findings with you in this thread.

Hopefully this all will bring us big step forward to solve this issue asap :)

PS:
Seems that the attached screens are a bit mixed ...
  • gdb_r.png
  • gdb_c1.png
  • gdb_running.png
is the correct order.

Re: ZWay crashing daily

Posted: 14 Apr 2017 10:35
by stellavision
Backed up everything, reinstalled from scratch, restored configuration and everything back to normal.

Re: ZWay crashing daily

Posted: 17 Apr 2017 15:29
by 10neWulf
@ronie I've started the process, and will let you know when it crashes.

The sockets issue could be why this has started happening now, as I have more devices polling the API?

Anyway, it's crashing roughly twice per day now, so i'll let you know after a few crashes

Re: ZWay crashing daily

Posted: 19 Apr 2017 05:41
by 10neWulf
My Server just crashed again, with the following message:

*** glibc detected *** /opt/z-way-server/z-way-server: free(): invalid next size (fast): 0x737d3ab0 ***

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x6c7ff460 (LWP 25850)]
0x763cf8dc in raise () from /lib/arm-linux-gnueabihf/libc.so.6

I've sent an email to the email address above with all the information - hopefully this helps :)

Re: ZWay crashing daily

Posted: 20 Apr 2017 22:38
by Benny
After my Sytem was broken I did a new Installation on my Pi3, Raz2, 80 devices. Newest Jessie, Zway V2.3.1. Low Automation grade, just a few lamps. It worked fine for two weeks. Then I got two Issues.

First one:
There were some kind of routing Problems, happens two times. First time Lamp 1 should turn on, but Lamp 2 did, second time Lamp 5 should turn on, but blind 1 got down. Doesn't recognize this issue again.

The Second:
After the two weeks Z-Way stopped working/logging. No automatet light, but I was able to login. The Log was empty, no Temperature or something else was transmitted and no app was working after "crash". I just had to switch a device in the Smart Home UI to bring it back to life. A restart of ZWay wasn't neccessary. This happend three times in the last 4 weeks. Everytime the same procedure, I just have to login and switch the state from one device to get Z-Way back to work.

Re: ZWay crashing daily

Posted: 21 Apr 2017 22:49
by pimth
I am testing 2.3.4 ....