Razberry 7 Pro problem

Discussions about Z-Way software and Z-Wave technology in general
Post Reply
AlesKO
Posts: 84
Joined: 24 Nov 2016 09:58

Razberry 7 Pro problem

Post by AlesKO »

Yesterday my Razberry 7 Pro was not accessible from the local network. Remote connection worked just fine. It is quite a common situation with Z-way, so as usual I restarted my Hub. Now my Hub is not accessible even remotely. Great.  :?:  And avtomations are not working.

Opened Raspberry box and there is no LED turned on (D1, D2). Is this OK?

Listening to Z-way experts on this forum I do not use graphical Debian any more - just terminal mode. In such a situation I simply run the Debian browser. But in terminal mode I am completely lost. ...

Any manual, idea, hint how to find where the problem is?
AlesKO
Posts: 84
Joined: 24 Nov 2016 09:58

Re: Razberry 7 Pro problem

Post by AlesKO »

Asked ChatGPT how to check if Z-way server is running. AI gave me command: 'sudo systemctl status z-way-server'.
Server is Active (running) but I can still not connect to Hub.

Mobile: Cloud service not available; Hub not available.
PC: Local IP address not available. Remote access also not accessable.

Just changed RaZberry 7 with old RaZberry 2 and old SD card. This setup works just fine. Local IP shown and accessable.

Is it that my Raz 7 Debian SD card is corrupted?
seattleneil
Posts: 172
Joined: 02 Mar 2020 22:41

Re: Razberry 7 Pro problem

Post by seattleneil »

Based on the symptoms you described, it sounds like your SD card is corrupted. You can test the SD card from a working Pi if you have a microSD<->USB adapter. Simply insert the suspect SD card into the adapter and then plug the adapter into an unused USB connector on the Pi. Since USB is plug-and-play, leave your Pi powered on to test the SD card.

As root, run this command after you insert the adapter into the Pi (with the suspect SD card installed):
# dmesg | tail -30

This is what appears on my Pi when I run the command:

Code: Select all

root@raspberrypi:~# dmesg | tail -30
[    3.833435] i2c /dev entries driver
[    6.318606] EXT4-fs (mmcblk0p2): re-mounted. Opts: (null)
[    6.895688] systemd-journald[76]: Received request to flush runtime journal from PID 1
[    9.585196] snd_bcm2835: module is from the staging directory, the quality is unknown, you have been warned.
[    9.624154] bcm2835_alsa bcm2835_alsa: card created with 8 channels
[   17.263639] smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[   18.787855] smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0xC5E1
[   21.406104] Adding 102396k swap on /var/swap.  Priority:-2 extents:1 across:102396k SSFS
[   27.622973] random: crng init done
[   27.623001] random: 7 urandom warning(s) missed due to ratelimiting
[   30.803656] uart-pl011 20201000.serial: no DMA platform data
[  355.355263] usb 1-1.5: new high-speed USB device number 4 using dwc_otg
[  355.491823] usb 1-1.5: New USB device found, idVendor=05e3, idProduct=0715
[  355.491848] usb 1-1.5: New USB device strings: Mfr=3, Product=4, SerialNumber=2
[  355.491859] usb 1-1.5: Product: USB Reader
[  355.491868] usb 1-1.5: Manufacturer: Genesys
[  355.491878] usb 1-1.5: SerialNumber: 000000009407
[  355.506536] usb-storage 1-1.5:1.0: USB Mass Storage device detected
[  355.516109] scsi host0: usb-storage 1-1.5:1.0
[  355.640243] usbcore: registered new interface driver uas
[  356.568496] scsi 0:0:0:0: Direct-Access     Generic  STORAGE DEVICE   9407 PQ: 0 ANSI: 0
[  356.630211] sd 0:0:0:0: Attached scsi generic sg0 type 0
[  356.833826] sd 0:0:0:0: [sda] 62333952 512-byte logical blocks: (31.9 GB/29.7 GiB)
[  356.835015] sd 0:0:0:0: [sda] Write Protect is off
[  356.835037] sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
[  356.841587] sd 0:0:0:0: [sda] No Caching mode page found
[  356.841657] sd 0:0:0:0: [sda] Assuming drive cache: write through
[  356.858876]  sda: sda1 sda2
[  356.864923] sd 0:0:0:0: [sda] Attached SCSI removable disk
[  479.106917]  sda: sda1 sda2
root@raspberrypi:~# 
If you don't see "sda: sda1 sda2" then the partition table on the SD card is corrupt which will make data recovery difficult or impossible.
If you do see "sda: sda1 sda2", the next step is to test if the linux filesystem is healthy by running this command:
# e2fsck /dev/sda2

If the linux partition is healthy, the output should look like the following:

Code: Select all

root@raspberrypi:~# e2fsck /dev/sda2
e2fsck 1.43.4 (31-Jan-2017)
rootfs: clean, 48034/1872896 files, 477091/7725184 blocks
root@raspberrypi:~# 
If the linux filesystem is healthy, you should be feel relieved since there's a good chance all of the files are intact. Before you proceed with additional testing, you should backup the filesystem to your working SD card. Assuming your "good" SD card has sufficient space, the following command is 1 way to backup the sda2 partition on the "bad" SD card:
# dd if=/dev/sda2 status=progress | gzip -9 > /root/badcard_sda2.img.gz

Be patient - this command could take an hour or more and will create a compressed image of the sda2 partition. You can use this image to recover your data.

The next step will take several minutes and tests the bad SD card by performing a non-destructive read-write test of the sda1 partition. If the SD card is corrupted, errors are expected since this test writes to the SD card. Be aware that if the SD card has only partially failed, this test could cause the SD card to fully fail which is why a backup of the sda2 partition should be done before you run this command.
# badblocks -n -v -s /dev/sda1

If the SD card is healthy, the output should look like the following:

Code: Select all

root@raspberrypi:~# badblocks -n -v -s /dev/sda1
Checking for bad blocks in non-destructive read-write mode
From block 0 to 262143
Checking for bad blocks (non-destructive read-write test)
Testing with random pattern: done
Pass completed, 0 bad blocks found. (0/0/0 errors)
root@raspberrypi:~#
There are more tests that can be run (such as: badblocks -n -v -s /dev/sda2), but I'm guessing the tests listed above will be conclusive.

Although Z-Way has gotten much better at reducing SD card wear-and-tear, if you have a Pi 4 that supports USB3, you may want to consider replacing the SD card with a real SSD as it will improve speed and reliability. For $21 USD, I'm very happy with the Kingston 240GB A400 SSD (see: https://www.amazon.com/gp/product/B01N5IB20Q). You'll also need a USB3<->SATA adapter - for $11 USD, I'm very happy with the StarTech USB3S2SAT3CB (see: https://www.amazon.com/gp/product/B00HJZJI84).

With a real SSD instead of a microSD card, you can install smartmontools and configure smartd to automatically monitor SSD health and send you an e-mail when there's a problem (see: https://linuxconfig.org/how-to-configur ... -via-email). Here's what appears for my Kingston SSD when I check SMART status from the command line:

Code: Select all

# smartctl -a -d sat /dev/sda
smartctl 6.6 2017-11-05 r4594 [armv7l-linux-5.10.103-v7l+] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Phison Driven OEM SSDs
Device Model:     SATA SSD
Serial Number:    AB47071407DE00022048
LU WWN Device Id: 0 000000 000000000
Firmware Version: SBFMKB.3
User Capacity:    240,057,409,536 bytes [240 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      < 1.8 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-4 (minor revision not indicated)
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Jun  3 09:07:26 2023 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (65535) seconds.
Offline data collection
capabilities:                    (0x79) SMART execute Offline immediate.
                                        No Auto Offline data collection support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (  30) minutes.
Conveyance self-test routine
recommended polling time:        (   6) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   050    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       12877
 12 Power_Cycle_Count       0x0012   100   100   000    Old_age   Always       -       58
168 SATA_Phy_Error_Count    0x0012   100   100   000    Old_age   Always       -       0
170 Bad_Blk_Ct_Erl/Lat      0x0003   092   092   010    Pre-fail  Always       -       0/30
173 MaxAvgErase_Ct          0x0012   100   100   000    Old_age   Always       -       7 (Average 5)
192 Unsafe_Shutdown_Count   0x0012   100   100   000    Old_age   Always       -       55
194 Temperature_Celsius     0x0023   067   067   000    Pre-fail  Always       -       33 (Min/Max 33/33)
218 CRC_Error_Count         0x000b   100   100   050    Pre-fail  Always       -       0
231 SSD_Life_Left           0x0013   100   100   000    Pre-fail  Always       -       99
241 Lifetime_Writes_GiB     0x0012   100   100   000    Old_age   Always       -       245

SMART Error Log Version: 1
No Errors Logged
In my opinion, not having to deal with SD card corruption is worth $32 USD.
AlesKO
Posts: 84
Joined: 24 Nov 2016 09:58

Re: Razberry 7 Pro problem

Post by AlesKO »

Solved by reinstalling SD card software. And upgraded to v4.1.0.

Another strange situation appeared:
2023-06-10 210842.jpg
2023-06-10 210842.jpg (2.62 KiB) Viewed 1561 times
Every second day server is not responsing to automation, to manual 'commands'...

I need to restart server (z-way restart) than it works again.
Where can I search for the problem?
seattleneil
Posts: 172
Joined: 02 Mar 2020 22:41

Re: Razberry 7 Pro problem

Post by seattleneil »

Now that your Pi is working again, it looks like you've solved one problem and have encountered a new problem. Actually, it doesn't seem to be a completely new problem as several other forum users have reported huge job queues in Z-Way version 4.1. Other than configuring a cron job to restart Z-Way every day as a temporary workaround, I don't have a suggested solution - I'm running version 4.1 and haven't seen unusual job queues. Perhaps looking at the job queue and z-way log file will show you if there's a specific device that's causing the job queue problem or you'll see a clue so that @PoltoS can isolate the problem.
User avatar
PoltoS
Posts: 7565
Joined: 26 Jan 2011 19:36

Re: Razberry 7 Pro problem

Post by PoltoS »

@AlesKO. We are trying to chase the "queue" issue. If you can provide a log file when it starts filling up (not where it is already full), would be nice.
dvd2000
Posts: 31
Joined: 22 Sep 2017 01:11

Re: Razberry 7 Pro problem

Post by dvd2000 »

I have the same problem. With fw 4.1 i have to restart zway twice a day, not responding to manual commands (binary switch or multilevel switch). Sensors continue updating values.
Tried a fresh install and backup restore, same problem.

I have reverted back to 4.0.2 and everything is stable again.
Post Reply