Corrupt SD card...again!

Discussions about Z-Way software and Z-Wave technology in general
User avatar
PoltoS
Posts: 7562
Joined: 26 Jan 2011 19:36

Re: Corrupt SD card...again!

Post by PoltoS »

micky1500 wrote:
14 Feb 2021 01:09
/opt/z-way-server/config/zddx/e4918c29-DevicesData.xml
This one is written only during interview or every hour. Should be ok
micky1500 wrote:
14 Feb 2021 01:09
/opt/z-way-server/automation/storage/zwayparsedPacketsjson-f1aed46af7fae1368e78db7768e4d9e5.json
/opt/z-way-server/automation/storage/zwayoriginPacketsjson-2ad2998bc2cb543d29ac586cec215398.json
Those do alredy use write only every 100 packet. This is in average once per 3-30 minutes. If you have an intensive network, it might be more often, and then we need to reduce this indeed.
User avatar
PoltoS
Posts: 7562
Joined: 26 Jan 2011 19:36

Re: Corrupt SD card...again!

Post by PoltoS »

I wonder what is writing so fast. You can attach strace to see writes:

Code: Select all

sudo strace -fp 344 -e trace=write,open
Then one can aggregate the data and check what files are written. In my hope this is really low - few times per minute only. May be you have some killing network traffic or a lot of logging?
insiorc
Posts: 43
Joined: 01 Apr 2019 21:55

Re: Corrupt SD card...again!

Post by insiorc »

Thanks PoltoS, I tried running what you posted but I just get what I pasted below. I tried intalling strace but it was already installed and up to date.

sudo strace -fp 344 -e trace=write,open
strace: attach: ptrace(PTRACE_SEIZE, 344): No such process

During today I have spent quite a bit of time, initially pausing virtually everything then re-enabling sections at a time and manually logging. I've still got more checking to go but so far I've noticed big jumps when I enable Philips Hue (24 lights and 6 switches), 3Gb total in 30 minutes vs 0.3Gb when Hue is paused. Also logical rules.

Also I've been noticing messages saying 'Can't register duplicate module instance: 147', I got 5 of these with different numbers at the one time. I've checked through some apps and everything appears ok, but could this be related to the high write rate?
Rpi4 2Gb, Z-Way 3.2.2, UZB 5.39
insiorc
Posts: 43
Joined: 01 Apr 2019 21:55

Re: Corrupt SD card...again!

Post by insiorc »

So overnight with Philip Hue off, the other apps on and a full reboot, 6hrs 19min run time has 1.5Gb disk write - a huge improvement. 1hr 22mins later once movement in the house has started and it's up to 2.7Gb.

Other than Philips Hue I'm wondering if this is an issue to do with too many devices, all reporting several value's (PIR, temp, humidity, lux etc)? Since disabling Philips Hue I've noticed a vast improvment in reaction speed of lights turning on so I''m going to phase it out for z-wave.

To try and reduce things, is it enough to go into Z-Way devices and disable events that are not required, or would I need to do this on the sensor configuration?
Attachments
NEO disable.png
NEO disable.png (67.75 KiB) Viewed 5134 times
Rpi4 2Gb, Z-Way 3.2.2, UZB 5.39
michap
Posts: 437
Joined: 26 Mar 2013 10:35
Contact:

Re: Corrupt SD card...again!

Post by michap »

insiorc wrote:
17 Feb 2021 01:01
sudo strace -fp 344 -e trace=write,open
strace: attach: ptrace(PTRACE_SEIZE, 344): No such process
the "344" is the PID of the z-way process (you can see it in

Code: Select all

ps -x |grep z-way
as sample)

if using

Code: Select all

strace -fp [PID] -e trace=write,open -s100 -y
you will get additional the filenames (as sample)

Code: Select all

[pid  3403] write(11</opt/z-way-server/automation/storage/configjson-06b2d3b23dce96e1619d2b53d6c947ec.json.tmp>, "{\"controller\":{\"first_start_up\":\"2020-01-11T11:23:47.526Z\",\"count_of_reconnects\":102,\"emailInitialAc"..., 106496) = 106496
[pid  3403] write(11</opt/z-way-server/automation/storage/configjson-06b2d3b23dce96e1619d2b53d6c947ec.json.tmp>, "://192.168.178.92\",\"database\":\"energy\",\"excludeTags\":[],\"interval\":10},\"id\":19,\"creationTime\":161131"..., 2614) = 2614
[pid  3403] write(3</var/log/z-way-server.log>, "[2021-02-17 17:55:47.156] [I] [core] Request was successful\n", 60) = 60
use the -o [filename] option for output into a file...
insiorc
Posts: 43
Joined: 01 Apr 2019 21:55

Re: Corrupt SD card...again!

Post by insiorc »

Well I think I'm making progress, now only 6Gb in 10 hours, and I'd guess a lot of that was due to the amount of changes I've been making. I disabled anything I wasn't using, I had loads of Fibaro Dimmer 2 heat alarms and the like so these are all now disabled. I've replaced a few Philips Hue lamps with more z-wave stuff, the remaining Philips Hue can be phased out in time but what's left is confined to bedroom lighting which had minimal automation anyway so it's staying off the ZWay system. Everything appears to be a lot faster acting now.

michap, thanks for your explanation, I'm a bit out of my depth but trying to figure it out. I've maybe picked it up wrong but 413 was listed in top as the z-way server process so using that number I got a continual running list of writing for about 15 minutes, I'm not sure if it timed out or if I inadvertently got it stopped, but I maaged to copy the last 5 mins of text which came to 1865 lines (is there a command to stop it or should I let it run until it stops?). Reading through them it just looks like a load of send and receive from sensors but a common number to appear is 46, is that a device ID? If so, I do not have a device 46 - EDIT, yes I do, no idea why I didn't see it.. I tried attaching the saved file but it can't attach a .txt file.

[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.769] [D] [zway] SENT ACK\n", 46) = 46
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.770] [D] [zway] SETDATA devices.37.data.lastReceived = 0 (0x00000000)\n", 91) = 91
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.771] [zway] Node 37:0 CC Security: sending Nonce Report\n", 81) = 81
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.772] [zway] Adding job: Nonce Report\n", 62) = 62
[pid 540] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.776] [core] [BaseModule-16] Set lastLevel to 100 for ZWayVDev_zway_30-0-49-"..., 111) = 111
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.777] [W] [zway] Received SOF, while awaiting ACK\n", 70) = 70
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.778] [D] [zway] RECEIVED: ( 01 0B 00 04 00 25 02 98 40 C1 00 00 CE )\n", 90) = 90
[pid 568] write(7</dev/ttyACM0>, "\6", 1) = 1
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.779] [D] [zway] SENT ACK\n", 46) = 46
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.781] [D] [zway] SETDATA devices.37.data.lastReceived = 0 (0x00000000)\n", 91) = 91
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.782] [zway] Node 37:0 CC Security: sending Nonce Report\n", 81) = 81
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.783] [zway] Adding job: Nonce Report\n", 62) = 62
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.787] [W] [zway] Received SOF, while awaiting ACK\n", 70) = 70
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.788] [D] [zway] RECEIVED: ( 01 0B 00 04 00 25 02 98 40 C1 00 00 CE )\n", 90) = 90
[pid 568] write(7</dev/ttyACM0>, "\6", 1) = 1
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.789] [D] [zway] SENT ACK\n", 46) = 46
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.791] [D] [zway] SETDATA devices.37.data.lastReceived = 0 (0x00000000)\n", 91) = 91
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.793] [zway] Node 37:0 CC Security: sending Nonce Report\n", 81) = 81
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.794] [zway] Adding job: Nonce Report\n", 62) = 62
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.797] [D] [zway] RECEIVED ACK\n", 50) = 50
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.798] [D] [zway] RECEIVED: ( 01 04 01 13 01 E8 )\n", 69) = 69
[pid 568] write(7</dev/ttyACM0>, "\6", 1) = 1
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.800] [D] [zway] SENT ACK\n", 46) = 46
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.800] [D] [zway] Delivered to Z-Wave stack\n", 63) = 63
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.807] [D] [zway] RECEIVED: ( 01 18 00 13 6D 00 00 02 00 BF 7F 7F 7F 7F 00 00 03 "..., 129) = 129
[pid 568] write(7</dev/ttyACM0>, "\6", 1) = 1
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.808] [D] [zway] SENT ACK\n", 46) = 46
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.808] [zway] Job 0x13 (Nonce Report): Delivered\n", 72) = 72
[pid 540] write(64</opt/z-way-server/automation/storage/configjson-06b2d3b23dce96e1619d2b53d6c947ec.json.tmp>, "{\"controller\":{\"first_start_up\":\"2019-01-04T11:21:55.023Z\",\"count_of_reconnects\":189,\"firstaccess\":f"..., 409600 <unfinished ...>
[pid 568] write(3</var/log/z-way-server.log>, "[2021-02-17 20:13:59.811] [D] [zway] SendData Response with callback 0x6d received: received by reci"..., 106 <unfinished ...>
[pid 540] <... write resumed> ) = 409600
Rpi4 2Gb, Z-Way 3.2.2, UZB 5.39
michap
Posts: 437
Joined: 26 Mar 2013 10:35
Contact:

Re: Corrupt SD card...again!

Post by michap »

If there are many write processes to
/opt/z-way-server/automation/storage/configjson-xxxx.json
you can check the device settings - any device is reporting with short interval.
(I have found an Aeon switch with default setting of 10 sec. report ....)

In this time every value change of parameter will cause a write process to this file and to z-way log file (and to device history, if installed).

My log files (also the z-way-server.log) are all in RAM disk (a small one) with rotating the log if size is >30MB - same for syslog etc.
vinisz
Posts: 151
Joined: 23 Nov 2019 23:23

Re: Corrupt SD card...again!

Post by vinisz »

Not having red all of this thread but is switching to USB not a better idea for the RPI ?
Booting from it is fully supported by now.
I did the transition from SD to USB on the RPI 2 months ago and very happy with it, also, easy to clone from SD to USB
insiorc
Posts: 43
Joined: 01 Apr 2019 21:55

Re: Corrupt SD card...again!

Post by insiorc »

vinisz, yes I've now got the Rpi running on a USB drive, from what I gather though it is non more tolerant of write damage than an SD card, although yes it is a bit easier and faster to work with for images etc., I've now got 2 identical USB drives so I can clone from one to another.

From the zway logs I've got the below error. I've uninstalled/reinstalled the BaseModule, I'm not really sure what more I should do or if that will sort it. I should see at 21:40 as it seems to present the error every hour. It's a slow process for me but I am enjoying the learning,

[2021-02-18 20:40:06.278] [core] [AutoOffInactive_162] Error: Device not found DummyDevice_159
at AutoOffInactive.BaseModule.error (automation/userModules/BaseModule/index.js:122:17)
at automation/userModules/BaseModule/index.js:183:18
at Function._.each._.forEach (automation/lib/underscore.js:153:9)
at AutoOffInactive.BaseModule.processDeviceList (automation/userModules/BaseModule/index.js:174:7)
at AutoOffInactive.checkInactivity (automation/userModules/AutoOffInactive/index.js:71:14)
[2021-02-18 20:40:06.746] [zway] Adding job: Get background noise level
Rpi4 2Gb, Z-Way 3.2.2, UZB 5.39
micky1500
Posts: 298
Joined: 07 Feb 2016 16:29
Location: England

Re: Corrupt SD card...again!

Post by micky1500 »

Cool, but confusing still.
Writing to ram drive has got to be 10 x faster and not wearing out SD, SSD drives.

2 Months is too short to know, SD cards I use last about 1 year before failing.
The memory chips in the SD cards are same as in the USB SSD units.
Unless you have SSD Wear levelling in the usb device firmware, it will fail at the same rate as an SD card.
Wear levelling moves the next write to a different block, so the same block is not continuously re-written.
I have a year old Integral 240GB SSD drive that is showing signs of write failures. Not related to z-way, just cctv snapshots.
https://cpc.farnell.com/integral/inssd2 ... EML007-005
Raspi 4 - (Buster - 32 Bit) Zwave Version 4.1.1, Raz 7 Pro, Serial API Version: 07.38
Post Reply