drumstyx

Members
  • Posts

    99
  • Joined

  • Last visited

Converted

  • Gender
    Undisclosed

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

drumstyx's Achievements

Apprentice

Apprentice (3/14)

5

Reputation

  1. Hooray! You did it! What a weird problem lol Now my first world problem is that my internet is actually faster than my SATA drives (not to mention the disk shelf they're in is a DS4246 so even turbo write peaks at about 85MB/s), so I can't rely on the mover to keep up with data intake. Maybe I can throttle nzbget at a certain cache usage threshold...
  2. Fair enough -- I've attached diags. I'm not quite sure where to look in them for this issue. As far as I can tell, there have been no dropouts or errors on the cache other than a failure to write due to insufficient space. datnas-diagnostics-20221230-1334.zip
  3. Right, but why is there a filesystem issue when it's "full"? Isn't the global reserve supposed to prevent this? I guess I didn't have min free space configured on cache, so my bad on that, but filling a filesystem shouldn't result in unresolvable readonly. If I accept the risk of some corrupted data in whatever was last written, is there any way to just say "to hell with integrity, delete this file"?
  4. Long story short, I upgraded to 3gbps fiber internet, and accidentally filled my 2tb cache pool (dual 2tb nvme drives) with the automated download processes. The filesystem went readonly, and I tried to just reboot, and of course, readonly again. On top of this, a disk in my array happened to fail a couple days ago, and I didn't notice, so I'm running a rebuild on a warm spare I had in there to replace it. Problem is, now I can't run mover, and of course even after copying manually, I can't delete anything to free up space. The global reserve is at 512MiB and says 0 is used, but still, it's just not happy. The rebuild means I can't go into maintenance mode right now. I'll wait it out if I have to, but it'd be awesome if I could just figure out how to delete even a single file from the cache to free it up a bit. Scrub doesn't work (readonly) balancing doesn't work (readonly)....seems like I'm stuck? It's frustrating because I went dual btrfs on the cache to prevent issues like this -- the suggestions I see on here have been "best bet is to backup cache, reformat and rebuild cache".
  5. Does this happen to have tools that used to be included in the old devpack plugin? Awesome to have an equivalent for nerdpack, but devpack was also indispensable for some uses! Seems to be a VERY similar codebase, so likely not a difficult port, if you're up for it!
  6. config/stop runs very early in the shutdown process -- not sure if you meant to quote me where I said "To run things LATE in the shutdown process is hard" there, but it holds true. It works great for various full-stack cleanup, or external notifications/hooks, but is technically not the best place for commands that affect the status of the server hardware (like my much earlier posts about turning the AC power off at the plug).
  7. The stop script is just about the earliest you could want anything to run. Shutdown is initiated with /sbin/init 0, which sets the runlevel to 0, at which point the absolute earliest hook is the first line of /etc/rc.d/rc.0 (after "#!/bin/bash", of course). The number after "rc." in the script name is the runlevel. Runlevel 0 is shutdown (aka "halt", officially) and runlevel 6 is reboot. In Unraid, rc.0 is just an alias for rc.6, and one script handles both things, as it looks to see which runlevel it was called as. The only scripts that run before /boot/config/stop are scripts within /etc/rc.d/rc0.d/ or /etc/rc.d/rc6.d/, depending on runlevel, which at this point contains only the flash backup of the myservers plugin, if applicable. All that to say, to run things EARLY in the shutdown process is easy -- /boot/config/stop. To run things LATE in the shutdown process is hard, and would involve either modifying the base unraid image on flash, or adding something in the go script to modify rc.6 to your liking on boot (either copying a complete file, which would have to be modified manually with each OS update, or some awk/sed/bash magic to modify the last if statement in rc.6)
  8. I've experienced various hardware failures over the years, and they've been a pain in the arse! Not the fault of UnRAID, just because I didn't take all the precautions necessary -- failed cache, failed drives, failed USB. Thing is, I can mitigate almost everything by just having some redundancy -- dual parity, cache pool, etc. which I've now done. Got this thing almost bulletproof, but for one thing: A USB drive failure. Is it possible, or are there any plans to make it possible, to have a boot pool or redundancy for the boot drive? I know that, in theory, the boot drive is only needed for power on and power off (actually, not even sure it's needed for power off?) but I do write a rotating syslog to my boot drive because it's very useful for debugging array and system bugs. I could write to an unassigned device, or the cache, but the boot drive is guaranteed to exist as long as the system is running, and if any of those other drives drop because of a system-wide failure, I don't want to lose the logs. I'm willing to sacrifice a boot drive every few years, but I'd REALLY like to not have that cause an outage for which I have to be physically present to fix.
  9. Long story short, I had a cache drive failure (ADATA nvme ssd, quite a surprise) and after recovery I'm looking to improve reliability. I've picked up a 2TB Samsung NVMe drive to complement the ADATA warranty replacement when it arrives (also 2tb) Thing is, I know the process when an array drive fails, but I have no idea what happens when a BTRFS drive fails in a pool. I've read some horror stories about a failed drive causing an unrecoverable situation. Of course I back up important stuff to the array (and important stuff on the array is backed up offsite) but recovering from that is still a pain in the butt. So, can anyone walk me through what a BTRFS cache pool drive failure looks like?
  10. Does Unraid work fine running on cores other than 0 these days? I remember when I first started playing around with pinning/isolation years ago, Unraid didn't play nice if I pinned/isolated core 0, even if I left, say, core 3 (the last core on a quad core CPU) entirely untouched. So I've always just let Unraid have the first core, which is no problem with homogenous CPUs. If I went 12th gen though, I'd really hate to give up a P core if I could hand it an E core for its minor management stuff.
  11. Got it -- runlevel is 0 for shutdown, 6 for reboot, 3 for normal operation For posterity, here's my hacked-together script for it: #!/bin/sh currentRunLevel=$(runlevel | awk {'print $NF'}) echo "*********** THE CURRENT RUN LEVEL IS ***********" echo $currentRunLevel if [[ $(echo $currentRunLevel) = 0 ]]; then snmpset -v 1 -c private 192.168.42.20 1.3.6.1.4.1.318.1.1.12.3.3.1.1.4.1 integer 5 snmpset -v 1 -c private 192.168.42.20 1.3.6.1.4.1.318.1.1.12.3.3.1.1.4.2 integer 5 snmpset -v 1 -c private 192.168.42.20 1.3.6.1.4.1.318.1.1.12.3.3.1.1.4.5 integer 5 fi The echos at the top were just for me to test that runlevel was changed before the script is executed, and indeed it seems it can be relied on. To add: this script runs pretty much immediately when a shutdown is triggered (at least from GUI). BEFORE unloading anything. Technically, that means it's not an ideal spot for it, but since unraid force kills things after 90 seconds (at least, that's what I assume when it says it's allowing 90 seconds for graceful shutdown) my timers getting triggered over snmp are fine at 180 seconds. The only downsides to the timeouts is that coming back up after power loss is a rather long process (I've got a raspberry pi that listens for UPS OFFBATTERY state and brings things back, but it needs to wait longer than the timeout to make sure the server isn't currently trying to shut down), and power is consumed from the UPS for longer than is really necessary, but only a minute or two extra. You just can't make the timings too tight, otherwise if there are any hangups during shutdown, it could kill the power too soon. EDIT: This isn't a thread for SNMP specifically, but I thought I'd note what the heck those long strings are: OIDs for the outlets on my PDU. APC provides MIB files to use in an MIB browser, and with some digging you can figure out how to control various things. It's a pain, not to mention archaic, but it's neat once it's set up. I'm positive I'll hate myself if I ever move outlets around and forget I set this up, but for now it's a blast!
  12. Come to think of it, this will probably run on both reboot and shutdown -- if the outlet timeout is long enough, and the scripts robust enough, it shouldn't be a problem for functionality, but it could slow down reboots considerably. Any ideas as to how I might ensure it only runs on shutdown?
  13. Ah, this could be perfect -- do you know if it runs before or after networking is unloaded, or can you point me to what script calls the stop script? The most ideal place to put this script is RIGHT before the poweroff call, just like a UPS killpower command, but just like the UPS killpower command, this won't work over SNMP because networking is already unloaded by then. Annoyingly, filesystems appear to be unmounted after stopping networking, so the only real option is to have a long poweroff delay on the outlets and send the request at the beginning of the shutdown. Probably fair to assume it's before the rc.0 script is called at all, so I'll give it a shot!
  14. Looking to run a script that shuts down my disk shelves on unraid shutdown. I've configured poweroff delays on my switched PDU, and I'm sending snmp commands to the PDU to make it work. Problem is, I'm using /etc/rc.d/rc.0 as the script location, and this is in the RAMdisk, so it doesn't persist across reboots. There used to be a powerdown script, but that's apparently been deprecated, and rc.0 just calls /sbin/poweroff, which is a binary. Basically, I'm looking for the shutdown equivalent of the go script