Linguafoeda Posted January 10 Share Posted January 10 Dumb question but I'm trying to connect my agent to Netdata, i logged in via GitHub and then it asks me to run the command below, which i tried in the netdata's own console. Nothing happens, it says "sh: 1: sudo: not found". I cannot find any sort of netdata folder in either /etc/ or /mnt/cache/appdata/. Where is this file location that i can use to enable this netdata cloud connection? $ sudo cat /var/lib/netdata/netdata_random_session_id Tip: Run the command and paste here the key it will give you. If the command doesn't work out of the box, locate the /var/lib/netdata/netdata_random_session_id file, open it in your favorite text editor, and copy it to your clipboard. Also separately - what steps do i need to follow to enable alerts for x% memory usage that can be sent to my discord server? I clicked on "Alerts" button up top -> Alerts Configuration -> System Memory Utilization -> clicked the singular line item twice until it shows up as a sidebar but i don't see anywhere to enable it / configure it / change it. I am a pretty big noob when it comes to some Unraid stuff so any dummy-guide to enabling this would be much appreciated (the reason i want to enable memory usage alerts is so i can track down what is causing my Unraid machine to hit 100% RAM usage and crash daily, see below) ) Quote Link to comment
Nuke Posted January 12 Share Posted January 12 This docker creates massive writes to cache drive. How to solve it? Quote Link to comment
Ether Wrangler Posted January 13 Share Posted January 13 On 1/9/2024 at 7:16 PM, Linguafoeda said: Dumb question but I'm trying to connect my agent to Netdata, i logged in via GitHub and then it asks me to run the command below, which i tried in the netdata's own console. Nothing happens, it says "sh: 1: sudo: not found". I cannot find any sort of netdata folder in either /etc/ or /mnt/cache/appdata/. Where is this file location that i can use to enable this netdata cloud connection? $ sudo cat /var/lib/netdata/netdata_random_session_id Tip: Run the command and paste here the key it will give you. If the command doesn't work out of the box, locate the /var/lib/netdata/netdata_random_session_id file, open it in your favorite text editor, and copy it to your clipboard. Also separately - what steps do i need to follow to enable alerts for x% memory usage that can be sent to my discord server? I clicked on "Alerts" button up top -> Alerts Configuration -> System Memory Utilization -> clicked the singular line item twice until it shows up as a sidebar but i don't see anywhere to enable it / configure it / change it. I am a pretty big noob when it comes to some Unraid stuff so any dummy-guide to enabling this would be much appreciated (the reason i want to enable memory usage alerts is so i can track down what is causing my Unraid machine to hit 100% RAM usage and crash daily, see below) ) The sudo command isn't working because that package is not in the image. Also on my install I have not been able to find the netdata_random_session_id file. I don't see this file in a quick google search either. Can you post a link on what you are trying to do? For the second part, it looks like you have to change the alert settings via CLI. You'll have to edit two files from inside off the netdata container's cli (one to change the alert's config, another to actually give netdata the settings to send the Discord notifications). You'll also have to create a new webhook/bot in discord that will be used in that second file. Let me know if this is something you want to look at and I can try to help you, but I saw in your other thread that you think you have the issue narrowed down to Plex. Quote Link to comment
Ether Wrangler Posted January 13 Share Posted January 13 8 hours ago, Nuke said: This docker creates massive writes to cache drive. How to solve it? I do not see anything like this in the syslog on Unraid or in the docker log for Netdata on my system. Which log file do you have these in? Quote Link to comment
Linguafoeda Posted January 14 Share Posted January 14 On 1/12/2024 at 11:08 PM, Ether Wrangler said: The sudo command isn't working because that package is not in the image. Also on my install I have not been able to find the netdata_random_session_id file. I don't see this file in a quick google search either. Can you post a link on what you are trying to do? For the second part, it looks like you have to change the alert settings via CLI. You'll have to edit two files from inside off the netdata container's cli (one to change the alert's config, another to actually give netdata the settings to send the Discord notifications). You'll also have to create a new webhook/bot in discord that will be used in that second file. Let me know if this is something you want to look at and I can try to help you, but I saw in your other thread that you think you have the issue narrowed down to Plex. I would love to get netdata properly setup so i can monitor alerts in future. Right now i have no folder of netdata anywhere i can find to even begin to edit an alert file? What is step #1 to create the netdata appdata folder...is there a package i need to install besides https://hub.docker.com/r/netdata/netdata docker container i have installed? How do i get "netdata cloud" installed and working? Quote Link to comment
Ether Wrangler Posted January 16 Share Posted January 16 On 1/14/2024 at 5:07 PM, Linguafoeda said: I would love to get netdata properly setup so i can monitor alerts in future. Right now i have no folder of netdata anywhere i can find to even begin to edit an alert file? What is step #1 to create the netdata appdata folder...is there a package i need to install besides https://hub.docker.com/r/netdata/netdata docker container i have installed? How do i get "netdata cloud" installed and working? If you have the latest version of the template (updated at the end of October) you'll have all you need in the template for Cloud. To see where the container is creating the AppData, click "Show more settings" at the bottom of the netdata docker settings page. In there you'll see three paths: NetData_Config, NetData_Lib, and NetData_Cache. Make sure all 3 of those are mapped or you'll have issues with NetData Cloud persisting data and setting up the config for Alerts. Go to https://www.netdata.cloud/ and sign up for an account. You'll get to the point you'll be given instructions to install netdata on something and those instructions should have a Claim Token, Claim URL, and a Claim Room. You'll paste all 3 of those into the 3 fields into the template and Unraid should start showing up in Netdata Cloud. To setup the Discord portion, this should walk you through it: https://learn.netdata.cloud/docs/alerting/notifications/agent-dispatched-notifications/discord When you get to: sudo ./edit-config health_alarm_notify.conf Ignore the sudo command. If you do the Test Notification section, I don't think you'll do the following: # become user netdata sudo su -s /bin/bash netdata To change the ram alert settings, run this command after you are done editing the config above: ./edit-config health.d/ram.conf In there you can change the settings to however you would like. For more info on that one, here's the documentation on setting those up: https://learn.netdata.cloud/docs/alerting/health-configuration-reference 1 Quote Link to comment
Linguafoeda Posted January 16 Share Posted January 16 (edited) 57 minutes ago, Ether Wrangler said: If you have the latest version of the template (updated at the end of October) you'll have all you need in the template for Cloud. To see where the container is creating the AppData, click "Show more settings" at the bottom of the netdata docker settings page. In there you'll see three paths: NetData_Config, NetData_Lib, and NetData_Cache. Make sure all 3 of those are mapped or you'll have issues with NetData Cloud persisting data and setting up the config for Alerts. Go to https://www.netdata.cloud/ and sign up for an account. You'll get to the point you'll be given instructions to install netdata on something and those instructions should have a Claim Token, Claim URL, and a Claim Room. You'll paste all 3 of those into the 3 fields into the template and Unraid should start showing up in Netdata Cloud. To setup the Discord portion, this should walk you through it: https://learn.netdata.cloud/docs/alerting/notifications/agent-dispatched-notifications/discord When you get to: sudo ./edit-config health_alarm_notify.conf Ignore the sudo command. If you do the Test Notification section, I don't think you'll do the following: # become user netdata sudo su -s /bin/bash netdata To change the ram alert settings, run this command after you are done editing the config above: ./edit-config health.d/ram.conf In there you can change the settings to however you would like. For more info on that one, here's the documentation on setting those up: https://learn.netdata.cloud/docs/alerting/health-configuration-reference So i uninstalled and reinstalled the container as it was entirely missing all those template variables, and i got netdata cloud up and running. I got to line 2 (sudo) and it failed. if i remove sudo word, it opens up a text editor-like window but then i don't know how to save the changes once I've made them? or can i edit this conf file in something like krusader? cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata ./edit-config health_alarm_notify.conf Webhook Paste Info I'm supposed to paste? #------------------------------------------------------------------------------ # discord (discordapp.com) global notification options SEND_DISCORD="YES" DISCORD_WEBHOOK_URL="https://discord.com/api/webhooks/XXXXXXXXXXXXX/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" DEFAULT_RECIPIENT_DISCORD="alerts" Edited January 16 by Linguafoeda Quote Link to comment
Linguafoeda Posted January 16 Share Posted January 16 I now have three "health_alarm_notify.conf" files in my config folder with .swo, .swn and .swp extensions. not sure if i did something wrong... Quote Link to comment
Ether Wrangler Posted January 20 Share Posted January 20 On 1/15/2024 at 10:02 PM, Linguafoeda said: So i uninstalled and reinstalled the container as it was entirely missing all those template variables, and i got netdata cloud up and running. I got to line 2 (sudo) and it failed. if i remove sudo word, it opens up a text editor-like window but then i don't know how to save the changes once I've made them? or can i edit this conf file in something like krusader? cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata ./edit-config health_alarm_notify.conf Webhook Paste Info I'm supposed to paste? #------------------------------------------------------------------------------ # discord (discordapp.com) global notification options SEND_DISCORD="YES" DISCORD_WEBHOOK_URL="https://discord.com/api/webhooks/XXXXXXXXXXXXX/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" DEFAULT_RECIPIENT_DISCORD="alerts" On 1/15/2024 at 10:33 PM, Linguafoeda said: I now have three "health_alarm_notify.conf" files in my config folder with .swo, .swn and .swp extensions. not sure if i did something wrong... Sorry, been a busy week at work. If you don't change the file permissions then Krusader probably won't have the correct ones to save the file. Not sure what those 3 you have highlighted are, but it does look like the correct file got created (circled in red) If the text editor in your terminal looks like this at the bottom: That's an editor called "nano" if you ever want to watch a video to learn more about it. All of those are the cheat codes for the common shortcuts and ^ is the "modifier key". The default modifier in nano is ctrl. So to save and exit you can do ctrl-o (Write Out) then ctrl-x (Exit). Or you can just do ctrl-x and it will ask if you want to save the buffer and you can answer y/n. Quote Link to comment
Linguafoeda Posted January 20 Share Posted January 20 (edited) 3 hours ago, Ether Wrangler said: Sorry, been a busy week at work. If you don't change the file permissions then Krusader probably won't have the correct ones to save the file. Not sure what those 3 you have highlighted are, but it does look like the correct file got created (circled in red) If the text editor in your terminal looks like this at the bottom: That's an editor called "nano" if you ever want to watch a video to learn more about it. All of those are the cheat codes for the common shortcuts and ^ is the "modifier key". The default modifier in nano is ctrl. So to save and exit you can do ctrl-o (Write Out) then ctrl-x (Exit). Or you can just do ctrl-x and it will ask if you want to save the buffer and you can answer y/n. I don't have that option - it just says insert at the bottom. i right clicked on netdata container -> console -> type those two commands. i tried a bunch of ctrl + commands...nothing seems to do anything along the lines of saving. Edited January 20 by Linguafoeda Quote Link to comment
Ether Wrangler Posted January 22 Share Posted January 22 On 1/19/2024 at 11:31 PM, Linguafoeda said: I don't have that option - it just says insert at the bottom. i right clicked on netdata container -> console -> type those two commands. i tried a bunch of ctrl + commands...nothing seems to do anything along the lines of saving. That's vi. Here's a link on how to use it: https://www.howtogeek.com/102468/a-beginners-guide-to-editing-text-files-with-vi/. If you'd rather use nano (at least learn the vi exit sequence though) then run: apt update && apt install nano select-editor When you run select-editor you should get an option to change from vim-tiny to nano. My only guess is that I installed nano in the past and set it as the default edit and forgot about it. Quote Link to comment
Linguafoeda Posted January 22 Share Posted January 22 (edited) 2 hours ago, Ether Wrangler said: That's vi. Here's a link on how to use it: https://www.howtogeek.com/102468/a-beginners-guide-to-editing-text-files-with-vi/. If you'd rather use nano (at least learn the vi exit sequence though) then run: apt update && apt install nano select-editor When you run select-editor you should get an option to change from vim-tiny to nano. My only guess is that I installed nano in the past and set it as the default edit and forgot about it. thank you - nano seemed to work. is there a way to globally set on my unraid all text editing to use nano? I tried to run that command on the main unraid terminal and it didn't work, but it did work in the container's (netdata) console. the test worked for the webhook, then i ran the first two commands, all i did was change the warn and crit ranges (i.e. alert at 70% instead of 80% for warn), then saved and exited, ran the reload-health command but nothing shows up in my netdata web console indicating an alert is setup? To edit ram.conf file: cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata ./edit-config health.d/ram.conf ram.conf file: # you can disable an alarm notification by setting the 'to' line to: silent alarm: ram_in_use on: system.ram class: Utilization type: System component: Memory os: linux hosts: * calc: $used * 100 / ($used + $cached + $free + $buffers) units: % every: 10s warn: $this > (($status >= $WARNING) ? (70) : (90)) crit: $this > (($status == $CRITICAL) ? (90) : (98)) delay: down 15m multiplier 1.5 max 1h summary: System memory utilization info: System memory utilization to: sysadmin alarm: ram_available on: mem.available class: Utilization type: System component: Memory os: linux hosts: * calc: $avail * 100 / ($system.ram.used + $system.ram.cached + $system.ram.free + $system.ram.buffers) units: % every: 10s warn: $this < (($status >= $WARNING) ? (15) : (10)) delay: down 15m multiplier 1.5 max 1h summary: System available memory info: Percentage of estimated amount of RAM available for userspace processes, without causing swapping to: silent alarm: oom_kill on: mem.oom_kill os: linux hosts: * lookup: sum -30m unaligned units: kills every: 5m warn: $this > 0 delay: down 10m summary: System OOM kills info: Number of out of memory kills in the last 30 minutes to: silent ## FreeBSD alarm: ram_in_use on: system.ram class: Utilization type: System component: Memory os: freebsd hosts: * calc: ($active + $wired + $laundry + $buffers) * 100 / ($active + $wired + $laundry + $buffers + $cache + $free + $inactive) units: % every: 10s warn: $this > (($status >= $WARNING) ? (70) : (90)) crit: $this > (($status == $CRITICAL) ? (90) : (98)) delay: down 15m multiplier 1.5 max 1h summary: System memory utilization info: System memory utilization to: sysadmin alarm: ram_available on: mem.available class: Utilization type: System component: Memory os: freebsd hosts: * calc: $avail * 100 / ($system.ram.free + $system.ram.active + $system.ram.inactive + $system.ram.wired + $system.ram.cache + $system.ram.laundry + $sy> units: % every: 10s warn: $this < (($status >= $WARNING) ? (15) : (10)) delay: down 15m multiplier 1.5 max 1h summary: System available memory info: Percentage of estimated amount of RAM available for userspace processes, without causing swapping to: silent After editing ram.conf, ran: netdatacli reload-health Edited January 22 by Linguafoeda Quote Link to comment
Linguafoeda Posted January 25 Share Posted January 25 I'm getting a weird error message on my discord from netdata, does anyone know what this means? SERVER needs attention, app.dhcp_fds_open_limit, App group dhcp file descriptors utilization = 500% App group dhcp file descriptors utilization = 500% Open files percentage against the processes limits, among all PIDs in application group app.dhcp_fds_open_limit 1 Quote Link to comment
Ether Wrangler Posted January 26 Share Posted January 26 On 1/22/2024 at 12:53 AM, Linguafoeda said: thank you - nano seemed to work. is there a way to globally set on my unraid all text editing to use nano? I tried to run that command on the main unraid terminal and it didn't work, but it did work in the container's (netdata) console. the test worked for the webhook, then i ran the first two commands, all i did was change the warn and crit ranges (i.e. alert at 70% instead of 80% for warn), then saved and exited, ran the reload-health command but nothing shows up in my netdata web console indicating an alert is setup? To edit ram.conf file: cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata ./edit-config health.d/ram.conf ram.conf file: # you can disable an alarm notification by setting the 'to' line to: silent alarm: ram_in_use on: system.ram class: Utilization type: System component: Memory os: linux hosts: * calc: $used * 100 / ($used + $cached + $free + $buffers) units: % every: 10s warn: $this > (($status >= $WARNING) ? (70) : (90)) crit: $this > (($status == $CRITICAL) ? (90) : (98)) delay: down 15m multiplier 1.5 max 1h summary: System memory utilization info: System memory utilization to: sysadmin alarm: ram_available on: mem.available class: Utilization type: System component: Memory os: linux hosts: * calc: $avail * 100 / ($system.ram.used + $system.ram.cached + $system.ram.free + $system.ram.buffers) units: % every: 10s warn: $this < (($status >= $WARNING) ? (15) : (10)) delay: down 15m multiplier 1.5 max 1h summary: System available memory info: Percentage of estimated amount of RAM available for userspace processes, without causing swapping to: silent alarm: oom_kill on: mem.oom_kill os: linux hosts: * lookup: sum -30m unaligned units: kills every: 5m warn: $this > 0 delay: down 10m summary: System OOM kills info: Number of out of memory kills in the last 30 minutes to: silent ## FreeBSD alarm: ram_in_use on: system.ram class: Utilization type: System component: Memory os: freebsd hosts: * calc: ($active + $wired + $laundry + $buffers) * 100 / ($active + $wired + $laundry + $buffers + $cache + $free + $inactive) units: % every: 10s warn: $this > (($status >= $WARNING) ? (70) : (90)) crit: $this > (($status == $CRITICAL) ? (90) : (98)) delay: down 15m multiplier 1.5 max 1h summary: System memory utilization info: System memory utilization to: sysadmin alarm: ram_available on: mem.available class: Utilization type: System component: Memory os: freebsd hosts: * calc: $avail * 100 / ($system.ram.free + $system.ram.active + $system.ram.inactive + $system.ram.wired + $system.ram.cache + $system.ram.laundry + $sy> units: % every: 10s warn: $this < (($status >= $WARNING) ? (15) : (10)) delay: down 15m multiplier 1.5 max 1h summary: System available memory info: Percentage of estimated amount of RAM available for userspace processes, without causing swapping to: silent After editing ram.conf, ran: netdatacli reload-health On 1/25/2024 at 12:23 PM, Linguafoeda said: I'm getting a weird error message on my discord from netdata, does anyone know what this means? SERVER needs attention, app.dhcp_fds_open_limit, App group dhcp file descriptors utilization = 500% App group dhcp file descriptors utilization = 500% Open files percentage against the processes limits, among all PIDs in application group app.dhcp_fds_open_limit I don't think there is a way to change the editor globally because each container will have it's own configuration. I also don't see any way in the web console to verify if the alert is configured or not, but based on the second post, you are at least getting Discord alerts for some of the built in alerts. As to what that alert means, I've never seen that, but I have a static IP Address on my server. If no one else chimes in here with an idea, you might want to take that message to the General Support forum since I think whatever is causing that is going to be from UnRAID itself and not this docker. Quote Link to comment
dlparisi Posted February 16 Share Posted February 16 (edited) On 1/25/2024 at 12:23 PM, Linguafoeda said: I'm getting a weird error message on my discord from netdata, does anyone know what this means? SERVER needs attention, app.dhcp_fds_open_limit, App group dhcp file descriptors utilization = 500% App group dhcp file descriptors utilization = 500% Open files percentage against the processes limits, among all PIDs in application group app.dhcp_fds_open_limit I'm getting the same warning alert in my netdata cloud console, even down the "500%". I don't recall seeing this before but I'm not exactly sure when it started showing up either. Everything seems to be running fine though on the server. It even survives a restart and goes immediately back up to 500% (there's just a gap during the restart). Any idea if this is normal, fixable, or even something to worry about? I should add that I've got no other dockers or vms running. Is there any way to tell what might be causing this? Edited February 17 by dlparisi 1 Quote Link to comment
dlparisi Posted February 19 Share Posted February 19 (edited) I was able avoid the warning by reverting back to v1.44.3 (stable) instead of the nightly but checking the app.dhcp_fds_open_limit value still shows it at 500%. Also tried booting into safe mode, no change. Now I'm even more confused. Edited February 19 by dlparisi Quote Link to comment
Bmalone Posted April 10 Share Posted April 10 I recently updated my Netdata Docker containers on my production and backup server which have an almost identical configuration, or as close as I can keep them, and after the update my production instance doesn't work and my backup server is fine. The app will start fine, but won't render any visualizations of the data and I'm getting the error below. I've tried deleting the container and reinstalling, but the error is the same. Any idea what could be causing this? time=2024-04-10T10:13:00.425-05:00 comm=apps.plugin source=collector level=error errno="3, No such process" tid=718 thread=apps.plugin msg="Cannot process /host/proc/12131/cmdline (command 'z_wr_iss')" time=2024-04-10T10:13:27.413-05:00 comm=apps.plugin source=collector level=error errno="3, No such process" tid=718 thread=apps.plugin msg="Cannot process /host/proc/36316/status (command 'zfs')" time=2024-04-10T10:19:58.407-05:00 comm=apps.plugin source=collector level=error errno="3, No such process" tid=718 thread=apps.plugin msg="Cannot process /host/proc/29752/limits (command 'zpool')" Quote Link to comment
Bmalone Posted April 11 Share Posted April 11 On 4/10/2024 at 10:24 AM, Bmalone said: I recently updated my Netdata Docker containers on my production and backup server which have an almost identical configuration, or as close as I can keep them, and after the update my production instance doesn't work and my backup server is fine. The app will start fine, but won't render any visualizations of the data and I'm getting the error below. I've tried deleting the container and reinstalling, but the error is the same. Any idea what could be causing this? time=2024-04-10T10:13:00.425-05:00 comm=apps.plugin source=collector level=error errno="3, No such process" tid=718 thread=apps.plugin msg="Cannot process /host/proc/12131/cmdline (command 'z_wr_iss')" time=2024-04-10T10:13:27.413-05:00 comm=apps.plugin source=collector level=error errno="3, No such process" tid=718 thread=apps.plugin msg="Cannot process /host/proc/36316/status (command 'zfs')" time=2024-04-10T10:19:58.407-05:00 comm=apps.plugin source=collector level=error errno="3, No such process" tid=718 thread=apps.plugin msg="Cannot process /host/proc/29752/limits (command 'zpool')" Today I tried to install a new container, deleted all the appdata from the previous install, and rolled back to older versions and I'm seeing the errors below consistently. Any idea why this all of a sudden stopped working? time=2024-04-11T16:27:59.961-05:00 comm=netdata source=health level=warning tid=709 thread=HEALTH msg_id=9ce0cb58ab8b44df82c4bf1ad9ee22de node=Goathead instance=app.dhcp_fds_open_limit context=app.fds_open_limit code=0 alert_id=1712871056 alert_unique_id=1712871195 alert_event_id=2 alert_transition_id=94e16a2fc9464250ad2e77411cf2c4bf alert_config=27ec43a6f27349f2808a959928f59920 alert=apps_group_file_descriptors_utilization alert_class=Utilization alert_component=Process alert_type=System alert_exec=/usr/libexec/netdata/plugins.d/alarm-notify.sh alert_recipient=sysadmin alert_duration=1 alert_value=500 alert_value_old=null alert_status=WARNING alert_value_old=UNINITIALIZED alert_units=% alert_summary="App group dhcp file descriptors utilization" alert_info="Open files percentage against the processes limits, among all PIDs in application group" alert_notification_timestamp=2024-04-11T16:27:59-05:00 msg="ALERT 'apps_group_file_descriptors_utilization' of instance 'app.dhcp_fds_open_limit' on node 'Goathead', transitioned from UNINITIALIZED to WARNING" time=2024-04-11T16:43:25.329-05:00 comm=cgroup-network source=collector level=error tid=1548 thread=cgroup-network msg="child pid 1549 exited with code 1." time=2024-04-11T16:43:25.329-05:00 comm=cgroup-network source=collector level=error tid=1548 thread=cgroup-network msg="Cannot find a cgroup PID from cgroup '/host/sys/fs/cgroup/docker/43c4a2d485238a93adc627e2338ecf01733bb284dc08a13e0bfe9cc7033a11b9'" time=2024-04-11T16:43:25.330-05:00 comm=netdata source=daemon level=error tid=761 thread=P[cgroups] msg="child pid 1548 exited with code 1." Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.