[Plugin] CA Fix Common Problems


Recommended Posts

Hi

I use latest unraid on my gen8 microserver with Xeon and 16g of ram, worked perfectly until last week, I get this error on your excellent plugin:

Your server has detected hardware errors. You should install mcelog via the NerdPack plugin, post your diagnostics and ask for assistance on the unRaid forums. The output of mcelog (if installed) has been logged

I have done this but dont really understand the log - attached in Word format. please can you help.

 

Log file.docx

Link to comment
9 minutes ago, keymaster said:

Hi

I use latest unraid on my gen8 microserver with Xeon and 16g of ram, worked perfectly until last week, I get this error on your excellent plugin:

Your server has detected hardware errors. You should install mcelog via the NerdPack plugin, post your diagnostics and ask for assistance on the unRaid forums. The output of mcelog (if installed) has been logged

I have done this but dont really understand the log - attached in Word format. please can you help.

 

Log file.docx

 

Post always the diagnostics zip instead.

 

You CPU is overheating:

Jul  5 17:32:57 Tower kernel: CPU7: Package temperature above threshold, cpu clock throttled (total events = 1)

  • Upvote 1
Link to comment
  • 2 weeks later...

Likely this is an issue with the Docker container template owners but my dockers started alerting about template URL not matching.

 

However, two of the docker container errors are as follows:
"The template URL the author specified is Array. The template can be updated automatically with the correct URL."

 

Seems odd that the solution is "Array" and FCP can't seem to actually resolve the issue.

 

Edit: The containers in question are Davos and Letsencrypt

 

Thoughts?

Edited by wreave
Link to comment
5 hours ago, wreave said:

Likely this is an issue with the Docker container template owners but my dockers started alerting about template URL not matching.

 

However, two of the docker container errors are as follows:
"The template URL the author specified is Array. The template can be updated automatically with the correct URL."

 

Seems odd that the solution is "Array" and FCP can't seem to actually resolve the issue.

 

Edit: The containers in question are Davos and Letsencrypt

 

Thoughts?

 

Add DuckDNS to the list of "Array"

Link to comment
5 hours ago, wreave said:

Seems odd that the solution is "Array" and FCP can't seem to actually resolve the issue.

 

Edit: The containers in question are Davos and Letsencrypt

The problem is actually a template error on the part of lsio.  I already fixed ubooquity for them to let them know what the problem was, and dropped @CHBMB a line to let him know about 8 other templates (those 2 were part of them) that suffered the same issue.

  • Upvote 1
Link to comment
2 minutes ago, CHBMB said:

Did you? I must have forgotten, although I did have a look in CA the other day and thought all our templates were ok...

Sent from my LG-H815 using Tapatalk
 

Look at my PR on ubooquity that @sparklyballs already accepted.  It listed everything with the mistake.  CA doesn't make use of that tag, but FCP shows a URL of "array" if the mistake is there and the container is installed.

Link to comment

Is this a problem or should I ignore it.  I thought I read that I can ignore it but not sure.  

 

Template URL for docker application LetsEncryptis missing.

Template URL for docker application ombi is missing.

Template URL for docker application plexpy is missing.

Template URL for docker application radarr is missing.

Link to comment
29 minutes ago, squirrellydw said:

Is this a problem or should I ignore it.  I thought I read that I can ignore it but not sure.  

 

Template URL for docker application LetsEncryptis missing.

Template URL for docker application ombi is missing.

Template URL for docker application plexpy is missing.

Template URL for docker application radarr is missing.

Fix it.  When its fixed, lsio will be able to pump out updates to the templates (ie: ports get added, etc), and you will in most cases pick up those changes automatically

  • Upvote 1
Link to comment

Fix Common Problems keeps telling me that my Crashplan and Plex docker are configured wrong, they have the wrong network type...

 

They actually do not.. since the new unraid version it is possible to give dockers their own ip address and this is what I have chosen to do very specifically..

 

Maybe it is possible to change the check in such a way that when the other requirements for a docker ip address are met the interface type is no longer recognised as an error ?

 

This is what it shows now:

 

Jul 29 17:52:19 Tower root: Fix Common Problems: Error: Docker Application CrashPlan is currently set up to run in br0 mode ** Ignored
Jul 29 17:52:19 Tower root: Fix Common Problems: Error: Docker Application plex is currently set up to run in br0 mode ** Ignored

This is because the "Network type" is set to BR0, that however is not wrong when you need the docker to have its own IP address, if you want this then the "fixed ip address" field is also filled. 

 

So if BR0 is set for these containers and the fixed ip address field is empty, then it should trigger an error

If BR0 is set and the fixed ip address is fild, then there should be no error..

 

 

Link to comment
1 hour ago, Helmonder said:

Fix Common Problems keeps telling me that my Crashplan and Plex docker are configured wrong, they have the wrong network type...

 

They actually do not.. since the new unraid version it is possible to give dockers their own ip address and this is what I have chosen to do very specifically..

 

Maybe it is possible to change the check in such a way that when the other requirements for a docker ip address are met the interface type is no longer recognised as an error ?

 

This is what it shows now:

 


Jul 29 17:52:19 Tower root: Fix Common Problems: Error: Docker Application CrashPlan is currently set up to run in br0 mode ** Ignored
Jul 29 17:52:19 Tower root: Fix Common Problems: Error: Docker Application plex is currently set up to run in br0 mode ** Ignored

This is because the "Network type" is set to BR0, that however is not wrong when you need the docker to have its own IP address, if you want this then the "fixed ip address" field is also filled. 

 

So if BR0 is set for these containers and the fixed ip address field is empty, then it should trigger an error

If BR0 is set and the fixed ip address is fild, then there should be no error..

 

 

I thought that would eventually happen.  Surprised it took so long though.  I'll check it out

Link to comment
2 hours ago, Helmonder said:

So if BR0 is set for these containers and the fixed ip address field is empty, then it should trigger an erro

 

When the fixed IP address field is empty, the container gets an auto-assigned IP address. The range for these assignments is set under general Docker settings.

Link to comment
  • 2 weeks later...

I am having an issue with Fix Common Problems.

 

System this is on is:

Intel S2600CP4

2x E5-2640

2x 16GB DDR3 ECC

SAS Key for onboard ports

SAS Expander

16 2.5" bays (9 drives hooked up)

 

When using a HBA there is no issue. When I moved over to the onboard SAS I could not get the system to come fully up. Unraid would boot but dockers, vm, shares, etc were down. I traced it to Fix Common Problems.  Here is the log:



Aug 12 22:06:14 Tower root: Fix Common Problems Version 2017.07.28
Aug 12 22:06:16 Tower kernel: BUG: unable to handle kernel NULL pointer dereference at           (null)
Aug 12 22:06:16 Tower kernel: IP: isci_task_abort_task+0x1c/0x36b [isci]
Aug 12 22:06:16 Tower kernel: PGD 0
Aug 12 22:06:16 Tower kernel: P4D 0
Aug 12 22:06:16 Tower kernel:
Aug 12 22:06:16 Tower kernel: Oops: 0000 [#1] PREEMPT SMP
Aug 12 22:06:16 Tower kernel: Modules linked in: md_mod bonding igb ptp pps_core i2c_algo_bit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd intel_cstate intel_uncore isci libsas intel_rapl_perf i2c_i801 i2c_core ahci scsi_transport_sas wmi ipmi_si libahci button [last unloaded: pps_core]
Aug 12 22:06:16 Tower kernel: CPU: 0 PID: 2258 Comm: kworker/u288:8 Not tainted 4.12.3-unRAID #1
Aug 12 22:06:16 Tower kernel: Hardware name: Intel Corporation S2600CP/S2600CP, BIOS SE5C600.86B.02.06.0006.032420170950 03/24/2017
Aug 12 22:06:16 Tower kernel: Workqueue: scsi_tmf_8 scmd_eh_abort_handler
Aug 12 22:06:16 Tower kernel: task: ffff88081829b500 task.stack: ffffc90004334000
Aug 12 22:06:16 Tower kernel: RIP: 0010:isci_task_abort_task+0x1c/0x36b [isci]
Aug 12 22:06:16 Tower kernel: RSP: 0018:ffffc90004337c98 EFLAGS: 00010292
Aug 12 22:06:16 Tower kernel: RAX: ffffffffa0193329 RBX: ffff8807dcecfda8 RCX: 0000000000000000
Aug 12 22:06:16 Tower kernel: RDX: ffff88081cc20420 RSI: 0000000000002400 RDI: 0000000000000000
Aug 12 22:06:16 Tower kernel: RBP: ffffc90004337e28 R08: 00000000000000d7 R09: 0000001918aa7c00
Aug 12 22:06:16 Tower kernel: R10: 000000000000003a R11: 071c71c71c71c71c R12: 0000000000000000
Aug 12 22:06:16 Tower kernel: R13: ffff8807f2a69000 R14: 0000000000000000 R15: 0000000000000008
Aug 12 22:06:16 Tower kernel: FS:  0000000000000000(0000) GS:ffff88081d200000(0000) knlGS:0000000000000000
Aug 12 22:06:16 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 12 22:06:16 Tower kernel: CR2: 0000000000000000 CR3: 000000000180a000 CR4: 00000000000406f0
Aug 12 22:06:16 Tower kernel: Call Trace:
Aug 12 22:06:16 Tower kernel: ? freed_request+0x36/0x51
Aug 12 22:06:16 Tower kernel: ? __blk_put_request+0xfb/0x13c
Aug 12 22:06:16 Tower kernel: ? blk_put_request+0x4a/0x51
Aug 12 22:06:16 Tower kernel: ? scsi_execute+0x17a/0x18a
Aug 12 22:06:16 Tower kernel: ? cpuacct_charge+0x6d/0x74
Aug 12 22:06:16 Tower kernel: ? dequeue_entity+0x4d2/0x4ec
Aug 12 22:06:16 Tower kernel: ? put_prev_entity+0x26/0x2f8
Aug 12 22:06:16 Tower kernel: sas_eh_abort_handler+0x2e/0x49 [libsas]
Aug 12 22:06:16 Tower kernel: scmd_eh_abort_handler+0x3a/0x93
Aug 12 22:06:16 Tower kernel: process_one_work+0x147/0x222
Aug 12 22:06:16 Tower kernel: worker_thread+0x1da/0x2a5
Aug 12 22:06:16 Tower kernel: ? rescuer_thread+0x258/0x258
Aug 12 22:06:16 Tower kernel: kthread+0x11c/0x124
Aug 12 22:06:16 Tower kernel: ? kthread_create_on_node+0x3a/0x3a
Aug 12 22:06:16 Tower kernel: ret_from_fork+0x25/0x30
Aug 12 22:06:16 Tower kernel: Code: 89 e5 5d c3 55 b8 05 00 00 00 48 89 e5 5d c3 55 48 89 e5 41 57 41 56 41 55 41 54 49 89 fc 53 4d 8d 7c 24 08 48 81 ec 68 01 00 00 <48> 8b 07 c7 85 80 fe ff ff 00 00 00 00 c7 85 88 fe ff ff 00 00
Aug 12 22:06:16 Tower kernel: RIP: isci_task_abort_task+0x1c/0x36b [isci] RSP: ffffc90004337c98
Aug 12 22:06:16 Tower kernel: CR2: 0000000000000000
Aug 12 22:06:16 Tower kernel: ---[ end trace 03433dd934a37e08 ]---

Any ideas?  Removing the plugin everything works fine.

Link to comment
On 7/31/2017 at 10:48 PM, Helmonder said:

what is then the correct way to determine if a docker is set up for a different ip address ?

 

docker inspect <container-name>

Shows the IP address assigned to the container.

 

Or when creating/editing a container from the GUI, click on "Show deployed IP addresses" which opens a list of all IP addresses used by the different containers.

 

Link to comment
26 minutes ago, TType85 said:

Dang.  I really liked the plugin :/

Put it this way.  The only way that I can see it being possible for the problem is when FCP issues an HDPARM (standard linux) command to figure out if there's an HPA partition on the drives.  Not sure at the moment what to possibly do about it.

Link to comment
  • 2 weeks later...

Okay, I just had a power outage for about an hour, UPS took over and shut down my server with 5mins runtime left, just as it should. On reboot, however, FCP reports an unclean shutdown. Is this a known issue or is there something wrong with my server? No parity check initiated on reboot and nothing else to suggest that the shutdown was unclean.

Edited by NeoDude
Link to comment
1 hour ago, NeoDude said:

Okay, I just had a power outage for about an hour, UPS took over and shut down my server with 5mins runtime left, just as it should. On reboot, however, FCP reports an unclean shutdown. Is this a known issue or is there something wrong with my server? No parity check initiated on reboot and nothing else to suggest that the shutdown was unclean.

If there's no parity check happening, then something in my detection went wrong somewhere.  Just acknowledge the error.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.