Anybody planning a Ryzen build?


Recommended Posts

35 minutes ago, eschultz said:

 

Try 6.4.0-rc7a (just released) and see if it's still an issue.

 

Wow, rc7 has been a rollercoaster so far...

 

After a brief heart attack thinking the upgrade killed the server (turns out the new 4.12 kernel finally includes the drivers for my primary NIC, so I had to move my network cable), I'm now running on rc7 with C-states enabled (fingers crossed).

 

LOL, didn't think the cache drive situation could get worse.  I was wrong.

 

Now I can't even make the cache drive assignment.  I can see the drive under UNASSIGNED DEVICES, and the pretty name is back:

Samsung_SSD_960_EVO_1TB_<longstringnottypingit> - 1 TB (nvme0n1)

 

I can see the same drive listed as a selectable option for the cache drive assignment.  But I've been trying to select it for 5 minutes, using 3 different browsers.  I simply can't select it, it always goes back to "no drive".

 

Seems like something got touched in rc7 related to this, since the behavior is definitely different.  But different bad, not different good.

 

Hmmm, more weirdness:  Just noticed that the disk.cfg file got updated with the cacheID of my new drive, but with the array running the drive is still showing as unassigned on the Main tab, and I have no cache drive.

 

Decided to reboot and see if that made a difference, and when I Stopped the array I noticed that now the drive is showing as the assigned cache drive.  Instead of rebooting, I Started the array again, and now I have a cache drive.

 

Tried to run Mover, and got the dreaded "root: mover: cache not enabled, exit" log entry.

 

Stopped the array and rebooted (forgot to grab diagnostics first, sorry), and for the first time in 4 months the cache drive booted up assigned!!!

 

Then tried Mover again, and it is still not working.

 

Attaching diagnostics showing Mover not working.

 

Paul

tower-diagnostics-20170727-2147.zip

Link to comment
56 minutes ago, Pauven said:

Tried to run Mover, and got the dreaded "root: mover: cache not enabled, exit" log entry.

 

Actually everything looks ok in your syslog, cache is mounted fine.  But looking in config/share.cfg there is supposed to be a line:

 

shareCacheEnabled="yes"

 

or

 

shareCacheEnabled="no"

 

This var is completely missing from your share.cfg file.  This should not be possible with a "stock" install.

 

I suggest you boot in 'safe mode' to eliminate possibility of one of your plugins being the culprit.

 

Meanwhile you can go to Settings/Global share settings and set that thing back to 'Yes'.

Link to comment
37 minutes ago, limetech said:

Also make sure you're running -rc7a (-rc7 had a cache-related bug though I don't see evidence of that in your syslog).

 

Thanks for chiming in Tom.  Yes, 7a.

 

39 minutes ago, limetech said:

 

Actually everything looks ok in your syslog, cache is mounted fine.  But looking in config/share.cfg there is supposed to be a line:

 

shareCacheEnabled="yes"

 

or

 

shareCacheEnabled="no"

 

This var is completely missing from your share.cfg file.  This should not be possible with a "stock" install.

 

I suggest you boot in 'safe mode' to eliminate possibility of one of your plugins being the culprit.

 

Meanwhile you can go to Settings/Global share settings and set that thing back to 'Yes'.

 

Okay, so I booted up in Safe Mode.  With array stopped, first I check the Settings/Global Shares and see that Cache Enabled is showing 'Yes'.  I don't touch anything here for the moment.

 

I then go start the array.  Whole GUI immediately becomes unresponsive, web pages returning "Server not found" errors.  Telnet is unresponsive too.  Can't find server by name or IP.  This is the second time server has become unresponsive tonight, once on rc6, and now again on rc7a, both times in safe mode.  Seems very odd that I have more problems in safe mode than in frivolous mode.  There is a chance the server has hung from the C-state being re-enabled on my Ryzen, not sure at this time and don't want to rule anything out yet.

 

One last thought:  Mover worked great until I upgraded from 6.3.latest to 6.4.0-rc6, at which point it appears to have stopped working at the same time.  Just thinking that might be a clue as to how the shareCacheEnabled var got whacked.

 

That's enough for one night, gonna hit the sack.

Link to comment

Okay, multiple findings.

 

First, when I checked on my server this morning, I found a Kernel Panic on the console screen, and the system was fully hung.  Here's a pic:

 

1RI2SHixU8NGitvuAvpjcqrBI2e7ePP0B5PHkEcz

 

I restarted in Safe Mode again, started the array, and checked the share.cfg file.  shareCacheEnabled was still missing.

 

I stopped the array and went to the Settings/Global Shares panel.  I couldn't directly apply "Yes" to "Use cache disk:", as it was already on "Yes" and wouldn't let me Apply it.  I set it to "No", Applied, then set back to "Yes" and Applied.

 

Now the share.cfg file got updated with the shareCacheEnabled="Yes" line, plus what appears to be several additional lines that must have also been missing.  Here's the new file contents:

 

# Generated settings:
shareDisk="e"
shareUser="e"
shareUserInclude=""
shareUserExclude=""
shareSMBEnabled="yes"
shareNFSEnabled="no"
shareNFSFsid="100"
shareAFPEnabled="no"
shareInitialOwner="Administrator"
shareInitialGroup="Domain Users"
shareCacheEnabled="yes"
shareCacheFloor="2000000"
shareMoverSchedule="40 3 * * *"
shareMoverLogging="yes"
fuse_remember="330"
fuse_directio="auto"
shareAvahiEnabled="yes"
shareAvahiSMBName="%h"
shareAvahiSMBModel="Xserve"
shareAvahiAFPName="%h-AFP"
shareAvahiAFPModel="Xserve"

 

Expecting Mover to now work again, I restarted into normal mode, as I wanted my temperature and fan plugins to keep my drives cool while the Mover got busy.

 

On reboot, I confirmed that shareCacheEnabled="yes" was still in the share.cfg file.  I then manually started Mover.

 

This time the logged message was "root: mover: started", and I can see disk activity so it appears that Mover really is working.

 

So it appears that my Cache drive and Mover troubles are finally over - thank you Tom and all who helped.

 

That said, assuming Lime-Tech is already here reading this, I'd like to take a moment to recount my experiences with the -rc6/-rc7a releases:

  • Experienced 1 Kernel Panic while in Safe Mode on -rc7a (above), and possibly another in Safe Mode on -rc6 (speculation)
  • The upgrade from 6.3.latest to 6.4.0-rc6 coincided with whacking some cache related configuration file parameters (can't rule out plug-ins as a contributing factor)
  • Could not assign the cache drive under -rc6, though -rc7a fixed this
  • Several -rc7a anomalies (cache drive showing unassigned even though it was assigned, multiple Stop/Starts/Restarts required to get system synced up & behaving correctly)
  • Currently, Mover is working but only at about 36 MB/s peak.  Never paid attention before, because Mover is normally running in the middle of the night, but this seems rather slow.  Possibly because data is being written to a drive that is 96% full, so this may be nothing.
  • Odd caching in new GUI under -rc7a (didn't notice on -rc6) in which sometimes I have to Shift-F5/Forced Refresh to get current data presented.  
  •      An easy example is the UPS Summary on the Dashboard, which kept reporting 157 watts for 30 minutes after I spun the drives down.  I finally forced a screen refresh, and the status updated to 84 watts.  
  •      Another example (plugin related) is the Dynamix System Temps ticker at the bottom of the screen doesn't seem to be updating.  I've got both Firefox and IE open on the Main screen, and the ticker has been frozen on both for 10+ minutes, and they don't match each other.  If I click around the menus, sometimes the ticker updates, and sometimes it just disappears.  The behavior seems worse on IE than Firefox.
  • On the plus side, the new 4.12 kernel includes some drivers that were missing, so that's pretty nice.  It will take a while to determine if the C-state issue is resolved.

 

Thanks,

Paul

 

 

Link to comment

Paul, have you considered wiping your boot stick and starting again fresh?  

 

I had some weirdness a while back with stuff not updating, drives going missing, etc, and in the end I wiped my boot stick and started again, and most of the problems went away.  The original stick was one I'd been using probably from 6beta days, so who knows what crap was lingering in the background.  

 

Also, with your machine doing all those lockups and KPs, it seems to me there might be some corruption creeping in.

 

 

Link to comment
13 minutes ago, HellDiverUK said:

Paul, have you considered wiping your boot stick and starting again fresh?  

 

I had some weirdness a while back with stuff not updating, drives going missing, etc, and in the end I wiped my boot stick and started again, and most of the problems went away.  The original stick was one I'd been using probably from 6beta days, so who knows what crap was lingering in the background.  

 

Also, with your machine doing all those lockups and KPs, it seems to me there might be some corruption creeping in.

 

 

 

That's not a bad idea.  I've been using the same USB stick for 8+ years, since the beginning when I started with 4.5 beta4 (with its brand new 20-disk limit).  How's that for a flashback.

 

Though I've certainly wiped it on occasion over the years.  Most recently I think for the 6.1 branch.

 

Now that I've finally got things settled, I'm gonna let it chill as-is.  If more problem crop up, this will be high on my trouble-shooting list.

 

As far as dealing with the potential corruption, I might just have to start from scratch and rebuild my configuration if I wipe the drive.  Otherwise, I'm simply restoring potentially corrupted files.

 

Thanks.

 

Paul

Link to comment
2 hours ago, Pauven said:

 

That's not a bad idea.  I've been using the same USB stick for 8+ years, since the beginning when I started with 4.5 beta4 (with its brand new 20-disk limit).  How's that for a flashback.

 

 

I think you'd be best getting a new stick and transferring your license.  8 years is a long time for a USB stick.

 

I'm using a Kingston DataTraveller 3.0 16GB, which is USB3 and really quite quick, unRAID boots much faster than it did on my old SanDisk Fit.  The Kingston was pretty cheap too.

Link to comment
3 hours ago, Greygoose said:

Pauven, I can not offer any assistance. Except to say thank you for continuing to get ryzen rolling sweet with Unraid. 

 

 

 

Thanks Greygoose!

 

I just reached 50+ hours uptime on 6.4.0-rc7a with C-states enabled.  Looks like Lime-Tech may have solved the stability issue, good job guys!

 

With C-states enabled, Idle wattage has dropped 10+ watts.  My UPS only reports in 10.5w increments (which is 1% of the 1050w power rating), so actual savings are likely somewhere between 10.5w-21 watts.  From earlier testing with a more accurate Kill-A-Watt, the actual delta between C-states enabled & disabled was between 12w-18w.

 

Idle temps have dropped 2-3 degrees C on both CPU (41C) and System (36C).  Not as much as I had hoped, but I think my expectations were off.  I did a lot of initial testing with the case cover off, and temps have unsurprisingly increased simply from closing the case, as case fans are on lowest speed (three 120mm fans, 1000 RPM @ 35% PWM), and they have to suck air past the HD's, so very little airflow at idle.  The CPU fan speed profile is set to 'Standard' in the BIOS.

 

At max case fan speeds (2750 RPM), idle CPU temp easily drops to 35C and System to 30C, but the higher fan speeds consume an extra 10+ watts and make lots of noise.

 

As a compromise, I just changed my minimum case fan speed to 1400 RPM @ 50% PWM, which is much more quiet and energy efficient than full blast, but still improves my idle temps a couple degrees over the slowest fan speeds:  39C CPU, 34C System.  I'll probably change the CPU fan profile from Standard to Performance in the BIOS to see if that drops the 5C delta over ambient a bit, but other than that I think I'm done.

 

I'm happy to have idle temps back in the 30's, at reasonable fan speeds/noise, and with idle watts back to a more reasonable level.

 

Paul

  • Upvote 1
Link to comment

Just came here to share my positive Ryzen experience. Running a 1700X and an Asus Prime x-370-PRO mobo, I was able to get things up and running stable back in April. The keys for me were disabling EPU, which is what my motherboard calls the power-saving functions. If you're running an ASRock, it's apparently called C-State Control.

I JUST got some temp monitoring by manually enabling the it87 kernel driver, and using force_id=0x8628. Read more about it here!

  • Upvote 1
Link to comment
4 hours ago, willzone1 said:

Just came here to share my positive Ryzen experience. Running a 1700X and an Asus Prime x-370-PRO mobo, I was able to get things up and running stable back in April. The keys for me were disabling EPU, which is what my motherboard calls the power-saving functions. If you're running an ASRock, it's apparently called C-State Control.

I JUST got some temp monitoring by manually enabling the it87 kernel driver, and using force_id=0x8628. Read more about it here!

 

I also use same MB, but I turn on EPU with C-state off before. This MB also have C-State control.

Anyway both on now and stable.

Link to comment
1 minute ago, methanoid said:

Ryzen owners, how are your VMs running? I saw Level1Techs had some issues with VMs and passthru on Ryzen and I'd like some feedback before I drop a pile of cash on a 16 core Threadripper setup to run multiple VMs with passthrough

Wait until it's confirmed that npt issue that's plaguing ryzen owners is not present in threadripper. I have no idea if it's being looked at tbh and like you I'm very keen on TR.

  • Upvote 1
Link to comment
1 hour ago, mikeyosm said:

Wait until it's confirmed that npt issue that's plaguing ryzen owners is not present in threadripper. I have no idea if it's being looked at tbh and like you I'm very keen on TR.

 

Probably just as well to wait... I've got a 14C Xeon so its not like I NEED TR.. and the Mrs would probably have a seizure if she saw the cost :(

Link to comment
3 hours ago, methanoid said:

Ryzen owners, how are your VMs running? I saw Level1Techs had some issues with VMs and passthru on Ryzen and I'd like some feedback before I drop a pile of cash on a 16 core Threadripper setup to run multiple VMs with passthrough

 

I kinda feel like this has been answered in this thread many times already.

Link to comment
On 4.8.2017 at 2:21 PM, Tuftuf said:

 

I kinda feel like this has been answered in this thread many times already.

Our feelings may deceive us here. 

I dare to admit that I do doubt that anybody here has decent, intel typical, nearly bare metal vm performance with hardware passthrough (especially when the performance metric includes GPU and CPU)  on their ryzen build.

Edited by unrateable
Link to comment

 

2 minutes ago, unrateable said:

Our feelings may deceive us here. 

I dare to admit that I do doubt that anybody here has decent, intel typical, nearly bare metal vm performance with hardware passthrough (especially when the performance metric includes GPU and CPU)  on their ryzen build.

 

Which would indicate you have not read the thread since I have already made clear that performance is not great, and your comment seems to assume I'm suggesting the performance is near native.

  • Upvote 1
Link to comment
17 hours ago, Tuftuf said:

 

 

Which would indicate you have not read the thread since I have already made clear that performance is not great, and your comment seems to assume I'm suggesting the performance is near native.

Just interpreted wrong what you meant to say. My bad.

Edited by unrateable
Link to comment

I just had a mover issue and did a search for the error and found this thread. I just bought a new AMD system, mobo, and ram yesterday. Today is day 2 with the trial! :)

 

Aug 6 13:14:22 Tower root: Starting libvirtd...
Aug 6 13:14:22 Tower kernel: kvm: disabled by bios
Aug 6 13:14:22 Tower root: modprobe: ERROR: could not insert 'kvm_amd': Operation not supported
Aug 6 13:14:22 Tower kernel: tun: Universal TUN/TAP device driver, 1.6
Aug 6 13:14:22 Tower emhttpd: nothing to sync
Aug 6 13:14:22 Tower emhttpd: 
Aug 6 13:14:22 Tower kernel: virbr0: port 1(virbr0-nic) entered blocking state
Aug 6 13:14:22 Tower kernel: virbr0: port 1(virbr0-nic) entered disabled state
Aug 6 13:14:22 Tower kernel: device virbr0-nic entered promiscuous mode
Aug 6 13:14:22 Tower dhcpcd[1758]: virbr0: new hardware address: ea:aa:44:db:8d:60
Aug 6 13:14:23 Tower avahi-daemon[3829]: Joining mDNS multicast group on interface virbr0.IPv4 with address 192.168.122.1.
Aug 6 13:14:23 Tower avahi-daemon[3829]: New relevant interface virbr0.IPv4 for mDNS.
Aug 6 13:14:23 Tower avahi-daemon[3829]: Registering new address record for 192.168.122.1 on virbr0.IPv4.
Aug 6 13:14:23 Tower kernel: virbr0: port 1(virbr0-nic) entered blocking state
Aug 6 13:14:23 Tower kernel: virbr0: port 1(virbr0-nic) entered listening state
Aug 6 13:14:23 Tower dnsmasq[30308]: started, version 2.77 cachesize 150
Aug 6 13:14:23 Tower dnsmasq[30308]: compile time options: IPv6 GNU-getopt no-DBus i18n IDN2 DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify
Aug 6 13:14:23 Tower dnsmasq-dhcp[30308]: DHCP, IP range 192.168.122.2 -- 192.168.122.254, lease time 1h
Aug 6 13:14:23 Tower dnsmasq-dhcp[30308]: DHCP, sockets bound exclusively to interface virbr0
Aug 6 13:14:23 Tower dnsmasq[30308]: reading /etc/resolv.conf
Aug 6 13:14:23 Tower dnsmasq[30308]: using nameserver 192.168.1.1#53
Aug 6 13:14:23 Tower dnsmasq[30308]: read /etc/hosts - 2 addresses
Aug 6 13:14:23 Tower dnsmasq[30308]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
Aug 6 13:14:23 Tower dnsmasq-dhcp[30308]: read /var/lib/libvirt/dnsmasq/default.hostsfile
Aug 6 13:14:23 Tower kernel: virbr0: port 1(virbr0-nic) entered disabled state
Aug 6 13:14:48 Tower emhttpd: req (66): cmdStartMover=Move+now&csrf_token=62192835A19A6461
Aug 6 13:14:48 Tower emhttpd: shcmd (1065): /usr/local/sbin/mover |& logger &
Aug 6 13:14:48 Tower emhttpd: 
Aug 6 13:14:48 Tower root: mover: started

 

When the mover stopped, I shut down the array, changed the cache settings and then restarted. Seems like it got the mover to work again. I downloaded my logs before and couldn't find exactly where the cache setting was to verify if it was missing.

 

Thanks for your detailed replies Pauven

 

 

EDIT: Just an interesting side note. Before I did the reset of the cache settings I could not see the VMs under the VMs tab, I just saw an error about hardware not supporting it. Now that I did the reset, I can see the list again.

Edited by RonUSMC
Link to comment

just switched from a i7-4790 to a ryzen7 1700. at the moment it runs fine, its only consuming 20+watt more than the i7!

 

found this message in the "fix common problem" plugin:

 
 
CPU possibly will not throttle down frequency at idle Your CPU is running constantly at 100% and will not throttle down when it's idle (to save heat / power). This is because there is currently no CPU Scaling Driver Installed. Seek assistance on the unRaid forums with this issue

 

 

I am running version 6.4.0-rc7a with the latest bios update.

 

 

Link to comment
5 minutes ago, eschultz said:

Just purchased a AMD Threadripper 1950X cpu and MSI X399 gaming pro carbon motherboard.  Parts will arrive mid next week and will hopefully get some unRAID testing in by that weekend.

 

Cool. Let us know your 3dmark bench scores particularly the fire strike physics and combined since that is severely impacted by the NPT issue and Ryzen. Where did you purchase from if you don't mind me asking?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.