UnRAID on VMWare ESXi with Raw Device Mapping


Recommended Posts

Didn't nojstevens get it working out of the box with his Supermicro board (X8ST3-F), with the LSI card?

 

The different workstreams are hard to follow in this thread.

Here's what I understood so far:

 

What user SK is approaching with the fix is that unRAID will properly

recognise the virtual disks which are passed via RDM from ESXI.

So in this scenario, the controller and disks physically are managed by ESXi

and unRAID will access via the virtualized drivers.

 

What user nojstevens did was using the PCIe controller in passtrough mode of ESXi,

which would give unRAID real access to that piece of hardware and to handle the

real disks.

Link to comment
  • Replies 461
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

Getting some errors in the syslog. (Attaching full syslog)

 

Nov 23 11:47:07 Tower ata_id[2061]: HDIO_GET_IDENTITY failed for '/dev/block/8:16'
Nov 23 11:47:07 Tower kernel: sd 1:0:0:0: [sdb] Device not ready
Nov 23 11:47:07 Tower kernel: sd 1:0:0:0: [sdb] Result: hostbyte=0x00 driverbyte=0x08
Nov 23 11:47:07 Tower kernel: sd 1:0:0:0: [sdb] Sense Key : 0x2 [current]
Nov 23 11:47:07 Tower kernel: sd 1:0:0:0: [sdb] ASC=0x4 ASCQ=0x2
Nov 23 11:47:07 Tower kernel: sd 1:0:0:0: [sdb] CDB: cdb[0]=0x28: 28 00 00 00 00 00 00 00 20 00
Nov 23 11:47:07 Tower kernel: end_request: I/O error, dev sdb, sector 0
Nov 23 11:47:07 Tower kernel: Buffer I/O error on device sdb, logical block 0
Nov 23 11:47:07 Tower kernel: Buffer I/O error on device sdb, logical block 1
Nov 23 11:47:07 Tower kernel: Buffer I/O error on device sdb, logical block 2
Nov 23 11:47:07 Tower kernel: Buffer I/O error on device sdb, logical block 3

 

Could this be a problem with the drive itself?

 

Some information on the test system:

Asus P5LD2-VM

Intel E6600

2GB DDR2

IBM ServerRaid Br10i FRU PCI-e 8x SAS HBA (LSI SAS3082E-R card with LSI1068E chipset).

120GB WD IDE

40GB Toshiba IDE

60GB Seagate SATA (connected to the Br10i)

 

After commanding a spin down don't get the flashing green dot for the SATA drive. However, when I physically check the drive it seems not to be rotating.

 

I've switched the SATA drive from different ports on the Br10i card and each time after reboot reassigned it back to it's slot and the array starts up fine.

syslog-2010-11-23.zip

Link to comment

Didn't nojstevens get it working out of the box with his Supermicro board (X8ST3-F), with the LSI card?

 

The different workstreams are hard to follow in this thread.

Here's what I understood so far:

 

What user SK is approaching with the fix is that unRAID will properly

recognise the virtual disks which are passed via RDM from ESXI.

So in this scenario, the controller and disks physically are managed by ESXi

and unRAID will access via the virtualized drivers.

 

What user nojstevens did was using the PCIe controller in passtrough mode of ESXi,

which would give unRAID real access to that piece of hardware and to handle the

real disks.

 

It seems there are several usage profiles for issues people are experiencing which patch I made may help:

1. have unraid working under vmware esx/i with SATA drives presented as physical RDMs to unraid VM using virtual LSI SAS scsi (or paravirtualized controller)

2. have unraid working under vmware esx/i with PCI disks controller (LSI cards?) in passthrough mode

3. have unraid wokring on physical server with LSI cards which unraid having issues with

 

While I really focus on #1 since it is my home configuration it good to know that #2 and #3 issue may got resolved.

 

 

 

Link to comment

Getting some errors in the syslog. (Attaching full syslog)

 

Nov 23 11:47:07 Tower ata_id[2061]: HDIO_GET_IDENTITY failed for '/dev/block/8:16'
Nov 23 11:47:07 Tower kernel: sd 1:0:0:0: [sdb] Device not ready
Nov 23 11:47:07 Tower kernel: sd 1:0:0:0: [sdb] Result: hostbyte=0x00 driverbyte=0x08
Nov 23 11:47:07 Tower kernel: sd 1:0:0:0: [sdb] Sense Key : 0x2 [current]
Nov 23 11:47:07 Tower kernel: sd 1:0:0:0: [sdb] ASC=0x4 ASCQ=0x2
Nov 23 11:47:07 Tower kernel: sd 1:0:0:0: [sdb] CDB: cdb[0]=0x28: 28 00 00 00 00 00 00 00 20 00
Nov 23 11:47:07 Tower kernel: end_request: I/O error, dev sdb, sector 0
Nov 23 11:47:07 Tower kernel: Buffer I/O error on device sdb, logical block 0
Nov 23 11:47:07 Tower kernel: Buffer I/O error on device sdb, logical block 1
Nov 23 11:47:07 Tower kernel: Buffer I/O error on device sdb, logical block 2
Nov 23 11:47:07 Tower kernel: Buffer I/O error on device sdb, logical block 3

 

Could this be a problem with the drive itself?

 

Some information on the test system:

Asus P5LD2-VM

Intel E6600

2GB DDR2

IBM ServerRaid Br10i FRU PCI-e 8x SAS HBA (LSI SAS3082E-R card with LSI1068E chipset).

120GB WD IDE

40GB Toshiba IDE

60GB Seagate SATA (connected to the Br10i)

 

After commanding a spin down don't get the flashing green dot for the SATA drive. However, when I physically check the drive it seems not to be rotating.

 

I've switched the SATA drive from different ports on the Br10i card and each time after reboot reassigned it back to it's slot and the array starts up fine.

 

I finally got a chance to look at standards specs - device not ready error is due to drive spin down command issued previously by unraid/emhttp. It actually gets drive into stopped mode where it refuses media access commands (such reads) unless started again. I need to change it to go into standby mode instead so media access will spin up drive automatically - will make change in day or two.

 

Interesting enough I don't have this error under vmware - perhaps virtual layer is masked from physical one.

 

 

 

 

Link to comment

I have this motherboard model ASUS P5N7A-VM laying around and wondered if it would work with VMWare ESXi ?

 

 

Ok i was able to install ESX4.1 onto a 2 gig USB drive and it boots just fine and i can access it from vSphere Client just fine and i can also SSH into it as well.

I do have a problem it seems ESX does not see any sata drives i have hooked up to the motherboard. On boot up the motherboard bios shows the drives being there. I had planned on getting the ESX server working and then later attempting install unRaid in a VM image to test with.

 

 

Link to comment

Sounds like your mobo SATA controller isn't supported. Here's a good place to start: http://www.vm-help.com/esx40i/esx40_whitebox_HCL.php

 

 

I have this motherboard model ASUS P5N7A-VM laying around and wondered if it would work with VMWare ESXi ?

 

 

Ok i was able to install ESX4.1 onto a 2 gig USB drive and it boots just fine and i can access it from vSphere Client just fine and i can also SSH into it as well.

I do have a problem it seems ESX does not see any sata drives i have hooked up to the motherboard. On boot up the motherboard bios shows the drives being there. I had planned on getting the ESX server working and then later attempting install unRaid in a VM image to test with.

 

 

Link to comment

FYI - Was able to use the patched ISO with RDM drives via the LSI SAS controller so that unraid can access the drives. Performance is lacking but still tinkering.

 

Hope this gets officially supported. I think more and more people will go this route.

 

SK you rock!

 

Great thing with going ESXi route is a choice to isolate unRAID VM to do just storage and have other VMs (with  other OSs/linux distros than just quite limited slackware) doing other things. I have 4 to 5 VMs running all the time on single box with 4GB memory and may add couple more if needed (surprisingly have not run out of memory yet). Imho having needed services running on usable linux distros (such ubuntu/centos is priceless :) Also I have not experience significant performance issues to date with my setup.

 

It would be good to have official support for sure, but doubtfully it will be soon as it requires quite a few changes in code - in both driver and closed management piece..

 

What I recently found in my ESXi configuration (with physical RDMs) that controlling drive power (to switch to standby and back to active mode) does not work - SATL (scsi to ata translation layer) does not implement specific ATA pass-through (used by hdparm) or SCSI START/STOP with power condition. I believe this is general vmware issue (or feature?). Question is if this works on physical box with real LSI controller - then it may worth to have it implemented in driver. So if someone with physical box/LSI card can install sg3_utils, run the following and post reply - it would really helpful.

 

# sg_start -v --stop --pc 3 /dev/sdN

# sg_start -v --start --pc 1 /dev/sdN

 

What does work in ESXi - is SCSI START/STOP command without power condition (like the following command) and vmware does good job of spinning up drive when access required (which btw is differ from what SAT-2 standard dictate).

 

# sg_start -v --stop /dev/sdN

# sg_start -v --start /dev/sdN

 

So I need to figure out what logic to put into spinup/spindown piece of driver to handle both ESXi and real hardware cases (if this doable of cause). 

 

Also had some progress with vmware tools which has quite incompatible installer for unraid slackware distro, with workaround I have it installed in my slackware dev VM and now need to get it work for prod unraid VM.   

 

Answering to bcbgboy13/jamerson9 - I know that LSI 1068E is not that expensive, but really don't feel need for it for my little home config at any nearest time.. 

 

 

 

 

 

 

 

 

Link to comment

I ran the following commands.

 

# sg_start -v --stop --pc 3 /dev/sdN

# sg_start -v --start --pc 1 /dev/sdN

root@Tower:/boot/boot/sg3# sg_start -v --stop --pc 3 /dev/sdb
    Start stop unit command: 1b 00 00 00 30 00
root@Tower:/boot/boot/sg3# sg_start -v --start --pc 1 /dev/sdb
    Start stop unit command: 1b 00 00 00 10 00
root@Tower:/boot/boot/sg3#

 

# sg_start -v --stop /dev/sdN

# sg_start -v --start /dev/sdN

root@Tower:/boot/boot/sg3# sg_start -v --stop /dev/sdb
    Start stop unit command: 1b 00 00 00 00 00
root@Tower:/boot/boot/sg3# sg_start -v --start /dev/sdb
    Start stop unit command: 1b 00 00 00 01 00

 

Again thanks for all your work.

Link to comment

It doesn't seem the ISO includes the source of the patches you made. The /usr/src/linux/drives/md files seem identical to stock unraid 4.5.6. Did I overlook them somewhere?

 

Can you please provide them? This way those of us interested in them can patch them into the 5.0 beta 2 on our test box(es), and you can be compliant with the GPL.

 

Thanks!

 

If you do happen to patch this into the 5.0bX's, please post it, as I would like to give it a shot with the latest beta.

 

Also SK, if you could set this up with the newest 4.6, it'd be great..

 

Link to comment

It would be good to have official support for sure, but doubtfully it will be soon as it requires quite a few changes in code - in both driver and closed management piece..

 

 

 

That's troubling. I'm running the free version and would pay for the full if it supported ESXi.

 

As for VMware not emulating all the driver commands, I've seen them do this before. Implement what's mostly needed but not the finer details.

 

Link to comment

Are you sure this is how the pass-through works?  I was under the understanding that you could share specific devices to specific VM's no matter what buss the devices were on.  For example, on my system I'd like to pass 2 HVR-2250's to a Win7 VM, and possibly 1 or 2 LSI 8 port raid cards over to an Unraid VM, with all 4 of those cards being PCIe.  Or is that buss sharing part only for regular PCI devices (as far as  I know right now, I'll have none other than a PCI video card, which won't really be shared).

 

I was originally going to go with a Slackware + Unraid + Sage Linux install, non-vm, but found out my tuner cards are currently only working with digital signals, meaning I'd need at least 2 more tuner cards for the analog channels, so I went back to my idea of running all in ESXi VM's, but if you can't share another device on the same buss with another seperate VM, it kind of defeats the purpose..

 

 

I'm a late arriver to this thread, but it has definitely piqued my interest as I have been considering these same integration issues myself (storage, dvr, security cameras, esxi, etc)  I've been assuming that I would have to run a minimum of 2 boxes, one for esx and one for storage, but, I would prefer to build a single box to handle it all.

 

heffe have you considered an HD HomeRun?  I've got the original one (which is now called a Dual) although the Tech one looks like it could be a fun toy. The HDHR doesn't do analog, although I thought the 2250 did....anyway, I attached my hdhr to my antenna in my attic, hooked a network cable to it and haven't thought about it since, it works great.

Link to comment

I had a HDHR once upon a time (probably got it within the first month that it supported SageTV), but when I changed over to DirecTV, I sold the box.  I've since switched BACK to Cable, and I'm using 2 HVR-2250 dual tuner cards (it'll do 4 analog or 4 digital, or any combination of those between the 2 cards), with no cable boxes (cable directly injected into the cards).  I don't plan on going strictly digital (my house pretty much gets zero OTA HD, I'm on the wrong side of a hill to hit the towers), so I'd be limited to about 8 or 9 channels tops using strictly the HDHR.

 

Once I can get a stable Unraid running under ESXi, I plan on trying to install Win7, and the 2 tuners with PCI passthrough, and hopefully be able to get everything running on this one box.  If not, I'll try to go minimal on the Unraid box (going from an 11 drive system down to a 5 drive system, with a low-power CPU).

 

Right now, I'm running 2 very power-hungry boxes, I really should check them both with my Kill-a-watt meter just to see how bad it is, lol.

 

Link to comment

I ran the following commands.

 

# sg_start -v --stop --pc 3 /dev/sdN

# sg_start -v --start --pc 1 /dev/sdN

root@Tower:/boot/boot/sg3# sg_start -v --stop --pc 3 /dev/sdb
    Start stop unit command: 1b 00 00 00 30 00
root@Tower:/boot/boot/sg3# sg_start -v --start --pc 1 /dev/sdb
    Start stop unit command: 1b 00 00 00 10 00
root@Tower:/boot/boot/sg3#

 

# sg_start -v --stop /dev/sdN

# sg_start -v --start /dev/sdN

root@Tower:/boot/boot/sg3# sg_start -v --stop /dev/sdb
    Start stop unit command: 1b 00 00 00 00 00
root@Tower:/boot/boot/sg3# sg_start -v --start /dev/sdb
    Start stop unit command: 1b 00 00 00 01 00

 

Again thanks for all your work.

 

Thanks jamerson9 for quick response.

 

Seems LSI does support power condition (and not complain as vmware does).  I almost finished with driver modifications that should work in both virtual ESXi and with physical LSI (or pass through) environment.

 

The logic is simple - try unraid original ATA spindown/up commands, if not supported - try scsi start/stop with power modifier (for happy lsi owners) and if that's not supported do without power modifier (for happy esxi users). 

 

Link to comment

It doesn't seem the ISO includes the source of the patches you made. The /usr/src/linux/drives/md files seem identical to stock unraid 4.5.6. Did I overlook them somewhere?

 

Can you please provide them? This way those of us interested in them can patch them into the 5.0 beta 2 on our test box(es), and you can be compliant with the GPL.

 

Thanks!

 

If you do happen to patch this into the 5.0bX's, please post it, as I would like to give it a shot with the latest beta.

 

Also SK, if you could set this up with the newest 4.6, it'd be great..

 

 

well, as soon there be final version of 4.6 which seems to be very soon will patch it.

For 5.0bx need to check how emhttp been modified to make sure there are no issues..

 

 

 

 

Link to comment

 

Update (with better handling of drives spinup/spindown) can be found at http://www.mediafire.com/?2710vppr8ne43

 

Getting disk errors with this version.

Disk1 is the SATA (sdc) drive on the LSI controller.

Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] Device not ready
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] Result: hostbyte=0x00 driverbyte=0x08
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] Sense Key : 0x2 [current]
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] ASC=0x4 ASCQ=0x2
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] CDB: cdb[0]=0x28: 28 00 00 00 00 4f 00 00 08 00
Dec 1 14:27:14 Tower kernel: end_request: I/O error, dev sdc, sector 79
Dec 1 14:27:14 Tower kernel: md: disk1 read error
Dec 1 14:27:14 Tower kernel: handle_stripe read error: 16/1, count: 1
Dec 1 14:27:14 Tower kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
Dec 1 14:27:14 Tower kernel: REISERFS (device md2): found reiserfs format "3.6" with standard journal
Dec 1 14:27:14 Tower kernel: REISERFS (device md2): using ordered data mode
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] Device not ready
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] Result: hostbyte=0x00 driverbyte=0x08
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] Sense Key : 0x2 [current]
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] ASC=0x4 ASCQ=0x2
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] CDB: cdb[0]=0x2a: 2a 00 00 00 00 4f 00 00 08 00
Dec 1 14:27:14 Tower kernel: end_request: I/O error, dev sdc, sector 79
Dec 1 14:27:14 Tower kernel: md: disk1 write error
Dec 1 14:27:14 Tower kernel: handle_stripe write error: 16/1, count: 1
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] Device not ready
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] Result: hostbyte=0x00 driverbyte=0x08
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] Sense Key : 0x2 [current]
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] ASC=0x4 ASCQ=0x2
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] CDB: cdb[0]=0x28: 28 00 00 00 00 bf 00 00 08 00
Dec 1 14:27:14 Tower kernel: end_request: I/O error, dev sdc, sector 191
Dec 1 14:27:14 Tower kernel: md: disk1 read error
Dec 1 14:27:14 Tower kernel: md: recovery thread woken up ...
Dec 1 14:27:14 Tower kernel: handle_stripe read error: 128/1, count: 1

 

Went back to the previous version and the errors went away.

Attaching full syslog.

syslog-2010-12-0.1.02.zip

Link to comment

 

Update (with better handling of drives spinup/spindown) can be found at http://www.mediafire.com/?2710vppr8ne43

 

Getting disk errors with this version.

Disk1 is the SATA (sdc) drive on the LSI controller.

Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] Device not ready
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] Result: hostbyte=0x00 driverbyte=0x08
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] Sense Key : 0x2 [current]
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] ASC=0x4 ASCQ=0x2
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] CDB: cdb[0]=0x28: 28 00 00 00 00 4f 00 00 08 00
Dec 1 14:27:14 Tower kernel: end_request: I/O error, dev sdc, sector 79
Dec 1 14:27:14 Tower kernel: md: disk1 read error
Dec 1 14:27:14 Tower kernel: handle_stripe read error: 16/1, count: 1
Dec 1 14:27:14 Tower kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
Dec 1 14:27:14 Tower kernel: REISERFS (device md2): found reiserfs format "3.6" with standard journal
Dec 1 14:27:14 Tower kernel: REISERFS (device md2): using ordered data mode
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] Device not ready
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] Result: hostbyte=0x00 driverbyte=0x08
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] Sense Key : 0x2 [current]
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] ASC=0x4 ASCQ=0x2
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] CDB: cdb[0]=0x2a: 2a 00 00 00 00 4f 00 00 08 00
Dec 1 14:27:14 Tower kernel: end_request: I/O error, dev sdc, sector 79
Dec 1 14:27:14 Tower kernel: md: disk1 write error
Dec 1 14:27:14 Tower kernel: handle_stripe write error: 16/1, count: 1
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] Device not ready
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] Result: hostbyte=0x00 driverbyte=0x08
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] Sense Key : 0x2 [current]
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] ASC=0x4 ASCQ=0x2
Dec 1 14:27:14 Tower kernel: sd 1:0:1:0: [sdc] CDB: cdb[0]=0x28: 28 00 00 00 00 bf 00 00 08 00
Dec 1 14:27:14 Tower kernel: end_request: I/O error, dev sdc, sector 191
Dec 1 14:27:14 Tower kernel: md: disk1 read error
Dec 1 14:27:14 Tower kernel: md: recovery thread woken up ...
Dec 1 14:27:14 Tower kernel: handle_stripe read error: 128/1, count: 1

 

Went back to the previous version and the errors went away.

Attaching full syslog.

 

jamerson9 - check private box (for debug version that allow to understand the issue on physical box with LSI card)

 

EDIT: Under ESXi with physical RDM drives I have no such problems..

 

 

Link to comment

SK you're currently using the ESXi LSI controller with RDM (like the topic says on this post), correct?  I'm guessing I could always use the 'soft' controller, and not do PCIe passthrough for the LSI card I'm actually using right now, especially if spinup/spindown is working correctly through the VM with it.

 

I've really got to get a decent NIC for my ESX box, the Realtek 8111c that's on the mobo now drops connection frequently under ESX, and I'm thinking at this point that it's doing the same thing under Unraid without VM as well (can't seem to get it to complete a rsync from my old box, the new machine hangs at a random point in the transfer).  If RDM works correctly with this version that may be the last part of the puzzle for me to actually get started on my migration..

 

Link to comment

My ESXi is using a motherboard with the Realtek nic.  It started losing connectivity.  I eventually discovered I have a wifi router that was accidently set to provide DHCP and when I turned that off then my ESXi box hasn't had any connectivity issues.  What has happening was the ESXi Realtek nic was getting a different IP intermittenly from my main router and the wifi router.

 

Now I have 4 nic in my server.  The onboard Realtek, a single PCI Intel Pro MT and a dual PCI-X Intel Pro MT in a PCI slot.  

 

I bought two of the dual adapters for $20 apiece.  They look new in sealed anti-static bags.  One of my 32bit PCI slots has space behind it (no heatsinks in the way) for the PCI-x part.  

 

I paid $14 for the single PCI Intel Pro 1000/MT and I think it was new at that price.

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.