To Cache drive or not to Cache drive?

johnm160 · June 20, 2011

I have a 500gb 7200 rpm IDE drive sitting around that seems like it might be useful as a cache drive. I am running over gigabit ethernet, would I see a performance hit writing to the server using an IDE drive or is the drive still likely to be faster than the network?

lionelhutz · June 20, 2011

Jason & Others;

unRAID will automatically look after your files as they go to the cache disk and then get tranferred to the array disks. You set each share to use or not use the cache disk. You set the schedule. The, data copied to the shares will first go to the cache and then be moved to the array disks when the mover runs. You do not have to manually copy to the cache disk and you do not have to manually handle the transfers from the cache disk to the array disks. If the cache disk is full then data goes directly to the array until the mover runs and empties it again.

To many people are over-thinking the use of the cache disk and trying to manually manage it. You shouldn't even have to pay any attention to it or look at the files on it.

Peter

Heretic · June 20, 2011

I'm using the cache disk only for transmission.

i I want to put files on the server i always choose disk shares too copy straight to the selected drive.

using the cache was the easiest option to use an drive outside the array.

jortan · June 21, 2011

I'm using the cache disk only for transmission.

i I want to put files on the server i always choose disk shares too copy straight to the selected drive.

using the cache was the easiest option to use an drive outside the array.

I would write to the share which would utilise the cache for a couple of reasons:

1) It might prevent additional disk/s from spinning up (unless the disk/s you're writing to are also being seeded from)

2) It prevents some fragmentation of downloaded files as the full contiguous files are written to your disks each night.

But to each their own!

Heretic · June 21, 2011

I'm using the cache disk only for transmission.

i I want to put files on the server i always choose disk shares too copy straight to the selected drive.

using the cache was the easiest option to use an drive outside the array.

I would write to the share which would utilise the cache for a couple of reasons:

1) It might prevent additional disk/s from spinning up (unless the disk/s you're writing to are also being seeded from)

2) It prevents some fragmentation of downloaded files as the full contiguous files are written to your disks each night.

But to each their own!

I'm not sure I get what you mean

all torrents are on the cache drive in the folder that starts with a "." so it will be invisible for the mover script.

all files will remain on the cache until the are manually moved to the disk of choice

so there is no fragmentation when the files are moved

only disk that is active 24/7 is the cache drive.

Personally i see no point in using a drive inside the protected array for torrents.

torrent already have protection -> verify

dgaschk · June 21, 2011

Personally i see no point in using a drive inside the protected array for torrents.

torrent already have protection -> verify

This is a completely different kind of protection. The array does nothing like "verify". It protects only from a total disk failure. There is no individual file protection or "verify". This can be added with the md5deep package.

If the cache drive fails everything on it is lost, but you can just download it again. If a file within unRAID becomes corrupt there is no way to fix it. It will fail when accessed.

Heretic · June 22, 2011

that is very true.

for critical data its always best to have (offsite) backups.

apart from knowing if a files is corrupted md5 doesn't really protect does it? creating parity (.par) files can to a certain extent. (just seems like a lot of work)

Having the entire array active to seed some of the files is for various reasons no option for me.

as you said if the drive fails most files can be downloaded again

jortan · June 22, 2011

Having the entire array active to seed some of the files

sabnzbd/sickbeard. You'll never look back.

dgaschk · June 22, 2011

that is very true.

for critical data its always best to have (offsite) backups.

apart from knowing if a files is corrupted md5 doesn't really protect does it? creating parity (.par) files can to a certain extent. (just seems like a lot of work)

Having the entire array active to seed some of the files is for various reasons no option for me.

as you said if the drive fails most files can be downloaded again

You're right. I use md5deep in case I ever get a parity error. Then I can determine if the error is in parity or on disk. If none of my data files have errors then I update parity. If I have corrupt data files then I can rebuild that disk.

Heretic · June 23, 2011

altho were going a bit offtopic (cache) this is quite interesting. basically your saying under normal condition you don't know if you can trust parity?

defected07 · June 23, 2011

You're right. I use md5deep in case I ever get a parity error. Then I can determine if the error is in parity or on disk. If none of my data files have errors then I update parity. If I have corrupt data files then I can rebuild that disk.

How is this process configured? Do you have a cron job to scan all of your data drives and calculate md5 of each file (if it's been changed or wasn't already calculated)? I'd like to do something like this, too.

dgaschk · June 23, 2011

A parity error could be the result of a corrupt data disk or corrupt parity. It's most likely parity but this is not certain.

I have just done it manually and saved a copies of the hashes for all data disks in a .hashes directory on the top level of all drives. For my full drives it never has to change. I occasionally update the hashes for a drive as it fills. A cron job is a good idea though. It doesn't take very long to compute for for an entire 2T but so a nightly job for all non-full data drives should work.

Since I'm filling my media drives one at a time I only have a single drive to compute. I don't worry about disks that hold backups because if I have a parity error and my media drives are ok then I can just delete the backups and recompute parity. Backups are easy to replace.

Joe L. · June 23, 2011

A parity error could be the result of a corrupt data disk or corrupt parity. It's most likely parity but this is not certain.

True, or neither if the bit was corrupted in memory. It is one chance out of N (where N = the number of total drives in your system + the number of other hardware items involved.) It could be ANY disk, or any part of the I/O hardware, from memory to motherboard chipset.

defected07 · June 23, 2011

So how does the md5deep package work? It creates one file with a hash in it for each file? Or does it do a hash => value scheme, where it has one file with each line containing the file name and the md5 value of it? Just want to figure out the best way to do this and let it be as automated as possible..

dgaschk · June 24, 2011

Google it for info.

KellyVB · June 25, 2011

I'm confused about the diffrent drives that have been talked about. I currently am using 2 WD 1TB greens for data and a 1TB WD blue for parity. These are all SATA drives and run at 3gps. I now have the pro lisence so am thinking of adding a cache drive. The WD blue is capable of 6gps same as the WD black, BUT only if it's connected to the new SATA 3 ports. Since these were not avalible untill now, how can a WD black be any faster my greens or blue which all run at 3gps runing on SATA 2 ports? I do understand that the black has a 64 meg cache, the greens have a 32 and i think the blue is 16

Joe L. · June 25, 2011

I'm confused about the diffrent drives that have been talked about. I currently am using 2 WD 1TB greens for data and a 1TB WD blue for parity. These are all SATA drives and run at 3gps. I now have the pro lisence so am thinking of adding a cache drive. The WD blue is capable of 6gps same as the WD black, BUT only if it's connected to the new SATA 3 ports. Since these were not avalible untill now, how can a WD black be any faster my greens or blue which all run at 3gps runing on SATA 2 ports? I do understand that the black has a 64 meg cache, the greens have a 32 and i think the blue is 16

Don't get sucked in by the marketing. Today's SATA disks, regardless of who that are made by, can basically attain a sustained max read speed of between 120 and 150 MB/s. (multiply by 8 to get bps)

150 * 8 = 1200 Mbps, or 1.2Gbps.

It does not make a bit of difference if the SATA link to it can theoretically transfer bits faster, it is not going to happen. It has been frequently said, a spinning disk can barely saturate an SATA-1 link to it.

The BIGGEST factor for any disk is the areal density of the bits on the platters and the rotational speed of the platters. The cache on the disk is nearly useless when playing music, or movies. (When was the last you watched a movie that was less than 64 Meg in size?)

Joe L.

johnm160 · June 26, 2011

Would I notice a performance hit using a 500GB 7200 RPM IDE drive as a cache disk? Just trying to decide if I can utilize this disk or if I would benifit from buying another SATA drive for cache.

defected07 · June 27, 2011

Well, you should see a performance increase, as I believe your other drives are 5900 RPM, correct? Why do you think you'd see a performance "hit"?

I have a 500 GB 7200 RPM Hitachi cache drive, with my array drives all being green--of mixed Hitachi and WD sizes. I never did a benchmark comparison between the two, but, according to those specs, performance should be increased. Please I use the cache drive for running YAMJ.

Heretic · June 27, 2011

it also depends on the data density. With higher data density a drive can spin slower to get the same data throughput as a drive with lower data density

SSD · June 27, 2011

Depends on usage. For sequential read or write access, a slower/ more dense surface can yield better performance than a higher RPM disk with lower density.

But for random access, the higher RPM drive can perform better due to faster access time.

Due to the way unRaid writes to the array, a higher RPM drive will provide better performance that a higher density slower RPM drive for array disks.

Heretic · June 27, 2011

Due to the way unRaid writes to the array, a higher RPM drive will provide better performance that a higher density slower RPM drive for array disks.

is this valid for the cache disk?

SSD · June 27, 2011

No - only for disks in the protected array.

johnm160 · June 30, 2011

Well, you should see a performance increase, as I believe your other drives are 5900 RPM, correct? Why do you think you'd see a performance "hit"?

I have a 500 GB 7200 RPM Hitachi cache drive, with my array drives all being green--of mixed Hitachi and WD sizes. I never did a benchmark comparison between the two, but, according to those specs, performance should be increased. Please I use the cache drive for running YAMJ.

Correct my data drives are 5900 RPM, My concern about a performance hit is because the 500GB drive I am thinking about using as a cache drive is IDE interface not SATA

defected07 · June 30, 2011

Well, you should see a performance increase, as I believe your other drives are 5900 RPM, correct? Why do you think you'd see a performance "hit"?

I have a 500 GB 7200 RPM Hitachi cache drive, with my array drives all being green--of mixed Hitachi and WD sizes. I never did a benchmark comparison between the two, but, according to those specs, performance should be increased. Please I use the cache drive for running YAMJ.

Correct my data drives are 5900 RPM, My concern about a performance hit is because the 500GB drive I am thinking about using as a cache drive is IDE interface not SATA

Do you expect to put more than 500GB worth of a data on your server per day? I don't see why that'd be an issue--actually a smaller cache drive would be better. Use a cache drive that's as small as the daily amount of data you'd transfer, plus the size of whatever Applications will permanently reside on your cache drive...

And yes, you should get a SATA interface cache drive...

To Cache drive or not to Cache drive?

Recommended Posts

Link to comment

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Posted Images

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation