Go for the Gold with your Disk I/O!

crossing the finish line

Optimize fstab to Improve R/W Performance of Hard-drives

Wow what a difference! If you have read a couple of my other articles you may know about my adventures with file sharing on a home network.

Taking a homebrew approach at setting all this up has proven… interesting.

Not the least of which was when I figured out why syncing was so slow to the RAID on my file server. In one sentence: I needed to change sync to async in fstabThere! Article finished. Just kidding; please keep reading to learn more.

For those who may not know, fstab is where Linux goes to get mounting options for disks upon start-up. Think of it like a cheat-sheet for those really tough problems on a final exam. With the proper settings your disks will perform well but as I found out, it only takes one parameter to make everything go sideways!

The async option in fstab is the opposite of sync, which is rarely used; async is the default.

The sync option means all changes to the filesystem are immediately flushed to disk. For mechanical drives this leads to a huge slow down as the system has to move the disk heads to the right position and wait for the operation to complete.

With async the system buffers the write operation and optimizes the actual writes; instead of being blocked the process continues to run.

See this post for details.

Wow what a difference that made!

I found it very useful to open both VNC and SSH sessions to debug the remote machine during the below steps.

Grep to find Process

Useful tools for troubleshooting here:

ps fauxww | grep -A 1 ‘[R]SYNC’

I used the above grep command to see if there are any rsync processes running. Yes at this point it is worth mentioning that I use rsync to automate copying and related operations between my machines. Some basic copying from one machine to another will also work during troubleshooting but I find that it is easier in this instance to debug when I have a separate process running to deal with copying. Think of it like trying to run a race while figuring out what is going wrong with your gait, I do not know about the rest of you but it is probably easier to watch someone else running and see how they do instead of trying to watch and critique yourself.

In short, if everything is going well with your automated copying and syncing operations there should be multiple rsync processes, as per this post.

Disk Monitoring

Another useful tool when figuring out read/write transfer speeds:

iotop

The iostat and iotop commands (run as SuperUser) help monitor system input/output device loading by observing the time the devices are active in relation to their average transfer rates. They are sometimes used to evaluate the balance of activity between disks as per this post.

Finding a Folder

Okay so this is not strictly related to the problem at hand — call it a close cousin to all of the other troubleshooting bits and pieces. Say what you will but I find it helpful to be able to quickly locate and confirm various items in sub-folders particularly when my copying operations via rsync have a delete feature each time they are run. To find a folder named Documents in your home directory, run:

find $HOME -type d -name 'Documents'

OR

find ~ -type d -name 'Documents'

OR

find /home/<user>/ -type d -name 'Documents'

See here for details.

Kill Process

There are a number of ways to kill a process if you know the name of it. Again, not strictly close-holed to the problem at hand but I did find it helpful while I was going through my other troubleshooting steps. Here are a couple different ways you can accomplish this. We are going to assume that the process we are trying to kill is named irssi:

kill $(pgrep irssi)

killall -v irssi

pkill irssi

kill `ps -ef | grep irssi | grep -v grep | awk ‘{print $2}’`

Refer here for details.

Kernel Tracing

Back to something a tad more relavent to our problem at hand: the utilities for kernel tracing are very useful when hunting down processes and services that are consuming CPU and IO!

In my quest to find what was causing such slow read/write speeds I posed a number of questions in an effort of process of elimination.

Was it the stripe size for the RAID array that was causing the slow-down? Nope; this did not have an effect.

What about the commit time? Default should be five seconds before jbd2 is run.

Still a lot of excessive flushing after specifying commit=10 (10 seconds) in fstab!

This was the post that ultimately got me pointed in the right direction!

In order to capture events you need to be logged in as SuperUser, see here for details:

sudo -s

Set events for all services beginning with jbd2:

echo jbd2:* > /sys/kernel/debug/tracing/set_event

Cancel all event tracing:

echo > sudo /sys/kernel/debug/tracing/set_event

Set ext4_sync_file_enter event:

echo 1 > /sys/kernel/debug/tracing/events/ext4/ext4_sync_file_enter/enable

Send trace output to a non-SuperUser log file:

cat /sys/kernel/debug/tracing/trace > /home/<user>/out.txt

or

cat /sys/kernel/debug/tracing/trace_pipe > /home/<user>/out.txt

Taking a closer look at what was captured with the ext4_sync_file_enter event, I determined that there was an absolute crap-ton of flushing occuring on a regular basis. Bingo! This was my culprit. The excessive flushing was due to the sync setting for my RAID in the fstab file. While possibly good in some situations, it was preferable (at least for me) to change it to async. Any downside of choosing this option from my perspective was far outweighed by the fact that my disk I/O improved by a significant margin. All that being said you will want to do a bit more reading before deciding if this option is right for you, too.

Lots of troubleshooting with different Linux services and utilities but it was worth it; rsync from one machine to another was a couple kB, now clocking in at 60MB!

Postscript

In my search for possible causes I pondered disk fragmentation. Turns out ext4 filesystem should not need to be defragmented, see this post for details. This whole article was admittedly a bit fragmented itself as there were a number of un-related services and utilities brought to bear to get to the root of my disk I/O problem. What experiences have you had in making your disks read and write faster?

Leave a Comment

Required fields are marked *