Portal Home > Knowledgebase > Articles Database > High I/O Wait
High I/O Wait
Posted by alisaqi, 08-21-2007, 05:42 AM |
I/O Wait is increasing on my server, please advise what to do: some times it is within 30 % and sometime it is up to 90%
please advise
top - 04:40:44 up 5:30, 2 users, load average: 4.57, 2.64, 3.24
Tasks: 171 total, 2 running, 169 sleeping, 0 stopped, 0 zombie
Cpu(s): 18.3% us, 4.3% sy, 0.0% ni, 0.0% id, 77.0% wa, 0.3% hi, 0.0% si
Mem: 1027556k total, 1021944k used, 5612k free, 85460k buffers
Swap: 2040244k total, 2068k used, 2038176k free, 147752k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
16754 apache 15 0 96140 58m 5948 S 8.3 5.8 0:04.95 httpd
19760 psaadm 15 0 36936 3628 1548 S 4.7 0.4 0:00.14 httpsd
5417 apache 15 0 103m 67m 5984 S 2.0 6.7 0:13.00 httpd
2738 mysql 15 0 394m 79m 4660 S 0.7 7.9 1:25.08 mysqld
28294 apache 15 0 104m 68m 6076 S 0.7 6.9 0:25.33 httpd
39 root 16 0 0 0 0 S 0.3 0.0 0:06.16 kswapd0
2857 root 16 0 4788 752 628 S 0.3 0.1 0:03.09 couriertcpd
3022 qmails 15 0 2304 484 372 S 0.3 0.0 0:09.28 qmail-send
32688 apache 15 0 96112 58m 6000 S 0.3 5.8 0:10.76 httpd
5632 popuser 15 0 35588 29m 2344 D 0.3 2.9 0:30.44 spamd
9850 apache 16 0 100m 65m 5996 S 0.3 6.5 0:09.85 httpd
12418 apache 16 0 99796 62m 6296 S 0.3 6.2 0:08.56 httpd
19578 qmaild 15 0 4008 840 692 S 0.3 0.1 0:00.01 qmail-smtpd
1 root 16 0 1652 552 472 S 0.0 0.1 0:00.70 init
2 root 34 19 0 0 0 S 0.0 0.0 0:00.29 ksoftirqd/0
3 root 5 -10 0 0 0 S 0.0 0.0 0:00.00 events/0
4 root 11 -10 0 0 0 S 0.0 0.0 0:00.00 khelper
5 root 15 -10 0 0 0 S 0.0 0.0 0:00.00 kacpid
19 root 5 -10 0 0 0 S 0.0 0.0 0:00.00 kblockd/0
37 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pdflush
38 root 15 0 0 0 0 S 0.0 0.0 0:01.07 pdflush
40 root 11 -10 0 0 0 S 0.0 0.0 0:00.00 aio/0
20 root 15 0 0 0 0 S 0.0 0.0 0:00.00 khubd
186 root 25 0 0 0 0 S 0.0 0.0 0:00.00 kseriod
295 root 15 0 0 0 0 D 0.0 0.0 0:06.68 kjournald
1373 root 6 -10 1904 456 380 S 0.0 0.0 0:00.01 udevd
1606 root 6 -10 0 0 0 S 0.0 0.0 0:00.00 kauditd
1648 root 6 -10 0 0 0 S 0.0 0.0 0:00.00 kmirrord
1672 root 19 0 0 0 0 S 0.0 0.0 0:00.00 kjournald
1673 root 19 0 0 0 0 S 0.0 0.0 0:00.00 kjournald
2360 root 15 0 2160 552 456 S 0.0 0.1 0:07.27 syslogd
2364 root 16 0 2620 384 316 S 0.0 0.0 0:00.00 klogd
2391 rpc 15 0 1952 548 452 S 0.0 0.1 0:00.00 portmap
2410 rpcuser 18 0 2400 724 620 S 0.0 0.1 0:00.00 rpc.statd
2436 root 16 0 4628 340 172 S 0.0 0.0 0:00.00 rpc.idmapd
2507 root 17 0 3060 508 300 S 0.0 0.0 0:00.00 smartd
|
Posted by SagoKyle, 08-21-2007, 09:52 AM |
Hmmm....
I've see this often on webservers where there are some poorly written php apps that due improper mysql joins and such nonsense. Basically it causes temporary tables to be written to the disk, and it causes io wait and extra load.
Can you connect to mysql or use phpmyadmin and get a list of running processes (mysql) when this is happening?
Also, just out of curiosity can your run "hdparm -tT /dev/hda" (replace /dev/hda with whatever your hard drive device is.)
|
Posted by david510, 08-21-2007, 10:06 AM |
Type the following command to see if any certain database is causing it
mysqladmin -i3 processlist
|
Posted by anatolijd, 08-21-2007, 10:07 AM |
What is the apache full status output?
Looks, like this may be an apache process, or some php script (if mod_php used).
# ~apache/bin/apachectl fullstatus
|
Posted by Slidey, 08-21-2007, 11:51 AM |
if you're running multiple disks you could use iostat (iostat -xn 3) to work which fs's are causing you trouble..
|
Posted by alisaqi, 08-21-2007, 12:05 PM |
it gives me following, seems no result
+--------+------+-----------+----+---------+------+-------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+--------+------+-----------+----+---------+------+-------+------------------+
| 121549 | root | localhost | | Query | 0 | | show processlist |
+--------+------+-----------+----+---------+------+-------+------------------+
+--------+------+-----------+----+---------+------+-------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+--------+------+-----------+----+---------+------+-------+------------------+
| 121549 | root | localhost | | Query | 0 | | show processlist |
+--------+------+-----------+----+---------+------+-------+------------------+
+--------+------+-----------+----+---------+------+-------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+--------+------+-----------+----+---------+------+-------+------------------+
| 121549 | root | localhost | | Query | 0 | | show processlist |
+--------+------+-----------+----+---------+------+-------+------------------+
+--------+------+-----------+----+---------+------+-------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+--------+------+-----------+----+---------+------+-------+------------------+
| 121549 | root | localhost | | Query | 0 | | show processlist |
+--------+------+-----------+----+---------+------+-------+------------------+
|
Posted by alisaqi, 08-21-2007, 12:10 PM |
this one is now working
[root@sml101 bin]# ~apache/bin/apachectl fullstatus
-bash: /var/www/bin/apachectl: No such file or directory
[root@sml101 ]# whereis apache
apache:
[root@sml101 ]# whereis apachectl
apachectl:
|
Posted by anatolijd, 08-22-2007, 09:27 AM |
locate apachectl
?
|
Posted by YYamagishi, 08-22-2007, 10:48 AM |
Make sure you do updatedb first.
It's weird that apachectl can't be found
|
Posted by david510, 08-22-2007, 11:42 AM |
Type the following command, it should give the binary path to apache. Use that to see the status.
ps ax | grep http
Normally it should be like /usr/sbin/httpd or /usr/local/apache/bin/httpd
Use the following command to see status.
/usr/sbin/httpd fullstatus
|
Posted by alisaqi, 08-23-2007, 03:21 AM |
root@sml101 ]# ps ax | grep http
2991 ? Ss 0:07 /usr/sbin/httpd
3537 ? Ss 0:00 /usr/local/psa/admin/bin/httpsd
24164 ? S 0:10 /usr/local/psa/admin/bin/httpsd
24208 ? S 0:09 /usr/local/psa/admin/bin/httpsd
16806 ? S 0:00 /usr/sbin/httpd
30647 ? S 0:16 /usr/sbin/httpd
31126 ? S 0:18 /usr/sbin/httpd
31647 ? S 0:18 /usr/sbin/httpd
31651 ? S 0:17 /usr/sbin/httpd
31652 ? S 0:15 /usr/sbin/httpd
5142 ? S 0:17 /usr/sbin/httpd
9329 ? S 0:07 /usr/sbin/httpd
13635 ? S 0:03 /usr/sbin/httpd
14140 ? S 0:02 /usr/sbin/httpd
14203 ? S 0:01 /usr/sbin/httpd
14209 ? S 0:01 /usr/sbin/httpd
14253 ? S 0:02 /usr/sbin/httpd
14397 ? S 0:02 /usr/sbin/httpd
14411 ? S 0:01 /usr/sbin/httpd
16373 pts/0 R+ 0:00 grep http
[root@sml101 ]#
[root@sml101 ]# /usr/local/apache/bin/httpd fullstatus
-bash: /usr/local/apache/bin/httpd: No such file or directory
[root@sml101 ]# /usr/sbin/httpd fullstatus
Usage: /usr/sbin/httpd [-D name] [-d directory] [-f file]
[-C "directive"] [-c "directive"]
[-k start|restart|graceful|stop]
[-v] [-V] [-h] [-l] [-L] [-t] [-S]
Options:
-D name : define a name for use in directives
-d directory : specify an alternate initial ServerRoot
-f file : specify an alternate ServerConfigFile
-C "directive" : process directive before reading config files
-c "directive" : process directive after reading config files
-e level : show startup errors of level (see LogLevel)
-E file : log startup errors to file
-v : show version number
-V : show compile settings
-h : list available command line options (this page)
-l : list compiled in modules
-L : list available configuration directives
-t -D DUMP_VHOSTS : show parsed settings (currently only vhost settings)
-S : a synonym for -t -D DUMP_VHOSTS
-t : run syntax check for config files
[root@sml101 ]#
|
Posted by alisaqi, 08-23-2007, 04:19 AM |
root@sml101 bin]# locate apachectl
/usr/share/zsh/4.2.0/functions/_apachectl
/usr/share/man/fr/man8/apachectl.8.gz
/usr/share/man/man8/apachectl.8.gz
/usr/sbin/apachectl
/var/www/manual/programs/apachectl.html
/var/www/manual/programs/apachectl.html.en
/var/www/manual/programs/apachectl.html.ko.euc-kr
[root@sml101 bin]# /usr/sbin/apachectl fullstatus
Not Found
The requested URL /server-status was not found on this server.
---------------------------------------------------------------------------
Apache/2.0.52 (Red Hat) Server at localhost Port 80
[root@sml101 bin]#
|
Posted by macker, 08-23-2007, 04:29 AM |
there's a few easy things you can do.
run iostat to see what filesystem the problem is on.
run vmstat to monitor paging activity; or just try turning swap off for a few minutes (swapoff -a), see if the iowait goes down, then back on (swapon -a).
check myql for problems (see a mysql tuning guide); run 'mysql -u root -p extended-status' (-p prompts for a password; if there is none, omit it). high counts for opened_tables and created_tmp_disk_tables can both indicate potentially unnecessary disk activity, while slow_queries and table_locks_waited can be symptoms that mysql is being affected.
if this is an IDE-based system (/dev/hda), try 'hdparm -c 1 -d 1 /dev/hda'
without knowing what changed, you're stuck with shooting in the dark, but iowait generally means that your disks aren't keeping up, either on reads or writes. adding more RAM can help with reads, but improving writes usually means faster disks, or less writes. depending what's writing, you may be able to improve this.
a failing disk can degrade performance, and lead to high iowait, as well.
|
Posted by alisaqi, 08-23-2007, 05:10 AM |
Ok, please advise for this
[root@sml101 ~]# iostat
Linux 2.6.9-42.0.10.EL (sml101) 08/23/2007
avg-cpu: %user %nice %sys %iowait %idle
12.13 0.09 2.49 21.92 63.36
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
hda 39.76 270.78 590.25 51592254 112460144
hda1 0.00 0.01 0.00 1746 48
hda2 0.06 0.17 0.29 32626 55352
hda3 94.24 270.60 589.96 51556906 112404744
hdb 0.00 0.02 0.00 2906 248
hdb1 0.00 0.01 0.00 1466 248
[root@sml101 ~]#
[root@sml101 ~]# vmstat
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
1 2 9492 4052 127812 114280 0 0 136 295 14 107 12 2 63 22
[root@sml101 ~]#
Please explain what happend with following command and I am confused, I dont want mess up
[root@sml101 ~]# hdparm -c 1 -d 1 /dev/hda
/dev/hda:
setting 32-bit IO_support flag to 1
setting using_dma to 1 (on)
IO_support = 1 (32-bit)
using_dma = 1 (on)
[root@sml101 ~]#
|
Posted by macker, 08-23-2007, 05:35 AM |
iostat is not easy to explain, but you should run it as 'iostat -x 3'. (Not sure about Slidey's syntax of -n, don't think this will work cleanly as written, at least not on Linux)
It will show the stats every 3 seconds, for all the partitions. Look for the one(s) that have a high %util and/or high await; note that there may be one or more that has a very high await, but that never changes; if there's no activity, await doesn't get updated. Look at the rKB/s and wKB/s coluns; if you see it peaking regularly for one device, especially in conjunction with a high %util, that helps to narrow down where on your disk the I/O is running.
vmstat is similar, you need to run it in an ongoing mode, e.g. 'vmstat 3'. This will show ongoing activity; the si and so columns are of interest, to see if the system is using swap a lot; using swap a lot is bad.
-c 1 turns on 32-bit disk access. -d 1 turns on DMA mode. both of these are performance options which _may_ be disabled by default, just in case you have an old crappy system that has bugs with these; this is more a relic from the 486 days. They are probably turned on by default, but we're double-checking here. These options being turned off use slower ways of accessing the disk.
As always, check dmesg ('dmesg | tail') for errors about your disk; e.g. things which say "hda" or "hdb", especially errors like "attempted to read beyond end of device" or "timeout".
It looks like you have a device named hdb; this may be a cd-rom drive, but it's not in use. hda3 looks to be the most active partition, and is currently write-heavy. Is there lots of logs being generated, maybe?
If all else fails, and you just want a magic bullet, run: 'mount -o remount,noatime /dev/hda3'
This turns off atime (a relatively useless stat) for the filesystem. atime logs the last time a file was accessed. Not the last time you opened the file, but the last time you did _anything_ with the file. Doing 'ls /test' updates the atime. Webservers pound the disk with atime updates, with no benefit; turning it off helps reduce unnecessary writes.
But ultimately, if your iowait just started going up, you should figure out why; what changed? Or, maybe it's been increasing for a while, and you just now noticed.
These are just some suggestions to go on. hdparm and the noatime option should both be safe, and a reboot will reset both to their original settings. iostat and vmstat are read-only and don't change anything.
|
Add to Favourites Print this Article
Also Read