Gatt Posted May 5, 2010 Posted May 5, 2010 running this on a dc5800 with 4Gig RAM, and C2D CPU But as the pic shows - my load avg is constantly max'd out!! Anyone got any ideas why?
CyberNerd Posted May 5, 2010 Posted May 5, 2010 login as root (ssh, port 222, or system> shell ) and then run top to see what process is maxing out
Gatt Posted May 5, 2010 Author Posted May 5, 2010 Hope this makes sense to someone.. [root@proxy root]# top top - 14:51:11 up 2:09, 1 user, load average: 21.23, 22.24, 24.85 Tasks: 427 total, 5 running, 421 sleeping, 0 stopped, 1 zombie Cpu(s): 19.5%us, 4.3%sy, 0.0%ni, 0.0%id, 51.7%wa, 12.3%hi, 12.2%si, 0.0%st Mem: 3615832k total, 3500540k used, 115292k free, 95984k buffers Swap: 3622648k total, 2996k used, 3619652k free, 2593636k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 19947 postgres 16 0 394m 46m 44m D 16 1.3 0:01.33 postgres 3313 squid 15 0 50308 44m 1724 R 8 1.3 17:09.89 squid 7709 postgres 18 0 394m 362m 360m D 6 10.3 2:45.98 postgres 6891 postgres 18 0 394m 362m 360m D 6 10.3 5:16.74 postgres 10794 postgres 16 0 395m 361m 359m D 3 10.2 1:47.89 postgres 20129 postgres 18 0 393m 25m 24m D 3 0.7 0:00.65 postgres 3595 guardian 16 0 176m 163m 1136 S 3 4.6 2:15.82 dansguardian 15465 guardian 16 0 176m 164m 1704 S 3 4.6 0:00.57 dansguardian 19556 guardian 16 0 176m 164m 1796 R 2 4.7 0:02.64 dansguardian 17048 guardian 15 0 176m 164m 1812 S 2 4.7 0:03.41 dansguardian 17831 auth 15 0 14944 5140 2704 S 2 0.1 0:00.06 smoothauthd 17047 guardian 15 0 176m 164m 1856 S 1 4.7 0:03.35 dansguardian 19482 guardian 15 0 176m 164m 1812 S 1 4.7 0:04.12 dansguardian 19396 guardian 16 0 176m 164m 1808 S 1 4.7 0:00.92 dansguardian 19483 guardian 15 0 176m 164m 1820 S 1 4.7 0:02.90 dansguardian 19552 guardian 16 0 176m 164m 1808 S 1 4.7 0:05.15 dansguardian 19687 guardian 15 0 176m 164m 1752 S 1 4.6 0:00.42 dansguardian 2503 root 16 0 6088 4236 1048 S 1 0.1 1:22.52 trafficlogger 3596 guardian 16 0 176m 163m 880 S 1 4.6 0:16.95 dansguardian 12994 guardian 16 0 176m 164m 1812 S 1 4.7 0:02.82 dansguardian 13004 guardian 16 0 176m 164m 1852 S 1 4.7 0:03.77 dansguardian 13005 guardian 16 0 176m 164m 1856 S 1 4.7 0:01.64 dansguardian 15470 guardian 16 0 176m 164m 1736 S 1 4.6 0:00.56 dansguardian 19530 root 16 0 6160 1860 1540 S 1 0.1 0:00.33 sshd 19551 guardian 16 0 176m 164m 1780 S 1 4.6 0:01.03 dansguardian 19553 guardian 16 0 176m 164m 1812 S 1 4.6 0:01.60 dansguardian 19555 guardian 15 0 176m 164m 1816 S 1 4.7 0:02.30 dansguardian 19689 guardian 16 0 176m 164m 1812 S 1 4.6 0:01.71 dansguardian 19886 guardian 16 0 176m 163m 1704 S 1 4.6 0:00.17 dansguardian 6 root 10 -5 0 0 0 S 0 0.0 0:35.90 events/0 87 root 15 0 0 0 0 D 0 0.0 0:17.54 kswapd0 3319 squid 15 0 2048 776 464 S 0 0.0 0:00.80 fakeauth_auth 3320 squid 15 0 2012 736 464 S 0 0.0 0:00.26 fakeauth_auth 3365 squid 15 0 104m 1500 1104 S 0 0.0 0:16.44 iwf-squid-threa 3452 auth 16 0 9652 1816 1324 S 0 0.1 0:21.49 smoothauthd 14900 nobody 16 0 4760 2368 1436 S 0 0.1 0:02.19 httpd 15273 postgres 17 0 395m 280m 277m D 0 7.9 0:30.11 postgres 15458 guardian 16 0 176m 164m 1828 S 0 4.6 0:02.53 dansguardian 15463 guardian 15 0 176m 164m 1780 S 0 4.6 0:00.24 dansguardian 15468 guardian 15 0 176m 164m 1752 S 0 4.6 0:00.30 dansguardian 19398 guardian 15 0 176m 164m 1824 S 0 4.6 0:02.17 dansguardian 19480 guardian 15 0 176m 164m 1828 S 0 4.7 0:02.99 dansguardian 19485 guardian 15 0 176m 164m 1752 S 0 4.6 0:01.01 dansguardian 19688 guardian 16 0 176m 164m 1744 R 0 4.6 0:01.41 dansguardian [root@proxy root]# Just before I stopped it, squid was at the top of that list with %CPU at 45...
tom_newton Posted May 5, 2010 Posted May 5, 2010 That is pretty huge. You got yesterday's updates on? I will take a guess it is something database vs. disk to get LA that high. Might be worth poking support with something sharp. Edit: guess was right - hadn't seen your latest post. Postgres is having your disk for dinner. Support will definitely be able to help on that score.
Gatt Posted May 5, 2010 Author Posted May 5, 2010 @tom - yep installed them over lunch but its not had any effect...
tom_newton Posted May 5, 2010 Posted May 5, 2010 Has it settled down yet? If not please raise a ticket, and we'll get it looked into.
Gatt Posted May 5, 2010 Author Posted May 5, 2010 Looks like it is.. I know its only 4 something but theres no one in school!
kmount Posted May 5, 2010 Posted May 5, 2010 Got yourself a zombie process there too. What happened before 12:44 today, looks around system boot time, was this after installing the updates? Looks like something (probably postgres) is battering your disks + CPU as Tom said there, sure support will nip it (with a kick) in the bud.
Gatt Posted May 5, 2010 Author Posted May 5, 2010 The reboot at 12:44 was just after the installation of the updates.. Will call support tomorrow then
Gatt Posted May 6, 2010 Author Posted May 6, 2010 Ok raised a support ticket this mornign as I came in at 7am to find it complaining and no users on the system Now its spewing out IP LIMIT EXCEEDED errors, yet when I check we have only used 219 out of our alloted 320 licences! Still waiting on it rebooting - been stuck at "System going down for Reboot NOW!" for past five minutes
tom_newton Posted May 6, 2010 Posted May 6, 2010 Urk. I wonder if someone's DOSing the box by mistake? Have asked for your ticket to be actioned ASAP.
Gatt Posted May 6, 2010 Author Posted May 6, 2010 No, becuase at the time it wasn't really affecting QoS - just slow on the Web UI and slower browsing But if its gonna keep doing this then I'm gonna be getting turned very slowly over hot coals!! How do i get the priority changed?
Gatt Posted May 6, 2010 Author Posted May 6, 2010 Also, can I check to see if there are any DoS attempts?
Gatt Posted May 6, 2010 Author Posted May 6, 2010 Just off the phone after speaking a very nice chap by the name of Lyndon Looks like our DB was corrupted (@DMcCoy - they are kept for 6 months..) He's cleared it and waiting on it to rebuild, he's also advised to reduce out web cache down to minimum as it appears the SATA drive in the server is being referenced as a legacy IDE!! Fingers crossed this sorts it!!
DMcCoy Posted May 6, 2010 Posted May 6, 2010 How many requests do you get a day? I found that we simply produce too much data, and have the database retention set to a month to reduce system load.
tom_newton Posted May 6, 2010 Posted May 6, 2010 He's a good chap is our Lyndon. It is possible to change how SATA drives are seen by the OS in the BIOS usually (is this a dell?) - but you will possibly need to reinstall, as it changes how the kernel letters the drives... if you felt up to the job you could probably hand-hack it tho
Gatt Posted May 6, 2010 Author Posted May 6, 2010 Not sure off the top of my head, but we do regularly have over 300 PCs on at a time....
tom_newton Posted May 6, 2010 Posted May 6, 2010 How many requests do you get a day? I found that we simply produce too much data, and have the database retention set to a month to reduce system load. Is this post db updates? I know you were one of the earlier adopters...
Gatt Posted May 6, 2010 Author Posted May 6, 2010 yeah it was tom.. Lyndon's cleared the DB and forced it to rebuild the schema Email from Support: Hi, Just to confirm our conversation we found issues with the database and decided to rebuilt with the new schema. We also adjusted the prune settings and free space settings so that it's not freeing up space quite as soon and to stop the database growing. We also found the Hard disk to be quite slow causing the load average to increase, this looked to be caused by the server being configured in a "legacy ide" mode and would recommend at some stage to see if the bios can be configured to a SATA or RAID setting instead, though would unfortunately need a re-install. However the Smoothwall has been running for some months and only recently had issues, so I suggested as you have a fast internet connection to reduce the web proxy cache from 1500 to 50M in order to reduce the disk access, Any questions or queries please let us know,
DMcCoy Posted May 6, 2010 Posted May 6, 2010 I'm running a new install on a physical box so it's difficult to compare. My old virtualisation hosts didn't have the VT extensions so I took a heavy hit on my VM smoothwall box. This was much improved on new servers, although the DB updates did help on the old box. 1
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now