Mastering unix shell scripting phần 4 ppsx

OS=$(uname) case $OS in AIX|HP-UX) SWITCH=’-t’ F1=3 F2=4 F3=5 F4=6 echo “\nThe Operating System is $OS\n” ;; Linux|SunOS) SWITCH=’-c’ F1=1 F2=2 F3=3 F4=4 echo “\nThe Operating System is $OS\n” ;; *) echo “\nERROR: $OS is not a supported operating system\n” echo “\n\t EXITING \n” exit 1 ;; esac Listing 7.2 Case statement for the iostat fields of data. Notice in Listing 7.2 that we use a single case statement to set up the environment for the shell script to run the correct iostat command for each of the four Unix flavors. If the Unix flavor is not in the list, then the user receives an error message before the script exits with a return code of 1, one. Later we will cover the entire shell script. Syntax for sar The sar command stands for system activity report. Using the sar command we can take direct sample intervals for a specific time period. For example, we can take 4 samples that are 10 seconds each, and the sar command automatically averages the results for us. Let’s look at the output of the sar command for each of our Unix flavors, AIX, HP-UX, Linux, and Solaris. AIX # sar 10 4 AIX yogi 1 5 000125604800 07/26/02 17:44:54 %usr %sys %wio %idle 17:45:04 25 75 0 0 188 Chapter 7 17:45:14 25 75 0 0 17:45:24 26 74 0 0 17:45:34 25 75 0 0 Average 25 75 0 0 Now let’s look at the average of the samples directly. # sar 10 4 | grep Average Average 26 74 0 0 HP-UX # sar 10 4 HP-UX dino B.10.20 A 9000/715 07/29/102 22:48:10 %usr %sys %wio %idle 22:48:20 40 60 0 0 22:48:30 40 60 0 0 22:48:40 12 19 0 68 22:48:50 0 0 0 100 Average 23 35 0 42 Now let’s only look at the average of the samples directly. # sar 10 4 | grep Average Average 25 37 0 38 Linux # sar 10 4 Linux 2.4.2-2 (bambam) 07/29/2002 10:01:59 PM CPU %user %nice %system %idle 10:02:09 PM all 0.10 0.00 0.00 99.90 10:02:19 PM all 0.00 0.00 0.10 99.90 10:02:29 PM all 11.40 0.00 5.00 83.60 10:02:39 PM all 60.80 0.00 36.30 2.90 Average: all 18.07 0.00 10.35 71.58 Now let’s look at the average of the samples directly. # sar 10 4 | grep Average Average: all 18.07 0.00 10.35 71.58 Monitoring System Load 189 Solaris # sar 10 4 SunOS wilma 5.8 Generic i86pc 07/29/02 23:01:55 %usr %sys %wio %idle 23:02:05 1 1 0 98 23:02:15 12 53 0 35 23:02:25 15 67 0 18 23:02:35 21 59 0 21 Average 12 45 0 43 Now let’s look at the average of the samples directly. # sar 10 4 | grep Average Average 12 45 0 43 What Is the Common Denominator? With the sar command the only common denominator is that we can always grep on the word “Average.” Like the iostat command, the fields vary between some Unix flavors. We can use a similar case statement to extract the correct fields for each Unix flavor, as shown in Listing 7.3. OS=$(uname) case $OS in AIX|HP-UX|SunOS) F1=2 F2=3 F3=4 F4=5 echo “\nThe Operating System is $OS\n” ;; Linux) F1=3 F2=4 F3=5 F4=6 echo “\nThe Operating System is $OS\n” ;; *) echo “\nERROR: $OS is not a supported operating system\n” echo “\n\t EXITING \n” exit 1 ;; esac Listing 7.3 Case statement for the sar fields of data. 190 Chapter 7 Notice in Listing 7.3 that a single case statement sets up the environment for the shell script to select the correct fields from the sar command for each of the four Unix flavors. If the Unix flavor is not in the list, then the user receives an error message before the script exits with a return code of 1, one. Later we will cover the entire shell script. Syntax for vmstat The vmstat command stands for virtual memory statistics. Using the vmstat command, we can get a lot of data about the system including memory, paging space, page faults, and CPU statistics. We are concentrating on the CPU statistics in this chapter, so let’s stay on track. The vmstat commands also allow us to take direct samples over intervals for a specific time period. The vmstat command does not do any averaging for us, however, we are going to stick with two intervals. The first interval is the average of the system load since the last system reboot, like the iostat command. The last line con- tains the most current sample. Let’s look at the output of the vmstat command for each of our Unix flavors, AIX, HP-UX, Linux, and Solaris. AIX [root:yogi]@/scripts# vmstat 30 2 kthr memory page faults cpu r b avm fre re pi po fr sr cy in sy cs us sy id wa 0 0 23936 580 0 0 0 0 2 0 103 2715 713 8 25 67 0 1 0 23938 578 0 0 0 0 0 0 115 9942 2730 24 76 0 0 The last line of output is what we are looking for. This is the average of the CPU load over the length of the interval. We want just the last four columns in the output. The fields that we want to extract for AIX are in positions $14, $15, $16, and $17. HP-UX # vmstat 30 2 procs memory page faults cpu r b w avm free re at pi po fr de sr in sy cs us sy id 0 39 0 8382 290 122 26 2 0 0 0 3 128 2014 146 14 21 65 1 40 0 7532 148 345 71 0 0 0 0 0 108 5550 379 29 43 27 The HP-UX vmstat output is a long string of data. Notice for the CPU data that HP- UX supplies only three values: user part, system part, and the CPU idle time. The fields that we want to extract are in positions $16, $17, and $18. Monitoring System Load 191 Linux # vmstat 30 2 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 2 0 0 244 1088 1676 21008 0 0 1 0 127 72 1 1 99 3 0 0 244 1132 1676 21008 0 0 0 1 212 530 37 23 40 Like HP-UX, the Linux vmstat output for CPU activity has three fields: user part, system part, and the CPU idle time. The fields that we want to extract are in positions $14, $15, and $16. Solaris # vmstat 30 2 procs memory page disk faults cpu r b w swap free re mf pi po fr de sr cd f0 s0 in sy cs us sy id 0 0 0 558316 33036 57 433 2 0 0 0 0 0 0 0 0 111 500 77 2 8 90 0 0 0 556192 29992 387 2928 0 0 0 0 0 1 0 0 0 155 2711 273 14 60 26 As with HP-UX and Linux, the Solaris vmstat output for CPU activity consists of the last three fields: user part, system part, and the CPU idle time. What Is the Common Denominator? There are at least two common denominators for the vmstat command output between the Unix flavors. The first is that the CPU data is in the last fields. On AIX the data is in the last four fields with the added I/O wait state. HP-UX, Linux, and Solaris do not list the wait state. The second common factor is that the data is always on a row that is entirely numeric. Again, we need a case statement to parse the correct fields for the command output. Take a look at Listing 7.4. OS=$(uname) case $OS in AIX) F1=14 F2=15 F3=16 F4=17 echo “\nThe Operating System is $OS\n” ;; Listing 7.4 Case statement for the vmstat fields of data. 192 Chapter 7 HP-UX) F1=16 F2=17 F3=18 F4=1 # This “F4=1” is bogus and not used for HP-UX echo “\nThe Operating System is $OS\n” ;; Linux) F1=14 F2=15 F3=16 F4=1 # This “F4=1” is bogus and not used for Linux echo “\nThe Operating System is $OS\n” ;; SunOS) F1=20 F2=21 F3=22 F4=1 # This “F4=1” is bogus and not used for SunOS echo “\nThe Operating System is $OS\n” ;; *) echo “\nERROR: $OS is not a supported operating system\n” echo “\n\t EXITING \n” exit 1 ;; esac Listing 7.4 Case statement for the vmstat fields of data. (continued) Notice in Listing 7.4 that the F4 variable gets a valid assignment only on the AIX match. For HP-UX, Linux, and Solaris, the F4 variable is assigned the value of the $1 field, specified by the F4=1 variable assignment. This bogus assignment is made so that we do not need a special vmstat command statement for each operating system. You will see how this works in detail in the scripting section. Scripting the Solutions Each of the techniques presented is slightly different in execution and output. Some options need to be timed over an interval for a user-defined amount of time, measured Monitoring System Load 193 in seconds. We can get an immediate load measurement using the uptime command, but the sar, iostat, and vmstat commands require the user to specify a period of time to measure over and the number of intervals to sample the load. If you enter the sar, iostat, or vmstat commands without any arguments, then the statistics presented are an average since the last system reboot. Because we want current statistics, the scripts must supply a period of time to sample. We are always going to initialize the INTERVAL variable to equal 2. The first line of output is measured since the last system reboot, and the second line is the current data that we are looking for. Let’s look at each of these commands in separate shell scripts in the following sections. Using uptime to Measure the System Load Using uptime is one of the best indicators of the system load. The last columns of the output represent the average of the run queue over the last 5, 10, and 15 minutes for an AIX machine and over the last 1, 5, and 10 minutes for HP-UX, Linux, and Solaris. A run queue is where jobs wanting CPU time line up for their turn for some processing time in the CPU. The priority of the process, or on some systems a thread, has a direct influence on how long a job has to wait in line before getting more CPU time. The lower the priority, the more CPU time. The higher the priority, the less CPU time. The uptime command always has an average of the length of the run queue. The threshold trigger value that you set will depend on the normal load of your system. My little C-10 AIX box starts getting very slow when the run queue hits 2, but the S-80 at work typically runs with a run queue value over 8 because it is a multiprocessor machine running a terabyte database. With these differences in acceptable run queue levels, you will need to tailor the threshold level for notification on a machine-by- machine basis. Scripting with the uptime Command Scripting the uptime solution is a short shell script, and the response is immediate. As you remember in the “Syntax” section, we had to follow the floating load statistics as the time since the last reboot moved from minutes, to hours, and even days after the machine was rebooted. The good thing is that the floating fields are consistent across the Unix flavors studied in this book. Let’s look at the uptime_loadmon.ksh shell shown in Listing 7.5. #!/bin/ksh # # SCRIPT: uptime_loadmon.ksh # AUTHOR: Randy Michael # DATE: 07/26/2002 # REV: 1.0.P # PLATFORM: AIX, HP-UX, Linux, and Solaris # Listing 7.5 uptime_loadmon.ksh shell script listing. 194 Chapter 7 # PURPOSE: This shell script uses the “uptime” command to # extract the most current load average data. There # is a special need in this script to determine # how long the system has been running since the # last reboot. The load average field “floats” # during the first 24 hours after a system restart. # # set -x # Uncomment to debug this shell script # set -n # Uncomment to check script syntax without any execution # ################################################### ############# DEFINE VARIABLES HERE ############### ################################################### MAXLOAD=2.00 typeset -i INT_MAXLOAD=$MAXLOAD # Find the correct field to extract based on how long # the system has been up, or since the last reboot. if $(uptime | grep day | grep min >/dev/null) then FIELD=11 elif $(uptime | grep day | grep hrs >/dev/null) then FIELD=11 elif $(uptime | grep day >/dev/null) then FIELD=10 elif $(uptime | grep min >/dev/null) then FIELD=9 else FIELD=8 fi ################################################### ######## BEGIN GATHERING STATISTICS HERE ########## ################################################### echo “\nGathering System Load Average using the \”uptime\” command\n” # This next command statement extracts the latest # load statistics no matter what the Unix flavor is. LOAD=$(uptime | sed s/,//g | awk ‘{print $’$FIELD’}’) Listing 7.5 uptime_loadmon.ksh shell script listing. (continues) Monitoring System Load 195 # We need an integer representation of the $LOAD # variable to do the test for the load going over # the set threshold defined by the $INT_MAXLOAD # variable typeset -i INT_LOAD=$LOAD # If the current load has exceeded the threshold then # issue a warning message. The next step always shows # the user what the current load and threshold values # are set to. ((INT_LOAD >= INT_MAXLOAD)) && echo “\nWARNING: System load has \ reached ${LOAD}\n” echo “\nSystem load value is currently at ${LOAD}” echo “The load threshold is set to ${MAXLOAD}\n” Listing 7.5 uptime_loadmon.ksh shell script listing. (continued) There are two statements that I want to point out in Listing 7.5 that are highlighted in boldface text. First, notice the LOAD= statement. To make the variable assignment we use command substitution, defined by the VAR=$(command statement) notation. In the command statement we execute the uptime command and pipe the output to a sed statement. This sed statement removes all of the commas (,) from the uptime output. We need to take this step because the load statistics are comma separated. Once the commas are removed, the remaining output is piped to the awk statement that extracts the correct field that is defined at the top of the shell script by the FIELD variable and based on how long the system has been running. In this awk statement notice how we find the positional parameter that the $FIELD variable is pointing to. If you try to use the syntax $$FIELD, the result is the current process ID ($$) and the word FIELD. To get around this little problem of directly access- ing what a variable is pointing to, we use the following syntax: # The $8 variable points to the value 34. FIELD=8 # Wrong usage echo $$FIELD 3243FIELD # Correct usage echo $’$FIELD’ 34 196 Chapter 7 Notice that the latter usage is correct, and the actual result is the value of the $8 field, which is currently 34. This is really telling us the value of what a pointer is pointing to. You will see other uses of this technique as we go through this chapter. The second command statement that I want to point out is the test of the INT_LOAD value to the INT_MAXLOAD value, which are integer values of the LOAD and MAXLOAD variables. If the INT_LOAD is equal to, or has exceeded, the INT_MAXLOAD, then we use a logical AND (&&) to echo a warning to the user’s screen. Using the logical AND saves a little code and is faster than an if then else statement. You can see the uptime_loadmon.ksh shell script in action in Listings 7.6 and 7.7. # ./uptime_loadmon.ksh Gathering System Load Average using the “uptime” command System load value is currently at 1.86 The load threshold is set to 2.00 Listing 7.6 Script in action under “normal” load. Listing 7.6 shows the uptime_loadmon.ksh shell script in action on a machine that is under a normal load. Listing 7.7 shows the same machine under an excessive load—at least, it is excessive for this little machine. # ./uptime_loadmon.ksh Gathering System Load Average using the “uptime” command WARNING: System load has reached 2.97 System load value is currently at 2.97 The load threshold is set to 2.00 Listing 7.7 Script in action under “excessive” load. This is about all there is to using the uptime command. Let’s move on to the sar command. Using sar to Measure the System Load Most Unix flavors have sar data collection set up by default. This sar data is presented when the sar command is executed without any switches. The data that is displayed is automatically collected at scheduled intervals throughout the day and compiled into a Monitoring System Load 197 [...]... to the correct field in the # command output for each Unix flavor case $OS in AIX|HP-UX|SunOS) F1=2 F2=3 F3 =4 F4=5 echo “\nThe Operating System is $OS\n” ;; Linux) F1=3 F2 =4 F3=5 F4=6 echo “\nThe Operating System is $OS\n” ;; *) echo “\nERROR: $OS is not a supported operating system\n” echo “\n\t EXITING \n” exit 1 ;; Listing 7.8 sar_loadmon.ksh shell script listing (continues) 199 200 Chapter 7 esac... $INTERVAL AIX yogi 1 5 0001256 048 00 19: 24: 00 19: 24: 30 19:25:00 19:25:30 19:26:00 19:26:30 19:27:00 19:27:30 19:28:00 19:28:30 19:29:00 Average 07/31/02 %usr 0 4 26 13 16 27 20 5 11 9 %sys 1 15 28 12 44 73 48 6 9 18 %wio 1 13 40 11 0 0 2 9 5 0 %idle 98 68 6 64 39 0 30 80 75 73 13 26 8 53 The previous output is produced by the first part of the sar command statement Then, all of this output is piped to the next... shown to the user based # on the Unix operating system that this shell script is # executing on Different Unix flavors have differing # outputs and the fields vary too # # REV LIST: # # Listing 7.10 iostat_loadmon.ksh shell script listing (continues) 203 2 04 Chapter 7 # set -n # Uncomment to check the script syntax without any execution # set -x # Uncomment to debug this shell script # ###################################################... Unix flavor ################################################### ##### SETUP THE ENVIRONMENT FOR EACH OS HERE ###### ################################################### # These “F-numbers” point to the correct field in the # command output for each Unix flavor case $OS in AIX|HP-UX) SWITCH=’-t’ F1=3 F2 =4 F3=5 F4=6 echo “\nThe Operating System is $OS\n” ;; Linux|SunOS) SWITCH=’-c’ F1=1 F2=2 F3=3 F4 =4. .. following output: 23.15 31.77 0.00 0.00 26.09 21.79 50.76 46 .44 This brings us to the next addition to the iostat command statement in the shell script This is where we add the awk part of the statement using the F1, F2, F3, and F4 variables, as shown here iostat $SWITCH $SECS $INTERVAL | egrep -v ‘[a-zA-Z]|^$’ \ | awk ‘{print $’$F1’, $’$F2’, $’$F3’, $’$F4’}’ This is the same code that we covered in the last... HP-UX has only three relative columns in the output F1=16 F2=17 F3=18 F4=1 # This “F4=1” is bogus and not used for HP-UX echo “\nThe Operating System is $OS\n” ;; Linux) # Linux has only three relative columns in the output F1= 14 F2=15 Listing 7.12 vmstat_loadmon.ksh shell script listing (continues) 209 210 Chapter 7 F3=16 F4=1 # This “F4=1” is bogus and not used for Linux echo “\nThe Operating System is... iostat_loadmon.ksh shell script in action (continued) Notice that the output is in the same format as the sar script output This is all there is to the iostat shell script Let’s now move on to the vmstat solution Using vmstat to Measure the System Load The vmstat shell script uses the exact same technique as the iostat shell script in the previous section Only AIX produces four fields of output; the remaining Unix. .. the main purpose of this command anyway Let’s look at the vmstat script Scripting with the vmstat Command When you look at this shell script for vmstat you will think that you just saw this shell script in the last section Most of these two shell scripts are the same, with only minor exceptions Let’s look at the vmstat_loadmon.ksh shell script in Listing 7.12 and cover the differences in detail at the... AIX, HP-UX, Linux, and Solaris # Listing 7.8 sar_loadmon.ksh shell script listing Monitoring System Load # PURPOSE: This shell script takes multiple samples of the CPU # usage using the “sar” command The average of # sample periods is shown to the user based on the # Unix operating system that this shell script is # executing on Different Unix flavors have differing # outputs and the fields vary too... wait while gathering statistics User part is 14% System part is 54% Idle time is 31% Listing 7.13 vmstat_loadmon.ksh shell script in action (continued) Notice that the Solaris output shown in Listing 7.13 does not show the I/O wait state This information is available only on AIX for the vmstat shell script The output format is the same as the last few shell scripts It is up to you how you want to use . Solaris. AIX # sar 10 4 AIX yogi 1 5 0001256 048 00 07/26/02 17 :44 : 54 %usr %sys %wio %idle 17 :45 : 04 25 75 0 0 188 Chapter 7 17 :45 : 14 25 75 0 0 17 :45 : 24 26 74 0 0 17 :45 : 34 25 75 0 0 Average 25. sar 10 4 | grep Average Average 26 74 0 0 HP-UX # sar 10 4 HP-UX dino B.10.20 A 9000/715 07/29/102 22 :48 :10 %usr %sys %wio %idle 22 :48 :20 40 60 0 0 22 :48 :30 40 60 0 0 22 :48 :40 12 19 0 68 22 :48 :50. 0001256 048 00 07/31/02 19: 24: 00 %usr %sys %wio %idle 19: 24: 30 0 1 1 98 19:25:00 4 15 13 68 19:25:30 26 28 40 6 19:26:00 13 12 11 64 19:26:30 16 44 0 39 19:27:00 27 73 0 0 19:27:30 20 48 2 30 19:28:00

Mastering unix shell scripting phần 4 ppsx

Thông tin tài liệu

Từ khóa liên quan

Mục lục

Chapter 7 Monitoring System Load

Syntax

Syntax for sar

AIX

HP-UX

Linux

Solaris

What Is the Common Denominator?

Syntax for vmstat

AIX

HP-UX

Linux

Solaris

What Is the Common Denominator?

Scripting the Solutions

Using uptime to Measure the System Load

Scripting with the uptime Command

Using sar to Measure the System Load

Scripting with the sar Command

Using iostat to Measure the System Load

Scripting with the iostat Command

Using vmstat to Measure the System Load

Scripting with the vmstat Command

Other Options to Consider

Stop Chasing the Floating uptime Field

Try to Detect Any Possible Problems for the User

Show the User the Top CPU Hogs

Gathering a Large Amount of Data for Plotting

Summary

Chapter 8 Process Monitoring and Enabling Preprocess, Startup,and Postprocess Events

Syntax

Tài liệu cùng người dùng

Tài liệu liên quan