* Advanced IBM POWER6™ processor cores for enhanced performance and reliability
* Building block architecture delivers flexible scalability and modular growth
* Advanced virtualization features facilitate highly efficient systems utilization
* Enhanced RAS features enable improved application availability
The IBM POWER6 processor-based System p™ 570 mid-range server delivers outstanding price/performance, mainframe-inspired reliability and availability features, flexible capacity upgrades and innovative virtualization technologies. This powerful 19-inch rack-mount system, which can handle up to 16 POWER6 cores, can be used for database and application serving, as well as server consolidation. The modular p570 is designed to continue the tradition of its predecessor, the IBM POWER5+™ processor-based System p5™ 570 server, for resource optimization, secure and dependable performance and the flexibility to change with business needs. Clients have the ability to upgrade their current p5-570 servers and know that their investment in IBM Power Architecture™ technology has again been rewarded.
The p570 is the first server designed with POWER6 processors, resulting in performance and price/performance advantages while ushering in a new era in the virtualization and availability of UNIX® and Linux® data centers. POWER6 processors can run 64-bit applications, while concurrently supporting 32-bit applications to enhance flexibility. They feature simultaneous multithreading,1 allowing two application “threads” to be run at the same time, which can significantly reduce the time to complete tasks.
The p570 system is more than an evolution of technology wrapped into a familiar package; it is the result of “thinking outside the box.” IBM’s modular symmetric multiprocessor (SMP) architecture means that the system is constructed using 4-core building blocks. This design allows clients to start with what they need and grow by adding additional building blocks, all without disruption to the base system.2 Optional Capacity on Demand features allow the activation of dormant processor power for times as short as one minute. Clients may start small and grow with systems designed for continuous application availability.
Specifically, the System p 570 server provides:
Common features Hardware summary
* 19-inch rack-mount packaging
* 2- to 16-core SMP design with building block architecture
* 64-bit 3.5, 4.2 or 4.7 GHz POWER6 processor cores
* Mainframe-inspired RAS features
* Dynamic LPAR support
* Advanced POWER Virtualization1 (option)
o IBM Micro-Partitioning™ (up to 160 micro-partitions)
o Shared processor pool
o Virtual I/O Server
o Partition Mobility2
* Up to 32 optional I/O drawers
* IBM HACMP™ software support for near continuous operation*
* Supported by AIX 5L (V5.2 or later) and Linux® distributions from Red Hat (RHEL 4 Update 5 or later) and SUSE Linux (SLES 10 SP1 or later) operating systems
* 4U 19-inch rack-mount packaging
* One to four building blocks
* Two, four, eight, 12 or 16 3.5 GHz, 4.2 GHz or 4.7 GHz 64-bit POWER6 processor cores
* L2 cache: 8 MB to 64 MB (2- to 16-core)
* L3 cache: 32 MB to 256 MB (2- to 16-core)
* 2 GB to 192 GB of 667 MHz buffered DDR2 or 16 GB to 384 GB of 533 MHz buffered DDR2 or 32 GB to 768 GB of 400 MHz buffered DDR2 memory3
* Four hot-plug, blind-swap PCI Express 8x and two hot-plug, blind-swap PCI-X DDR adapter slots per building block
* Six hot-swappable SAS disk bays per building block provide up to 7.2 TB of internal disk storage
* Optional I/O drawers may add up to an additional 188 PCI-X slots and up to 240 disk bays (72 TB additional)4
* One SAS disk controller per building block (internal)
* One integrated dual-port Gigabit Ethernet per building block standard; One quad-port Gigabit Ethernet per building block available as optional upgrade; One dual-port 10 Gigabit Ethernet per building block available as optional upgrade
* Two GX I/O expansion adapter slots
* One dual-port USB per building block
* Two HMC ports (maximum of two), two SPCN ports per building block
* One optional hot-plug media bay per building block
* Redundant service processor for multiple building block systems2
Google Search
Saturday, December 1, 2007
IBM System p 570 with POWER 6
IBM System p5 570
* Up to 16-core scalability with modular architecture and leadership POWER5+ technology
* IBM Advanced POWER™ Virtualization features increase system utilization and reduce the number of overall systems required
* Capacity on Demand features enable quick response to spikes in processing requirements
The IBM System p5 570 mid-range server is a powerful 19-inch rack mount system that can be used for database and application serving, as well as server consolidation. IBM’s modular symmetric multiprocessor (SMP) architecture means you can start with a 2-core system and easily add additional building blocks when needed for more processing power (up to 16-cores) I/O and storage capacity. The p5-570 includes IBM mainframe-inspired reliability, availability and serviceability features.
The System p5 570 server is designed to be a cost-effective, flexible server for the on demand environment. Innovative virtualization technologies and Capacity on Demand (CoD) options help increase the responsiveness of the server to variable computing demands. These features also help increase the systems utilization of processors and system components allowing businesses to meet their computing requirements with a smaller system. By combining IBM’s most advanced leading-edge technology for enterprise-class performance and flexible adaptation to changing market conditions, the p5-570 can deliver the key capabilities medium-sized companies need to survive in today’s highly competitive world.
Specifically, the System p5 570 server provides:
Common features Hardware summary
* 19-inch rack-mount packaging
* 2- to 16-core SMP design with unique building block architecture
* 64-bit 1.9 or 2.2 GHz POWER5+ processor cores
* Mainframe-inspired RAS features
* Dynamic LPAR support
* Advanced POWER Virtualization1 (option)
o IBM Micro-Partitioning™ (up to 160 micro- partitions)
o Shared processor pool
o Virtual I/O Server
o Partition Load Manager (IBM AIX 5L™ only)
* Up to 20 optional I/O drawers
* IBM HACMP™ software support for near continuous operation*
* Supported by AIX 5L (V5.2 or later) and Linux® distributions from Red Hat (RHEL AS 4 or later) and SUSE Linux (SLES 9 or later) operating systems
* System Cluster 1600 support with Cluster Systems Management software*
* 4U 19-inch rack-mount packaging
* One to four building blocks
* Two, four, eight, 12, 16 1.9 or 2.2 GHz 64-bit POWER5+ processor cores
* L2 cache: 1.9MB to 15.2MB (2- to 16-core)
* L3 cache: 36MB to 288MB (2- to 16-core)
* 1.9 GHz systems: 2GB to 256GB of 533 MHz DDR2 memory; 2.2 GHz systems: 2GB to 256GB of 533 MHz or 32GB to 512GB of 400 MHz DDR2 memory
* Six hot-plug PCI-X adapter slots per building block
* Six hot-swappable disk bays per building block provide up to 7.2TB of internal disk storage
* Optional I/O drawers may add up to an additional 139 PCI-X slots (for a maximum of 163) and 240 disk bays (72TB additional)
* Dual channel Ultra320 SCSI controller per building block (internal; RAID optional)
* One integrated 2-port 10/100/1000 Ethernet per building block
* Optional 2 Gigabit Fibre Channel, 10 Gigabit Ethernet and 4x GX adapters
* One 2-port USB per building block
* Two HMC, two system ports
* Two hot-plug media bays per building block
AIX command
List the licensed program productslslpp -L
List the defined devices lsdev -C -H
List the disk drives on the system lsdev -Cc disk
List the memory on the system lsdev -Cc memory (MCA)
List the memory on the system lsattr -El sys0 -a realmem (PCI)
lsattr -El mem0
List system resources lsattr -EHl sys0
List the VPD (Vital Product Data) lscfg -v
Document the tty setup lscfg or smit screen capture F8
Document the print queues qchk -A
Document disk Physical Volumes (PVs) lspv
Document Logical Volumes (LVs) lslv
Document Volume Groups (long list) lsvg -l vgname
Document Physical Volumes (long list) lspv -l pvname
Document File Systems lsfs fsname
/etc/filesystems
Document disk allocation df
Document mounted file systems mount
Document paging space (70 - 30 rule) lsps -a
Document paging space activation /etc/swapspaces
Document users on the system /etc/passwd
lsuser -a id home ALL
Document users attributes /etc/security/user
Document users limits /etc/security/limits
Document users environments /etc/security/environ
Document login settings (login herald) /etc/security/login.cfg
Document valid group attributes /etc/group
lsgroup ALL
Document system wide profile /etc/profile
Document system wide environment /etc/environment
Document cron jobs /var/spool/cron/crontabs/*
Document skulker changes if used /usr/sbin/skulker
Document system startup file /etc/inittab
Document the hostnames /etc/hosts
Document network printing /etc/hosts.lpd
Document remote login host authority /etc/hosts.equiv
BASH CHEAT SHEETS
s1 = s2 Check if s1 equals s2.
s1 != s2 Check if s1 is not equal to s2.
-z s1 Check if s1 has size 0.
-n s1 Check if s2 has nonzero size.
s1 Check if s1 is not the empty string.
Example:
if [ $myvar = "hello" ] ; then
echo "We have a match"
fi
Checking numbers:
Note that a shell variable could contain a string that represents a number. If you want to check the numerical value use one of the following:
n1 -eq n2 Check to see if n1 equals n2.
n1 -ne n2 Check to see if n1 is not equal to n2.
n1 -lt n2 Check to see if n1 < n2.
n1 -le n2 Check to see if n1 <= n2.
n1 -gt n2 Check to see if n1 > n2.
n1 -ge n2 Check to see if n1 >= n2.
Example:
if [ $# -gt 1 ]
then
echo "ERROR: should have 0 or 1 command-line parameters"
fi
Boolean operators:
! not
-a and
-o or
Example:
if [ $num -lt 10 -o $num -gt 100 ]
then
echo "Number $num is out of range"
elif [ ! -w $filename ]
then
echo "Cannot write to $filename"
fi
Note that ifs can be nested. For example:
if [ $myvar = "y" ]
then
echo "Enter count of number of items"
read num
if [ $num -le 0 ]
then
echo "Invalid count of $num was given"
else
#... do whatever ...
fi
fi
The above example also illustrates the use of read to read a string from the keyboard and place it into a shell variable. Also note that most UNIX commands return a true (nonzero) or false (0) in the shell variable status to indicate whether they succeeded or not. This return value can be checked. At the command line echo $status. In a shell script use something like this:
if grep -q shell bshellref
then
echo "true"
else
echo "false"
fi
Note that -q is the quiet version of grep. It just checks whether it is true that the string shell occurs in the file bshellref. It does not print the matching lines like grep would otherwise do.
I/O Redirection:
pgm > file Output of pgm is redirected to file.
pgm < file Program pgm reads its input from file.
pgm >> file Output of pgm is appended to file.
pgm1 | pgm2 Output of pgm1 is piped into pgm2 as the input to pgm2.
n > file Output from stream with descriptor n redirected to file.
n >> file Output from stream with descriptor n appended to file.
n >& m Merge output from stream n with stream m.
n <& m Merge input from stream n with stream m.
<<>Note that file descriptor 0 is normally standard input, 1 is standard output, and 2 is standard error output.
Shell Built-in Variables:
$0 Name of this shell script itself.
$1 Value of first command line parameter (similarly $2, $3, etc)
$# In a shell script, the number of command line parameters.
$* All of the command line parameters.
$- Options given to the shell.
$? Return the exit status of the last command.
$$ Process id of script (really id of the shell running the script)Pattern Matching:
* Matches 0 or more characters.
? Matches 1 character.
[AaBbCc] Example: matches any 1 char from the list.
[^RGB] Example: matches any 1 char not in the list.
[a-g] Example: matches any 1 char from this range.Quoting:
\c Take character c literally.
`cmd` Run cmd and replace it in the line of code with its output.
"whatever" Take whatever literally, after first interpreting $, `...`, \
'whatever' Take whatever absolutely literally.Example:
match=`ls *.bak` #Puts names of .bak files into shell variable match.
echo \* #Echos * to screen, not all filename as in: echo *
echo '$1$2hello' #Writes literally $1$2hello on screen.
echo "$1$2hello" #Writes value of parameters 1 and 2 and string hello.Grouping:
Parentheses may be used for grouping, but must be preceded by backslashes
since parentheses normally have a different meaning to the shell (namely
to run a command or commands in a subshell). For example, you might use:if test \( -r $file1 -a -r $file2 \) -o \( -r $1 -a -r $2 \)
then
#do whatever
fiCase statement:
Here is an example that looks for a match with one of the characters a, b, c. If $1 fails to match these, it always matches the * case. A case statement can also use more advanced pattern matching.case "$1" in
a) cmd1 ;;
b) cmd2 ;;
c) cmd3 ;;
*) cmd4 ;;
esacShell Arithmetic:
In the original Bourne shell arithmetic is done using the expr command as in:result=`expr $1 + 2`
result2=`expr $2 + $1 / 2`
result=`expr $2 \* 5` #note the \ on the * symbolWith bash, an expression is normally enclosed using [ ] and can use the following operators, in order of precedence:
* / % (times, divide, remainder)
+ - (add, subtract)
< > <= >= (the obvious comparison operators)
== != (equal to, not equal to)
&& (logical and)
|| (logical or)
= (assignment)Arithmetic is done using long integers.
Example:result=$[$1 + 3]In this example we take the value of the first parameter, add 3, and place the sum into result.
Order of Interpretation:
The bash shell carries out its various types of interpretation for each line in the following order:brace expansion (see a reference book)
~ expansion (for login ids)
parameters (such as $1)
variables (such as $var)
command substitution (Example: match=`grep DNS *` )
arithmetic (from left to right)
word splitting
pathname expansion (using *, ?, and [abc] )Other Shell Features:
$var Value of shell variable var.
${var}abc Example: value of shell variable var with string abc appended.
# At start of line, indicates a comment.
var=value Assign the string value to shell variable var.
cmd1 && cmd2 Run cmd1, then if cmd1 successful run cmd2, otherwise skip.
cmd1 || cmd2 Run cmd1, then if cmd1 not successful run cmd2, otherwise skip.
cmd1; cmd2 Do cmd1 and then cmd2.
cmd1 & cmd2 Do cmd1, start cmd2 without waiting for cmd1 to finish.
(cmds) Run cmds (commands) in a subshell.
Directories to monitor in AIX
more to view it and rm to clean it out.
/etc/security/failedlogin Failed logins from users. Use the who command
to view the information. Use "cat /dev/null >
/etc/failedlogin" to empty it,
/var/adm/wtmp All login accounting activity. Use the who
command to view it use "cat /dev/null >
/var/adm/wtmp" to empty it.
/etc/utmp Who has logged in to the system. Use the who
command to view it. Use "cat /dev/null >
/etc/utmp" to empty it.
/var/spool/lpd/qdir/* Left over queue requests
/var/spool/qdaemon/* temp copy of spooled files
/var/spool/* spooling directory
smit.log smit log file of activity
smit.script smit log
What is an LVM hot spare?
A hot spare is a disk or group of disks used to replace a failing disk. LVM marks a physical
volume missing due to write failures. It then starts the migration of data to the hot spare
disk.
Minimum hot spare requirements
The following is a list of minimal hot sparing requirements enforced by the operating
system.
- Spares are allocated and used by volume group
- Logical volumes must be mirrored
- All logical partitions on hot spare disks must be unallocated
- Hot spare disks must have at least equal capacity to the smallest disk already
in the volume group. Good practice dictates having enough hot spares to
cover your largest mirrored disk.
Hot spare policy
The chpv and the chvg commands are enhanced with a new -h argument. This allows you
to designate disks as hot spares in a volume group and to specify a policy to be used in the
case of failing disks.
The following four values are valid for the hot spare policy argument (-h):
Synchronization policy
There is a new -s argument for the chvg command that is used to specify synchronization
characteristics.
The following two values are valid for the synchronization argument (-s):
Examples
The following command marks hdisk1 as a hot spare disk:
# chpv -hy hdisk1
The following command sets an automatic migration policy which uses the smallest hot
spare that is large enough to replace the failing disk, and automatically tries to synchronize
stale partitions:
# chvg -hy -sy testvg
Argument Description
y (lower case)
Automatically migrates partitions from one failing disk to one spare
disk. From the pool of hot spare disks, the smallest one which is big
enough to substitute for the failing disk will be used.
Y (upper case)
Automatically migrates partitions from a failing disk, but might use
the complete pool of hot spare disks.
n
No automatic migration will take place. This is the default value for a
volume group.
r
Removes all disks from the pool of hot spare disks for this volume
Mirror Write Consistency
Mirror Write Consistency (MWC) ensures data consistency on logical volumes in case a
system crash occurs during mirrored writes. The active method achieves this by logging
when a write occurs. LVM makes an update to the MWC log that identifies what areas of
the disk are being updated before performing the write of the data. Records of the last 62
distinct logical transfer groups (LTG) written to disk are kept in memory and also written to
a separate checkpoint area on disk (MWC log). This results in a performance degradation
during random writes.
With AIX V5.1 and later, there are now two ways of handling MWC:
• Active, the existing method
• Passive, the new method
DIfferent Raid Levels
The different raid levels available today
Raid 0 - Stripping data across the disks. This stripes the data across all the disks present in the
array. This improves the read and write performance. Eg. Reading a large file takes a
long time in comparison to reading the same file from a Raid 0 system.They is no data
redundancy in this case.
Raid 1 - Mirroring. In case of Raid 0 it was observed that there was no redundancy,i.e if one
disk fails then the data is lost. Raid 1 overcomes that problem by mirroring the data. So
if one disk fails the data is still accessible through the other disk.
Raid 2 - RAID level that does not use one or more of the "standard" techniques of mirroring,
striping and/or parity. It is implemented by splitting data at bit level and spreading it
across the data disks and redundant disk. It uses a special algorithm called as ECC
(error correction code) which is accompanied across each data block. These are tallied
when the data is read from the disk to maintain data integrity.
Raid 3 - data is striped across multiple disks at a byte level. The data is stripped with parity and
the parity is maintained in a separate disk. So if that disk goes off , it results in a data
loss.
Raid 4 - Similar to Raid 3 the only difference is that the data is striped across multiple disks at
block level.
Raid 5 - Block-level striping with distributed parity. The data and parity is stripped across all
disks thus increasing the data redundancy. Minimum three disks are required and if
any one disk goes off the data is still secure.
Raid 6 - Block-level striping with dual distributed parity. Its stripes blocks of data and parity
across all disks in the Raid except that it maintains two sets of parity information for
each parcel of data thus increasing the data redundancy. So if two disk go off the data
is still intact.
Raid 7 - Asynchronous, cached striping with dedicated parity. This level is not a open industry
standard. It is based on the concepts of Raid 3 and 4 and a great deal of cache is
included across multiple levels. Also there is a specialised real time processor to
manage the array asynchronously.
Recovering a Failed VIO Disk
server. It assumes the client partitions have mirrored (virtual) disks. The
recovery involves both the VIO server and its client partitions. However,
it is non disruptive for the client partitions (no downtime), and may be
non disruptive on the VIO server (depending on disk configuration). This
procedure does not apply to Raid5 or SAN disk failures.
The test system had two VIO servers and an AIX client. The AIX client had two
virtual disks (one disk from each VIO server). The two virtual disks
were mirrored in the client using AIX's mirrorvg. (The procedure would be
the same on a single VIO server with two disks.)
The software levels were:
p520: Firmware SF230_145 VIO Version 1.2.0 Client: AIX 5.3 ML3
We had simulated the disk failure by removing the client LV on one VIO server. The
padmin commands to simulate the failure were:
rmdev -dev vtscsi01 # The virtual scsi device for the LV (lsmap -all)rmlv -f aix_client_lv # Remove the client LV
This caused "hdisk1" on the AIX client to go "missing" ("lsvg -p rootvg"....The
"lspv" will not show disk failure...only the disk status at the last boot..)
The recovery steps included:
VIO Server
Fix the disk failure, and restore the VIOS operating system (if necessary)mklv -lv aix_client_lv rootvg 10G # recreate the client LV mkvdev -vdev aix_client_lv -vadapter vhost1 # connect the client LV to the appropriate vhost
AIX Client
cfgmgr # discover the new virtual hdisk2 replacepv hdisk1 hdisk2 # rebuild the mirror copy on hdisk2 bosboot -ad /dev/hdisk2 # add boot image to hdisk2bootlist -m normal hdisk0 hdisk2 # add the new disk to the bootlistrmdev -dl hdisk1 # remove failed hdisk1
The "replacepv" command assigns hdisk2 to the volume group, rebuilds the mirror, and
then removes hdisk1 from the volume group.
As always, be sure to test this procedure before using in production.
AIX / HMC/VIO Tips Sheet
HMC Commands
lshmc –n (lists dynamic IP addresses served by HMC)
lssyscfg –r sys –F name,ipaddr (lists managed system attributes)
lssysconn –r sys (lists attributes of managed systems)
lssysconn –r all (lists all known managed systems with attributes)
rmsysconn –o remove –ip (removes a managed system from the HMC)
mkvterm –m {msys} –p {lpar} (opens a command line vterm from an ssh session)
rmvterm –m {msys} –p {lpar} (closes an open vterm for a partition)
Activate a partition
chsysstate –m managedsysname –r lpar –o on –n partitionname –f profilename –b normal
chsysstate –m managedsysname –r lpar –o on –n partitionname –f profilename –b sms
Shutdown a partition
chsysstate –m managedsysname –r lpar –o {shutdown/ossshutdown} –n partitionname [-immed][-restart]
VIO Server Commands
lsdev –virtual (list all virtual devices on VIO server partitions)
lsmap –all (lists mapping between physical and logical devices)
oem_setup_env (change to OEM [AIX] environment on VIO server)
Create Shared Ethernet Adapter (SEA) on VIO Server
mkvdev –sea{physical adapt} –vadapter {virtual eth adapt} –default {dflt virtual adapt} –defaultid {dflt vlan ID}
SEA Failover
ent0 – GigE adapter
ent1 – Virt Eth VLAN1 (Defined with a priority in the partition profile)
ent2 – Virt Eth VLAN 99 (Control)
mkvdev –sea ent0 –vadapter ent1 –default ent1 –defaultid 1 –attr ha_mode=auto ctl_chan=ent2
(Creates ent3 as the Shared Ethernet Adapter)
Create Virtual Storage Device Mapping
mkvdev –vdev {LV or hdisk} –vadapter {vhost adapt} –dev {virt dev name}
Sharing a Single SAN LUN from Two VIO Servers to a Single VIO Client LPAR
hdisk = SAN LUN (on vioa server)
hdisk4 = SAN LUN (on viob, same LUN as vioa)
chdev –dev hdisk3 –attr reserve_policy=no_reserve (from vioa to prevent a reserve on the disk)
chdev –dev hdisk4 –attr reserve_policy=no_reserve (from viob to prevent a reserve on the disk)
mkvdev –vdev hdisk3 –vadapter vhost0 –dev hdisk3_v (from vioa)
mkvdev –vdev hdisk4 –vadapter vhost0 –dev hdisk4_v (from viob)
VIO Client would see a single LUN with two paths.
spath –l hdiskx (where hdiskx is the newly discovered disk)
This will show two paths, one down vscsi0 and the other down vscsi1.
AIX Performance TidBits and Starter Set of Tuneables
Current starter set of recommended AIX 5.3 Performance Parameters. Please ensure you test these first before implementing in production as your mileage may vary.
Network
no –p –o rfc1323=1
no –p –o sb_max=1310720
no –p –o tcp_sendspace=262144
no –p –o tcp_recvspace=262144
no –p –o udp_sendspace=65536
no –p –o udp_recvspace=655360
nfso –p –o rfc_1323=1
NB Network settings also need to be applied to the adapters
nfso –p –o nfs_socketsize=600000
nfso –p –o nfs_tcp_socketsize=600000
Memory Settings
vmo – p –o minperm%=5
vmo –p –o maxperm%=80
vmo –p –o maxclient%=80
Let strict_maxperm and strict_maxclient default
vmo –p –o minfree=960
vmo –p –o maxfree=1088
vmo –p –o lru_file_repage=0
vmo –p –o lru_poll_interval=10
IO Settings
Let minpgahead and J2_minPageReadAhead default
ioo –p –o j2_maxPageReadAhead=128
ioo –p –o maxpgahead=16
ioo –p –o j2_maxRandomWrite=32
ioo –p –o maxrandwrt=32
ioo –p –o j2_nBufferPerPagerDevice=1024
ioo –p –o pv_min_pbug=1024
ioo –p –o numfsbufs=2048
If doing lots of raw I/O you may want to change lvm_bufcnt
Default is 9
ioo –p –o lvm_bufcnt=12
Others left to default that you may want to tweak include:
ioo –p –o numclust=1
ioo –p –o j2_nRandomCluster=0
ioo –p –o j2_nPagesPerWriteBehindCluster=32
Useful Commands
vmstat –v or –l or –s lvmo
vmo –o iostat (many new flags)
ioo –o svmon
schedo –o filemon
lvmstat fileplace
Useful Links
1. Lparmon – www.alphaworks.ibm.com/tech/lparmon
2. Nmon – www.ibm.com/collaboration/wiki/display/WikiPtype/nmon
3. Nmon Analyser – www-941.ibm.com/collaboration/wiki/display/WikiPtype/nmonanalyser
4. vmo, ioo, vmstat, lvmo and other AIX commands http://publib.boulder.ibm.com/infocenter/pseries/v5r3/topic/com
Introduction to WPAR in AIX 6
Workload Partitioning is a virtualization technology that utilizes
software rather than firmware to isolate users and/or applications.
A Workload Partition (WPAR) is a combination of several core AIX technologies. There are differences of course, but here the emphasis is on the similarities. In this essay I shall describe the characteristics of these technologies and how workload partitions are built upon them.
There are two types of WPAR: system and application.My focus is on system WPAR as this more closely resembles a LPAR or a seperate system. In other words, a system WPAR behaves as a complete installation of AIX. At a later time application workload partitions will be described in terms of how they differ from a system WPAR. For the rest of this document WPAR and system WPAR are to be considered synonomous.
AIX system software has three components: root, user, and shared. The root component consists of all the software and data that are unique to that system or node. The user (or usr) part consists of all the software and data that is common to all AIX systems at that particular AIX software level (e.g., oslevel AIX 5.3 TL06-01, or AIX 5.3 TL06-02, or AIX 6.1). The shared component is software and data that is common to any UNIX or Linux system.
In it's default configuration a WPAR inherits it's user (/usr) and shared (/usr/share, usually physically included in /usr filesystem) components from the global system. Additionally, the WPAR inherits the /opt filesystem. The /opt filesystem is the normal installation area in the rootvg volume group for RPM and IHS packaged applications and AIX Linux affinity applications and libraries. Because multiple WPAR's are intended to share these file fystems (/usr and /opt) they are read-only by WPAR applications and users. This is very similiar to how NIM (Network Installation Manager) diskless and dataless systems were configured and installed. Since only the unique rootvg volume group file systems need to be created (/, /tmp, /var, /home) creation of a WPAR is a quick process.
The normal AIX boot process is conducted in three phases:
1) boot IPL, or locating and loading the boot block (hd5);
2) rootvg IPL (varyonvg of rootvg),
3) rc.boot 3 or start of init process reading /etc/inittab
A WPAR activation or "booting" skips step 1. Step 2 is the global (is hosting) system mounting the WPAR filesystems - either locally or from remote storage (currently only NFS is officially supported, GPFS is known to work, but not officially supported at this time (September 2007)). The third phase is staring an init process in the global system. This @init@ process does a chroot to the WPAR root filesystem and performs an AIX normal rc.boot 3 phase.
WPAR Management
WPAR Management in it's simpliest form is simply: Starting, Stopping, and Monitoring resource usage. And, not to forget - creating and deleting WPAR.
Creating a WPAR is a very simple process: the onetime prequistite is the existance of the directory /wpars with mode 700 for root. Obviously, we do not want just anyone wondering in the virtualized rootvg's of the WPAR. And, if the WPAR name you want to create resolves either in /etc/hosts or DNS (and I suspect NIS) all you need to do is enter:
# mkwpar -n
If you want to save the output you could also use:
# nohup mkwpar -n & sleep 2; tail -f nohup.out
and watch the show!
This creates all the wpar filesystems (/, /home, /tmp, /var and /proc)
and read-only entries for /opt and /usr. After these have been made, they are
mounted and "some assembly" is performed, basically installing the root part
of the filesets in /usr. The only "unfortunate" part of the default setup is
that all filesystems are created in rootvg, and using generic logical partition
names (fslv00, fslv01, fslv02, fslv03). Fortunately, there is an argument
(-g) that you can use to get the logical partitions made in a different
volume group. There are many options for changing all of these and they
will be covered in my next document when I'll discuss WPAR mobility.
At this point you should just enter:
# startwpar
wait for prompt and from "anywhere" you can connect to the running WPAR just
as if it was a seperate system. Just do not expect to make any changes in /usr
or /opt (software installation is also a later document).
Summary
WPAR creation is very similar to the process NIM uses for diskless and dataless installations. This method relies on AIX rootvg software consisting of three components: root, user and share. The normal boot process is emulated by the global system "hosting" the WPAR. Phase 1 is not needed; Phase 2 is the mount of the WPAR filesystem resources; and Phase 3 is a so-called @init@ process that is seen as the regular init in the WPAR environment. This is the process that reads and processes /sbin/rc.boot 3 and /etc/inittab just as a normal AIX system would
Implementation of Partition Load Manager
PLM Software Installation
Install the following filesets:
plm.license
plm.server.rte
plm.sysmgt.websm
Make sure SSL and OpenSSH are also installed
For setup of PLM, create .rhosts files on the server and all clients.After PLM has been set up, you can delete the .rhosts files.
Create SSH Keys
On the server, enter:
# ssh-keygen –t rsa
Copy the HMC’s secure keys to the server:
# scp hscroot@hmchostname:.ssh/authorized_keys2 \
~/.ssh/tmp_authorized_keys2
Append the server’s keys to the temporary key file and copy it back to the HMC:
# cat ~/.ssh/id_rsa.pub >> ~/.ssh/tmp_authorized_keys2
# scp ~/.ssh/tmp_authorized_keys2 \
hscroot@hmchostname:.ssh/authorized_keys2
Test SSH and Enable WebSM
Test SSH to the HMC. You should not be asked for a password.
# ssh hscroot@hmchostname lssyscfg –r sys
On the PLM server, make sure you can run WebSM. Run:
# /usr/websm/bin/wsmserver -enable
Configure PLM Software
On the PLM server, open WebSM and select Partition Load Manager.
Click on Create a Policy File. In the window open on the General Tab, enter a policy file name on the first line
Click on the Globals tab. Enter the fully qualified hostname of your HMC. Enter hscroot (or a user with the Systems Administration role) as the HMC user name. Enter the CEC name, which is the managed system name (not the fully qualified hostname).
Click on the Groups tab. Click the Add button. Type in a group name. Enter the maximum CPU and memory values that you are allowed to use for PLM operations.
Check both CPU and Memory management if you’re going to manage both.
Click on Tunables. These are the defaults for the entire group. If you don’t understand a value, highlight it and select Help for a detailed description.
Click on the Partitions tab. Click the Add button and add all of the running partitions in the group to the partitions list.
On the Partition Definition tab, use the partitions’ fully qualified hostnames and add them to the group you just created.
Click OK to create the policy file.
In the PLM server, view the policy file you created. It will be in /etc/plm/policies.
Perform the PLM setup step using WebSM. You must be root. Once this finishes, you’ll see “Finished: Success” in the WebSM working window.
In the server and a client partition, look at the /var/ct/cfg/ctrmc.acls file to see if these lines are at the bottom of the file:
IBM.LPAR
root@hmchostname * rw
If you need to edit this file, run this command afterward:
# refresh –s ctrmc
Test RMC authentication by running this command from the PLM server, where remote_host is a PLM client
# CT_CONTACT=remote_host lsrsrc IBM.LPAR
If successful, a lot of LPAR information will be printed out instead of “Could not authenticate user”
Start the PLM server. Look for “Finished:Success” in the WebSM working window.
Enter a configuration name. Enter your policy file name. Enter a new logfile name.
(If you have trouble with the logilfe, you may need to touch the file before you can access it)
If the LPAR details window shows only zeroed-out information, then there’s probably an RMC authentication problem.
If there’s a problem, on the server partition, run:
# /usr/sbin/rsct/bin/ctsvhbal
The output should list one or more identities. Check to see that the server’s fully qualified hostname is in the output.
On each partition, run /usr/sbin/rsct/bin/ctsthl –l. At least one of the identities shown on the remote partition’s ctsvhbal output should show up on the other partitions’ ctsthl –l output. This is the RMC list of trusted hosts.
If there are any entries in the RMC trusted hosts lists which are not fully qualified hostnames, remove them with the following command:
# /usr/sbin/rsct/bin/ctsthl –d –n identity
where identity is the trusted host list identity
If one partition is missing a hostname, add it as follows:
# /usr/sbin/rsct/bin/ctsthl –l –n identity –m METHOD –p ID_VALUE
Identity is the fully qualified hostname of the other partition
rsa512 is the method
Id_value is obtained by running ctsthl –l on the other partition to determine its own identifier