Processes and Devices—It’s All About the Children
Processes and Devices—It’s All About the Children
Within the AIX (or Linux) world, if a process is spawned, a copy of the parent process is used and the child process is created. Spawning children is a quick and effective way of handling many tasks that a process might have to undertake. When a parent is terminated or stopped, its children should be terminated in a controlled manner. This is the responsibility of the parent process, however, this doesn’t always happen. It’s important to know what children are associated with what parent, in case you’re faced with processes you might need to kill. You must keep track of the children and the parent. Here, I’ll demonstrate a couple of ways this can be achieved when dealing with the children method.
AIX uses a top-down approach to represent its devices using the parent/children method, a sort of device tree if you like. There is a hierarchy on each device, whether it’s physical or logical. When making changes to devices you cannot just jump in and use rmdev on devices. You need to know which devices hang off what slot. For example, if you wanted to remove or make a change to a fibre card, like fscsi, you must first remove or make unavailable any device hanging off that device. I will demonstrate how AIX sees children and its parent regarding devices using a few commands.
Keep an Eye on Those Children
For ease of identification, a script called trickle will be executed; it will spawn two children: trickle1, trickle2, like so:
# ps -ef -o user,pid,ppid,args|grep trickle root 5701778 5373994 /bin/sh /home/dxtans/trickle root 5898334 5701778 /bin/sh /home/dxtans/trickle2 root 8650902 5701778 /bin/sh /home/dxtans/trickle1
Notice that the main process parent’s PID is 5701778; and the PPID for the children is 5701778. Once a parent spawns, its children will inherit the parent PID into their own PPID. To locate a parent and all of its children, first find the PID of the main process, and with that PID use ps with grep to locate its children. Alternatively, you can use the -T option with the ps command to display the process and its children, if present, like so:
# ps -fT 5701778 UID PID PPID C STIME TTY TIME CMD root 5701778 5373994 0 12:57:15 - 0:00 /bin/sh /home/dxtans/trickle root 5898334 5701778 0 12:57:15 - 0:00 |\--/bin/sh /home/dxtans/trickle2 root 7471214 5898334 0 12:57:15 - 0:00 | \--scan 60 root 8650902 5701778 0 12:57:15 - 0:00 \--/bin/sh /home/dxtans/trickle1 root 6684866 8650902 0 12:57:15 - 0:00 \--scan 60
Parsing the ps command with the PID will then display the parent and its children and any other process the children are executing. From the output above, we can see that each child is running a utility called scan. However an even better visual for detecting child processes involves the proctree command. Simply parse it with the PID, like so:
# proctree -T -a 5701778 1 \--/etc/init 6815968 \--/usr/sbin/cron 5373994 \--ksh 5701778 \--/bin/sh /home/dxtans/trickle 5898334 | | | |\--/bin/sh /home/dxtans/trickle2 7471214 | | | | \--scan 60 8650902 | | | |\--/bin/sh /home/dxtans/trickle1 6684866 | | | | \--scan 60
The output here indicates the process was activated via cron, whose parent is init. While not in this output, the PPIDs are shown within a text display for easier identification, in my opinion.
As noted, when terminating a parent process with children, the parent should then terminate its own children. Otherwise, they’ll be orphaned, and init will be their parent. Let’s see that in action now:
# kill -9 5701778 # ps -ef -o user,pid,ppid,args|grep trickle root 5898334 1 /bin/sh /home/dxtans/trickle2 root 8650902 1 /bin/sh /home/dxtans/trickle1
A kill -9 (absolute kill) has been executed against the parent, but the children remain as the children of init (1). These remaining processes will need to be killed manually. If you don’t clean up these processes, you’re asking for trouble down the line when restarting the process or service. At best, you’ll have a lot of orphans whose parent is init. At worst, the process, if restarted, might compete for allocated resources, which isn't good. When a parent process is terminated, make sure the children are as well.
Top Down and Bottom Up With Devices
The device tree that AIX uses to present its devices is quite straight forward, you can view it top down or bottom up. When making changes to a device, you need to know which device owns what slot or rather who is the parent. Otherwise, you’ll get the dreaded “device busy/in use” message. From this, you can then go down the tree to discover all of the children until you get to the one you’re after or no children remain! AIX offers some commands to display devices:
- Parent device: lsdev -l -F parent
- Child device: lsdev -p
- Parent device: lsparent -Cl
- Parent(s) of the device: lsdev -C -F "name parent" | grep < device>
Imagine you have a SCSI tape unit called rmt0, and you need to know who the parent is, perhaps to move it or change its attributes:
# lsdev -l rmt0 -F parent scsi0 # lsdev -l scsi0 -F parent pci11 lsslot -c slot| grep pci11 U787F.001.DPM0Y7F-P1-C2 Logical I/O Slot pci11 scsi0
You could go further up the tree, like so:
# lsdev -l pci11 -F parent pci2
But as pci11 is a slot, that’s where to stop, job done. Of course, you could have got there quicker using:
# lsdev -C -F "name parent" | grep rmt rmt0 scsi0
If you want to know what children hang off the device pci3, in this example, you will go down the tree by using:
# lsdev -p pci3 ide0 Available 03-08 ATA/IDE Controller Device # lsdev -p ide0 cd0 Available 03-08-00 IDE DVD-ROM Drive
A common task is discovering what hangs off the fibre cards. To do so, start with the fcs0 virtual adapter and drill down:
# lsdev -p fcs0 fscsi0 Available C6-T1-01 FC SCSI I/O Controller Protocol Device # lsdev -p fscsi0 hdisk0 Available C6-T1-01 IBM MPIO FC 2107 hdisk1 Available C6-T1-01 IBM MPIO FC 2107 hdisk2 Available C6-T1-01 IBM MPIO FC 2107 hdisk3 Available C6-T1-01 IBM MPIO FC 2107 sfwcomm0 Available C6-T1-01-FF Fibre Channel Storage Framework Comm
In this output, you can see you have hdisk0, 1, 2, 3. These are the children of the parent device fsc0. Of course, you could go the other way and find who the parent is of fcs0 by going right to the top:
# lsdev -l fcs0 -F parent vio0 # lsdev -l vio0 -F parent sysplanar0 # lsdev -l sysplanar0 -F parent sys0
These commands could be the building blocks to write a basic script to present the layout of the devices and their corresponding children for your own needs.