Problems with NFS on an AIX Reboot? Then Go Single

Problems with NFS on an AIX Reboot? Then Go Single

 

 

 

A few weeks ago, I got called to review an issue where an AIX admin was cloning an AIX image onto a new LPAR. He said the reboot process was hanging during the NFS startup, trying to contact the NFS server to mount the remote file systems. I told him to wait for another five minutes before I came along and investigated the issue. He replied it had been like that for more than 30 minutes! Even hitting on the keyboard didn’t cancel the NFS mount process. My first step was to disable the NFS startup, which seemed a probable solution to the issue. Even though the NFS server was up and exporting the correct file systems. (To see what file systems are exported, use the command: showmount –e.)

The administrator wanted to find his AIX diagnostic DVD to boot off from, but I stopped him. He only needed to boot AIX into single user mode. This means no network services will be started so you can investigate the network-related issues.

Going Single User

Booting into single user is pretty simple. It goes like this:

  1. Boot the LPAR into SMS, select your normal boot disk to boot off, but instead of booting into Normal mode, select Service boot mode. See Figure 1.
  2. An informational message screen is then presented, before the diagnostic menu. Here select Single User Mode. See Figure 2.
  3. Once selected, the system goes into single user and you’ll be prompted for the root password before entry into single user is allowed. See Figure 3.

Once in single mode single entry, you can attack the problem. Now if you’re cloning from an old image, be sure you know the root password set at the time of the clone image. If not, you’ll be sitting at the password prompt for a long time.

Sorting Out NFS

I had a good guess what the issue was. The NFS mounts were probably set up to be a hard mount but no intr attribute was set. When doing NFS mounts, you can either do a soft or hard mount. With a soft mount, it will try a mount until the timo (timeout period) is reached. Then it gives up, generally after 15 seconds unless otherwise specified. With a hard mount, it will try forever waiting for a response from the NFS server. Now if the NFS server is down, you are out of luck and the client is going to hang around for long time! The only way to cancel the mount is to set the intr attribute, which lets you cancel the mount in progress with a from the keyboard. On this occasion, I decided to comment out anything related to NFS. I just needed to get the LPAR up on the network.

So I edited the /etc/inittab file to comment out the NFS startup, which is the entry rc.nfs. Now be sure to remember a comment in inittab is a colon (:) and not a hash (#). In the /etc/rc.nfs, I put a ‘exit 0’ at the top of the file. At this point, I was quite satisfied that the NFS wouldn’t start. I could have just ran:rmnfs –B, which would have more or less accomplished the same task. I also made sure that there were no entries in /etc/filesysyems for NFS mounts to comment out. When I exited single user mode, the LPAR came up within a minute.

When the Unexpected Happens

So what can we learn to avoid getting NFS issue when rebooting a client or doing a clone? Either ensure the NFS is a soft mount or set the intr attribute if it’s a hard mount.

For example, the following command creates a read-only soft mount of the file system, when the remote host and directory is uk01wrs6040:/opt/software_nfs. On the local host the directory, /opt/software_nfs has already been created to mount the remote directory. The following will mount the NFS share:

mount -o ro,soft uk01wrs6040:/opt/software_nfs /opt/software_nfs

To do a hard mount, specify the intr option, so we can if the remote server uk01wrs6040 is not responding. (Note it’s also read-only.)

mount -o ro,hard,intr uk01wrs6040:/opt/software_nfs /opt/software_nfs

Don’t Make It Hard

If you’re using SMIT for your NFS mounts, you can try setting the “mount automatically at system restart” option to no within the SMIT and then manually mount NFS after the system is up. You could also set it to mount in the background (option: bg), but I like to see what’s going on when I bring up an AIX machine, so all my mounts are in the foreground (option: fg). That’s just me.

As a rule, I don’t bother with SMIT to create predefined NFS mount, rather I explicitly create them in a rc.local script file, which is the last event to be called (executed) from /etc/inittab, which I’ve set up myself. These NFS mounts will do hard mounts but have the intr option set. So if it hangs during machine bootup with NFS issues, I just from the keyboard to break out of the NFS mounting. The machine then carries on with the normal boot process. No point in making it hard for yourself.

0 (0)
Article Rating (No Votes)
Rate this article
Attachments
There are no attachments for this article.
Comments
There are no comments for this article. Be the first to post a comment.
Full Name
Email Address
Security Code Security Code
Related Articles RSS Feed
Backing up your VIOS configuration with viosbr.
Viewed 11486 times since Mon, May 28, 2018
How to enable Large Pages for a specific user on AIX?
Viewed 2369 times since Thu, Nov 29, 2018
AIX rootvg Mirroring
Viewed 4465 times since Mon, May 21, 2018
AIX: Script to check if all paths are consistent and available
Viewed 3107 times since Tue, Jun 12, 2018
Epoch & Unix Timestamp Conversion Tools
Viewed 55872 times since Fri, Jun 22, 2018
How to Easily Generate AIX Systems Management Reports
Viewed 2909 times since Wed, May 30, 2018
The new VIOS performance advisor tool part util
Viewed 3153 times since Tue, Jun 4, 2019
AIX: Error code 0516-1339, 0516-1397 0516-792: cannot extendvg with a previous Oracle ASM disk
Viewed 3614 times since Wed, Feb 6, 2019
AIX: How to manage network tuning parameters
Viewed 3698 times since Mon, Jun 11, 2018
How to Maintain a Virtual I/O Server With FBO Part II
Viewed 10531 times since Wed, Jun 5, 2019