AIX Reviewing AIX Error and Boot Logs

Reviewing AIX Error and Boot Logs

 

 

 

AIX provides comprehensive logging of events—some are errors requiring attention and others are just notifications. For system administrators, tasked to make sure the system is running without major issues, logging provides alerts or apprises them of events as they happen.

AIX offers different logs depending on the action and when it occurred. These logs hold information on the boot-up process, console, hardware and system software events. It’s up to the system admin to take action on these events, because once AIX has published the log, its job is done.

Logs, Logs, Logs

AIX not only offers the errpt but also other error reporting logs. Using the alog command one can list and pick a log to view:

# alog -L
boot
bosinst
nim
console
cfg
mdmplog
lvmt
lvmcfg
dumpsymp

When issues arise during the boot-up process, for example, and you’re not at the console, you can review the start-up process messages, particularly the boot and console messages. To list the available logs:

alog -o -t  

For example, to view the console log:

alog -o -t console

Logging Your Own Entries

The standard errpt list hardware or software events in AIX that have occurred. However, you might want a message generated and inserted into errpt after some user interaction, for instance, if a system admin has made a change. This allows the change notification to be visible via errpt. Like the logger command that writes to the system log (messages file), errlogger will write an operator notification entry to the error log. For example, having completed an AIX upgrade, you could post that to the error log, so other users could view it, like so:

errlogger "AIX upgrade completed - no errors- test"

Working With errpt

The first thing AIX admins should do is get event notifications via email. Those errors/warnings will be emailed as well as posted to the errpt log. First, create an email alias containing all system admins’ addresses in the /etc/mail/aliases file. Insert the email alias into the notification list, using the following smit selections: smit diag, current shell diagnostic, task selection, automatic error log notification. Now you’ll get errpt log emails as they’re posted to the errpt file.

The errpt list has headers in the following format:

identifier, timestamp, type, class resource, description.

A typical list entry could be:

A6DF45AA 0410183413 I O RMCdaemon The daemon is started.

Some system admins view the errpt listing and list the errpt, in full, using the following commands, then clear the whole errprt when done:

errpt
errpt -a
errclear 0

However, one can be more explicit. To clear errpt entries older than two days, use

errclear 2

To clear all software errors by using the resource name, try:

errclear -d S 0

To clear down all ent0 entries:

errclear -N ent0 0

To clear all SYSPROC entries:

errclear -N SYSPROC 0

To clear by identifier:

errclear -J <identifier> 0

In the last example, identifier is used to locate and clear an entry. It can also be used to view entries:

errpt -j <identifier>

To view the full entries by identifier:

errpt -a -j <identifier>

Of course, it’s OK to get information from the errpt using the identifier, but sometimes you need to keep it simple. So to extract all entries relating to, say, hdisk1, use the resource name to extract from the errprt:

errpt -N hdisk1

To extract all entries relating to ent0, try:

errpt -N ent0

If you want to view entries based on hardware or software, simply supply the class type. To view any hardware-related issues, for instance, use:

errpt -d H 

Similarly for software, which would include core dumps and shutdowns, use:

errpt -d S

For operator, including notice events, file system space issues and services that terminate:

errpt -d O

Another identifier, called U (undetermined), logs events that don’t fall into any other category.

Don’t Report These Errors

There are occasions when the errpt gets filled with notifications you don’t really care about. Still, you want AIX to log them—just not report them. This could be due to a rush of notifications that you don’t want reported until a certain issue has been fixed. To view current errpt entries that have been disabled from reporting, use:

errpt -t -F Report=0

To view the current repository list containing the complete list of identifiers, labels, descriptions, etc., try:

errpt -t 

Consider a scenario where you wish to stop report logging of events for a disk raid. The system repeatedly tries to rebuild, but you don’t need AIX to keep telling you. To disable the reporting of the raid rebuild, first obtain the identifier—FE7D0EED—by listing the errpt repository. To disable reporting of that identifier:

# errupdate <hit return>
=FE7D0EED: <hit return>
Report=false 
<hit CTRL-D>
<hit CTRL-D>
0 entries added.
0 entries deleted.
1 entries updated.
#

In the output above, the “=” sign indicates to modify report entry. The text also shows where you should hit return and CTRL-D in the inactive errupdate utility. To confirm that reporting was disabled, use the errpt -t -F Report=0 command. At some point, you’ll want to re-enable this report. To do so:

# errupdate <hit return>
=FE7D0EED: <hit return>
< hit CTRL-D>
< hit CTRL-D>
0 entries added.
0 entries deleted.
1 entries updated.
#

Again, review the repository to check identifiers that have been disabled/enabled from reporting.

If Logging Stops

If your errlog stops logging/reporting events, chances are the log is full or corrupted. A quick fix is to zero the file. First, stop the errpt service:

# /usr/lib/errstop

Next, remove the /var/adm/ras/errlog:

# rm /var/adm/ras/errlog

Restart it:

# /usr/lib/errdemon

You’re good to go. To view attributes relating to the errolog, use:

# /usr/lib/errdemon -l
0 (0)
Article Rating (No Votes)
Rate this article
Attachments
There are no attachments for this article.
Comments
There are no comments for this article. Be the first to post a comment.
Full Name
Email Address
Security Code Security Code
Related Articles RSS Feed
Reconfigure RSCT ID to fix DLPAR issues on cloned AIX systems
Viewed 13744 times since Thu, Feb 21, 2019
0516-404 allocpThis system cannot fulfill the allocation
Viewed 10038 times since Thu, Sep 20, 2018
AIX WIKIS developerworks
Viewed 2332 times since Sun, Jun 17, 2018
AIX routing - How Do I Compare ODM with the Current Routing Table?
Viewed 2593 times since Mon, Jul 29, 2019
Install and configure GNU’s screen on AIX
Viewed 8875 times since Thu, Feb 21, 2019
AIX, Security, System Admin↑ Fix user accounts
Viewed 4697 times since Fri, Apr 19, 2019
AIX - How to unlock and reset user’s account
Viewed 16489 times since Fri, Jun 8, 2018
How to check dual path in AIX
Viewed 13819 times since Fri, Jun 8, 2018
IBM AIX multipath I/O (MPIO) resiliency and problem determination
Viewed 13411 times since Wed, May 30, 2018
AIX 0516-404 allocp: This system cannot fulfill the allocation
Viewed 3306 times since Thu, Sep 20, 2018