AIX Reviewing AIX Error and Boot Logs

Reviewing AIX Error and Boot Logs

 

 

 

AIX provides comprehensive logging of events—some are errors requiring attention and others are just notifications. For system administrators, tasked to make sure the system is running without major issues, logging provides alerts or apprises them of events as they happen.

AIX offers different logs depending on the action and when it occurred. These logs hold information on the boot-up process, console, hardware and system software events. It’s up to the system admin to take action on these events, because once AIX has published the log, its job is done.

Logs, Logs, Logs

AIX not only offers the errpt but also other error reporting logs. Using the alog command one can list and pick a log to view:

# alog -L
boot
bosinst
nim
console
cfg
mdmplog
lvmt
lvmcfg
dumpsymp

When issues arise during the boot-up process, for example, and you’re not at the console, you can review the start-up process messages, particularly the boot and console messages. To list the available logs:

alog -o -t  

For example, to view the console log:

alog -o -t console

Logging Your Own Entries

The standard errpt list hardware or software events in AIX that have occurred. However, you might want a message generated and inserted into errpt after some user interaction, for instance, if a system admin has made a change. This allows the change notification to be visible via errpt. Like the logger command that writes to the system log (messages file), errlogger will write an operator notification entry to the error log. For example, having completed an AIX upgrade, you could post that to the error log, so other users could view it, like so:

errlogger "AIX upgrade completed - no errors- test"

Working With errpt

The first thing AIX admins should do is get event notifications via email. Those errors/warnings will be emailed as well as posted to the errpt log. First, create an email alias containing all system admins’ addresses in the /etc/mail/aliases file. Insert the email alias into the notification list, using the following smit selections: smit diag, current shell diagnostic, task selection, automatic error log notification. Now you’ll get errpt log emails as they’re posted to the errpt file.

The errpt list has headers in the following format:

identifier, timestamp, type, class resource, description.

A typical list entry could be:

A6DF45AA 0410183413 I O RMCdaemon The daemon is started.

Some system admins view the errpt listing and list the errpt, in full, using the following commands, then clear the whole errprt when done:

errpt
errpt -a
errclear 0

However, one can be more explicit. To clear errpt entries older than two days, use

errclear 2

To clear all software errors by using the resource name, try:

errclear -d S 0

To clear down all ent0 entries:

errclear -N ent0 0

To clear all SYSPROC entries:

errclear -N SYSPROC 0

To clear by identifier:

errclear -J <identifier> 0

In the last example, identifier is used to locate and clear an entry. It can also be used to view entries:

errpt -j <identifier>

To view the full entries by identifier:

errpt -a -j <identifier>

Of course, it’s OK to get information from the errpt using the identifier, but sometimes you need to keep it simple. So to extract all entries relating to, say, hdisk1, use the resource name to extract from the errprt:

errpt -N hdisk1

To extract all entries relating to ent0, try:

errpt -N ent0

If you want to view entries based on hardware or software, simply supply the class type. To view any hardware-related issues, for instance, use:

errpt -d H 

Similarly for software, which would include core dumps and shutdowns, use:

errpt -d S

For operator, including notice events, file system space issues and services that terminate:

errpt -d O

Another identifier, called U (undetermined), logs events that don’t fall into any other category.

Don’t Report These Errors

There are occasions when the errpt gets filled with notifications you don’t really care about. Still, you want AIX to log them—just not report them. This could be due to a rush of notifications that you don’t want reported until a certain issue has been fixed. To view current errpt entries that have been disabled from reporting, use:

errpt -t -F Report=0

To view the current repository list containing the complete list of identifiers, labels, descriptions, etc., try:

errpt -t 

Consider a scenario where you wish to stop report logging of events for a disk raid. The system repeatedly tries to rebuild, but you don’t need AIX to keep telling you. To disable the reporting of the raid rebuild, first obtain the identifier—FE7D0EED—by listing the errpt repository. To disable reporting of that identifier:

# errupdate <hit return>
=FE7D0EED: <hit return>
Report=false 
<hit CTRL-D>
<hit CTRL-D>
0 entries added.
0 entries deleted.
1 entries updated.
#

In the output above, the “=” sign indicates to modify report entry. The text also shows where you should hit return and CTRL-D in the inactive errupdate utility. To confirm that reporting was disabled, use the errpt -t -F Report=0 command. At some point, you’ll want to re-enable this report. To do so:

# errupdate <hit return>
=FE7D0EED: <hit return>
< hit CTRL-D>
< hit CTRL-D>
0 entries added.
0 entries deleted.
1 entries updated.
#

Again, review the repository to check identifiers that have been disabled/enabled from reporting.

If Logging Stops

If your errlog stops logging/reporting events, chances are the log is full or corrupted. A quick fix is to zero the file. First, stop the errpt service:

# /usr/lib/errstop

Next, remove the /var/adm/ras/errlog:

# rm /var/adm/ras/errlog

Restart it:

# /usr/lib/errdemon

You’re good to go. To view attributes relating to the errolog, use:

# /usr/lib/errdemon -l
0 (0)
Article Rating (No Votes)
Rate this article
Attachments
There are no attachments for this article.
Comments
There are no comments for this article. Be the first to post a comment.
Full Name
Email Address
Security Code Security Code
Related Articles RSS Feed
SNAP
Viewed 1726 times since Mon, Sep 17, 2018
DISK OPERATION ERROR in AIX
Viewed 13633 times since Thu, Feb 21, 2019
AIX Assign a PVID to a new hdisk
Viewed 6315 times since Tue, Jul 17, 2018
AIX Booting
Viewed 9998 times since Tue, Apr 16, 2019
A Change to the SMT Mode Default in POWER9
Viewed 5005 times since Fri, Jan 18, 2019
AIX - How to unlock and reset user’s account
Viewed 16113 times since Fri, Jun 8, 2018
How to enable Large Pages for a specific user on AIX?
Viewed 2209 times since Thu, Nov 29, 2018
AIX Creating EtherChannel Devices from Command Line
Viewed 3171 times since Mon, Jun 3, 2019
AIX, System Admin↑ The chrctcp command
Viewed 2882 times since Fri, Apr 19, 2019
HOWTO: Implement SEA Failover with Dual VIOS
Viewed 7280 times since Tue, Jun 4, 2019