AIX Errpt - Diag - Alog

ERROR LOGGING:

The errdemon is started during system initialization and continuously monitors the special file /dev/error for new entries sent by either the kernel or by applications. The label of each new entry is checked against the contents of the Error Record Template Repository, and if a match is found, additional information about the system environment or hardware status is added. A memory buffer is set by the errdemon process, and newly arrived entries are put into the buffer before they are written to the log to minimize the possibility of a lost entry. The errlog file is a circular log, storing as many entries as can fit within its defined size, the default is /var/adm/ras/errlog and it is in binary format

The name and size of the error log file and the size of the memory buffer may be viewed with the errdemon command:

# /usr/lib/errdemon -l

Log File                /var/adm/ras/errlog
Log Size                1048576 bytes
Memory Buffer Size      32768 bytes

------------------------------

/usr/lib/errdemon               restarts the errdemon program
/usr/lib/errstop                stops the error logging daemon initiated by the errdemon program
/usr/lib/errdemon -l            shows information about the error log file (path, size)
/usr/lib/errdemon -s 2000000    changes the maximum size of the error log file

errpt                           retrieves the entries in the error log
errpt -a -j AA8AB241            shows detailed info about the error (with -j, the error id can be specified)
errpt -s 1122164405 -e 11231000405
                                shows error log in a time period (-s start date, -e end date)
errpt -d H                      shows hardware errors (errpt -d S: software errors)

Error Classes:
    H: Hardware
    S: Software
    O: Operator
    U: Undetermined

Error Type:
    P: Permanent - unable to recover from error condition
       Pending - it may be unavailable soon due to many errors
       Performance - the performance of the device or component has degraded to below an acceptable level
    T: Temporary - recovered from condition after several attempts
    I: Informational
    U: Unknown - Severity of the error cannot be determined


Types of Disk Errors:
DISK_ERR1: Disk should be replaced it was used heavily
DISK_ERR2: caused by loss of electrical power
DISK_ERR3: caused by loss of electrical power
DISK_ERR4: indicates bad blocks on the disk (if more than one entry in a week replace disk)


errclear                  deletes entries from the error log (smitty errclear)
errclear 7                deletes entries older than 7 days (0 clears all messages)
errclear -j CB4A951F 0    deletes all the messages with the specified ID              
errlogger                 log operator messages to the system error log
                          (errlogger "This is a test message")


------------------------------

Mail notification via errpt and errnotify

AIX has an Error Notification object class in the Object Data Manager (ODM). An errnotify object is a "hook" into the error logging facility that causes the execution of a program whenever an error message is recorded. By default, there are a number of predefined errnotify entries, and each time an error is logged via errlog, it checks if that error entry matches the criteria of any of the Error Notification objects.

0. make sure mail sending is working correctly from the server
1. create a text file (i.e. /tmp/errnotify.txt), which will be added to ODM


Add the below lines if you want notifications on all kind of errpt entries:

errnotify:
  en_name = "mail_all_errlog"
  en_persistenceflg = 1
  en_method = "/usr/bin/errpt -a -l $1 | mail -s \"errpt $9 on `hostname`\" aix4adm@gmail.com"
        <--specify here the email addres


Add the below lines if you want notifications on permanent hardware entries only:

errnotify:
  en_name = "mail_perm_hw"
  en_class = H
  en_persistenceflg = 1
  en_type = PERM
  en_method = "/usr/bin/errpt -a -l $1 | mail -s \"Permanent hardware errpt $9 on `hostname`\" aix4adm@gmail.com"



2. root@bb_lpar: / # odmadd /tmp/errnotify.txt                                 <--add the content of the text file to ODM:
3. root@bb_lpar: / # odmget -q en_name=mail_all_errlog errnotify               <--check if it is added successfully
4. root@bb_lpar: / # errlogger "This is a test message"                        <--check mail notification with a test errpt entry

You can delete the addded errnotify object if it is not needed anymore:
root@bb_lpar: / # odmdelete -q 'en_name=mail_all_errlog' -o errnotify
0518-307 odmdelete: 1 objects deleted.

(source: http://www.kristijan.org/2012/06/error-report-mail-notifications-with-errnotify/)

--------------------------------------------------------------------------------------------

DIAGRPT: (DIAG logs reporter)

diagrpt                   Displays previous diagnostic results
cd /usr/lpp/diag*/bin
    ./diagrpt -r          Displays the short version of the Diagnostic Event Log
    ./diagrpt -a          Displays the long version of the Diagnostic Event Log



--------------------------------------------------------------------------------------------

ALOG:

/var/adm/ras             this directory contains the master log files (alog command can read these files)
                         e.g. /var/adm/ras/conslog

alog -L                  shows what kind of logs there are (console, boot, bosinst...), these can be used by: alog -of ...
alog -Lt <type>          shows the attibute of a type (console, boot ...): size, path to logfile...
alog -ot console         lists of those errors which are on the console
alog -ot boot            shows the bootlog
alog -ot lvmcfg          lvm log file, shows what lvm commands were used (alog -ot lvmt: shows lvm commands and libs)


--------------------------------------------------------------------------------------------

0 (0)
Article Rating (No Votes)
Rate this article
Attachments
There are no attachments for this article.
Comments
There are no comments for this article. Be the first to post a comment.
Full Name
Email Address
Security Code Security Code
Related Articles RSS Feed
Do you Know These 5 Use of V$session View ?
Viewed 105041 times since Thu, Jun 21, 2018
AIX routing - How Do I Compare ODM with the Current Routing Table?
Viewed 2591 times since Mon, Jul 29, 2019
Setting new device attributes with chdef
Viewed 2188 times since Mon, Jun 3, 2019
List STALE partitions across Volume Groups for each Logical Volume in AIX
Viewed 2490 times since Tue, Jul 17, 2018
How to mirror the rootvg in AIX?
Viewed 5027 times since Mon, May 21, 2018
Trick to Purge/Clean Swap Usage on AIX
Viewed 7872 times since Thu, Nov 29, 2018
http://ibmsystemsmag.com/aix/administrator/backuprecovery/remote-sync/
Viewed 5345 times since Wed, May 30, 2018
Remove disk from volumegroup in AIX
Viewed 6754 times since Tue, Apr 16, 2019
AIX - How to get CPU infomation
Viewed 5525 times since Fri, Jun 8, 2018
AIX 6/7 Script to create a file with commands to remove missing and failed paths
Viewed 3354 times since Tue, Jun 14, 2022