Table of Contents
Template Check Procedure
On this page is a list of checks you should do as a System Administrator for your systems. I like suggestions if I've forgotten anything, or tips how to handle these checks. The point is really to write a best practice procedure. This is not a complete list, it's just the checks I do for my company and checks I did for previous companies, completed (?) with the checks my (ex-)colleagues do.
The first part is really the checks. What do you check and how frequently. The second part is how to make this into a procedure. The last part is really important if you're company is looking for certification or want to prepare for certification.
Checks
What do you check and how frequently?
Weekly Checks
Considerations:
- You should always check the backup, and when a backup is not necessary the backup line should state : “Backup: not necessary.”
- You should always check the UPS communication
- Antivirus (AV) should always be checked on Windows machines, but can be neglected for *nix
Linux
<servername>
- Syslog (/var/log/messages)
- See SYSLOG
- Uptime
- Diskspace % Free
- /
- Data volume
- Backup
- UPS
AIX
<servername>
- Syslog (!not enabled by default!)
- See SYSLOG
- Uptime
- Diskspace % Free
- /
- /data
- Backup
- UPS
NetWare
<servername>
- Server health
- Remote Manager
- Uptime
- monitor.nlm
- Diskspace
- SYS
- DATA1
- Backup
- UPS
Windows
<servername>
- Event Viewer
- Services
- Uptime
- Diskspace
- C:\
- D:\
- Backup
- UPS
Applications and Services
Linux eDirectory
- edir synchronisatie
- ./dsrmenu.sh
- ndsstat
NetWare eDirectory
- edir synchronisatie
- iMonitor
Groupwise
- GW errors to GW administrator
- GW Check Reports
- POA
- GW log files
- POA
- MTA
- WebAccess
- GWIA
- backup
Oracle
- Check the /var/log/oracle/<version>/<oraclesid>/bdump/alert_<oraclesid>.log
- Errors in this logfile start with “ORA-”
Miscellaneous
Local data
I like to work locally since it's faster, so I shouldn't forget to copy my data to the network:
- Create backup local data
- Copy local data to network
Websites
Websites need special attention as well:
- Create backup from website
- Check comment on website (if possible)
- Check email from website
Documentation
I'm a documentation freak so I say, once a week:
- Check to see if all work of the week is documented
- Check documentation from colleagues
Really Miscellaneous
- As a system administrator you have to deal with prejudices so
- Clean up your desk once a week
- Read and reply to your email
- Update the status of your open incidents/issues/tickets/calls
Monthly Checks
Backup
- Backup data paths in nodes
- Note the date when you last done this
AntiVirus
- Make sure all clients and the management server are up to date
- Note the date when you last done this
Half Yearly Checks
Backup
- Check backup overview and procedure
- See Backup Procedure
- Let all responsible key employees also recheck
- Note the date when you last done this
Applications and Services
- eDirecory health check
Yearly Checks
Backup
- Check year backup
- Check if there's been a full restore this year, if not perform one