One needs to understand OS, Networking, Hardware, RDBMS and clustering to perform performance tunning of any ERP product. SAP provides very good transaction codes for monitoring and troubleshooting all these components. Familiarization of these components would help understand the transaction code description which in turn will help to manage SAP system more efficiently.
During the process of analyzing the root cause of performance issue, one has to make sure the CPU is not close to 100% usage, if
- CPU is not close to 100%, then any monitoring tools like Toad, foglight, and Grid control will help in resolving the performance issue. If CPU is 100%, the use of above tools will not help because
1. It is a very time consuming process or
2. CPU may block these tools from connecting
In that case, use of native tools like SQL plus, NIping, Netstat, sar, vmstat, mpstat, top, ps, iostat, msprot, ensmon, dbmon, msmon etc will help.
Even though Oracle provides some features like ADDM and SQL Tuning Advisor for Performance tuning, it would be better to understand the functionality of Optimizer technology.In real time ADDM and SQL Tuning Advisor helped me to tune optimize the performance of Database Batch jobs (which reduced the time from 14 hours to 3 hours).But further manual tuning (using MV and SQL hints) reduced the time to 30 min.
Certain Database (like in Tele industry) is very sensitive and need immediate attention in case of any critical performance issues.This situation can be related to table buffer/lock isse (SAP) or latch/lock/Bad queries (Oracle) or Disk IO queue issue (Operating system > 30%) or message server is not coming up in MSCS environment after the MSCS resource switchover (message server port not released from the old Active host , netstat -a) or MSCS cluster freezing at shared disk level (cluster level – quorom disks).
In general use the given sequence below to start with any troubleshooting issue.
Level 1 –> OS
Level 2 –> Network
Level 3 –> Oracle
Level 4 –> SAP
So root cause analysis involve from level 1 to leve 4 (may be more in case of more layers are used – example Cluster).
From my experience while at one client (Telecommunication) , there was a call drop during peak hours.I started troubleshooting from level 1 and found disk IOQ was > 25 %.At level 2 (Oracle) i found one user was rebulding a huge index and that caused the disk IO to peak. So i had to kill that session (dropping calls would drop the revenue) to bring the IO normal , < 10%. This fixed the call drop issue.Disk IOQ > 20-25% would be expensive for any sensitive databases.It can be ok (> 40%) for any other non sensitive databases.So never allow expensive operations during peak hours and monitor those operations always.
Few useful Transaction codes that helps to troubleshoot the Root cause performance issue
—————————————————
Level 1 (OS)
————
AL11 – SAP Directories
OS02 – OS configuration
OS06/ST06 – OS monitor
OSS7 – SAPOSCOL Targets
SM50 – Process Overview
Level 2 (Network)
—————–
OS01 – LAN Check via ping
Level 3 (Oracle)
—————
DBA CockPit – Configuration and Maintenance (Wonderful Tcode)
DB01 – Oracle Lock monitor
DB02 – Database Performance
DB03 – Parameter changes at the database level
DB12 – Backup logs
DB17 – Database check condition
DBCO – Database connections
ST04N – Database Performance Monitor
DB20 – Table statistics
DB16 – Databse checks
DB14 – DBA log display
Level 4 (SAP)
————
AL12 – Buffer monitoring
ST01 – System Trace
ST02 – SAP Memory Monitor
ST03N – System Load monitor
ST05 – Performance Analysis
ST07 – Application Monitor
ST10 – Table Access Statistics
SM50 – Process Overview
SM12 – Lock entries
SM59 – RFC Connections
SMICM – ICM Monitor
SMMS – Message server monitor
Finally CCMS
No comments:
Post a Comment