terça-feira, 12 de abril de 2011

what to do at the moment of incident

Follow the below steps and try to collect the greatest number of
possible evidences:

1) Analyse from the Operating System side:
a. Is it possible to connect to the OS level?
b. Is there any CPU, IO, memory bottleneck (check with OS tools)?


2) Collect the queue status of SAP dialog instance:
a. dpmon pf= l
b. sapcontrol -nr -function ABAPGetWPTable

You should get the queue status of the instace during around 10 or
15 minutes to check if the work process are changing the status in that
timeframe or they are really hanged. For the sapcontrol you can use the
option "-repeat " (call times (-1=forever) with sec
delay) and for dpmon see the scrip of note 675778.


3) Collect queue statistics:
a. dpmon pf= d
b. sapcontrol -nr -function GetQueueStatistic

4) Collect enqueue statistics (ASCS)
a. sapcontrol -nr -function EnqGetStatistic
(as from 710)
b. transaction SM12 -> Extras -> Statistics


5) Collect process status of the instance:
a. sapcontrol -nr -function GetProcessList

If you are not sure where the problem is you should collect this
information for all available instances.


6) Collect the gateway statistics:
a. gwmon


7) Increase the trace level of work process:
a. kill -USR2
b. sapntkill -USR2 (for windows enviroment)

You can get the PID of the work process in SM50, dpmon or sapcontrol
(see item 2). This will increase the dispatcher/work process trace file
and provide further data in the trace. Call this command twice for the
PID.


8) Always SAVE the work folder with the traces:
You do find the instance work folder in
/usr/sap///work
Every incident you must save the work folder of this instance before
the restart or soon as after the start up.


9) Get the timestamps (date and hour) of all events;


10) Collecting the SAP instance parameters:
a. sapcontrol -nr -function ParameterValue
b. sappfpar all pf=LOL_DVEBMGS01_lolserver
c. transaction rspfpar