Tuesday, 24 June 2014

CONCURRENT MANAGERS

Contents
Introduction
Concurrent Managers
Internal Concurrent Manager
Product Concurrent Managers
Control of the Concurrent Managers
Tuning Parameters
Install Issues
Passwords
Demo Database
Fresh Install or Upgrade
USDSOP failed during spawn of INVLIBR process
Migration
Startup of Concurrent Manager Failed
Are the Concurrent Managers Down?
The Concurrent Managers are Down
10.7 NT Concurrent Managers
Errors when Starting the Managers on NT

Concurrent Manager Tables
FND_CONCURRENT_PROCESSES
FND_CONCURRENT_REQUESTS
FND_CONCURRENT_QUEUES


Introduction

Concurrent Managers are one of the most important and often the most feared process within Oracle's Applications. When a customer can not get a concurrent manager up and all of their production or testing is backing up, this is a very stressful time. This paper will try to remove some of the mystery and ease some of your fears. It will NOT go over every possible problem. It will also only cover the concepts for UNIX and not VMS or NT. There are other white papers that cover most of the known errors. This paper will also cover some of the basic tuning aspects of concurrent managers. It will NOT cover system-tuning parameters. Then finally, this paper will go over the three most important tables for the concurrent managers. We will cover some of the most common and our solutions for them.

Concurrent Managers

Internal Concurrent Manager

Concurrent Managers are composed of two different types of managers. One type is the Internal Concurrent Manager (ICM). The ICM controls all of the other concurrent managers. One of its' most important jobs in life is to ensure that the other concurrent managers are up and running. This is accomplished when the ICM wakes up and that is normally every 20 minutes. This will be discussed later in the paper. The ICM also performs the function of applying any incompatibility rules. Starting in 10.7, this can be done by the Conflict resolution Manager. This is where one program can not run with another. The other types of concurrent manager, are product concurrent managers (this is my phrasing).
The ICM logfiles will be located in either $FND_TOP/$APPLLOG or $APPLCSF/$APPLLOG ($APPLLOG is normally set to log). The file name is normally "std.mgr" but the name could also be "SID.mgr". When starting up the ICM, a parameter called "mgrname" can be used to rename the logfile.
APPLCSF is an optional parameter that can come in handy. The reasoning for APPLCSF, you have multiple APPL_TOPs and want to keep all of your logfiles and output files in the same place. Also, if APPLCSF is set all of the files (logfiles and output) for your products will be placed in this directory. This is convenient but never necessary. APPLCSF is set within the environment file on APPL_TOP.

Product Concurrent Managers
During the course of this paper specialized concurrent managers will consist of all concurrent managers with the exception of the ICM (e.g. Standard, Inventory, MRP, PA, and any user defined managers). These are the Concurrent Managers that run your jobs. They run concurrent requests, reports and processes. If the variable of $APPLCSF is NOT set the log and output files will be in each Product Top directory. All of the manager logs will be in either $FND_TOP/$APPLLOG or $APPLCSF/$APPLLOG ($APPLLOG is normally set to log).

Control of the Concurrent Managers
The ICM can be controlled from the operating system or within the applications.
This is the only manager that can be controlled from the operating system. When all of the concurrent managers have to be shutdown, shutting down the ICM is the way to do it. When the ICM is shutdown it will automatically shut down any running concurrent managers. Likewise, when the ICM is started it will restart any concurrent manager that were enabled (before shutdown) and the "target processes > 0" when the ICM was shutdown. Again this can be accomplished from either the o/s or within applications. If a product manager or the standard manager was deactivated prior to the ICM coming down, it will have to be restarted manually after the ICM comes back up.
In Char Mode (under SYSADMIN) go to: \Nav Con Man Admin
In GUI (under SYSADMIN) go to: Concurrent Manager Administrator
    • You need to query the concurrent manager record, by doing a "Query Run".
    • In the Control column either type 'Terminate' or 'Deactivate' or do a QuickPick and then save the screen.
From the command line the following will bring down the ICM:
10.6 and prior Releases
CONCSUB applsys/<password> SYSADMIN 'System Administrator' SYSADMIN CONCURRENT FND ABORT
For 10.7 CONCSUB apps/<password> SYSADMIN 'System Administrator' SYSADMIN CONCURRENT FND ABORT
The ICM can also be started for within the applications and from the command line. From the applications do the following:
In Char Mode (under SYSADMIN) go to: \Nav Con Man Admin
In GUI (under SYSADMIN) go to: Concurrent Manager Administrator
    • You need to query the concurrent manager record, by doing a "Query Run".
    • In the Control column either type 'Activate' or do a QuickPick and then save the screen.
From the command line the following will start the ICM. If you are experiencing problems, you can add "diag=y" to the end of the command line.
10.6 and prior Releases
startmgr sysmgr=applsys/<password> mgrname='SID'
For 10.7 startmgr sysmgr=applsys/<password> mgrname='SID'
The parameters startmgr & sysmgr are mandatory.
mgrname is if you want to name your concurrent managers.
If this option is chosen then the ICM logfile will now be the %lt;name%gt;mgrname.mgr.
diag - will create a more verbose logfile.
This option is normally used to troubleshoot ICM problems.
Other parameters like sleep, pmon, quesiz are described on pg. 8-18 of the System Administration Reference Manual Release 10.
Tuning Parameters

Internal Concurrent Manager
There are three parameters that affect the performance of the ICM. These parameters are sleep time, pmon cycle, and queue size.
First, Sleep time. This default is 60 seconds. This is the number of seconds that the internal concurrent manager waits between times it looks for new concurrent requests.
Second, PMON. This is the number of sleep cycles the ICM waits between time it checks for failed concurrent managers. The default is 20.
Finally, Queue size. This is the number of PMON cycles that the ICM waits between times it checks for new or disabled concurrent managers. The queue size default is 1.
The above parameters can be set during startup of the ICM. The parameters pmon and quesiz can be accomplish if the user does a 'Verify' on the ICM within applications. The path is \Nav Con Man Admin

Product Managers
The sleep time for the product managers are set to the default of 60 seconds. This is the number of seconds that the manager waits between checking the list of pending concurrent requests. The sleep parameter is modified in the 'Work Shifts' zone of the 'Define Concurrent Manager' screen. \Nav Con Man Def
The last tuning parameter is called 'Cache Size'. The Cache Size is the number of concurrent requests that the manager picks up from the FND_CONCURRENT_REQUST table when the manager wakes up. For customer defined managers the Cache Size is either blank or set to 0. This number should be set to twice the number of target processes. If the target process is set to three (\N C M A) then the Cache Size should be six.
The reasoning is, if the manager has three target processes it could run three jobs. But if it only picks up one job every 60 seconds then you are losing valuable processing time. If it picks up six jobs then the managers will continuously be processing. This parameter can ONLY be set though the GUI. It can not be set from within character mode.

Install Issues

Passwords
10.7 the APPLSYS and APPS passwords MUST be the same. The user can change the default passwords, but both passwords MUST be the same. The problem is functionality will be lost if the passwords are different. You will be able to start the concurrent managers, but other items (i.e. adadmin, adpatch) will not work correctly. If the user decides to change the passwords they MUST follow the steps in GSX 1009719.6 - Resetting Applications Passwords (GSX is Oracles Global Solution Exchange)

Demo Database
When installing the 10.7 demo database the user must be aware that they might need to update a couple of tables for the concurrent managers to come up properly.
The reasoning is these are compressed .dbf files and the data that is contained within them is from another system. The following tables need to be updated:
(This is how we cleanup the FND tables)
truncate table FND_CONCURRENT_REQUESTS
truncate table FND_CONCURRENT_PROCESSES
After the above have been updated start the ICM in the diagnostics state. This will give you a more verbose logfile.

10.7

startmgr sysmgr=apps/<password> diag=y

10.6 and prior

startmgr sysmgr=applsys/<password> diag=y
When the managers come up look at the logfile 'std.mgr'. On the first page you should see (if you used diagnostics):
Process Monitor (PMON) Method is LOCK
If you do NOT see LOCK do the following. It is always a good idea to check with WorldWide Support before changing the PMON method.
    • Shutdown the ICM (wait until it comes down)
    • Go to $FND_TOP/sql/afimpmon.sql
    • Run the above script as applsys/<password>
    • Type the word 'dual' when it asked.
    • Type 'LOCK'
The above will set the PMON method to LOCK. Now restart the ICM.
The PMON method gives the customer three choices and they are:
OS = The internal concurrent manager determines whether or not a process is still running by sending a kill -0 to the process. This is merely a test to see if the process is still here. (THIS METHOD IS NO LONGER USED.)
RDBMS = The internal concurrent manager queries against v$process to retrieve the session id (the one that started the request) and the operating system process id. The reason it needs to query against both columns and not just the OS process id (pid) is because if there are concurrent managers running on two different clients against the same RDBMS then two requests could have the same OS pid.
LOCK = The internal concurrent manager tries to get a lock on the process it is monitoring. The name of the lock is determined by a sequence (which is the id of the individual managers (standard, etc)) and the program in question. Once the manager is able to get the lock, then it knows the process is no longer running. Two installs in one RDBMS share the lock method because if the two managers get the same sequence id (one in each instance) then the second manager will never be able to start because the ICM will always think its already running.
(THIS IS THE RECOMMENDED METHOD FROM 10.6 ON.) Caution: If using DCP, you can only have one system using LOCK. Others must be set to rdbms.
If you are still getting errors, check if you have any invalid objects in your database that start with FND_CON% owned by APPS or APPLSYS. If you have invalid objects recompile them, they could be causing the problem. If the packages will NOT recompile the packages will need to be recreated. Please contact support to help find and recreate the packages.

Fresh Install or Upgrade
With a fresh install you will not have to cleanup the concurrent manager tables (they have just been installed). Once you have tried to start the ICM go to the logfiles and check for errors. It is possible to find the following:
If you find the following -- Please note the following are NOT all of the possible errors you could receive - For a more comprehensive list of errors please see GSX 1011248.6 What are the common concurrent manager issues?:
If the manager fails to start please go to that section:

USDSOP failed during spawn of <Product>LIBR process
The above normally happens when an application is installed as shared. The seed data has been placed into FND_CONCURRENT_QUEUES and there is no executable on the file system. This is because the install has enabled that manager and the ICM cannot find the corresponding executable. The best example is if '<Product>LIBR' is getting the USDSOP error and <Product> was not in installed or it is installed as shared.
The fix for this problem is to set the target processes to 0 in 'Define Concurrent Managers' screen.
In Char Mode (under SYSADMIN) go to: \Nav Con Man Define
    • Go to the last zone on the page and pick 'Work Shifts'
    • Under target processes place a 0
    • Save the Screen
    • Go to \Nav Con Man Admin
    • On the ICM tab over to the 'Control field' and do a 'Verify'
In GUI (under SYSADMIN) go to: Concurrent Manager Define
    • Select 'Work Shifts'
    • Place a 0 in 'Processes'
    • Save the Screen
    • Go to Concurrent Manager Admin
    • Select 'Verify'
APP-01167 - Internal Concurrent Manager has encountered an error:

This is a generic error which may be followed by an Oracle error or:

APP-01054 - Distributed concurrent manager enabled, lock PMON method requested

APP-01114 - AFPCAL received failure code while running FNDCPMGR

Solution: Make certain that $APPLDCP is set to off or, if using distributed concurrent processing, ensure the PMON method is LOCK. (Only on primary node. Other nodes should use rdbms.)

Migration
During a migration you may find the same type of problems as in the DEMO install. The reasoning is, you are moving from one platform to another. If you have not moved your file system the ICM may not come up. The fix may be as simple as cleaning up the concurrent manager tables. It is possible that the concurrent manager is looking for files that were on the other system.

Start Up of Concurrent Manager Failed
Now we are getting to point where users might be breathing down your neck. You need to ask a few basic questions:
      1. Has anything changed either on your system or in the applications??
      2. What are the errors your receiving??
        1. APP-1167
        2. APP-1114
Are the Concurrent Managers Down? Sounds like a stupid question. BUT, from within your applications do the target process equal the number of actual processes? If they do then technically your managers are not down. Now from the operating system do you see FNDLIBR processes when you do a:
ps -ef | grep FNDLIBR
Are the number of FNDLIBR processes equal to 1 + the number of target processes from within your applications?
Please Note: If you have more than set of applications running on the system then you WILL get more back from your grep. But, if these are your only applications then you should not get anything but the grep back.
If the managers are not down submit a concurrent request to bring them down:

CONCSUB applsys/<password> SYSADMIN 'System Administrator' SYSADMIN CONCURRENT FND ABORT

If this request only goes to pending never runs then we will need to kill the concurrent managers from the command line.

The Concurrent Managers are Down
1) Let's go to the logfiles. Remember they are in either $APPLCSF/$APPLLOG or $FND_TOP/$APPLLOG.
2) Rename the logfile. The reasoning here is, the ICM only appends to the log and it is much easier to start with a fresh log.
Normally this file is called "std.mgr" but if you name your concurrent managers by using the "mgrnam" parameter while starting the ICM then the logfile will normally be "SID.mgr".
    • Lets reset the tables in sqlplus.
    • Logon as applsys/<password>
select count(*) from fnd_concurrent_requests where status_code='T';
    • You should get 0 count. If you do get a count back then delete them.
select count(*) from fnd_concurrent_requests where status_code='R'
    • You should get 0 count. If you do get a count back then delete them.
You can remove all of the completed requests - This will reduce the size of the FND_CONCURRENT_REQUESTS table.
    • Delete from FND_CONCURRENT_REQUESTS where status_code = 'C';
The Next update statements should never be done without the help of support or development. It is possible to cause data corruption.
    • Lets update FND_CONCURRENT_QUEUES update FND_CONCURRENT_QUEUES set running_processes=0, max_processes=0;
    • The next one will remove all of entries in the table.
truncate table FND_CONCURRENT_PROCESSES;
    • Go back to the operating system (O/S)
    • From the O/S lets start the concurrent managers with diagnostics mode turned on:
10.7
startmgr sysmgr=apps/<password> diag=y

10.6 or prior

startmgr sysmgr=applsys/<password> diag=y
    • Now lets check the logfile for any errors. If you have any errors and they can range from USDSOP to APP-????. We need to address those.
    • Now lets see if the managers have come up. This can be done from within the applications
In GUI: Concurrent Manager Administer
In Char: \Nav Con Man Admin
    • If the managers have not come up and you do not have any errors lets check for invalid objects.
    • In sqlplus as system/<password>
Select count(*) from dba_objects where status='INVALID';
    • If the number comes back and is less than 10 then we need to know if any of then start with 'FND_'. If so then lets compile them and try to restart the ICM again.
    • If the number comes back and is greater than 10 and less than 100. This can still be alright. Again we need to find out if any of them start with 'FND_'. If so then lets compile them and try to start the ICM again.
    • If the number is greater than 100 we need to recompile the objects. Once that is complete then we can try the ICM again.
    • What is the PMON method? If you've started the ICM with diagnostics then you will be able to tell what method the PMON method is set to. If the ICM is up, you can also check the PMON method by running afimchk.sql.
Process Monitor (PMON) Method is LOCK
If you do NOT see LOCK do the following. It is always a good isea to check with WorldWide Support before changing the PMON method.
    • Shutdown the ICM (wait until it comes down)
    • Go to $FND_TOP/sql/afimpmon.sql
    • Run the above script as applsys/<password>
    • Type the word 'dual' when it asked.
    • Type 'LOCK'
The PMON method gives the customer three choices and they are:
OS = The internal concurrent manager determines whether or not a process is still running by sending a kill -0 to the process. This is merely a test to see if the process is still there. (THIS METHOD IS NO LONGER USED.)
RDBMS = The internal concurrent manager queries against v$process to retrieve the session id (the one that started the request) and the operating system process id. The reason it needs to query against both columns and not just the OS pid is because if there are concurrent managers running on two different clients against the same RDBMS then two requests could have the same OS pid.
LOCK = The internal concurrent manager tries to get a lock on the process it is monitoring. The name of the lock is determined by a sequence (which is the id of the individual managers (standard, etc)) and the program in question. Once the manager is able to get the lock, then it knows the process is no longer running.
Two installs in one RDBMS share the lock method
because if the two managers get the same sequence id (one in each
instance) then the second manager will never be able to start
because the ICM will always think its already running.
(THIS IS THE RECOMMENDED METHOD FROM 10.6 ON.)

10.7 NT Concurrent Managers:
There are a few interesting issues with concurrent managers on NT. The concurrent managers are services on the NT. They have to be added and removed with the following commands.
TO Add: (pg. 5-9 NT Install Manual) C:\> cmsrvadm add <APPL_CONFIG> [automatic | manual]
Note: When adding the service you MUST type in the UserID and password. It is does NOT default. If you do NOT type in the password the you will get an ORA-1017 error in the event viewer log.
To Remove: (pg. A-5 NT Install Manual) C:\> cmsrvadm remove <APPL_CONFIG>
Normally the <APPL_CONFIG> will be either APPLSYS or APPDEMO. One point, only the user who added the service will be allowed to remove the service. After the service has been added then it needs to be started from 'Services' in the 'Control Panel'.

Errors when Starting the Managers on NT
Now if your like most people, at some point you'll get an error when starting the Concurrent Managers. The normal generic error from the 'Services' goes something like this:
At the command line:
Could not start the OracleConcMgr<APPL_COMFIG> service on 'Machine Name'
Error 2140: An internal Windows NT error occurred.
(The above error doesn't say a whole lot - to get a better error message go to: Start - Programs - Administrative Tools - Event Viewer Go to Log and Click on Application)
Find the Source of OracleConcMgr<APPL_CONFIG>
The Event Log Says:
The Description for Event ID (0) in Source [OracleConcMgr<APPL_CONFIG>] could not be found. It contains the following insertion string(s): Cannot connect to database. Sid=ORACLE_SID ORA error 12154.
The fix:
Add a loop-back in your tnsnames with the ORACLE_SID as the alias.
ANOTHER ERROR:

At the command line:

Service not started. Check the event log for more information.
Exit Code: 0
Service Specific Exit Code: 0

The Event Log Says:

The Description for Event ID (0) in Source [OracleConcMgr<APPL_CONFIG>] could not be found. It contains the following insertion string(s): OracleConcMgr<APPL_CONFIG> error 3: The system cannot find the path specified.
Cannot delete Concurrent Manager logfile.
\%FND_TOP%\logdemo\CM_LOG. Please check that these are no processes running.

The Fix:

Check to ensure the directory 'LOGDEMO' is in FND_TOP. On the NT even if APPLCSF is set, the concurrent manager log files will still write to FND_TOP/log.
ANOTHER ERROR:

At the command line:

Service not started. Check the event log for more information.
Exit Code: 0
Service Specific Exit Code: 0

The Event Log Says:

The Description for Event ID (0) in Source [OracleConcMgr<APPL_CONFIG>] could not be found. It contains the following insertion string(s): Cannot connect to database SID = <ORACLE_SID> ORA error 1017.

The Fix:

One of the following were typed incorrectly:
APPS schema's UserID:
APPS schema's password:
LAST ERROR I'LL LIST:

At the command line:

Failed to start Service OracleConcMgr<APPL_CONFIG>
Error: 1069 The service did not start due to a logon failure

The Event Log Says:

No event log for this error
The Fix:
The 'Windows NT user account password' was typed incorrectly:
Remove the service and add it back.
Concurrent Manager Tables
The following tables are the most important tables for the concurrent managers. These are not the only tables that have to do with the concurrent managers.

FND_CONCURRENT_PROCESSES
Stores information about concurrent managers. Each row includes values that identify the ORACLE process, the operating system process, and the concurrent manager (QUEUE_APPLICATION_ID and CONCURRENT_QUEUE_ID). You need one row for each instance of a running concurrent manager (each process), as well as one row for the Internal Concurrent Manager. AOL uses this table to keep a history of concurrent managers. You should never update this table manually.
Purge Concurrent Requests and /or Managers Data program to delete history information periodically.

FND_CONCURRENT_REQUESTS
Stores information about individual concurrent request. Each row includes values that identify the particular request and its parameters, such as who submitted it, the request type, whether the request should run sequentially with other requests in the same logical database (SINGLE_THREAD_FLAG), whether the request is on hold (HOLD_FLAG), whether to display the request in the View Requests form for the request submitter to review, and what status a phase the concurrent request is in. Each row also includes values that identify the concurrent program, its execution and argument methods, and whether the program is constrained (QUEUE_METHOD_CODE). Each row also includes flags that indicate the requests, priority related to other requests, as well as values that specify how the concurrent manager should print program output, if any.
ARGUMENT1 through ARGUMENT25 contain any arguments the application passes to the concurrent program. If the concurrent program needs more that 25 arguments to run, the first 25 arguments are stored in this table, ARGUMENT26 through ARGUMENT100 are stored in FND_CONC_REQUEST_ARGUMENTS. ARGUMENT_TEXT contains the concatenation of concurrent requests arguments and COMPLETION_TEXT contains a message about how the request completed. The row also contains dates that the request was submitted, requested to start and actually run. REQ_INFORMATINO is used with reports sets to remember the status of the request between runs. When the request is set to use automatic resubmission, RESUBMITTED is a flag to indicate whether the request has been resubmitted or not.
RESUBMIT_INTERVAL_TYPE_CODE specifies whether to start interval count down from the requested start time or the completion of the request. RESUBMIT_INTERVAL_TYPE_CODE indicates whether interval unit is in Days, Hours, Minutes, or Months. RESUBMIT_TIME set the time of the day to rerun the concurrent request. RESUBMIT_INTERVAL indicates the number of units of time when the identical request will be resubmitted. RESUBMIT_END_DATE is the date the request stops resubmitting itself. IS_SUB_REQUEST is a flag that identifies a child request and HAS_SUB_REQUEST is a flag that identifies a parent request. Each child request also needs to have values in PARENT_REQUEST_ID to show what parent request submitted the child request and PRIORITY_REQUEST_ID to tell what priority the parent request has and what priority the child request should have. You need one row for each concurrent request. Though you should occasionally delete from this table, you should not modify any of its data.

FND_CONCURRENT_QUEUES
Stores information about concurrent managers. Each row includes the name and description of a concurrent manager, as well as values that identify the program library attached to the manager. CACHE_SIZE contains the buffer size (how many requests a concurrent manager should "remember" each time it checks the list of waiting requests), and MAX_PROCESSES determines the maximum number of concurrent requests a manager can run at a time (depends on work shifts). Then manager process automatically updates RUNNING_PROCESSES during start up and shut down time. Each row also includes the time in seconds for the concurrent manager to wait before checking for pending concurrent requests, and a code to activate, deactivate, or reset the manager. You need one row for each concurrent manager defined at your site, with a minimum of one row. AOL uses this information to activate concurrent managers.

Wednesday, 21 May 2014

Things to check when TEMP TABLESPACE USAGE IS HIGH.

           

Things to check when TEMP TABLESPACE USAGE IS HIGH.

You need to identify the sqls in the temp tablespace over time executing queries over and over again at some interval to determine this.

Please run the following at intervals - say every 10 minutes.


1>) Temporary Tablespace size:

The following query displays information about usage of temporary tablespace
===
SELECT A.tablespace_name tablespace, D.mb_total,
SUM (A.used_blocks * D.block_size) / 1024 / 1024 mb_used,
D.mb_total - SUM (A.used_blocks * D.block_size) / 1024 / 1024 mb_free
FROM v$sort_segment A,
(
SELECT B.name, C.block_size, SUM (C.bytes) / 1024 / 1024 mb_total
FROM v$tablespace B, v$tempfile C
WHERE B.ts#= C.ts#
GROUP BY B.name, C.block_size
) D
WHERE A.tablespace_name = D.name
GROUP by A.tablespace_name, D.mb_total;
====

2) >) Temporary Segments

The following query displays information about all sort segments in the database. Typically, Oracle will create a new sort segment the very first time a sort to disk occurs in a new temporary tablespace. The sort segment will grow as needed, but it will not shrink and will not go away after all sorts to disk are completed. A database with one temporary tablespace will typically have just one sort segment.

===
spool /PRD02/prdora1/temporary_segments.html
set markup html on
alter session set nls_date_format='DD-MON-YYYY HH24:MI:SS';
select sysdate from dual;

SELECT A.tablespace_name tablespace, D.mb_total,
SUM (A.used_blocks * D.block_size) / 1024 / 1024 mb_used,
D.mb_total - SUM (A.used_blocks * D.block_size) / 1024 / 1024 mb_free
FROM v$sort_segment A,
(
SELECT B.name, C.block_size, SUM (C.bytes) / 1024 / 1024 mb_total
FROM v$tablespace B, v$tempfile C
WHERE B.ts#= C.ts#
GROUP BY B.name, C.block_size
) D
WHERE A.tablespace_name = D.name
GROUP by A.tablespace_name, D.mb_total;
set markup html off
spool off

===

3>) Sort Space Usage by Session

The following query displays information about each database session that is using space in a sort segment. Although one session may have many sort operations active at once, this query summarizes the information by session.

===
spool /PRD02/prdora1/sort_space_usage_session.html
set markup html on
alter session set nls_date_format='DD-MON-YYYY HH24:MI:SS';
select sysdate from dual;
SELECT S.sid || ',' || S.serial# sid_serial, S.username, S.osuser, P.spid, S.module, S.program, SUM (T.blocks) * TBS.block_size / 1024 / 1024 mb_used, T.tablespace,
COUNT(*) sort_ops
FROM v$sort_usage T, v$session S, dba_tablespaces TBS, v$process P
WHERE T.session_addr = S.saddr
AND S.paddr = P.addr
AND T.tablespace = TBS.tablespace_name
GROUP BY S.sid, S.serial#, S.username, S.osuser, P.spid, S.module, S.program, TBS.block_size, T.tablespace
ORDER BY 8 desc;
set markup html off
spool off

===

4>) Sort Space Usage by Statement

The following query displays information about each statement that is using space in a sort
segment. The query displays information about each statement using space in a sort segment,
including information about the database session that issued the statement and the temporary
tablespace and amount of sort space being used.

===
spool /PRD02/prdora1/sort_space_usage_statement.html
set markup html on
alter session set nls_date_format='DD-MON-YYYY HH24:MI:SS';
select sysdate from dual;
SELECT S.sid || ',' || S.serial# sid_serial, S.username,
T.blocks * TBS.block_size / 1024 / 1024 mb_used, T.tablespace,
T.sqladdr address, Q.hash_value, Q.sql_text
FROM v$sort_usage T, v$session S, v$sqlarea Q, dba_tablespaces TBS
WHERE T.session_addr = S.saddr
AND T.sqladdr = Q.address (+)
AND T.tablespace = TBS.tablespace_name
ORDER BY 4 desc;
set markup html off
spool off
===

Please note, that on executing the above sqls at regular intervals (during high temp space usage),will help you to locate any particular sql which chews up the temp space, then its an ideal candidate for sql tuning.  

Wednesday, 30 April 2014

Reorganizing a Table using DBMS_REDEFINITION



Reorganizing a Table using DBMS_REDEFINITION

Posted at Thursday, July 03, 2008
There are many methods to reorganize/defragment a table in Oracle. If you want to avoid downtime, however, your choices are limited to using the 'SHRINK SPACE' feature and/or employing the DBMS_REDEFINITION PL/SQL package.

This post demonstrates the latter method. The DBMS_REDEFINITION package is most commonly used for redefining a table, however can be used to reorganize a table amongst other things. Here is a great article which covers the package in a little more detail: On-line Table Reorganization and Redefinition.

This example rebuilds the OE.ORDER_ITEMS table using DBMS_REDEFINITION.

Collect information about the table's segments prior to the reorg
ANALYZE TABLE oe.order_items COMPUTE STATISTICS;

SELECT avg_row_len, num_rows, chain_cnt

  FROM dba_tables

 WHERE table_name = 'ORDER_ITEMS' AND owner = 'OE';

SELECT ROUND (BYTES / 1024) kb

  FROM dba_segments

 WHERE owner = 'OE' AND segment_name = 'ORDER_ITEMS';

Check whether the table in question violates any redefinition restrictions
EXEC dbms_redefinition.can_redef_table('OE','ORDER_ITEMS', dbms_redefinition.cons_use_pk);

Create the Intermediary Table - in my case, I use the exact same structure
CREATE TABLE OE.ORDER_ITEMS_REDEF

(

  ORDER_ID      NUMBER(12),

  LINE_ITEM_ID  NUMBER(3)                       NOT NULL,

  PRODUCT_ID    NUMBER(6)                       NOT NULL,

  UNIT_PRICE    NUMBER(8,2),

  QUANTITY      NUMBER(8)

) TABLESPACE USERS;

Start the Redefinition Process
EXEC dbms_redefinition.start_redef_table('OE', 'ORDER_ITEMS', 'ORDER_ITEMS_REDEF');

You will notice that Oracle creates a Materialized View in order to store the table changes, etc.
SELECT COUNT (*)

  FROM oe.mlog$_order_items;

Sync up the Source table with the Intermediary table
EXEC dbms_redefinition.sync_interim_table('OE', 'ORDER_ITEMS', 'ORDER_ITEMS_REDEF');

Duplicate the table's dependent objects on the Intermediary table
CREATE INDEX oe.item_order_redef_ix ON oe.order_items_redef

(order_id) LOGGING TABLESPACE users;

CREATE INDEX oe.item_product_redef_ix ON oe.order_items_redef

(product_id) LOGGING TABLESPACE users;

CREATE UNIQUE INDEX oe.order_items_redef_pk ON oe.order_items_redef

(order_id, line_item_id) LOGGING TABLESPACE users;

CREATE UNIQUE INDEX oe.order_items_redef_uk ON oe.order_items_redef

(order_id, product_id) LOGGING TABLESPACE users;

ALTER TABLE oe.order_items_redef ADD (

  CONSTRAINT order_items_redef_pk

 PRIMARY KEY (order_id, line_item_id) USING INDEX TABLESPACE USERS);

ALTER TABLE oe.order_items_redef ADD (

  CONSTRAINT order_items_redef_order_id_fk

 FOREIGN KEY (order_id)

 REFERENCES oe.orders (order_id)

 ON DELETE CASCADE DISABLE NOVALIDATE);

ALTER TABLE oe.order_items_redef ADD (

  CONSTRAINT order_items_rdef_product_id_fk

 FOREIGN KEY (product_id)

 REFERENCES oe.product_information (product_id) DISABLE NOVALIDATE);

GRANT SELECT ON oe.order_items_redef TO bi;

GRANT SELECT ON oe.order_items_redef TO pm;

Notice how, I create the foreign key constraints initially as disabled.

To simplify things, I use the following to duplicate any triggers to the Intermediary table
DECLARE

 retval NUMBER(5);

BEGIN

 dbms_redefinition.copy_table_dependents('OE', 'ORDER_ITEMS', 'ORDER_ITEMS_REDEF', copy_indexes=>0, copy_triggers=>TRUE, copy_constraints=>FALSE,copy_privileges=>FALSE, num_errors=>retval);

 dbms_output.put_line(retval);

END;

/

Disable any foreign key constraints on the Source table before finishing the table redefinition process
ALTER TABLE oe.order_items DISABLE CONSTRAINT order_items_order_id_fk;



ALTER TABLE oe.order_items DISABLE CONSTRAINT order_items_product_id_fk;

Finish the Redefinition
EXEC dbms_redefinition.finish_redef_table('OE', 'ORDER_ITEMS', 'ORDER_ITEMS_REDEF');

Drop the Intermediary Table
DROP TABLE oe.order_items_redef;

Rename any dependent objects
ALTER TABLE oe.order_items RENAME CONSTRAINT order_items_rdef_product_id_fk TO order_items_product_id_fk;

ALTER TABLE oe.order_items RENAME CONSTRAINT order_items_redef_order_id_fk TO order_items_order_id_fk;

ALTER TABLE oe.order_items RENAME CONSTRAINT order_items_redef_pk TO order_items_pk;

ALTER INDEX oe.item_order_redef_ix RENAME TO item_order_ix;

ALTER INDEX oe.item_product_redef_ix RENAME TO item_product_ix;

ALTER INDEX oe.order_items_redef_pk RENAME TO order_items_pk;

ALTER INDEX oe.order_items_redef_uk RENAME TO order_items_uk;

At this point, the table should be reorganized/defragmented. Optionally, compare the storage footprint post-redefintion
ANALYZE TABLE oe.order_items COMPUTE STATISTICS;

SELECT avg_row_len, num_rows, chain_cnt

  FROM dba_tables

 WHERE table_name = 'ORDER_ITEMS' AND owner = 'OE';

SELECT ROUND (BYTES / 1024) kb

  FROM dba_segments

 WHERE owner = 'OE' AND segment_name = 'ORDER_ITEMS';

Tuesday, 29 April 2014

Use this pl/sql script to determine which indexes need defragmentation.

Use this pl/sql script to determine which indexes need defragmentation.
 
 
create or replace procedure system.my_indfrag (schema_in  in varchar2)
as
 
        cursor ind_cur(schema varchar2) is
        (select owner, index_name from dba_indexes where owner=
        upper(schema));
 
        owner varchar2(40);
        ind_name varchar2(30);
 
        cursor badness_cur is
        (select name, del_lf_rows/(lf_rows+0.00001) badness, height
        from system.ind_temp_table
        where del_lf_rows/(lf_rows+0.00001) > .15 or height > 5 );
 
        bad_row badness_cur%rowtype;
 
begin
 
        /* 
         * Assumes that you have created a table system.ind_temp_table of 
the same
         * structure as index_stats. 
         */
 
        dbms_output.enable(1000000);
        delete from system.ind_temp_table;
        
        open ind_cur(schema_in);
        fetch ind_cur into owner, ind_name; 
        while ind_cur%found
        loop
               begin <>
                       execute immediate('analyze index ' || owner  || 
'.' || 
                                              '"' || ind_name || '"' || 
' validate structure');
                       insert into system.ind_temp_table
                       (select * from index_stats);
                       commit;
               exception
                       when others then
                               dbms_output.put_line(ind_name || ':' || 
chr(09) || sqlerrm);
                               rollback;
               end analyze_block;
 
               fetch ind_cur into owner, ind_name;
        end loop;
        close ind_cur;
 
        open badness_cur;
        fetch badness_cur into bad_row;
        if badness_cur%found then
               dbms_output.put_line(rpad('Index Name',30) || chr(09) || 
'% Del. Entries' || chr(09) || 'Height');
        else
               dbms_output.put_line('No indexes in this schema need 
rebuilding.');
        end if;
        while badness_cur%found
        loop
               dbms_output.put_line(rpad(bad_row.name,30) || chr(09) 
                       || rpad(to_char(round(bad_row.badness*100,0)) || 
'%',14) || chr(09) ||
                       bad_row.height);
               fetch badness_cur into bad_row;
        end loop;
        close badness_cur;
 
end;
        
 

Sample output

ASP>  exec my_indfrag('HR_DATA');
chairs_index1:  ORA-01418: specified index does not exist
ASP>  exec system.my_indfrag('HR_DATA');
Index Name                      % Del. Entries  Height
IDX_DISC_PERSON_APPT_VERIFIED   50%             2
IDX_DISC_PERSON_APPT_VERIFIED   58%           1
IDX_LEAVE_END_RPT               63%           2
IDX_LEAVE_START_RPT             60%           3
IDX_LEAVES_NAME_KEY             65%           1
TC_WORKGRPS_PK                  48%           1
TC_SYSTEM_ACCRUALS_PK           44%           2
TC_WORKGROUP_SETTINGS_PK        10%           6
 
PL/SQL procedure successfully completed.

How to check blocking locks



What's blocking my lock?

If you've ever gotten a phone call from an annoyed user whose transaction just won't go through, or from a developer who can't understand why her application sessions are blocking each other, you know how useful it can be to identify not just whose lock is doing the blocking, but what object is locked. Even better, you can identify the exact row that a session is waiting to lock.

Create a blocking lock

To begin, create a situation where one user is actively blocking another. Open two sessions. Issue the following commands in Session 1 to build the test table:
SQL> create table tstlock (foo varchar2(1), bar varchar2(1));
 
Table created.
 
SQL> insert into tstlock values (1,'a'); 
 
1 row created.
 
SQL> insert into tstlock values (2, 'b');
 
1 row created.
 
SQL> select * from tstlock ;
 
FOO BAR
--- ---
1   a
2   b
 
2 rows selected.
 
SQL> commit ;
 
Commit complete.
Now grab a lock on the whole table, still in Session 1:
SQL> select * from tstlock for update ;
And in Session 2, try to update a row:
SQL> update tstlock set bar=
  2  'a' where bar='a' ;
This statement will hang, blocked by the lock that Session 1 is holding on the entire table.

Identify the blocking session

Oracle provides a view, DBA_BLOCKERS, which lists the SIDs of all blocking sessions. But this view is often, in my experience, a good bit slower than simply querying V$LOCK, and it doesn't offer any information beyond the SIDs of any sessions that are blocking other sessions. The V$LOCK view is faster to query, makes it easy to identify the blocking session, and has a lot more information.
SQL> select * from v$lock ;
 
ADDR     KADDR           SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK
-------- -------- ---------- -- ---------- ---------- ---------- ---------- ---------- ----------
AF9E2C4C AF9E2C60        479 TX     131078      16739          0          6        685          0
ADDF7EC8 ADDF7EE0        422 TM      88519          0          3          0        697          0
ADDF7F74 ADDF7F8C        479 TM      88519          0          3          0        685          0
ADEBEA20 ADEBEB3C        422 TX     131078      16739          6          0        697          1
....     ....            ... ...      ....       ....       ....       ....        ....      ....
Note the BLOCK column. If a session holds a lock that's blocking another session, BLOCK=1. Further, you can tell which session is being blocked by comparing the values in ID1 and ID2. The blocked session will have the same values in ID1 and ID2 as the blocking session, and, since it is requesting a lock it's unable to get, it will have REQUEST > 0.
In the query above, we can see that SID 422 is blocking SID 479. SID 422 corresponds to Session 1 in our example, and SID 479 is our blocked Session 2.
To avoid having to stare at the table and cross-compare ID1's and ID2's, put this in a query:
SQL> select l1.sid, ' IS BLOCKING ', l2.sid
  2  from v$lock l1, v$lock l2
  3  where l1.block =1 and l2.request > 0
  4  and l1.id1=l2.id1
  5  and l1.id2=l2.id2
SQL> /
 
       SID 'ISBLOCKING'         SID
---------- ------------- ----------
       422  IS BLOCKING         479
 
1 row selected.
Even better, if we throw a little v$session into the mix, the results are highly readable:
SQL> select s1.username || '@' || s1.machine
  2  || ' ( SID=' || s1.sid || ' )  is blocking '
  3  || s2.username || '@' || s2.machine || ' ( SID=' || s2.sid || ' ) ' AS blocking_status
  4  from v$lock l1, v$session s1, v$lock l2, v$session s2
  5  where s1.sid=l1.sid and s2.sid=l2.sid
  6  and l1.BLOCK=1 and l2.request > 0
  7  and l1.id1 = l2.id1
  8  and l2.id2 = l2.id2 ;
 
 
BLOCKING_STATUS
----------------------------------------------------------------------------------------------------
BULKLOAD@yttrium ( SID=422 )  is blocking BULKLOAD@yttrium ( SID=479 )
 
1 row selected.
There's still more information in the v$lock table, but in order to read that information, we need to understand a bit more about lock types and the cryptically-named ID1 and ID2 columns.

Lock type and the ID1 / ID2 columns

In this case, we already know that the blocking lock is an exclusive DML lock, since we're the ones who issued the locking statement. But most of the time, you won't be so lucky. Fortunately, you can read this information from the v$lock table with little effort.
The first place to look is the TYPE column. There are dozens of lock types, but the vast majority are system types. System locks are normally only held for a very brief amount of time, and it's not generally helpful to try to tune your library cache, undo logs, etc. by looking in v$lock! (See the V$LOCK chapter in the Oracle Database Reference for a list of system lock types.)
There are only three types of user locks, TX, TM and UL. UL is a user-defined lock -- a lock defined with the DBMS_LOCK package. The TX lock is a row transaction lock; it's acquired once for every transaction that changes data, no matter how many objects you change in that transaction. The ID1 and ID2 columns point to the rollback segment and transaction table entries for that transaction.
The TM lock is a DML lock. It's acquired once for each object that's being changed. The ID1 column identifies the object being modified.

Lock Modes

You can see more information on TM and TX locks just by looking at the lock modes. The LMODE and REQUEST columns both use the same numbering for lock modes, in order of increasing exclusivity: from 0 for no lock, to 6 for exclusive lock. A session must obtain an exclusive TX lock in order to change data; LMODE will be 6. If it can't obtain an exclusive lock because some of the rows it wants to change are locked by another session, then it will request a TX in exclusive mode; LMODE will be 0 since it does not have the lock, and REQUEST will be 6. You can see this interaction in the rows we selected earlier from v$lock:
ADDR     KADDR           SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK
-------- -------- ---------- -- ---------- ---------- ---------- ---------- ---------- ----------
AF9E2C4C AF9E2C60        479 TX     131078      16739          0          6        685          0
ADEBEA20 ADEBEB3C        422 TX     131078      16739          6          0        697          1
Note that ID1 and ID2 in Session 2, which is requesting the TX lock (LMODE=0, REQUEST=6), point back to the rollback and transaction entries for Session 1. That's what lets us determine the blocking session for Session 2.
You may also see TX locks in mode 4, Shared mode. If a block containing rows to be changed doesn't have any interested transaction list (ITL) entries left, then the session acquires a TX lock in mode 4 while waiting for an ITL entry. If you see contention for TX-4 locks on an object, you probably need to increase INITRANS for the object.
TM locks are generally requested and acquired in modes 3, aka Shared-Row Exclusive, and 6. DDL requires a TM Exclusive lock. (Note that CREATE TABLE doesn't require a TM lock -- it doesn't need to lock any objects, because the object in question doesn't exist yet!) DML requires a Shared-Row Exclusive lock. So, in the rows we selected earlier from v$lock, you can see from the TM locking levels that these are DML locks:
ADDR     KADDR           SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK
-------- -------- ---------- -- ---------- ---------- ---------- ---------- ---------- ----------
ADDF7EC8 ADDF7EE0        422 TM      88519          0          3          0        697          0
ADDF7F74 ADDF7F8C        479 TM      88519          0          3          0        685          0

Identifying the locked object

Now that we know that each TM row points to a locked object, we can use ID1 to identify the object.
SQL> select object_name from dba_objects where object_id=88519 ;
 
OBJECT_NAME
--------------
TSTLOCK
Sometimes just knowing the object is enough information; but we can dig even deeper. We can identify not just the object, but the block and even the row in the block that Session 2 is waiting on.

Identifying the locked row

We can get this information from v$session by looking at the v$session entry for the blocked session:
SQL> select row_wait_obj#, row_wait_file#, row_wait_block#, row_wait_row#
  2* from v$session where sid=479 ;
 
ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW#
------------- -------------- --------------- -------------
        88519             16          171309             0
This gives us the object ID, the relative file number, the block in the datafile, and the row in the block that the session is waiting on. If that list of data sounds familiar, it's because those are the four components of an extended ROWID. We can build the row's actual extended ROWID from these components using the DBMS_ROWID package. The ROWID_CREATE function takes these arguments and returns the ROWID:
SQL> select do.object_name,
  2  row_wait_obj#, row_wait_file#, row_wait_block#, row_wait_row#,
  3  dbms_rowid.rowid_create ( 1, ROW_WAIT_OBJ#, ROW_WAIT_FILE#, ROW_WAIT_BLOCK#, ROW_WAIT_ROW# )
  4  from v$session s, dba_objects do
  5  where sid=543
  6  and s.ROW_WAIT_OBJ# = do.OBJECT_ID ;
 
OBJECT_NAME     ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW# DBMS_ROWID.ROWID_C
--------------- ------------- -------------- --------------- ------------- ------------------
TSTLOCK                 88519             16          171309             0 AAAVnHAAQAAAp0tAAA
And, of course, this lets us inspect the row directly.
SQL> select * from tstlock where rowid='AAAVnHAAQAAAp0tAAA' ;
 
FOO BAR
--- ---
1   a

Conclusion

We've seen how to identify a blocking session, and how to inspect the very row that the waiting session is waiting for. And, I hope, learned a bit about v$lock in the process.

How Oracle Locking Works

When a transaction updates a row, it puts a lock so that no one can update the same row until it commits. When another transaction issues an update to the same row, it waits until the first one either commits or rolls back. After the first transaction performs a commit or rollback, the update by the second transaction is executed immediately, since the lock placed by the first transaction is now gone. How exactly does this locking mechanism work? Several questions come to mind in this context:



  1. Is there some kind of logical or physical structure called lock?
  2. How does the second transaction know when the first transaction has lifted the lock?
  3. Is there some kind of “pool” of such locks where transactions line up to get one?
  4. If so, do they line up to return it when they are done with the locking?
  5. Is there a maximum number of possible locks?
  6. Is there something called a block level lock? Since Oracle stores the rows in blocks, when all or the majority of rows in the blocks are locked by a single transaction, doesn’t it make sense for to lock the entire block to conserve the number of locks?
  7. The previous question brings up another question – does the number of active locks in the database at any point really matter?


If you are interested to learn about all this, please read on.

Lock Manager

Since locks convey information on who has what rows modified but not committed, anyone interested in making the update much check with some sort of system that is available across the entire database. So, it makes perfect sense to have a central locking system in the database, doesn’t it? But, when you think about it, a central lock manager can quickly become a single point of contention in a busy system where a lot of updates occur. Also, when a large number of rows are updated in a single transaction, an equally large number of locks will be required as well. The question is: how many? One can guess; but it will be at best a wild one. What if you guessed on the low side and the supply of available locks is depleted? In that case some transactions can’t get locks and therefore will have to wait (or, worse, abort). Not a pleasant thought in a system that needs to be scalable. To counter such a travesty you may want to make the available supply of locks really high. What is the downside of that action? Since each lock would potentially consume some memory, and memory is finite, it would not be advisable to create an infinite supply of locks.

Some databases actually have a lock manager with a finite supply of such locks. Each transaction must ask to get a lock from it before beginning and relinquish locks to it at the completion. In those technologies, the scalability of application suffers immensely as a result of the lock manager being the point of contention. In addition, since the supply of locks is limited, the developers need to commit frequently to release the locks for other transactions. When a large number of rows have locks on them, the database replaces the row locks with a block level lock to cover all the rows in the block – a concept known as lock escalation. Oracle does not follow that approach. In Oracle, there no central lock manager, no finite limit on locks and there is no such concept called lock escalation. The developers commit only when there is a logical need to do so; not otherwise.

Lock Management in Oracle

So, how is that approach different in case of Oracle? For starters, there is no central lock manager. But the information on locking has to be recorded somewhere. Where then? Well, consider this: when a row is locked, it must be available to the session, which means the session’s server process must have already accessed and placed the block in the buffer cache prior to the transaction occurring. Therefore, what is a better place for putting this information than right there in the block (actually the buffer in the buffer cache) itself?

Oracle does precisely that – it records the information in the block. When a row is locked by a transaction, that nugget of information is placed in the header of the block where the row is located. When another transaction wishes to acquire the lock on the same row, it has to access the block containing the row anyway (as you learned in Part 1 of this series) and upon reaching the block, it can easily confirm that the row is locked from the block header. A transaction looking to update a row in a different block puts that information on the header of that block. There is no need to queue behind some single central resource like a lock manager. Since lock information is spread over multiple blocks instead of a single place, this mechanism makes transactions immensely scalable.

Being the smart reader you are, you are now hopefully excited to learn more or perhaps you are skeptical. You want to know the nuts and bolts of this whole mechanism and, more, you want proof. We will see all that in a moment.

Transaction Address

Before understanding the locks, you should understand clearly what a transaction is and how it is addressed. A transaction starts when an update to data such as insert, update or delete occurs (or the intention to do so, e.g. SELECT FOR UPDATE) and ends when the session issues a commit or rollback. Like everything else, a specific transaction should have a name or an identifier to differentiate it from another one of the same type. Each transaction is given a transaction ID. When a transaction updates a row (it could also insert a new row or delete an existing one; but we will cover that little later in this article), it records two things:


  • The new value
  • The old value


The old value is recorded in the undo segments while the new value is immediately updated in the buffer where the row is stored. The data buffer containing the row is updated regardless of whether the transaction is committed or not. Yes, let me repeat – the data buffer is updated as soon as the transaction modifies the row (before commit). If you didn’t know that, please see the Part 1 of this series.

Undo information is recorded in a circular fashion. When new undo is created, it is stored in the next available undo “slot”. Each transaction occupies a record in the slot. After all the slots are exhausted and a new transaction arrives, the next processing depends on the state of the transactions. If the oldest transaction occupying any of the other slots is no longer active (that is either committed or rolled back), Oracle will reuse that slot. If none of the transactions is inactive, Oracle will have to expand the undo tablespace to make room. In the former case (where a transaction is no longer active and its information in undo has been erased by a new transaction), if a long running query that started before the transaction occurred selects the value, it will get an ORA-1555 error. But that will be covered in a different article in the future. If the tablespace containing the undo segment can’t extend due to some reason (such as in case of the filesystem being completely full), the transaction will fail.

Speaking of transaction identifiers, it is in the form of three numbers separated by periods. These three numbers are:


  • Undo Segment Number where the transaction records its undo entry
  • Slot# in the undo segment
  • Sequence# (or wrap) in the undo slot


This is sort of like the social security number of the transaction. This information is recorded in the block header. Let’s see the proof now through a demo.

Demo

First, create a table:

SQL> create table itltest (col1 number, col2 char(8));

Insert some rows into the table.

SQL> begin
  2     for i in 1..10000 loop 
  3             insert into itltest values (i,'x');
  4     end loop;
  5     commit;
  6  end;
  7  /

Remember, this is a single transaction. It started at the “BEGIN” line and ended at “COMMIT”. The 10,000 rows were all inserted as a part of the same transaction. To know the transaction ID of this transaction, Oracle provides a special package - dbms_transaction. Here is how you use it. Remember, you must use it in the same transaction. Let’s see:

SQL> select dbms_transaction.local_transaction_id from dual;

LOCAL_TRANSACTION_ID
------------------------------------------------------------------------


1 row selected.

Wait? There is nothing. The transaction ID returned is null. How come?

If you followed the previous section closely, you will realize that the transaction ends when a commit or rollback is issued. The commit was issued inside the PL/SQL block. So, the transaction had ended before you called the dbms_transaction is package. Since there was no transaction, the package returned null.

Let’s see another demo. Update one row:

SQL> update itltest set col2 = 'y' where col1 = 1;

1 row updated.

In the same session, check the transaction ID:

SQL> select dbms_transaction.local_transaction_id from dual;

LOCAL_TRANSACTION_ID
-------------------------------------------------------------------------
3.23.40484

1 row selected.

There you see – the transaction ID. The three numbers separated by period signify undo segment number, slot# and record# respectively. Now perform a commit:

SQL> commit;

Commit complete.

Check the transaction ID again:

SQL> select dbms_transaction.local_transaction_id from dual;

LOCAL_TRANSACTION_ID
-------------------------------------------------------------------------


1 row selected.

The transaction is gone so the ID is null, as expected.

Since the call to the package must be in the same transaction (and therefore in the same session), how can you check the transaction in a different session? In real life you will be asked to check transaction in other sessions, typically application sessions. Let’s do a slightly different test. Update the row one more time and check the transaction:

SQL> update itltest set col2 = 'y' where col1 = 1;

1 row updated.

SQL> select dbms_transaction.local_transaction_id from dual;

LOCAL_TRANSACTION_ID
-----------------------------------------------------------------------
10.25.31749

1 row selected.

From a different session, check for active transactions. This information is available in the view V$TRANSACTION. There are several columns; but we will look at four of the most important ones:


  • ADDR – the address of the transaction, which is a raw value
  • XIDUSN – the undo segment number
  • XIDSLOT – the slot#
  • XIDSQN – the sequence# or record# inside the slot


SQL> select addr, xidusn, xidslot, xidsqn
  2  from v$transaction;

ADDR         XIDUSN    XIDSLOT     XIDSQN
-------- ---------- ---------- ----------
3F063C48         10         25      31749

Voila! You see the transaction id of the active transaction from a different session. Compare the above output to the one you got from the call to dbms_transaction package. You can see that the transaction identifier shows the same set of numbers.

Interested Transaction List

You must be eager to know about the section of the block header that contains information on locking and how it records it. It is a simple data structure called "Interested Transaction List" (ITL), a list that maintains information on transaction. The ITL contains several placeholders (or slots) for transactions. When a row in the block is locked for the first time, the transaction places a lock in one of the slots. In other words, the transaction makes it known that it is interested in some rows (hence the term "Interested Transaction List"). When a different transaction locks another set of rows in the same block, that information is stored in another slot and so on. When a transaction ends after a commit or a rollback, the locks are released and the slot which was used to mark the row locks in the block is now considered free (although it is not updated immediately - fact about which you will learn later in a different installment).

[Updated Jan 22, 2011] [Thank you, Randolph Geist (info@www.sqltools-plusplus.org) for pointing it out. I follow his blog http://oracle-randolf.blogspot.com/, which is a treasure trove of information.
The row also stores a bit that represents the whether it is locked or not.
[end of Update Jan 22, 2011]


ITLs in Action

Let's see how ITLs really work. Here is an empty block. The block header is the only occupant of the block.


This is how the block looks like after a single row has been inserted:


Note, the row was inserted from the bottom of the block. Now, a second row has been inserted:



A session comes in and updates the row Record1, i.e. it places a lock on the row, shown by the star symbol. The lock information is recorded in the ITL slot in the block header:


The session does not commit yet; so the lock is active. Now a second session - Session 2 - comes in and updates row Record2. It puts a lock on the record - as stored in the ITL slot.


I have used two different colors to show the locks (as shown by the star symbol) and the color of the ITL entry.

As you can clearly see, when a transaction wants to update a specific row, it doesn’t have to go anywhere but the block header itself to know if the row is locked or not. All it has to do is to check the ITL slots. However ITL alone does not show with 100% accuracy that row is locked (again, something I will explain in a different installment). The transaction must go to the undo segment to check if the transaction has been committed. How does it know which specifci part of the undo segment to go to? Well, it has the information in the ITL entry. If the row is indeed locked, the transaction must wait and retry. As soon as the previous transaction ends, the undo information is updated and the waiting transaction completes its operation.

So, there is in fact a queue for the locks, but it's at the block level, not at the level of the entire database or even the segment.

Demo

The proof is in the pudding. Let’s see all this through a demo. Now that you know the transaction entry, let’s see how it is stored in the block header. To do that, first, we need to know which blocks to look for. So, we should get the blocks numbers where the table is stored:

SQL> select file_id, relative_fno, extent_id, block_id, blocks
  2  from dba_extents
  3  where segment_name = 'ITLTEST';

   FILE_ID RELATIVE_FNO  EXTENT_ID   BLOCK_ID     BLOCKS
---------- ------------ ---------- ---------- ----------
         7            7          0       3576          8
         7            7          1       3968          8
         7            7          2       3976          8
         7            7          3       3984          8

To check inside the block, we need to “dump” the contents of the block to a tracefile so that we can read it. From a different session issue a checkpoint so that the buffer data is now written to the dis:

SQL> alter system checkpoint;

Now dump the data blocks 3576 through 3583.

SQL> alter system dump datafile 7 block min 3576 block max 3583;

System altered.

This will create a tracefile in the user dump destination directory. In case of Oracle 11g, the tracefile will be in the diag structure under /diag/rdbms///trace directory. It will be most likely the last tracefile generated. You can also get the precise name by getting the OS process ID of the session:


SQL> select p.spid
  2  from v$session s, v$process p
  3  where s.sid = (select sid from v$mystat where rownum < 2)
  4  and p.addr = s.paddr
  5  /

SPID
------------------------
9202

1 row selected.
Now look for a file named _ora_9202.trc. Open the file in vi and search for the phrase “Itl”. Here is an excerpt from the file:


Itl           Xid                  Uba          Flag  Lck        Scn/Fsc
0x01   0x000a.019.00007c05  0x00c00288.1607.0e  ----    1  fsc 0x0000.00000000
0x02   0x0003.017.00009e24  0x00c00862.190a.0f  C---    0  scn 0x0000.02234e2b

This is where the information on row locking is stored. Remember, the row locking information is known as Interested Transaction List (ITL) and each ITL is stored in a “slot”. Here it shows two slots, which is the default number. Look for the one where the “Lck” column shows a value. It shows “1”, meaning one of the rows in the blocks is locked by a transaction. But, which transaction? To get that answer, note the value under the “Xid” column. It shows the transaction ID - 0x000a.019.00007c05. These numbers are in hexadecimal (as indicated by the 0x at the beginning of the number). Using the scientific calculator in Windows, I converted the values to decimal as 10, 25 and 31749 respectively. Do they sound familiar? Of course they do; they are exactly as reported by both the record in v$transaction and the dbms_transaction.local_transaction_id function call.

This is how Oracle determines that there is a transaction has locked the row and correlates it to the various components in the other areas – mostly the undo segments to determne if it is active. Now that you know undo segments holds the transaction details, you may want to know more about the segment. Remember, the undo segment is just a segment, like any other table, indexes, etc. It resides in a tablespace, which is on some datafile. To find out the specifics of the segment, we will look into some more columns of the view V$TRANSACTION:


SQL> select addr, xidusn, xidslot, xidsqn, ubafil, ubablk, ubasqn, ubarec,
  2  status, start_time, start_scnb, start_scnw, ses_addr
  3  from v$transaction;

ADDR         XIDUSN    XIDSLOT     XIDSQN     UBAFIL     UBABLK     UBASQN
-------- ---------- ---------- ---------- ---------- ---------- ----------
    UBAREC STATUS           START_TIME           START_SCNB START_SCNW SES_ADDR
---------- ---------------- -------------------- ---------- ---------- --------
3F063C48         10         25      31749          3        648       5639
        14 ACTIVE           12/30/10 20:00:25      35868240          0 40A73784


1 row selected.
The columns with names starting with UBA show the undo block address information. Look at the above output. The UBAFIL shows the file#, which is “3” in this case. Checking for the file_id: 

SQL> select * from dba_data_files
  2> where file_id = 3;

FILE_NAME
-------------------------------------------------------------------------
   FILE_ID TABLESPACE_NAME                     BYTES     BLOCKS STATUS
---------- ------------------------------ ---------- ---------- ---------
RELATIVE_FNO AUT   MAXBYTES  MAXBLOCKS INCREMENT_BY USER_BYTES USER_BLOCKS
------------ --- ---------- ---------- ------------ ---------- -----------
ONLINE_
-------
+DATA/d112d2/datafile/undotbs1.260.722742813
         3 UNDOTBS1                       4037017600     492800 AVAILABLE
           3 YES 3.4360E+10    4194302          640 4035969024      492672
ONLINE


1 row selected.

Note the UBASQN (which is the undo block sequence#) value in the earlier output, which was 5639. Let’s revisit the ITL entries in the dump of block:
Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x000a.019.00007c05  0x00c00288.1607.0e  ----    1  fsc 0x0000.00000000
0x02   0x0003.017.00009e24  0x00c00862.190a.0f  C---    0  scn 0x0000.02234e2b

Look at the entry under the Uba column: 0x00c00288.1607.0e. As indicated by the “0x” at the beginning, these are in hexadecimal. Using a scientific calculator, let’s convert them. 1607 in hex means 5639 in decimal – the UBA Sequence# (UBASQN). The value “e” is 14 in decimal, which corresponds to the UBAREC. Finally the value 288 is 648 in decimal, which is the UBABLK. Now you see how the information is recorded in the block header and is also available to the DBA through the view V$TRANSACTION. 

Let’s see some more important columns of the view. A typical database will have many sessions; not just one. Each session may have an active transaction, which means you have to link sessions to transactions to generate meaningful information. The transaction information also contains the session link. Note the column SES_ADDR, which is the address of the session that issued the transaction. From that, you can get the session information



SQL> select sid, username
  2  from v$session
  3  where saddr = '40A73784';

SID USERNAME
--- --------
123 ARUP

There you go – you now have the SID of the session. And now that you know the SID, you can look up any other relevant data on the session from the view V$SESSION.

Takeaways

Here is a summary of what you learned so far:
  1. Transaction in Oracle starts with a data update (or intention to update) statement. Actually there are some exceptions which we will cover in a later article.
  2. It ends when a commit or rollback is issued
  3. A transaction is identified by a transaction ID (XID) which is a set of three numbers – undo segment#, undo slot# and undo record# - separated by periods.
  4. You can view the transaction ID in the session itself by calling dbms_transaction.local_transaction_id function.
  5. You can also check all the active transactions in the view v$transaction, where the columns XIDUSN, XIDSLOT and XIDSQN denote the undo segment#, undo slot# and undo rec# - the values that make up the transaction ID.
  6. The transaction information is also stored in the block header. You can check it by dumping the block and looking for the term “Itl”.
  7. The v$transaction view also contains the session address under SES_ADDR column, which can be used to join with the SADDR column of v$session view to get the session details.
  8. From the session details, you can find out other actions by the session such as the username, the SQL issues, the machine issued from, etc.