Friday, August 8, 2014

Using 11gr2 Clusterware for Providing HA for Single Instance Database

We can use the Oracle cluster ware for providing the High availability instead of using the
third party cluster-ware.

In this practical session we can see how we are use the 11gr2 cluster ware can be used
to to provide the HA for a single instance database.

OS    - OEL 6.5
Cluster-ware - 11.2.0.3

Below is the status of the fresh installed clusterware.


[oracle@oelrac1 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATADG.dg
               ONLINE  ONLINE       oelrac1
               ONLINE  ONLINE       oelrac2
ora.FRA.dg
               ONLINE  ONLINE       oelrac1
               ONLINE  ONLINE       oelrac2
ora.LISTENER.lsnr
               ONLINE  ONLINE       oelrac1
               ONLINE  ONLINE       oelrac2
ora.OCR_MIRROR.dg
               ONLINE  ONLINE       oelrac1
               ONLINE  ONLINE       oelrac2
ora.OCR_VOTING.dg
               ONLINE  ONLINE       oelrac1
               ONLINE  ONLINE       oelrac2
ora.asm
               ONLINE  ONLINE       oelrac1                  Started
               ONLINE  ONLINE       oelrac2                  Started
ora.gsd
               OFFLINE OFFLINE      oelrac1
               OFFLINE OFFLINE      oelrac2
ora.net1.network
               ONLINE  ONLINE       oelrac1
               ONLINE  ONLINE       oelrac2
ora.ons
               ONLINE  ONLINE       oelrac1
               ONLINE  ONLINE       oelrac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       oelrac2
ora.cvu
      1        ONLINE  ONLINE       oelrac2
ora.oc4j
      1        ONLINE  ONLINE       oelrac2
ora.oelrac1.vip
      1        ONLINE  ONLINE       oelrac1
ora.oelrac2.vip
      1        ONLINE  ONLINE       oelrac2
ora.scan1.vip
      1        ONLINE  ONLINE       oelrac2



1) Install Standalone oracle home on both the nodes.
2) Create the database using dbca named testdb

--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------

ora.testdb.db
      1        ONLINE  ONLINE       oelrac1                  Open


3) Take the configuration properties of the database and remove it from the cluster resource.

[oracle@oelrac1 trace]$ crsctl stat res ora.testdb.db -p > /home/oracle/test_db_properties.txt


[oracle@oelrac1 trace]$ cat /home/oracle/test_db_properties.txt
NAME=ora.testdb.db
TYPE=ora.database.type
ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r--
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
ACTIVE_PLACEMENT=1
AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%
AUTO_START=restore
CARDINALITY=1
CHECK_INTERVAL=1
CHECK_TIMEOUT=30
CLUSTER_DATABASE=false
DATABASE_TYPE=SINGLE
DB_UNIQUE_NAME=testdb
DEFAULT_TEMPLATE=PROPERTY(RESOURCE_CLASS=database) PROPERTY(DB_UNIQUE_NAME= CONCAT(PARSE(%NAME%, ., 2), %USR_ORA_DOMAIN%, .)) ELEMENT(INSTANCE_NAME= %GEN_USR_ORA_INST_NAME%) ELEMENT(DATABASE_TYPE= %DATABASE_TYPE%)
DEGREE=1
DESCRIPTION=Oracle Database resource
ENABLED=1
FAILOVER_DELAY=0
FAILURE_INTERVAL=60
FAILURE_THRESHOLD=1
GEN_AUDIT_FILE_DEST=/u01/app/oracle/admin/testdb/adump
GEN_START_OPTIONS=
GEN_START_OPTIONS@SERVERNAME(oelrac1)=open
GEN_USR_ORA_INST_NAME=
GEN_USR_ORA_INST_NAME@SERVERNAME(oelrac1)=testdb
HOSTING_MEMBERS=
INSTANCE_FAILOVER=1
LOAD=1
LOGGING_LEVEL=1
MANAGEMENT_POLICY=AUTOMATIC
NLS_LANG=
NOT_RESTARTING_TEMPLATE=
OFFLINE_CHECK_INTERVAL=0
ONLINE_RELOCATION_TIMEOUT=0
ORACLE_HOME=/u01/app/oracle/product/11.2.0/db
ORACLE_HOME_OLD=
PLACEMENT=restricted
PROFILE_CHANGE_TEMPLATE=
RESTART_ATTEMPTS=2
ROLE=PRIMARY
SCRIPT_TIMEOUT=60
SERVER_POOLS=ora.testdb
SPFILE=+DATADG/testdb/spfiletestdb.ora
START_DEPENDENCIES=hard(ora.DATADG.dg) weak(type:ora.listener.type,uniform:ora.ons) pullup(ora.DATADG.dg)
START_TIMEOUT=600
STATE_CHANGE_TEMPLATE=
STOP_DEPENDENCIES=hard(intermediate:ora.asm,shutdown:ora.DATADG.dg)
STOP_TIMEOUT=600
TYPE_VERSION=3.2
UPTIME_THRESHOLD=1h
USR_ORA_DB_NAME=testdb
USR_ORA_DOMAIN=
USR_ORA_ENV=
USR_ORA_FLAGS=
USR_ORA_INST_NAME=testdb
USR_ORA_OPEN_MODE=open
USR_ORA_OPI=false
USR_ORA_STOP_MODE=immediate
VERSION=11.2.0.3.0


[oracle@oelrac1 dbs]$ srvctl status database -d testdb
Instance testdb is running on node oelrac1

[oracle@oelrac1 dbs]$ srvctl stop database -d testdb

[oracle@oelrac1 dbs]$ srvctl remove database -d testdb
Remove the database testdb? (y/[n]) y

Now the db resource will not get listed in crs resource. (crsctl stat res -t).


4) Copy the pfile and password file to the 2nd node. Create adump directory in 2nd node.

[oracle@oelrac1 dbs]$ scp inittestdb.ora oracle@oelrac2:/u01/app/oracle/product/11.2.0/db/dbs/
inittestdb.ora                                                                                                                        100%   41     0.0KB/s   00:00

[oracle@oelrac1 dbs]$ scp orapwtestdb oracle@oelrac2:/u01/app/oracle/product/11.2.0/db/dbs/
orapwtestdb  


[oracle@oelrac2 ~]$ mkdir -p /u01/app/oracle/admin/testdb/adump



5) Test the database startup in node 2.


[oracle@oelrac2 ~]$ export ORACLE_SID=testdb
[oracle@oelrac2 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Sat Aug 9 10:35:59 2014

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area  626327552 bytes
Fixed Size                  2230952 bytes
Variable Size             381683032 bytes
Database Buffers          239075328 bytes
Redo Buffers                3338240 bytes
Database mounted.
Database opened.


-- Database has opened in node 2 without issues.

Register the database with local listeners, since the local listeners are listening on the
vip there is no need to setup a seperate vip for this database. We can use the existing
vip and scan vips for this setup.


[oracle@oelrac2 ~]$ srvctl config scan
SCAN name: oelrac-scan.manzoor.com, Network: 1/192.168.0.0/255.255.255.0/eth0
SCAN VIP name: scan1, IP: /oelrac-scan.manzoor.com/192.168.0.44

[oracle@oelrac2 ~]$ srvctl config scan_listener
SCAN Listener LISTENER_SCAN1 exists. Port: TCP:1521

[oracle@oelrac2 ~]$ srvctl config vip -n oelrac1
VIP exists: /oelrac1-vip/192.168.0.42/192.168.0.0/255.255.255.0/eth0, hosting node oelrac1

[oracle@oelrac2 ~]$ srvctl config vip -n oelrac2
VIP exists: /oelrac2-vip/192.168.0.43/192.168.0.0/255.255.255.0/eth0, hosting node oelrac2


Enter the below on the tnsnames.ora file

On Node 1:-

TESTDB =
   (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = oelrac-scan.manzoor.com)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = testdb)
    )
 )


LISTENER_TESTDB =
        (DESCRIPTION=
        (ADDRESS_LIST=
        (ADDRESS=(PROTOCOL=TCP)(HOST=192.168.0.42)(PORT=1521))))


On Node 2:-


TESTDB =
   (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = oelrac-scan.manzoor.com)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = testdb)
    )
 )


LISTENER_TESTDB =
        (DESCRIPTION=
        (ADDRESS_LIST=
        (ADDRESS=(PROTOCOL=TCP)(HOST=192.168.0.43)(PORT=1521))))




5) Set the local_listener parameter in the db to listener on the above configuration.


SQL> alter system set local_listener = 'LISTENER_TESTDB' scope=both ;

System altered.

SQL> alter system set remote_listener = 'oelrac-scan:1521' scope=both;

System altered.


6) Check the dummy connection.

[oracle@oelrac2 admin]$ sqlplus -L a/a@testdb

SQL*Plus: Release 11.2.0.3.0 Production on Sat Aug 9 10:51:24 2014

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

ERROR:
ORA-01017: invalid username/password; logon denied


SP2-0751: Unable to connect to Oracle.  Exiting SQL*Plus


-- Dummy connection working fine.

-- Now shutdown the node 2 and start the db in node 1 and check the connections.


[oracle@oelrac2 admin]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Sat Aug 9 10:52:51 2014

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> shut immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> exit


[oracle@oelrac1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Sat Aug 9 10:54:46 2014

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area  626327552 bytes
Fixed Size                  2230952 bytes
Variable Size             381683032 bytes
Database Buffers          239075328 bytes
Redo Buffers                3338240 bytes
Database mounted.
Database opened.


[oracle@oelrac1 ~]$ sqlplus -L a/a@testdb

SQL*Plus: Release 11.2.0.3.0 Production on Sat Aug 9 10:56:21 2014

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

ERROR:
ORA-01017: invalid username/password; logon denied


SP2-0751: Unable to connect to Oracle.  Exiting SQL*Plus



-- So far manual cold failover has been succeeded with out any issues. Now lets create an action script
to perform this action by clusterware.

11gr2 clusterware action script requires 4 inputs (start, stop, check & clean). Lets create the action
script.


Create the below directories in both the nodes and change the directory owner to oracle.


[root@oelrac1 11.2.0]# mkdir ha_scripts
[root@oelrac1 11.2.0]# chown oracle:oinstall ha_scripts


[root@oelrac2 11.2.0]# mkdir ha_scripts
[root@oelrac2 11.2.0]# chown oracle:oinstall ha_scripts

Create the below scripts as oracle user.

vi testdb_action_script.sh



#!/bin/sh
export ORACLE_HOME=/u01/app/oracle/product/11.2.0/db
export ORACLE_SID=testdb



case $1 in
'start')
$ORACLE_HOME/bin/sqlplus /nolog <conn / as sysdba
startup;
exit;
EOF
RET=0
;;

'stop')
$ORACLE_HOME/bin/sqlplus /nolog <conn / as sysdba
shut immediate;
exit;
EOF
RET=0
;;

'clean')
$ORACLE_HOME/bin/sqlplus /nolog <conn / as sysdba
shut abort;
exit;
EOF
RET=0
;;

'check')
os=`ps -ef | grep smon | grep $ORACLE_SID | wc -l`
if [ $os -eq 0 ]; then
        RET=1
else
        RET=0
fi
;;

'*')
RET=0
;;
esac


if [ $RET -eq 0 ]; then
        echo "Return value is zero"
        exit 0
        else
        echo "Return value is one"
        exit 1
fi



Test the script manually.


[oracle@oelrac1 ha_scripts]$ ./testdb_action_script.sh start

SQL*Plus: Release 11.2.0.3.0 Production on Sat Aug 9 11:36:12 2014

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

SQL> Connected to an idle instance.
SQL> ORACLE instance started.

Total System Global Area  626327552 bytes
Fixed Size                  2230952 bytes
Variable Size             381683032 bytes
Database Buffers          239075328 bytes
Redo Buffers                3338240 bytes
Database mounted.
Database opened.
SQL> Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options



[oracle@oelrac1 ha_scripts]$ ./testdb_action_script.sh stop

SQL*Plus: Release 11.2.0.3.0 Production on Sat Aug 9 11:38:05 2014

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

SQL> Connected.
SQL> Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options


--- Script is working fine. Copy the script to the other node and test.


[oracle@oelrac2 ha_scripts]$ ./testdb_action_script.sh start

SQL*Plus: Release 11.2.0.3.0 Production on Sat Aug 9 11:44:16 2014

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

SQL> Connected to an idle instance.
SQL> ORACLE instance started.

Total System Global Area  626327552 bytes
Fixed Size                  2230952 bytes
Variable Size             381683032 bytes
Database Buffers          239075328 bytes
Redo Buffers                3338240 bytes
Database mounted.
Database opened.
SQL> Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
Return value is zero
[oracle@oelrac2 ha_scripts]$ ./testdb_action_script.sh check
Return value is zero
[oracle@oelrac2 ha_scripts]$ ./testdb_action_script.sh stop

SQL*Plus: Release 11.2.0.3.0 Production on Sat Aug 9 11:46:24 2014

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

SQL> Connected.
SQL> Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
Return value is zero

-- Now we need a new cluster resource which will manage the failover automatically, Since we
are creating as cluster_resource type we can remove all the properties which are related to
database resource. Below is the modified properties.



NAME=TESTDB.db
TYPE=cluster_resource
ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r--
ACTION_SCRIPT=/u01/grid/11.2.0/ha_scripts/testdb_action_script.sh
ACTIVE_PLACEMENT=0
AUTO_START=restore
CARDINALITY=1
CHECK_INTERVAL=10
DEGREE=1
DESCRIPTION=Oracle Database resource
ENABLED=1
LOGGING_LEVEL=1
PLACEMENT=restricted
RESTART_ATTEMPTS=1
SCRIPT_TIMEOUT=60
START_DEPENDENCIES=hard(ora.DATADG.dg) weak(type:ora.listener.type,uniform:ora.ons) pullup(ora.DATADG.dg)
START_TIMEOUT=600
STOP_DEPENDENCIES=hard(intermediate:ora.asm,shutdown:ora.DATADG.dg)
STOP_TIMEOUT=600
UPTIME_THRESHOLD=1h
HOSTING_MEMBERS=oelrac1 oelrac2



copy the contents to one text file eg. myresource.txt and add the cluster resource.


[oracle@oelrac2 ha_scripts]$ crsctl add res TESTDB.db -type cluster_resource -file myresource.txt


[oracle@oelrac2 ha_scripts]$ crsctl stat res TESTDB.db
NAME=TESTDB.db
TYPE=cluster_resource
TARGET=OFFLINE
STATE=OFFLINE



[oracle@oelrac2 ha_scripts]$ crsctl start res TESTDB.db
CRS-2672: Attempting to start 'TESTDB.db' on 'oelrac1'
CRS-2676: Start of 'TESTDB.db' on 'oelrac1' succeeded



Cluster Resources
--------------------------------------------------------------------------------
TESTDB.db
      1        ONLINE  ONLINE       oelrac1



[oracle@oelrac1 ~]$ crsctl stop res TESTDB.db
CRS-2673: Attempting to stop 'TESTDB.db' on 'oelrac1'
CRS-2677: Stop of 'TESTDB.db' on 'oelrac1' succeeded

[oracle@oelrac1 ~]$ crsctl start res TESTDB.db
CRS-2672: Attempting to start 'TESTDB.db' on 'oelrac1'
CRS-2676: Start of 'TESTDB.db' on 'oelrac1' succeeded


-- Now relocate the res to another server.



[oracle@oelrac1 ~]$ crsctl relocate res TESTDB.db -s oelrac1 -n oelrac2 -f
CRS-2673: Attempting to stop 'TESTDB.db' on 'oelrac1'
CRS-2677: Stop of 'TESTDB.db' on 'oelrac1' succeeded
CRS-2672: Attempting to start 'TESTDB.db' on 'oelrac2'
CRS-2676: Start of 'TESTDB.db' on 'oelrac2' succeeded


-- Now lets check the failover test...


Cluster Resources
--------------------------------------------------------------------------------
TESTDB.db
      1        ONLINE  ONLINE       oelrac2




[oracle@oelrac2 ha_scripts]$ ps -ef | grep pmon
oracle    3529     1  0 08:50 ?        00:00:02 asm_pmon_+ASM2
oracle   13194     1  0 12:15 ?        00:00:00 ora_pmon_testdb


[oracle@oelrac2 ha_scripts]$ kill -9 13194


[oracle@oelrac2 ha_scripts]$ ps -ef | grep pmon
oracle    3529     1  0 08:50 ?        00:00:02 asm_pmon_+ASM2
oracle   13434 12098  0 12:17 pts/0    00:00:00 grep pmon



[oracle@oelrac2 ha_scripts]$ ps -ef | grep pmon
oracle    3529     1  0 08:50 ?        00:00:02 asm_pmon_+ASM2
oracle   13498     1  0 12:17 ?        00:00:00 ora_pmon_testdb
oracle   13768 12098  0 12:18 pts/0    00:00:00 grep pmon


-- Could see that the database automatically comes up.


-- Let shutdown the node 2.


--- Monitor the cluster res in node 1



--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
TESTDB.db
      1        ONLINE  ONLINE       oelrac2                  STOPPING





--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
TESTDB.db
      1        ONLINE  OFFLINE                               STARTING



--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
TESTDB.db
      1        ONLINE  ONLINE       oelrac1



Advantageous:-

1) No need to rely on third party clustware for managing the db resources and cost saving on acquiring those license.
2) Easy maintenance (using crsctl commands).
3) Automatic fail over.
4) Can be useful for application which will not support RAC.


















































No comments:

Post a Comment