Manzoor's Blog - Stuffs on Oracle Database Administration

Saturday, October 5, 2013

Setup up DNS Server for SCAN IP for 11gr2 Grid (11.2)

Setup DNS Server and configure scan ips and modify the scan details in 11g grid (11.2)
======================================================================================

In this post we are going to discuss about below

* How to setup the yum installer.

* How to Configure DNS server.

* How to add the Client server details in the DNS Server.

* How to Modify the SCAN IP details in the 11gr2 Grid Infrastructure.

OS - RHEL 5.7

Prepare Yum Install.

===================

1. Mount the RHEL ISO DVD on the server.

2. [root@standalone2 media]# mount /dev/cdrom /mnt

mount: block device /dev/cdrom is write-protected, mounting read-only

[root@standalone2 media]# cd /mnt

3. Install the FTP Server.

[root@standalone2 Server]# ls -lrt vsf*

-r--r--r-- 75 root root 143483 May 24 2011 vsftpd-2.0.5-21.el5.x86_64.rpm

[root@standalone2 Server]# rpm -ivh vsftpd-2.0.5-21.el5.x86_64.rpm

warning: vsftpd-2.0.5-21.el5.x86_64.rpm: Header V3 DSA signature: NOKEY, key ID 37017186

Preparing... ########################################### [100%]

1:vsftpd ########################################### [100%]

4. Copy the files under Server / images directory and RPM-GPG-KEY files to /var/ftp/pub directory.

[root@standalone2 Server]# cp -av /mnt/Server /var/ftp/pub/

[root@standalone2 Server]# cp -av /mnt/images /var/ftp/pub/

[root@standalone2 Server]# cp -av /mnt/RPM-GPG-KEY* /var/ftp/pub/

5. Install the create repository package.

[root@standalone2 ~]# cd /var/ftp/pub/Server/

[root@standalone2 Server]# rpm -ivh createrepo-0.4.11-3.el5.noarch.rpm

warning: createrepo-0.4.11-3.el5.noarch.rpm: Header V3 DSA signature: NOKEY, key ID 37017186

Preparing... ########################################### [100%]

1:createrepo ########################################### [100%]

6. Create a Repository for the /var/ftp/pub directory

[root@standalone2 Server]# createrepo -v /var/ftp/pub

[root@standalone2 Server]# createrepo -g /var/ftp/pub/Server/repodata/comps-rhel5-server-core.xml /var/ftp/pub/

[root@standalone2 Server]# yum clean all

Loaded plugins: rhnplugin, security

Cleaning up Everything

7. Create an Repository file with below contents.

[root@standalone2 Server]# vi /etc/yum.repos.d/Server.repo

[ser]

name=standalone2.manzoor.com

baseurl=file:///var/ftp/pub

enabled=1

gpgcheck=0

8. Check yum installer tool by uninsalling and reinstalling a package

[root@standalone2 Server]# yum remove telnet

Loaded plugins: rhnplugin, security

This system is not registered with RHN.

RHN support will be disabled.

Setting up Remove Process

Resolving Dependencies

--> Running transaction check

---> Package telnet.x86_64 1:0.17-39.el5 set to be erased

--> Finished Dependency Resolution

Dependencies Resolved

=============================================================================================================================================================

Package Arch Version Repository Size

=============================================================================================================================================================

Removing:

telnet x86_64 1:0.17-39.el5 installed 105 k

Transaction Summary

=============================================================================================================================================================

Remove 1 Package(s)

Reinstall 0 Package(s)

Downgrade 0 Package(s)

Is this ok [y/N]: y

Downloading Packages:

Running rpm_check_debug

Running Transaction Test

Finished Transaction Test

Transaction Test Succeeded

Running Transaction

Erasing : telnet 1/1

Removed:

telnet.x86_64 1:0.17-39.el5

Complete!

[root@standalone2 Server]# yum install telnet

Loaded plugins: rhnplugin, security

This system is not registered with RHN.

RHN support will be disabled.

Server | 1.1 kB 00:00

Server/primary | 1.1 MB 00:00

Server 3261/3261

Setting up Install Process

Resolving Dependencies

--> Running transaction check

---> Package telnet.x86_64 1:0.17-39.el5 set to be updated

--> Finished Dependency Resolution

Dependencies Resolved

=============================================================================================================================================================

Package Arch Version Repository Size

=============================================================================================================================================================

Installing:

telnet x86_64 1:0.17-39.el5 Server 60 k

Transaction Summary

=============================================================================================================================================================

Install 1 Package(s)

Upgrade 0 Package(s)

Total download size: 60 k

Is this ok [y/N]: y

Downloading Packages:

Running rpm_check_debug

Running Transaction Test

Finished Transaction Test

Transaction Test Succeeded

Running Transaction

Installing : telnet 1/1

Installed:

telnet.x86_64 1:0.17-39.el5

Complete!

[root@standalone2 Server]# yum update

======= yum configuration completed =========================

DNS Server Configuration

========================

1) Install the necessary rpm (bind packages) which are required to configure DNS Server.

[root@standalone2 ~]# yum install -y *bind* caching-nameserver

2) Notedown the Public IP address of the Server.

[root@standalone2 ~]# ifconfig eth0

eth0 Link encap:Ethernet HWaddr 00:0C:29:86:F8:24

inet addr:192.168.0.30 Bcast:192.168.0.255 Mask:255.255.255.0

inet6 addr: fe80::20c:29ff:fe86:f824/64 Scope:Link

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

RX packets:26338 errors:0 dropped:0 overruns:0 frame:0

TX packets:40786 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

RX bytes:1764870 (1.6 MiB) TX bytes:8763994 (8.3 MiB)

IP Address - 192.168.0.30

3) Modify the named.conf coniguration files

[root@standalone2 ~]# cd /var/named/chroot/etc/

[root@standalone2 etc]# ls -lrt

total 16

-rw-r----- 1 root named 955 Dec 2 2010 named.rfc1912.zones

-rw-r----- 1 root named 1230 Dec 2 2010 named.caching-nameserver.conf

-rw-r--r-- 1 root root 2819 Oct 13 2012 localtime

-rw-r----- 1 root named 113 Oct 4 21:52 rndc.key

[root@standalone2 etc]# cp named.caching-nameserver.conf named.conf

[root@standalone2 etc]# vi named.conf

# edit the named.conf file...

Modify the below lines...

Before Modification

===================

listen-on port 53 { 127.0.0.1; };

listen-on-v6 port 53 { ::1; };

allow-query { localhost; };

allow-query-cache { localhost; };

match-clients { localhost; };

match-destinations { localhost; };

After Modificaton

=================

listen-on port 53 { 192.168.0.30; };

# listen-on-v6 port 53 { ::1; };

allow-query { any; };

allow-query-cache { any; };

match-clients { any; };

match-destinations { 192.168.0.30; };

[root@standalone2 etc]# ls -lrt

total 20

-rw-r----- 1 root named 955 Dec 2 2010 named.rfc1912.zones

-rw-r----- 1 root named 1230 Dec 2 2010 named.caching-nameserver.conf

-rw-r--r-- 1 root root 2819 Oct 13 2012 localtime

-rw-r----- 1 root named 113 Oct 4 21:52 rndc.key

-rw-r----- 1 root root 1219 Oct 4 22:46 named.conf

4. Edit the zones files.

[root@standalone2 etc]# vi named.rfc1912.zones

# Now edit the zone file

Mofiy the below lines.

Before Modification.

====================

zone "localdomain" IN {

file "localdomain.zone";

zone "0.0.127.in-addr.arpa" IN {

file "named.local";

After Modification

===================

zone "manzoor.com" IN {

file "forward.zone";

zone "0.168.192.in-addr.arpa" IN {

file "reverse.zone";

[root@standalone2 etc]# chgrp named named.conf

[root@standalone2 etc]# ls -lrt

total 20

-rw-r----- 1 root named 1230 Dec 2 2010 named.caching-nameserver.conf

-rw-r--r-- 1 root root 2819 Oct 13 2012 localtime

-rw-r----- 1 root named 113 Oct 4 21:52 rndc.key

-rw-r----- 1 root named 1219 Oct 4 22:46 named.conf

-rw-r----- 1 root named 954 Oct 4 23:20 named.rfc1912.zones

[root@standalone2 etc]# cd /var/named/chroot/var/named

[root@standalone2 named]# ls -lrt

total 36

drwxrwx--- 2 named named 4096 Jul 27 2004 slaves

drwxrwx--- 2 named named 4096 Aug 25 2004 data

-rw-r----- 1 root named 427 Dec 2 2010 named.zero

-rw-r----- 1 root named 426 Dec 2 2010 named.local

-rw-r----- 1 root named 424 Dec 2 2010 named.ip6.local

-rw-r----- 1 root named 1892 Dec 2 2010 named.ca

-rw-r----- 1 root named 427 Dec 2 2010 named.broadcast

-rw-r----- 1 root named 195 Dec 2 2010 localhost.zone

-rw-r----- 1 root named 198 Dec 2 2010 localdomain.zone

-- Before in the zone file we have changed the localdoamin.zone to forward.zone and named.local to reverse.zone

so copy the below files with the mentioned name and edit it

[root@standalone2 named]# cp localdomain.zone forward.zone

[root@standalone2 named]# cp named.local reverse.zone

[root@standalone2 named]# vi forward.zone

# Whole file before modification.

================================

$TTL 86400

@ IN SOA localhost root (

42 ; serial (d. adams)

3H ; refresh

15M ; retry

1W ; expiry

1D ) ; minimum

IN NS localhost

localhost IN A 127.0.0.1

# whole file after modification.

================================

$TTL 86400

@ IN SOA standalone2.manzoor.com. root.standalone2.manzoor.com. (

42 ; serial (d. adams)

3H ; refresh

15M ; retry

1W ; expiry

1D ) ; minimum

IN NS standalone2.manzoor.com.

standalone2 IN A 192.168.0.30

[root@standalone2 named]# vi reverse.zone

# Whole file before modification.

================================

$TTL 86400

@ IN SOA localhost. root.localhost. (

1997022700 ; Serial

28800 ; Refresh

14400 ; Retry

3600000 ; Expire

86400 ) ; Minimum

IN NS localhost.

1 IN PTR localhost.

# whole file after modification.

================================

$TTL 86400

@ IN SOA standalone2.manzoor.com. root.standalone2.manzoor.com. (

1997022700 ; Serial

28800 ; Refresh

14400 ; Retry

3600000 ; Expire

86400 ) ; Minimum

IN NS standalone2.manzoor.com.

30 IN PTR standalone2.manzoor.com.

-- in the above 30 is the last pointer in the ip address 192.168.0.30

-- Change the group of forward.zone and reverse.zone files to named group.

[root@standalone2 named]# chgrp named forward.zone

[root@standalone2 named]# chgrp named reverse.zone

[root@standalone2 named]# ls -lrt

total 44

drwxrwx--- 2 named named 4096 Jul 27 2004 slaves

drwxrwx--- 2 named named 4096 Aug 25 2004 data

-rw-r----- 1 root named 427 Dec 2 2010 named.zero

-rw-r----- 1 root named 426 Dec 2 2010 named.local

-rw-r----- 1 root named 424 Dec 2 2010 named.ip6.local

-rw-r----- 1 root named 1892 Dec 2 2010 named.ca

-rw-r----- 1 root named 427 Dec 2 2010 named.broadcast

-rw-r----- 1 root named 195 Dec 2 2010 localhost.zone

-rw-r----- 1 root named 198 Dec 2 2010 localdomain.zone

-rw-r----- 1 root named 258 Oct 4 23:25 forward.zone

-rw-r----- 1 root named 482 Oct 4 23:28 reverse.zone

[root@standalone2 named]# cat /etc/hosts

# Do not remove the following line, or various programs

# that require network functionality will fail.

127.0.0.1 localhost.localdomain localhost

::1 localhost6.localdomain6 localhost6

##################################################

#### Public ips #################################

192.168.0.30 standalone2.manzoor.com standalone2

5) Edit the resolv.conf file modify the localdomain to your domain name

and the nameserver ip address to the public ip of this server.

[root@standalone2 named]# vi /etc/resolv.conf

# Edit file as per below details.

search manzoor.com

nameserver 192.168.0.30

-- Host name should be updated in network file as below

[root@standalone2 named]# cat /etc/sysconfig/network

NETWORKING=yes

NETWORKING_IPV6=no

HOSTNAME=standalone2.manzoor.com

-- Restart the named service

[root@standalone2 named]# service named restart

Stopping named: [ OK ]

Starting named: [ OK ]

-- Test the dns

[root@standalone2 named]# dig standalone2.manzoor.com

; <<>> DiG 9.3.6-P1-RedHat-9.3.6-16.P1.el5 <<>> standalone2.manzoor.com

;; global options: printcmd

;; Got answer:

;; ->>HEADER<<- 6354="" div="" id:="" noerror="" opcode:="" query="" status:="">

;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:

;standalone2.manzoor.com. IN A

;; ANSWER SECTION:

standalone2.manzoor.com. 86400 IN A 192.168.0.30

;; AUTHORITY SECTION:

manzoor.com. 86400 IN NS standalone2.manzoor.com.

;; Query time: 4 msec

;; SERVER: 192.168.0.30#53(192.168.0.30)

;; WHEN: Fri Oct 4 23:32:47 2013

;; MSG SIZE rcvd: 71

- We got the answer without error.

[root@standalone2 named]# nslookup standalone2.manzoor.com

Server: 192.168.0.30

Address: 192.168.0.30#53

Name: standalone2.manzoor.com

Address: 192.168.0.30

[root@standalone2 named]# nslookup 192.168.0.30

Server: 192.168.0.30

Address: 192.168.0.30#53

30.0.168.192.in-addr.arpa name = standalone2.manzoor.com.

== DNS Configuration for the server has been completed =====================

Steps to Add client to the DNS server.

======================================

1) Update the client server details in the forward.zone file.

Here our clinet server is hostname is urac1rac2-scan.manzoor.com and the IP address for

this host is 192.168.0.27 / 192.168.0.28 and 192.168.0.29

-- Note in this eg. we are using three ip address for the same host beacuse we are

going to setup scan ip for the Oracle 11g grid.

2. Edit the forward zone file and add the client server hostname and ip address as below.

[root@standalone2 named]# vi forward.zone

$TTL 86400

@ IN SOA standalone2.manzoor.com. root.standalone2.manzoor.com. (

42 ; serial (d. adams)

3H ; refresh

15M ; retry

1W ; expiry

1D ) ; minimum

IN NS standalone2.manzoor.com.

IN NS urac1rac2-scan.manzoor.com.

standalone2 IN A 192.168.0.30

urac1rac2-scan IN A 192.168.0.27

urac1rac2-scan IN A 192.168.0.28

urac1rac2-scan IN A 192.168.0.29

-- Note

NS -- Denotes Named server

A -- Denotes Address.

We have updated the NS and A for the client.

2) Update the clienter server details in the reverse.zone file.

[root@standalone2 named]# vi reverse.zone

$TTL 86400

@ IN SOA standalone2.manzoor.com. root.standalone2.manzoor.com. (

1997022700 ; Serial

28800 ; Refresh

14400 ; Retry

3600000 ; Expire

86400 ) ; Minimum

IN NS standalone2.manzoor.com.

IN NS urac1rac2-scan.manzoor.com.

30 IN PTR standalone2.manzoor.com.

27 IN PTR urac1rac2-scan.manzoor.com.

28 IN PTR urac1rac2-scan.manzoor.com.

29 IN PTR urac1rac2-scan.manzoor.com.

-- Note

PTR -- Here the PTR denotes the last pointer of the IP address.

4) Now Test this

[root@standalone2 named]# nslookup urac1rac2-scan.manzoor.com

Server: 192.168.0.30

Address: 192.168.0.30#53

** server can't find urac1rac2-scan.manzoor.com: NXDOMAIN

[root@standalone2 named]# service named restart

Stopping named: [ OK ]

Starting named:

-- We have assigned three ips for urac1rac2-scan.manzoor.com so it should listen is round robin fashion.

[root@standalone2 named]# nslookup urac1rac2-scan.manzoor.com

Server: 192.168.0.30

Address: 192.168.0.30#53

Name: urac1rac2-scan.manzoor.com

Address: 192.168.0.27

Name: urac1rac2-scan.manzoor.com

Address: 192.168.0.28

Name: urac1rac2-scan.manzoor.com

Address: 192.168.0.29

[root@standalone2 named]# nslookup urac1rac2-scan.manzoor.com

Server: 192.168.0.30

Address: 192.168.0.30#53

Name: urac1rac2-scan.manzoor.com

Address: 192.168.0.28

Name: urac1rac2-scan.manzoor.com

Address: 192.168.0.29

Name: urac1rac2-scan.manzoor.com

Address: 192.168.0.27

[root@standalone2 named]# nslookup urac1rac2-scan.manzoor.com

Server: 192.168.0.30

Address: 192.168.0.30#53

Name: urac1rac2-scan.manzoor.com

Address: 192.168.0.29

Name: urac1rac2-scan.manzoor.com

Address: 192.168.0.27

Name: urac1rac2-scan.manzoor.com

Address: 192.168.0.28

Update the /etc/resovl.conf file in the client to update the DNS server address.

----- Client Configuration in DNS Server is Completed ------------------------------------

Updating the SCAN IP in 11gr2 Grid.

===================================

Currently we have a two node rac setup running with one scan since we dont have dns,

and have used the /etc/hosts file for resolving the SCAN ip.

Now we have setup the DNS server and have updated three ips for scan (urac1rac2-scan.manzoor.com).

Current scan detail in Grid.

[oracle@rhel11gr2rac1 bin]$ srvctl status scan

SCAN VIP scan1 is enabled

SCAN VIP scan1 is running on node rhel11gr2rac2

[oracle@rhel11gr2rac1 bin]$ ./srvctl status scan_listener

SCAN Listener LISTENER_SCAN1 is enabled

SCAN listener LISTENER_SCAN1 is running on node rhel11gr2rac2

[oracle@rhel11gr2rac1 bin]$ srvctl config scan

SCAN name: urac1rac2-scan.manzoor.com, Network: 1/192.168.0.0/255.255.255.0/eth0

SCAN VIP name: scan1, IP: /urac1rac2-scan.manzoor.com/192.168.0.28

-- As we see currently its running with 1 ip 192.168.0.28

1) Update the DNS server ip details on both the rac nodes.

[root@rhel11gr2rac1 ~]# vi /etc/resolv.conf

search manzoor.com

nameserver 192.168.0.30

[root@rhel11gr2rac2 ~]# vi /etc/resolv.conf

; generated by /sbin/dhclient-script

search manzoor.com

nameserver 192.168.0.30

2) Check whether the nslookup is returning the details properly.

[root@rhel11gr2rac2 ~]# nslookup urac1rac2-scan.manzoor.com

Server: 192.168.0.30

Address: 192.168.0.30#53

Name: urac1rac2-scan.manzoor.com

Address: 192.168.0.28

Name: urac1rac2-scan.manzoor.com

Address: 192.168.0.29

Name: urac1rac2-scan.manzoor.com

Address: 192.168.0.27

[root@rhel11gr2rac2 ~]# nslookup urac1rac2-scan.manzoor.com

Server: 192.168.0.30

Address: 192.168.0.30#53

Name: urac1rac2-scan.manzoor.com

Address: 192.168.0.29

Name: urac1rac2-scan.manzoor.com

Address: 192.168.0.27

Name: urac1rac2-scan.manzoor.com

Address: 192.168.0.28

[root@rhel11gr2rac2 ~]# nslookup urac1rac2-scan.manzoor.com

Server: 192.168.0.30

Address: 192.168.0.30#53

Name: urac1rac2-scan.manzoor.com

Address: 192.168.0.27

Name: urac1rac2-scan.manzoor.com

Address: 192.168.0.28

Name: urac1rac2-scan.manzoor.com

Address: 192.168.0.29

3) Remove the Scan entry from /etc/hosts file on all the nodes.

4) Stop the scan listener and scan.

[oracle@rhel11gr2rac1 bin]$ ./srvctl stop scan_listener

[oracle@rhel11gr2rac1 bin]$ ./srvctl stop scan

5) Modify scan as root user.

[root@rhel11gr2rac1 bin]# ./srvctl modify scan -n urac1rac2-scan.manzoor.com

[oracle@rhel11gr2rac1 bin]$ ./srvctl modify scan_listener -u

6) Start the Scan listener.

[oracle@rhel11gr2rac1 bin]$ ./srvctl start scan_listener

6) Check the status of the scan.

[oracle@rhel11gr2rac1 bin]$ ./srvctl status scan

SCAN VIP scan1 is enabled

SCAN VIP scan1 is running on node rhel11gr2rac1

SCAN VIP scan2 is enabled

SCAN VIP scan2 is running on node rhel11gr2rac2

SCAN VIP scan3 is enabled

SCAN VIP scan3 is running on node rhel11gr2rac1

[oracle@rhel11gr2rac1 bin]$ ./srvctl config scan

SCAN name: urac1rac2-scan.manzoor.com, Network: 1/192.168.0.0/255.255.255.0/eth0

SCAN VIP name: scan1, IP: /urac1rac2-scan.manzoor.com/192.168.0.28

SCAN VIP name: scan2, IP: /urac1rac2-scan.manzoor.com/192.168.0.29

SCAN VIP name: scan3, IP: /urac1rac2-scan.manzoor.com/192.168.0.27

[oracle@rhel11gr2rac1 bin]$ ./srvctl status scan_listener

SCAN Listener LISTENER_SCAN1 is enabled

SCAN listener LISTENER_SCAN1 is running on node rhel11gr2rac1

SCAN Listener LISTENER_SCAN2 is enabled

SCAN listener LISTENER_SCAN2 is running on node rhel11gr2rac2

SCAN Listener LISTENER_SCAN3 is enabled

SCAN listener LISTENER_SCAN3 is running on node rhel11gr2rac1

-- Scan configuration has been completed.

Reference:-

How to Modify SCAN Setting or SCAN Listener Port after Installation (Doc ID 972500.1)

Linux: How to Configure the DNS Server for 11gR2 SCAN (Doc ID 1107295.1)

How To Convert an 11gR2 GNS Configuration To A Standard Configuration Using DNS Only[Article ID 1489121.1

http://www.youtube.com/watch?v=XLcryY6Ndlg

Friday, October 4, 2013

HAIP - Configure Multiple Private interconnect interface in Linux (11.2)

How to add one more network to Private interconnect (11.2)
==========================================================

1) Below is the current setup

a) Two node RAC with 11.2.0.3 oracle Clusterware.

b) Node details.

[oracle@rhel11gr2rac1 bin]$ ./olsnodes -n -i -s
rhel11gr2rac1 1 rhel11gr2rac1-vip Active
rhel11gr2rac2 2 rhel11gr2rac2-vip Active

c) Private interconnect ips.

[oracle@rhel11gr2rac1 bin]$ ./olsnodes -l -p
rhel11gr2rac1 10.10.10.20

[oracle@rhel11gr2rac2 bin]$ ./olsnodes -l -p
rhel11gr2rac2 10.10.10.21

2) Below is the new interface we are going to add to the private interconnect, update the /etc/hosts file with the below details.

10.10.10.30 rhel11gr2rac1-priv2.manzoor.com rhel11gr2rac1-priv2
10.10.10.31 rhel11gr2rac2-priv2.manzoor.com rhel11gr2rac2-priv2

3) Configure the Network and assign the above ips to the added new network.

Node 1 -

[root@rhel11gr2rac1 ~]# cd /etc/sysconfig/network-scripts/
[root@rhel11gr2rac1 network-scripts]# ifdown eth2

-- Open the eth2 config details and update the necessary details ( can refer the eth1 details)

[root@rhel11gr2rac1 network-scripts]# vi ifcfg-eth2
# Intel Corporation 82545EM Gigabit Ethernet Controller (Copper)
DEVICE=eth2
HWADDR=00:0c:29:89:94:4d
ONBOOT=yes
HOTPLUG=no
BOOTPROTO=none
NETMASK=255.255.255.0
IPADDR=10.10.10.30
GATEWAY=10.10.10.0
TYPE=Ethernet
USERCTL=no
IPV6INIT=no
PEERDNS=yes

[root@rhel11gr2rac1 network-scripts]# ifup eth2
[root@rhel11gr2rac1 network-scripts]# ifconfig eth2

eth2 Link encap:Ethernet HWaddr 00:0C:29:89:94:4D
inet addr:10.10.10.30 Bcast:10.10.10.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe89:944d/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:375 errors:0 dropped:0 overruns:0 frame:0
TX packets:215 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:77632 (75.8 KiB) TX bytes:35114 (34.2 KiB)

Node 2 -

[root@rhel11gr2rac2 ~]# cd /etc/sysconfig/network-scripts/
[root@rhel11gr2rac2 network-scripts]# ifdown eth2
[root@rhel11gr2rac2 network-scripts]# vi ifcfg-eth2

# Intel Corporation 82545EM Gigabit Ethernet Controller (Copper)
DEVICE=eth2
HWADDR=00:0c:29:75:b5:10
ONBOOT=yes
HOTPLUG=no
BOOTPROTO=none
NETMASK=255.255.255.0
IPADDR=10.10.10.31
GATEWAY=10.10.10.0
TYPE=Ethernet
USERCTL=no
IPV6INIT=no
PEERDNS=yes

[root@rhel11gr2rac2 network-scripts]# ifup eth2
[root@rhel11gr2rac2 network-scripts]# ifconfig eth2
eth2 Link encap:Ethernet HWaddr 00:0C:29:75:B5:10
inet addr:10.10.10.31 Bcast:10.10.10.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe75:b510/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:489 errors:0 dropped:0 overruns:0 frame:0
TX packets:186 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:103027 (100.6 KiB) TX bytes:27884 (27.2 KiB)

4) Follow the steps below to add the new interface to the private network.

a) As of 11.2 Grid Infrastructure, the private network configuration is not only stored in OCR but also
in the gpnp profile. If the private network is not available or its definition is incorrect, the
CRSD process will not start and any subsequent changes to the OCR will be impossible. Therefore care needs to be taken when making modifications to the configuration of the private network. It is important to perform the changes in the correct order. Please also note that manual modification of gpnp profile is not supported.

b) Take the backup of the profile.xml file in all the nodes.

Node 1

[oracle@rhel11gr2rac1 ~]$ cd /grid/11.2/gpnp/rhel11gr2rac1/profiles/peer/
[oracle@rhel11gr2rac1 peer]$ cp profile.xml profile.xml_bkp_4thoct
[oracle@rhel11gr2rac1 peer]$ ls -lrt
total 20
-rw-r--r-- 1 oracle oinstall 1873 Mar 23 2013 profile_orig.xml
-rw-r--r-- 1 oracle oinstall 1880 Mar 23 2013 profile.old
-rw-r--r-- 1 oracle oinstall 1886 Mar 23 2013 profile.xml
-rw-r--r-- 1 oracle oinstall 1886 Oct 3 18:35 pending.xml
-rw-r--r-- 1 oracle oinstall 1886 Oct 3 19:48 profile.xml_bkp_4thoct

Node 2

[oracle@rhel11gr2rac2 peer]$ cd /grid/11.2/gpnp/rhel11gr2rac2/profiles/peer
[oracle@rhel11gr2rac2 peer]$ cp profile.xml profile.xml_bkp_4thoct
[oracle@rhel11gr2rac2 peer]$ ls -lrt
total 20
-rw-r--r-- 1 oracle oinstall 1873 Mar 23 2013 profile_orig.xml
-rw-r--r-- 1 oracle oinstall 1880 Mar 23 2013 profile.old
-rw-r--r-- 1 oracle oinstall 1886 Mar 23 2013 profile.xml
-rw-r--r-- 1 oracle oinstall 1886 Oct 3 18:35 pending.xml
-rw-r--r-- 1 oracle oinstall 1886 Oct 3 19:45 profile.xml_bkp_4thoct

c) Ensuare the oracle clusterware is up and running in all the nodes.

Node 1

[oracle@rhel11gr2rac1 bin]$ ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

Node 2

[oracle@rhel11gr2rac2 bin]$ ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

d) We need to use the oifcfg tool for configuring the network.

[oracle@rhel11gr2rac1 bin]$ ./oifcfg -h

Name:
oifcfg - Oracle Interface Configuration Tool.

Usage: oifcfg iflist [-p [-n]]
oifcfg setif {-node | -global} {/:}...
oifcfg getif [-node | -global] [ -if [/] [-type ] ]
oifcfg delif {{-node | -global} [[/]] [-force] | -force}
oifcfg [-help]

- name of the host, as known to a communications network
- name by which the interface is configured in the system
- subnet address of the interface
- type of the interface { cluster_interconnect | public }

e) Get the current configuration details.

[oracle@rhel11gr2rac1 bin]$ ./oifcfg getif
eth0 192.168.0.0 global public
eth1 10.10.10.0 global cluster_interconnect

f) Add the new cluster interconnect information.

$ oifcfg setif -global /:cluster_interconnect

interface -- eth2
subnet -- We are going to add the new interface in the same subnet of previous interconnect. (10.10.10.0)

-- We can use the below command to find the subnet of an interface.

[oracle@rhel11gr2rac1 bin]$ ./oifcfg iflist
eth0 192.168.0.0
eth1 10.10.10.0
eth1 169.254.0.0
eth2 10.10.10.0

-- Our new network interface is eth2 and hence the subnet is 10.10.10.0

-- Note

i) This can be done with -global option even if the interface is not available yet, but this can not be done
with -node option if the interface is not available, it will lead to node eviction.

ii) If your adding a 2nd private network, not replacing the existing private network, please ensure MTU size of both interfaces are the same, otherwise instance startup will report below error:

ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:if MTU failed with status: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: skgxpcini2
ORA-27303: additional information: requested interface lan1:801 has a different MTU (1500) than lan3:801 (9000), which is not supported. Check output from ifconfig command

Check the MTU of the private interface.

-- Node 1

[root@rhel11gr2rac1 ~]# ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:0C:29:89:94:43
inet addr:10.10.10.20 Bcast:10.10.10.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe89:9443/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:220044 errors:0 dropped:0 overruns:0 frame:0
TX packets:186665 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:147597750 (140.7 MiB) TX bytes:108973996 (103.9 MiB)

[root@rhel11gr2rac1 ~]# ifconfig eth2
eth2 Link encap:Ethernet HWaddr 00:0C:29:89:94:4D
inet addr:10.10.10.30 Bcast:10.10.10.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe89:944d/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:410 errors:0 dropped:0 overruns:0 frame:0
TX packets:215 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:84329 (82.3 KiB) TX bytes:35114 (34.2 KiB)

[root@rhel11gr2rac2 ~]# ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:0C:29:75:B5:06
inet addr:10.10.10.21 Bcast:10.10.10.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe75:b506/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:187806 errors:0 dropped:0 overruns:0 frame:0
TX packets:220819 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:109461264 (104.3 MiB) TX bytes:148275753 (141.4 MiB)

[root@rhel11gr2rac2 ~]# ifconfig eth2
eth2 Link encap:Ethernet HWaddr 00:0C:29:75:B5:10
inet addr:10.10.10.31 Bcast:10.10.10.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe75:b510/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:498 errors:0 dropped:0 overruns:0 frame:0
TX packets:186 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:103963 (101.5 KiB) TX bytes:27884 (27.2 KiB)

-- All the above interface are having the same MTU 1500.

Now add the interface as below.

[oracle@rhel11gr2rac1 bin]$ ./oifcfg setif -global eth2/10.10.10.0:cluster_interconnect

verify the changes made.

[oracle@rhel11gr2rac1 bin]$ ./oifcfg getif
eth0 192.168.0.0 global public
eth1 10.10.10.0 global cluster_interconnect
eth2 10.10.10.0 global cluster_interconnect

[oracle@rhel11gr2rac2 bin]$ ./oifcfg getif
eth0 192.168.0.0 global public
eth1 10.10.10.0 global cluster_interconnect
eth2 10.10.10.0 global cluster_interconnect

g) Shutdown the clusterware in all the nodes.

[root@rhel11gr2rac1 bin]# ./crsctl stop crs
[root@rhel11gr2rac2 bin]# ./crsctl stop crs

h) If you have configured the oifcfg before the network card is available then now make the changes at the
os level and check whether the network is available before bring up the crs.

Ping test

Node 1

[root@rhel11gr2rac1 bin]# ping rhel11gr2rac1-priv2
PING rhel11gr2rac1-priv2.manzoor.com (10.10.10.30) 56(84) bytes of data.
64 bytes from rhel11gr2rac1-priv2.manzoor.com (10.10.10.30): icmp_seq=1 ttl=64 time=0.042 ms
64 bytes from rhel11gr2rac1-priv2.manzoor.com (10.10.10.30): icmp_seq=2 ttl=64 time=0.038 ms
64 bytes from rhel11gr2rac1-priv2.manzoor.com (10.10.10.30): icmp_seq=3 ttl=64 time=0.040 ms

--- rhel11gr2rac1-priv2.manzoor.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.038/0.040/0.042/0.001 ms

[root@rhel11gr2rac1 bin]# ping rhel11gr2rac2-priv2
PING rhel11gr2rac2-priv2.manzoor.com (10.10.10.31) 56(84) bytes of data.
64 bytes from rhel11gr2rac2-priv2.manzoor.com (10.10.10.31): icmp_seq=1 ttl=64 time=1.77 ms
64 bytes from rhel11gr2rac2-priv2.manzoor.com (10.10.10.31): icmp_seq=2 ttl=64 time=0.333 ms
64 bytes from rhel11gr2rac2-priv2.manzoor.com (10.10.10.31): icmp_seq=3 ttl=64 time=0.292 ms
64 bytes from rhel11gr2rac2-priv2.manzoor.com (10.10.10.31): icmp_seq=4 ttl=64 time=0.300 ms
64 bytes from rhel11gr2rac2-priv2.manzoor.com (10.10.10.31): icmp_seq=5 ttl=64 time=0.299 ms
64 bytes from rhel11gr2rac2-priv2.manzoor.com (10.10.10.31): icmp_seq=6 ttl=64 time=0.463 ms

--- rhel11gr2rac2-priv2.manzoor.com ping statistics ---
6 packets transmitted, 6 received, 0% packet loss, time 4999ms
rtt min/avg/max/mdev = 0.292/0.576/1.772/0.538 ms

Node 2

[root@rhel11gr2rac2 bin]# ping rhel11gr2rac2-priv2
PING rhel11gr2rac2-priv2.manzoor.com (10.10.10.31) 56(84) bytes of data.
64 bytes from rhel11gr2rac2-priv2.manzoor.com (10.10.10.31): icmp_seq=1 ttl=64 time=0.048 ms
64 bytes from rhel11gr2rac2-priv2.manzoor.com (10.10.10.31): icmp_seq=2 ttl=64 time=0.050 ms
64 bytes from rhel11gr2rac2-priv2.manzoor.com (10.10.10.31): icmp_seq=3 ttl=64 time=0.045 ms

--- rhel11gr2rac2-priv2.manzoor.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.045/0.047/0.050/0.008 ms
[root@rhel11gr2rac2 bin]# ping rhel11gr2rac1-priv2
PING rhel11gr2rac1-priv2.manzoor.com (10.10.10.30) 56(84) bytes of data.
64 bytes from rhel11gr2rac1-priv2.manzoor.com (10.10.10.30): icmp_seq=1 ttl=64 time=2.20 ms
64 bytes from rhel11gr2rac1-priv2.manzoor.com (10.10.10.30): icmp_seq=2 ttl=64 time=0.401 ms
64 bytes from rhel11gr2rac1-priv2.manzoor.com (10.10.10.30): icmp_seq=3 ttl=64 time=0.321 ms

--- rhel11gr2rac1-priv2.manzoor.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.321/0.976/2.207/0.871 ms

i) Start the CRS.

[root@rhel11gr2rac1 bin]# ./crsctl start crs
CRS-4123: Oracle High Availability Services has been started.

[root@rhel11gr2rac2 bin]# ./crsctl start crs
CRS-4123: Oracle High Availability Services has been started.

j) Now verify the status

Node 1

[oracle@rhel11gr2rac1 bin]$ ./oifcfg getif
eth0 192.168.0.0 global public
eth1 10.10.10.0 global cluster_interconnect
eth2 10.10.10.0 global cluster_interconnect

[oracle@rhel11gr2rac1 bin]$ ./olsnodes -l -p
rhel11gr2rac1 10.10.10.20,10.10.10.30

-- Both the private network are getting listed.

[oracle@rhel11gr2rac2 bin]$ ./oifcfg getif
eth0 192.168.0.0 global public
eth1 10.10.10.0 global cluster_interconnect
eth2 10.10.10.0 global cluster_interconnect

[oracle@rhel11gr2rac2 bin]$ ./olsnodes -l -p
rhel11gr2rac2 10.10.10.21,10.10.10.31

========================================================================
For 11.2.0.2+: (HAIP address will show in alert log instead of private IP)
eg.

Cluster communication is configured to use the following interface(s) for this instance
169.254.86.97
=======================================================================================

From alert log
==============

Private Interface 'eth1:1' configured from GPnP for use as a private interconnect.
[name='eth1:1', type=1, ip=169.254.48.13, mac=00-0c-29-75-b5-06, net=169.254.0.0/17, mask=255.255.128.0, use=haip:cluster_interconnect/62]
Private Interface 'eth2:1' configured from GPnP for use as a private interconnect.
[name='eth2:1', type=1, ip=169.254.227.73, mac=00-0c-29-75-b5-10, net=169.254.128.0/17, mask=255.255.128.0, use=haip:cluster_interconnect/62]

.....
Cluster communication is configured to use the following interface(s) for this instance
169.254.48.13
169.254.227.73

Note: interconnect communication will use all two virtual private IPs; in case of network failure, as long as there is one private network adapter functioning, all two IPs will remain active.

From Database

SQL> select * from GV$configured_interconnects where is_public = 'NO';

INST_ID NAME IP_ADDRESS IS_ SOURCE
---------- --------------- ---------------- --- -------------------------------
2 eth1:1 169.254.48.13 NO
2 eth2:1 169.254.227.73 NO
1 eth1:1 169.254.62.58 NO
1 eth2:1 169.254.250.70 NO

Here each private interface will have an virtual ip i.e the eth1 is having the vip as 169.254.62.58 and eth2 vip is 169.254.250.70 like wise for node 2 the eth1 vip is 169.254.48.13 and eth2 vip is 169.254.227.73.

VIP is used for failover, i.e. if one network interface goes down then the vip will be failed over to the other
available interface.

Eg.

If in node 1 the interface eth1 got failured then the vip 169.254.62.58 will be failed over to the eth2. Thus as long as there is one private newtork adapter functioning all the two ips will remain active.

Testing..

INST_ID NAME IP_ADDRESS IS_ SOURCE
---------- --------------- ---------------- --- -------------------------------
2 eth1:1 169.254.48.13 NO
2 eth2:1 169.254.227.73 NO
1 eth1:1 169.254.62.58 NO
1 eth2:1 169.254.250.70 NO

Let bring down the interface eth1 in node 1.

[root@rhel11gr2rac1 ~]# ifdown eth1

Snap from the node 1 db alter log

Thu Oct 03 23:38:45 2013
SKGXP: ospid 16542: network interface query failed for IP address 169.254.62.58.
SKGXP: [error 11132]

ifconfig

--output

eth1 Link encap:Ethernet HWaddr 00:0C:29:89:94:43
BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:375106 errors:0 dropped:0 overruns:0 frame:0
TX packets:310254 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:251856284 (240.1 MiB) TX bytes:186387264 (177.7 MiB)

eth2 Link encap:Ethernet HWaddr 00:0C:29:89:94:4D
inet addr:10.10.10.30 Bcast:10.10.10.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe89:944d/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:157698 errors:0 dropped:0 overruns:0 frame:0
TX packets:139343 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:113436206 (108.1 MiB) TX bytes:74727012 (71.2 MiB)

eth2:1 Link encap:Ethernet HWaddr 00:0C:29:89:94:4D
inet addr:169.254.62.58 Bcast:169.254.127.255 Mask:255.255.128.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth2:2 Link encap:Ethernet HWaddr 00:0C:29:89:94:4D
inet addr:169.254.250.70 Bcast:169.254.255.255 Mask:255.255.128.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

--- Since the eth1 interface is down the vip 169.254.62.58 has been failed over to the eth2 interface, which is eth2:1

[root@rhel11gr2rac2 ~]# ifdown eth1

ifconfig output

eth1 Link encap:Ethernet HWaddr 00:0C:29:75:B5:06
BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:326893 errors:0 dropped:0 overruns:0 frame:0
TX packets:377493 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:192462174 (183.5 MiB) TX bytes:259305297 (247.2 MiB)

eth2 Link encap:Ethernet HWaddr 00:0C:29:75:B5:10
inet addr:10.10.10.31 Bcast:10.10.10.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe75:b510/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:132279 errors:0 dropped:0 overruns:0 frame:0
TX packets:165414 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:72715227 (69.3 MiB) TX bytes:114056247 (108.7 MiB)

eth2:1 Link encap:Ethernet HWaddr 00:0C:29:75:B5:10
inet addr:169.254.48.13 Bcast:169.254.127.255 Mask:255.255.128.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth2:2 Link encap:Ethernet HWaddr 00:0C:29:75:B5:10
inet addr:169.254.227.73 Bcast:169.254.255.255 Mask:255.255.128.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

-- Since the eth1 interface id down the vip 169.254.48.13 has been failed over to eth2, which is eth2:1.

-- Eventhough one interface is down on each node still we have two vips on both the nodes which been served by the remaining network interface.

The oifcfg output as below.

[root@rhel11gr2rac2 bin]# ./oifcfg iflist -n -p
eth0 192.168.0.0 PRIVATE 255.255.255.0
eth2 10.10.10.0 PRIVATE 255.255.255.0
eth2 169.254.0.0 UNKNOWN 255.255.128.0
eth2 169.254.128.0 UNKNOWN 255.255.128.0

-- Now lets bring up the eth1 on node 2.

[root@rhel11gr2rac2 bin]# ifup eth1

ifconfig output in node 2

eth1 Link encap:Ethernet HWaddr 00:0C:29:75:B5:06
inet addr:10.10.10.21 Bcast:10.10.10.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe75:b506/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:327807 errors:0 dropped:0 overruns:0 frame:0
TX packets:378590 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:192898043 (183.9 MiB) TX bytes:260037599 (247.9 MiB)

eth1:1 Link encap:Ethernet HWaddr 00:0C:29:75:B5:06
inet addr:169.254.48.13 Bcast:169.254.127.255 Mask:255.255.128.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth1:2 Link encap:Ethernet HWaddr 00:0C:29:75:B5:06
inet addr:169.254.227.73 Bcast:169.254.255.255 Mask:255.255.128.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth2 Link encap:Ethernet HWaddr 00:0C:29:75:B5:10
inet addr:10.10.10.31 Bcast:10.10.10.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe75:b510/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:138848 errors:0 dropped:0 overruns:0 frame:0
TX packets:173925 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:75828753 (72.3 MiB) TX bytes:120251886 (114.6 MiB)

--now both the vips are servered by eth1 eventhough eth2 is up and running this is because one interface is down on node 1.

[root@rhel11gr2rac1 ~]# ifup eth1

ifconfig output in node 1

eth1 Link encap:Ethernet HWaddr 00:0C:29:89:94:43
inet addr:10.10.10.20 Bcast:10.10.10.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe89:9443/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:375296 errors:0 dropped:0 overruns:0 frame:0
TX packets:310382 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:251983004 (240.3 MiB) TX bytes:186445931 (177.8 MiB)

eth1:1 Link encap:Ethernet HWaddr 00:0C:29:89:94:43
inet addr:169.254.62.58 Bcast:169.254.127.255 Mask:255.255.128.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth2 Link encap:Ethernet HWaddr 00:0C:29:89:94:4D
inet addr:10.10.10.30 Bcast:10.10.10.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe89:944d/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:186819 errors:0 dropped:0 overruns:0 frame:0
TX packets:161939 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:134733101 (128.4 MiB) TX bytes:85910612 (81.9 MiB)

eth2:2 Link encap:Ethernet HWaddr 00:0C:29:89:94:4D
inet addr:169.254.250.70 Bcast:169.254.255.255 Mask:255.255.128.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

ifconfig output in node 2 (after the eth1 is up on both the nodes)

eth1 Link encap:Ethernet HWaddr 00:0C:29:75:B5:06
inet addr:10.10.10.21 Bcast:10.10.10.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe75:b506/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:333637 errors:0 dropped:0 overruns:0 frame:0
TX packets:386233 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:196228732 (187.1 MiB) TX bytes:265869284 (253.5 MiB)

eth1:1 Link encap:Ethernet HWaddr 00:0C:29:75:B5:06
inet addr:169.254.48.13 Bcast:169.254.127.255 Mask:255.255.128.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth2 Link encap:Ethernet HWaddr 00:0C:29:75:B5:10
inet addr:10.10.10.31 Bcast:10.10.10.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe75:b510/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:140802 errors:0 dropped:0 overruns:0 frame:0
TX packets:175889 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:76612446 (73.0 MiB) TX bytes:121438787 (115.8 MiB)

eth2:1 Link encap:Ethernet HWaddr 00:0C:29:75:B5:10
inet addr:169.254.227.73 Bcast:169.254.255.255 Mask:255.255.128.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

-- As long as one interface is healthy then there wont be any impact to the asm/db instance.

now we bought down eth2 in node 1 and eth1 in node 2. Below is the oifcfg output.

node 1

[root@rhel11gr2rac1 bin]# ./oifcfg iflist -n -p
eth0 192.168.0.0 PRIVATE 255.255.255.0
eth1 10.10.10.0 PRIVATE 255.255.255.0
eth1 169.254.0.0 UNKNOWN 255.255.128.0
eth1 169.254.128.0 UNKNOWN 255.255.128.0

Node 2

[root@rhel11gr2rac2 bin]# ./oifcfg iflist -n -p
eth0 192.168.0.0 PRIVATE 255.255.255.0
eth2 10.10.10.0 PRIVATE 255.255.255.0
eth2 169.254.128.0 UNKNOWN 255.255.128.0
eth2 169.254.0.0 UNKNOWN 255.255.128.0

Below is the oifcfg output when both the interface are up in both the nodes.

Node 1

[root@rhel11gr2rac1 bin]# ./oifcfg iflist -n -p
eth0 192.168.0.0 PRIVATE 255.255.255.0
eth1 10.10.10.0 PRIVATE 255.255.255.0
eth1 169.254.0.0 UNKNOWN 255.255.128.0
eth2 10.10.10.0 PRIVATE 255.255.255.0
eth2 169.254.128.0 UNKNOWN 255.255.128.0

Node 2

[root@rhel11gr2rac2 bin]# ./oifcfg iflist -n -p
eth0 192.168.0.0 PRIVATE 255.255.255.0
eth1 10.10.10.0 PRIVATE 255.255.255.0
eth1 169.254.0.0 UNKNOWN 255.255.128.0
eth2 10.10.10.0 PRIVATE 255.255.255.0
eth2 169.254.128.0 UNKNOWN 255.255.128.0

Reference:-
11gR2 Grid Infrastructure Redundant Interconnect and ora.cluster_interconnect.haip (Doc ID 1210883.1)
How to Modify Private Network Information in Oracle Clusterware (Doc ID 283684.1)

Wednesday, October 2, 2013

Oracle Golden Gate High Availability using Oracle Clusterware (11.2)

OGG High Availability using Oracle clusterware
================================
1) Oracle Golden Gate cluster high availability Pre-requsite.

a. Oracle Golden gate runs on one server at any time.
b. In the event of the failure on one node, the oracle GG can be started on the another node.
c. In order to resume processing on the another node, we need to maintain/store the
recover relagted files (checkpoint file & trailfiles) on shared location.
d. Oracle ACFS is the recommended cluster file system for Oracle Golden Gate binaries and
trail files in Real Application Cluster configurations for ease of management and high availability.

Note: ACFS can be used for Oracle Golden Gate trail files with no restrictions. Oracle GoldenGate installation can be done on ACFS and you can also store the recovery-related files in a cluster configuration in ACFS to make them accessible to all nodes. However if your Oracle Grid Infrastructure version is older than 11.2.0.3 then ACFS mounted on multiple servers concurrently does not currently support file locking, thus you would need to mount ACFS on only one server.

If ACFS is mounted on one server at a time then file locking is supported in pre 11.2.0.3 Grid Infrastructure releases. This file locking issue has been resolved in Oracle Grid Infrastructure release 12c and the fix has been back ported up to version 11.2.0.3.

2) Oracle clusterware

a) Oracle clusterware provides the capability to manage the third-party applications.
b) There are commands to register an application and instruct Oracle Clusterware how to manage the application in a clustered environment.
c) This capability will be used to register the Oracle GoldenGate manager process as an application managed through Oracle Clusterware.
d) Oracle Clusterware can be installed standalone without an Oracle RAC database and still manage a cluster of servers and various applications running on these servers. As such Oracle Clusterware can also be installed on more than just the database servers to form a single cluster.

3) Oracle Golden Gate Installations.

a) You may choose to perform a local installation on every server, or a single installation on a shared file system. You will need shared storage for the recovery-related files. On a Unix/Linux platform you can use a symbolic link to a central location for the shared directories.

4) Virtual IP.

a) Oracle Clusterware uses the concept of a Virtual IP address (VIP) to manage high availability for applications that require incoming network traffic (including the Oracle RAC database).
b) A VIP is an IP address on the public subnet that can be used to access a server. If the server hosting the VIP were to go down, then Oracle Clusterware will migrate the VIP to a surviving server to minimize interruptions for the application accessing the server (through the VIP).
c) This concept enables faster failovers compared to time-out based failovers on a server's actual IP address in case of a server failure.
d) For Oracle GoldenGate, you should use a VIP to access the manager process to isolate access to the manager process from the physical server that is running Oracle GoldenGate. Remote pumps must use the VIP to contact the Oracle GoldenGate manager. The VIP must be an available IP address on the public subnet and cannot be determined through DHCP.
Ask a system administrator for an available fixed IP address for Oracle GoldenGate managed through Oracle Clusterware.

5. We need to instruct Oracle clusterware how to start, stop, check process.

i) Start
a) Oracle GoldenGate manager is the process that starts all other Oracle GoldenGate processes. The only process that Oracle Clusterware should start is the manager process. Use the AUTOSTART parameter in the manager parameter file to start extract and replicat processes. You can use wild cards (AUTOSTART ER *) to start all extract and replicat processes.
b) Also note that once manager is started through Oracle Clusterware, it is Oracle Clusterware that manages its availability. If you would stop manager through the command interface ggsci, then Oracle Clusterware will attempt to restart it. Use the Oracle Clusterware commands to stop Oracle GoldenGate and prevent Oracle Clusterware from attempting to restart it.

ii) check
a) The validation whether Oracle GoldenGate is running is equivalent to making sure the Oracle GoldenGate manager runs.

iii) Stop
a) Stop must stop all Oracle GoldenGate processes, including manager. Stop may be called during a planned downtime (e.g. a server is taken out of a cluster for maintenance reasons) and/or if you manually instruct Oracle Clusterware to relocate Oracle GoldenGate to a different server (e.g. to change the load on a server). If a server crashes then all processes will go down with it, in which case they can be started on another server.

Setup
=====

1 . As of now below setup is running.

a. Source is two node rac where the GG is configured using ACFS.
b. Configured one extract and one pump process on the source.
c. Target is standalone db.
d. One Replicat process is configured in the target.

2. Now we need to register the Golden gate in Oracle cluster ware. We need to use oracle clusterware commands to create, register and set privileges on the VIP and the Oracle Golden gate application.Once registered, use the Oracle Clusterware commands to start, relocate and stop Oracle GoldenGate.

3. Add an application VIP.

a) The first step is to create an application VIP. The VIP will be used to access Oracle GoldenGate.
Oracle Clusterware will assign the VIP to a physical server, and migrate the VIP if that server were to go down or if you instruct Clusterware to do so.

b. Update the below vip in the /etc/hosts file on both the nodes , ( the vip should be on the same subnet of the public ip).

########## VIP FOR GOLDENGATE ################################

192.168.0.22 goldengate-vip.manzoor.com goldengate-vip

c. Create a application VIP using below command as root user.

[root@rhel11gr2rac1 bin]# cd /grid/11.2/bin

[root@rhel11gr2rac1 bin]# ./appvipcfg -help
Production Copyright 2007, 2008, Oracle.All rights reserved
Unknown option: help

Usage: appvipcfg create -network= -ip= -vipname=
-user=[-group=] [-failback=0 | 1]
delete -vipname=

To identifiy the network number execute the below command.

[root@rhel11gr2rac1 bin]# ./crsctl stat res -p | grep -ie.network -ie subnet | grep -ie name -ie subnet
NAME=ora.net1.network
USR_ORA_SUBNET=192.168.0.0

here ora.net1 in NAME denotes the network number which is 1, and the USR_ORA_SUBNET denotes the subnet under which
the vip will be created.

Execute the below command to create the application vip.

./appvipcfg create -network=1 -ip=192.168.0.22 -vipname=goldengate-vip -user=root
Production Copyright 2007, 2008, Oracle.All rights reserved
2013-10-01 23:18:11: Creating Resource Type
2013-10-01 23:18:11: Executing /grid/11.2/bin/crsctl add type app.appvip_net1.type -basetype ora.cluster_vip_net1.type -file /grid/11.2/crs/template/appvip.type
2013-10-01 23:18:11: Executing cmd: /grid/11.2/bin/crsctl add type app.appvip_net1.type -basetype ora.cluster_vip_net1.type -file /grid/11.2/crs/template/appvip.type
2013-10-01 23:18:13: Create the Resource
2013-10-01 23:18:13: Executing /grid/11.2/bin/crsctl add resource goldengate-vip -type app.appvip_net1.type -attr "USR_ORA_VIP=192.168.0.22,START_DEPENDENCIES=hard(ora.net1.network) pullup(ora.net1.network),STOP_DEPENDENCIES=hard(ora.net1.network),ACL='owner:root:rwx,pgrp:root:r-x,other::r--,user:root:r-x',HOSTING_MEMBERS=rhel11gr2rac1.manzoor.com,APPSVIP_FAILBACK="
2013-10-01 23:18:13: Executing cmd: /grid/11.2/bin/crsctl add resource goldengate-vip -type app.appvip_net1.type -attr "USR_ORA_VIP=192.168.0.22,START_DEPENDENCIES=hard(ora.net1.network) pullup(ora.net1.network),STOP_DEPENDENCIES=hard(ora.net1.network),ACL='owner:root:rwx,pgrp:root:r-x,other::r--,user:root:r-x',HOSTING_MEMBERS=rhel11gr2rac1.manzoor.com,APPSVIP_FAILBACK="

d) Now allow the oracle clusterware owner (eg. oracle or grid) to run the script to start the VIP.

execute the below as root.

./crsctl setperm resource goldengate-vip -u user:oracle:r-x

e) As oracle user start the vip.

[oracle@rhel11gr2rac1 bin]$ ./crsctl start resource goldengate-vip
CRS-2672: Attempting to start 'goldengate-vip' on 'rhel11gr2rac1'
CRS-2676: Start of 'goldengate-vip' on 'rhel11gr2rac1' succeeded

f) Check the status of the vip.

[oracle@rhel11gr2rac1 bin]$ ./crsctl stat res goldengate-vip
NAME=goldengate-vip
TYPE=app.appvip_net1.type
TARGET=ONLINE
STATE=ONLINE on rhel11gr2rac1

e) Now we can able to ping the vip from the other nodes. Test it in node 2.

[root@rhel11gr2rac2 ~]# ping 192.168.0.22
PING 192.168.0.22 (192.168.0.22) 56(84) bytes of data.
64 bytes from 192.168.0.22: icmp_seq=1 ttl=64 time=2.29 ms
64 bytes from 192.168.0.22: icmp_seq=2 ttl=64 time=0.389 ms
64 bytes from 192.168.0.22: icmp_seq=3 ttl=64 time=0.372 ms

--- 192.168.0.22 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 0.372/1.018/2.295/0.903 ms

4) Now develop and agent script.

a) Oracle Clusterware runs resource-specific commands through an entity called an agent.
The agent script must be able to accept 5 parameter values: start, stop, check, clean and abort (optional).

b) Now we will create an script to and will also place the script in the shared location, here we have placed the script under the gg home which will be accessed on both the nodes. (This is the sample script provided by oracle we can also have a customized script as per our requirement).

Script name = gg_monitor_start.sh

#!/bin/sh
#goldengate_action.scr
. ~oracle/.bash_profile
[ -z "$1" ]&& echo "ERROR!! Usage $0 "&& exit 99
GGS_HOME=/golden_gate
#specify delay after start before checking for successful start
start_delay_secs=5
#Include the Oracle GoldenGate home in the library path to start GGSCI
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${GGS_HOME}
#set the oracle home to the database to ensure Oracle GoldenGate will get
#the right environment settings to be able to connect to the database
export ORACLE_HOME=/u01/app/oracle/product/11.2/db
export CRS_HOME=/grid/11.2
#Set NLS_LANG otherwise it will default to US7ASCII

export NLS_LANG=AMERICAN_AMERICA.AL32UTF8
logfile=/tmp/crs_gg_start.log

###########################
function log
###########################
{
DATETIME=`date +%d/%m/%y-%H:%M:%S`
echo $DATETIME "goldengate_action.scr>>" $1
echo $DATETIME "goldengate_action.scr>>" $1 >> $logfile
}
#check_process validates that a manager process is running at the PID
#that Oracle GoldenGate specifies.
check_process () {
dt=`date +%d/%m/%y-%H:%M:%S`
if ( [ -f "${GGS_HOME}/dirpcs/MGR.pcm" ] )
then
pid=`cut -f8 "${GGS_HOME}/dirpcs/MGR.pcm"`
if [ ${pid} = `ps -e |grep ${pid} |grep mgr |awk '{ print $1 }'` ]
then
#manager process is running on the PID . exit success
echo $dt "manager process is running on the PID . exit success">> /tmp/check.out
exit 0
else
#manager process is not running on the PID
echo $dt "manager process is not running on the PID" >> /tmp/check.out
exit 1
fi
else
#manager is not running because there is no PID file
echo $ dt"manager is not running because there is no PID file" >> /tmp/check.out
exit 1
fi
}
#call_ggsci is a generic routine that executes a ggsci command
call_ggsci () {
log "entering call_ggsci"
ggsci_command=$1
#log "about to execute $ggsci_command"
log "id= $USER"
cd ${GGS_HOME}
ggsci_output=`${GGS_HOME}/ggsci << EOF
${ggsci_command}
exit
EOF`
log "got output of : $ggsci_output"
}

case $1 in
'start')
#Updated by Sourav B (02/10/2011)
# During failover if the “mgr.pcm” file is not deleted at the node crash
# then Oracle clusterware won’t start the manager on the new node assuming the
# manager process is still running on the failed node. To get around this issue
# we will delete the “mgr.prm” file before starting up the manager on the new
# node. We will also delete the other process files with pc* extension and to
# avoid any file locking issue we will first backup the checkpoint files and then
# delete them from the dirchk directory.After that we will restore the checkpoint
# files from backup to the original location (dirchk directory).
log "removing *.pc* files from dirpcs directory..."
cd $GGS_HOME/dirpcs
rm -f *.pc*
log "creating tmp directory to backup checkpoint file...."
cd $GGS_HOME/dirchk
mkdir tmp
log "backing up checkpoint files..."
cp *.cp* $GGS_HOME/dirchk/tmp
log "Deleting checkpoint files under dirchk......"
rm -f *.cp*
log "Restore checkpoint files from backup to dirchk directory...."
cp $GGS_HOME/dirchk/tmp/*.cp* $GGS_HOME/dirchk
log "Deleting tmp directory...."
rm -rf tmp
log "starting manager"
call_ggsci 'start manager'

#there is a small delay between issuing the start manager command
#and the process being spawned on the OS . wait before checking
log "sleeping for start_delay_secs"
sleep ${start_delay_secs}
#check whether manager is running and exit accordingly
check_process
;;
'stop')
#attempt a clean stop for all non-manager processes
call_ggsci 'stop er *'
#ensure everything is stopped
call_ggsci 'stop er *!'
#stop manager without (y/n) confirmation
call_ggsci 'stop manager!'
#exit success
exit 0
;;
'check')
check_process
exit 0
;;
'clean')
#attempt a clean stop for all non-manager processes
call_ggsci 'stop er *'
#ensure everything is stopped
call_ggsci 'stop er *!'
#in case there are lingering processes
call_ggsci 'kill er *'
#stop manager without (y/n) confirmation
call_ggsci 'stop manager!'
#exit success
exit 0
;;
'abort')
#ensure everything is stopped
call_ggsci 'stop er *!'
#in case there are lingering processes
call_ggsci 'kill er *'
#stop manager without (y/n) confirmation
call_ggsci 'stop manager!'
#exit success
exit 0
;;
esac

c) Now we need to add a clusterware resource for the ggate application. As oracle user execute the
below command.

[oracle@rhel11gr2rac1 bin]$ ./crsctl add resource ggateapp -type cluster_resource -attr "ACTION_SCRIPT=/golden_gate/gg_monitor_start.sh,CHECK_INTERVAL=30,START_DEPENDENCIES='hard(goldengate-vip) pullup(goldengate-vip)', STOP_DEPENDENCIES='hard(goldengate-vip)'"

where ggateapp - is the application name we have given for golden gate resource.

START_DEPENDENCIES: there is a hard start dependency on goldengate-vip. This indicates that the VIP and the ggateapp application should always start together.

STOP_DEPENDENCIES: there is a hard stop dependency on goldengate-vip. This indicates that the VIP and the ggateapp application should always stop together.

d) Now set the ownership of the oracle golden gate application if it is different from the oracle clusterware owner eg(ggowner) If oracle goldengate owner is same then ignore the below.

As root execute the below command.

./crsctl setperm resource ggateapp -o ggowner

e) Now start the resource using oracle user.

[oracle@rhel11gr2rac1 bin]$ ./crsctl start res ggateapp
CRS-2672: Attempting to start 'ggateapp' on 'rhel11gr2rac1'
CRS-2676: Start of 'ggateapp' on 'rhel11gr2rac1' succeeded

[oracle@rhel11gr2rac1 bin]$ ./crsctl status res ggateapp
NAME=ggateapp
TYPE=cluster_resource
TARGET=ONLINE
STATE=ONLINE on rhel11gr2rac1

[oracle@rhel11gr2rac1 bin]$ ./crsctl stop res ggateapp
CRS-2673: Attempting to stop 'ggateapp' on 'rhel11gr2rac1'
CRS-2677: Stop of 'ggateapp' on 'rhel11gr2rac1' succeeded

-- Now lets check the status in ggsci.

[oracle@rhel11gr2rac1 golden_gate]$ ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 11.2.1.0.1 OGGCORE_11.2.1.0.1_PLATFORMS_120423.0230_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Apr 23 2012 08:32:14

Copyright (C) 1995, 2012, Oracle and/or its affiliates. All rights reserved.

GGSCI (rhel11gr2rac1.manzoor.com) 1> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER STOPPED
EXTRACT STOPPED PTBLS 00:00:00 00:00:29
EXTRACT STOPPED XTBLS 00:00:02 00:00:27

--- Its showing stopped..

[oracle@rhel11gr2rac1 bin]$ ./crsctl start res ggateapp
CRS-2672: Attempting to start 'ggateapp' on 'rhel11gr2rac1'
CRS-2676: Start of 'ggateapp' on 'rhel11gr2rac1' succeeded

[oracle@rhel11gr2rac1 golden_gate]$ ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 11.2.1.0.1 OGGCORE_11.2.1.0.1_PLATFORMS_120423.0230_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Apr 23 2012 08:32:14

Copyright (C) 1995, 2012, Oracle and/or its affiliates. All rights reserved.

GGSCI (rhel11gr2rac1.manzoor.com) 1> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT RUNNING PTBLS 00:00:00 00:00:10
EXTRACT RUNNING XTBLS 00:00:01 00:00:09

--- Now lets relocate the ggapp to the another node (-- scheduled downtime)

[oracle@rhel11gr2rac1 bin]$ ./crsctl relocate resource ggateapp -f
CRS-2673: Attempting to stop 'ggateapp' on 'rhel11gr2rac1'
CRS-2677: Stop of 'ggateapp' on 'rhel11gr2rac1' succeeded
CRS-2673: Attempting to stop 'goldengate-vip' on 'rhel11gr2rac1'
CRS-2677: Stop of 'goldengate-vip' on 'rhel11gr2rac1' succeeded
CRS-2672: Attempting to start 'goldengate-vip' on 'rhel11gr2rac2'
CRS-2676: Start of 'goldengate-vip' on 'rhel11gr2rac2' succeeded
CRS-2672: Attempting to start 'ggateapp' on 'rhel11gr2rac2'
CRS-2676: Start of 'ggateapp' on 'rhel11gr2rac2' succeeded

-- Lets check the gg process on node 2.

[oracle@rhel11gr2rac2 golden_gate]$ ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 11.2.1.0.1 OGGCORE_11.2.1.0.1_PLATFORMS_120423.0230_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Apr 23 2012 08:32:14

Copyright (C) 1995, 2012, Oracle and/or its affiliates. All rights reserved.

GGSCI (rhel11gr2rac2.manzoor.com) 1> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT RUNNING PTBLS 00:00:00 00:00:04
EXTRACT RUNNING XTBLS 00:00:05 00:00:06

[oracle@rhel11gr2rac1 bin]$ ./crsctl status resource ggateapp
NAME=ggateapp
TYPE=cluster_resource
TARGET=ONLINE
STATE=ONLINE on rhel11gr2rac2

-- Now lets check how the failover is working.

Lets crash the node 2...(power of from vmware home)

Below are the staus after the node 2 is down....

Cluster Resources
--------------------------------------------------------------------------------
ggateapp
1 ONLINE OFFLINE
goldengate-vip
1 ONLINE OFFLINE STARTING

Cluster Resources
--------------------------------------------------------------------------------
ggateapp
1 ONLINE ONLINE rhel11gr2rac1
goldengate-vip
1 ONLINE ONLINE rhel11gr2rac1

-- Could see the ggateapp resource and goldengate-vip has been failed over from
node 2 to node 1...

Below is the output from ggsci.

GGSCI (rhel11gr2rac1.manzoor.com) 3> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT ABENDED PTBLS 00:00:00 00:04:10
EXTRACT RUNNING XTBLS 00:00:00 00:00:09

-- The manager and the extract process are started but the pump extract has been abended with the below
error.

2013-10-02 22:20:56 ERROR OGG-01031 There is a problem in network communication, a remote file problem, encryption keys for target and source do not matc
h (if using ENCRYPT) or an unknown error. (Reply received is Unable to open file "./dirdat/XT000016" (error 11, Resource temporarily unavailable)).

2013-10-02 22:20:56 ERROR OGG-01668 PROCESS ABENDING.

Source --

GGSCI (rhel11gr2rac1.manzoor.com) 17> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT ABENDED PTBLS 00:00:00 00:08:18
EXTRACT RUNNING XTBLS 00:00:00 00:00:00

GGSCI (rhel11gr2rac1.manzoor.com) 18> info PTBLS

EXTRACT PTBLS Last Started 2013-10-02 22:24 Status ABENDED
Checkpoint Lag 00:00:00 (updated 00:08:24 ago)
Log Read Checkpoint File ./dirdat/XT000019
2013-10-02 22:16:24.415084 RBA 1111

Target --

GGSCI (standalone2.manzoor.com) 2> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
REPLICAT RUNNING RTBLS 00:00:00 00:00:05

GGSCI (standalone2.manzoor.com) 3> info rtbls

REPLICAT RTBLS Last Started 2013-10-02 20:38 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:07 ago)
Log Read Checkpoint File ./dirdat/XT000016
2013-10-02 22:16:24.422522 RBA 1991

GGSCI (standalone2.manzoor.com) 4> send rtbls status

Sending STATUS request to REPLICAT RTBLS ...
Current status: At EOF
Sequence #: 16
RBA: 1991
0 records in current transaction

--- The replicat process shows it is completed all the data and currently it is at end of file.

-- Now we will do an et (extact trail) rollover on the source and

Source--

GGSCI (rhel11gr2rac1.manzoor.com) 21> alter extract ptbls etrollover

2013-10-02 22:34:04 INFO OGG-01520 Rollover performed. For each affected output trail of Version 10 or higher format, after
starting the source extract, issue ALTER EXTSEQNO for that trail's reader (either pump EXTRACT or REPLICAT) to move the reader's scan to the new trail file; it will not happen automatically.
EXTRACT altered.

GGSCI (rhel11gr2rac1.manzoor.com) 23> start extract ptbls

Sending START request to MANAGER ...
EXTRACT PTBLS starting

GGSCI (rhel11gr2rac1.manzoor.com) 24> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT RUNNING PTBLS 00:00:00 00:00:58
EXTRACT RUNNING XTBLS 00:00:00 00:00:08

GGSCI (rhel11gr2rac1.manzoor.com) 31> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT RUNNING PTBLS 00:00:00 00:00:05
EXTRACT RUNNING XTBLS 00:00:00 00:00:00

GGSCI (rhel11gr2rac1.manzoor.com) 32> info ptbls

EXTRACT PTBLS Last Started 2013-10-02 22:35 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:02 ago)
Log Read Checkpoint File ./dirdat/XT000020
2013-10-02 22:20:46.216461 RBA 1111

--- Target

in target we need to move the replicat to start again from the next sequnce since we have
did an ET rollover on the source.

GGSCI (standalone2.manzoor.com) 3> stop replicat rtbls

Sending STOP request to REPLICAT RTBLS ...
Request processed.

GGSCI (standalone2.manzoor.com) 4> alter replicat rtbls extseqno 17 extrba 0
REPLICAT altered.

GGSCI (standalone2.manzoor.com) 5> start replicat rtbls

-- Now lets update some data on source.

SQL> select count(*) from emp;

COUNT(*)
----------
4000

SQL> begin
2 for i in 4001..5000 loop
3 insert into emp values (i, dbms_random.string('U',30),30);
4 END LOOP;
5 commit;
6 end;
7 /

PL/SQL procedure successfully completed.

SQL> select count(*) from emp;

COUNT(*)
----------
5000

-- Lets check whether it replicated to target.

SQL> select count(*) from emp;

COUNT(*)
----------
5000

---- Reference -- Oracle White Paper—Oracle GoldenGate high availability with Oracle Clusterware

Note - 1527310.1,

Manzoor's Blog - Stuffs on Oracle Database Administration

Saturday, October 5, 2013

Setup up DNS Server for SCAN IP for 11gr2 Grid (11.2)

Friday, October 4, 2013

HAIP - Configure Multiple Private interconnect interface in Linux (11.2)

Wednesday, October 2, 2013

Oracle Golden Gate High Availability using Oracle Clusterware (11.2)

Followers

Blog Archive