Oracle Real Application Cluster failure test cases


Oracle Real Application Cluster failure test cases Kamran Aghayev A. Oracle Certified Master, ACE Director My country – Azerbaijan It has an ancient and historical culture Azerbaijan is one of the birthplaces of the oil industry 9 out of 11 existing climate zones are present in Azerbaijan • Database Administrator at AzerCell Telecom • Oracle Certified Master • Oracle ACE Director • Author of the book “RMAN Backup and Recovery” • Blogger at http://www.kamranagayev.com • President of Azerbaijan Oracle User Group (AzerOUG) About me • How many of you installed RAC database? The Requisite Room Survey CAN YOU PROVE IT? • How many of you have tested the clusterware? • Is your Clusterware stable? Some suggestions • Have your own „ready scripts‟ under „~/scripts‟ folder •@asmdisks •@asmdiskgroups • Use formatted „crs_stat –t‟ output (ID 259301.1) • „tail –f‟ log files before any critical action Creating a workload #!/bin/bash . /home/oracle/.bash_profile for ((i=1; i <= 100 ; i++)) do nohup sqlplus -S system/oracle@RAC< file.log & begin dbms_lock.sleep(20); end; / eof done • Node failure • Database Instance failure • ASM instance failure • Listener failure • Public network failure • Private network failure • Losing access to OCR and Voting disk • OCR, Voting disk, ASM disk failure The main test cases Procedure • Start the workload • Shutdown the node Expected result: •Resources will go OFFLINE •VIP fails to surviving node •Client connections are moved to surviving instance 1. Node failure test cases Procedure 2 • Reboot all nodes Investigation : •Check if cluster is up on both nodes after the reboot •Note the time it take for the cluster to be up 1. Node failure test cases 2. Instance failure Procedure • Start the workload • Shut down the instance („shut abort‟ or kill PMON process) Expected result: •Instance recovery will be performed • Surviving instance reads online redo log files of the failed instance and ensures that committed transactions are recorded to the database. • If all nodes fail, one instance will performs recovery of all instances. •Services will be moved to available instance •Failed instance will be restarted by the clusterware automatically 3. ASM Instance failure Procedure • Start the workload • Kill PMON process of the ASM instance Expected result: •ASM resource will be OFFLINE and will be automatically restarted by the clusterware •Instance recovery will be performed •Services will be moved to available instance 3. Listener failure Procedure • Kill Listener process Expected result: •New connections will be redirected to the second listener •Listener failure will be detected by CRSD and restarted 4. Public Network failure Procedure • Unplug public network cable Expected result: •VIP will fail to the surviving node •DB instance will be up. DB service will fail to surviving node •If TAF configured, clients should fail to available instance 5. Private Network failure Procedure • Unplug private network cable Expected result: •CSSD will detect a split-brain situation and will survive the node with the lowest node number. Second node will be evicted •The CRS, ASM and DB instances will shutdown. •All processes will be terminated. If not, the node will be rebooted •After reconnect, CRS stack and resources will be started 6. OCR and Voting disk failure test Procedure • Unplug storage cable to OCR disk from node 1 Expected result: •CRSD will detect and abort. DB instance and ASM will not be impacted Procedure • Unplug storage cable to Voting disk from node 1 Expected result: •CSS will detect and evict the node from the cluster 6. OCR and Voting disk failure test Procedure • Remove mirrored copy of OCR file or Voting disk Expected result: •There will be no impact on the cluster 6. OCR and Voting disk failure test Procedure • Take backup of OCR file and Voting disk and remove them completely • Recover them from backup 7. ASM disk failure Procedure • Remove a disk from diskgroup with normal redundancy • Add a disk to the diskgroup Expected result: •No impact on database instances •Check rebalance operation from V$ASM_OPERATION Thanks for coming!! Questions? http://www.kamranagayev.com http://www.facebook.com/KamranAgayev http://www.twitter.com/KamranAgayev
还剩18页未读

继续阅读

下载pdf到电脑,查找使用更方便

pdf的实际排版效果,会与网站的显示效果略有不同!!

需要 3 金币 [ 分享pdf获得金币 ] 0 人已下载

下载pdf