LoveUnix » ORACLE等数据库 » Oracel RAC一台服务器宕机时,客户端程序连接奇慢
让LU留住您的每

一天 让LU博客留住您的每一天
2006-11-3 07:59 dooza
Oracel RAC一台服务器宕机时,客户端程序连接奇慢

环境:
2台 ibm 650  AIX 5.2 ML04
HACMP 5.2
oracle 9i   9.2.0.4 RAC
oracle数据库使用裸设备
现在的情况是:
当一台服务器宕机:群集服务停止,服务IP未启的情况下,客户端程序连接数据库时,大概需要45秒钟,而正常情况下,连接时间只需要5秒钟左右。
经过测试发现:
如果在2台服务器都正常启动群集和oracle数据库的情况下(oracle是由hacmp来启动的),将其中任何一台服务器的oracle实例停止(也就是通过sqlplus以sysdba连接,并执行shutdown命令),客户端程序连接速度正常,然后再停止oracle监听,连接速度也正常。然后停止群集服务,连接速度奇慢。为了进一步确认是否跟群集的服务IP有关系,在其中一台服务器群集服务停止的情况下,将其服务IP设置到另一台毫不相关的测试机器上,客户端程序连接速度很正常。
这种情况实在令人惊讶,但不知道oracle rac的轮询机制是如何的,也就是说客户端的配置文件tnsnames.ora是如何检查冗余功能的,在一台服务器宕机时,它是如何作出判断,并连接到另一台服务器呢?

2006-11-3 13:24 boypoo
先看看你tnsnames.ora 是怎么写的

如果你直接写一个IP,而不是failover ,情况也一样的遭么?

2006-11-3 17:21 netbbs
据说rac关掉一台速度更快

2006-11-4 23:55 charly
不知道你的TAF是怎么配置的?

2006-11-6 10:05 dooza
tnsnames.ora:
nbrac =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.101)(PORT = 1521))
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.102)(PORT = 1521))
      (LOAD_BALANCE=ON)
      (FAILOVER=ON)
    )
    (CONNECT_DATA =
      (SERVICE_NAME = nbrac)
      (FAILOVER_MODE=(TYPE=SELECT)(METHOD = BASIC)(RETRIES=10)(DELAY=1))
    )
  )
nbrac1 =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.101)(PORT = 1521))
    )
    (CONNECT_DATA =
      (SERVICE_NAME = nbrac)
   (INSTANCE_NAME = nbrac1)
    )
  )
nbrac2 =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.102)(PORT = 1521))
    )
    (CONNECT_DATA =
      (SERVICE_NAME = nbrac)
      (INSTANCE_NAME = nbrac2)
    )
  )

2006-11-6 11:37 boypoo
TAF Basic Configuration: Example
Before using TAF, it is recommended that you create and start a service that is used during connections. By doing so, you benefit from TAF and services integration. When you want to use BASIC TAF with a service, you should have the -P BASIC option when creating the service. After the service is created, you simply start it on your database.
Then, your application needs to connect to the service using a connection descriptor similar to the one shown in the slide above. The FAILOVER_MODE parameter must be included in the CONNECT_DATA section of your connection descriptor:
TYPE specifies the type of failover. The SESSION value means that only the user session is reauthenticated on the server-side, whereas open cursors in the OCI application need to be reexecuted. The SELECT value means that not only the user session is reauthenticated on the server side, but the open cursors in the OCI can continue fetching. This implies that the client-side logic maintains fetch-state of each open cursor. A SELECT statement is reexecuted by using the same snapshot, discarding those rows already fetched, and retrieving those rows that were not fetched initially. TAF verifies that the discarded rows are those that were returned initially, or it returns an error message.
METHOD=BASIC is used to reconnect at failover time.
RETRIES specifies the number of times to attempt to connect after a failover.
DELAY specifies the amount of time in seconds to wait between connect attempts.
Note: If using TAF, do no set the GLOBAL_DBNAME parameter in your listener.ora file.


如果你直接用sqlplus username/pwd@nbrac1连接应该很快吧

2006-11-6 16:31 dooza
对,非常正常,比rac双机正常时稍快。

另外:
nbrac1的listener.ora:
LISTENER_NBRAC1 =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.101)(PORT = 1521))
      )
    )
  )
SID_LIST_LISTENER_NBRAC1 =
  (SID_LIST =
    (SID_DESC =
      (SID_NAME = PLSExtProc)
      (ORACLE_HOME = /oracle/app/oracle/product/9.2.0)
      (PROGRAM = extproc)
    )
  )


nbrac2的listener.ora:
LISTENER_NBRAC2 =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.102)(PORT = 1521))
      )
    )
  )
SID_LIST_LISTENER_NBRAC2 =
  (SID_LIST =
    (SID_DESC =
      (SID_NAME = PLSExtProc)
      (ORACLE_HOME = /oracle/app/oracle/product/9.2.0)
      (PROGRAM = extproc)
    )
  )

SPFILE:/dev/rora_spfile:
nbrac1.local_listener='nbrac1'
nbrac2.local_listener='nbrac2'
nbrac1.remote_listener='nbrac2'
nbrac2.remote_listener='nbrac1'

各位大侠,这些设置有没有问题呢。

[[i] 本帖最后由 dooza 于 2006-11-6 16:39 编辑 [/i]]

2006-11-6 16:44 dooza
另外,我看到有些设置如下:

*.remote_listeners='LISTENERS_NBRAC'

然后tnsnames.ora内容为:
LISTENERS_NBRAC =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.101)(PORT = 1521))
        (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.102)(PORT = 1521))
      )
    )
  )

这跟我的设置有什么不同?

[[i] 本帖最后由 dooza 于 2006-11-6 20:04 编辑 [/i]]

2006-11-7 22:14 charly
你把RETRIES=10  改为RETRIES=2 试试看?
是不是会好些?

2006-11-7 22:21 charly
个人感觉:在1台服务IP未启的情况下,某个客户端一直在try to connect the failed ip address,由于这个IP地址没有任何反应,所以有多次try和多次Delay。这个过程会比较慢。

但是如果这个IP地址live ,只是没有服务的情况下多次的Try是很快的。这个也许就是你配置到1台无关Server快的原因。

无法验证,dooza试试看?

页: [1]


Powered by Discuz! Archiver 5.5.0  © 2001-2006 Comsenz Inc.