LoveUnix » 其他UNIX & Linux » 现场求救啊HP双机问题
让LU留住您的每

一天 让LU博客留住您的每一天
2008-3-25 11:55 joebora
现场求救啊HP双机问题

OS:HPUX 11.23   MC/SG:11.17

用以下步骤配置好双机后
vgchange -a y vglock
cmquerycl -n xiaojiA -n xiaojiB -v -C /etc/cmcluster/cmclconfig.asc
cmcheckconf -v -C cmclconfig.asc
cmapplyconf -v -C cmclconfig.asc
vgchange -a n vglock
启动双机
# cmruncl -v
cmrunnode: Validating network configuration...
Gathering network information
Beginning network probing (this may take a while)
Completed network probing
cmrunnode: Network validation complete
Waiting for cluster to form .............. timed out
Check the syslog files for information.
cmrunnode failed: timed out waiting for cluster to form

以下是syslog -------------------------------------------------------------
Mar 25 12:00:45 xiaojiB CM-CMD[6221]: cmruncl -v
Mar 25 12:00:45 xiaojiB cmclconfd[6223]: Request from root on node xiaojiB to start the cluster on this node
Mar 25 12:00:45 xiaojiB cmcld[6229]: Logging level changed to level 0.
Mar 25 12:00:45 xiaojiB cmcld[6229]: Daemon Initialization - Maximum number of packages supported for this incarnation is 150.
Mar 25 12:00:45 xiaojiB cmcld[6229]: Global Cluster Information:
Mar 25 12:00:45 xiaojiB cmcld[6229]: Heartbeat Interval is 1.00 seconds.
Mar 25 12:00:45 xiaojiB cmcld[6229]: Logging level changed to level 0.
Mar 25 12:00:45 xiaojiB cmcld[6229]: Node Timeout is 10.00 seconds.
Mar 25 12:00:45 xiaojiB cmcld[6229]: Network Polling Interval is 2.00 seconds.
Mar 25 12:00:45 xiaojiB cmcld[6229]: Auto Start Timeout is 600.00 seconds.
Mar 25 12:00:45 xiaojiB cmcld[6229]: Failover Optimization is disabled.
Mar 25 12:00:45 xiaojiB cmcld[6229]: Information Specific to node xiaojiB:
Mar 25 12:00:45 xiaojiB cmcld[6229]: Cluster lock disk: /dev/dsk/c4t0d0.
Mar 25 12:00:45 xiaojiB cmcld[6229]: lan1  0x001a4b07cade  10.88.5.11  bridged net:1
Mar 25 12:00:45 xiaojiB cmcld[6229]: lan2  0x001a4b07cadf  192.168.0.11  bridged net:1
Mar 25 12:00:45 xiaojiB cmcld[6229]: Heartbeat Subnet: 10.88.5.0
Mar 25 12:00:45 xiaojiB cmcld[6229]: Heartbeat Subnet: 192.168.0.0
Mar 25 12:00:45 xiaojiB cmcld[6229]: The maximum # of concurrent local connections to the daemon that will be supported is 4066.
Mar 25 12:00:45 xiaojiB cmlvmd[6235]: lvm online query ioctl success- supports online feature
Mar 25 12:00:45 xiaojiB cmcld[6229]: Waiting for connection request from CMGMSD
Mar 25 12:00:45 xiaojiB cmcld[6229]: CMGMSD (pid=6237) successfully started
Mar 25 12:00:45 xiaojiB cmcld[6229]: rcomm health:  Initializing timeout to 120000000 microseconds
Mar 25 12:00:46 xiaojiB cmcld[6229]: Total allocated: 35108680 bytes, used: 2005040 bytes, unused 33103632 bytes
Mar 25 12:00:46 xiaojiB cmcld[6229]: Starting cluster management protocols.
Mar 25 12:00:46 xiaojiB cmcld[6229]: Attempting to form a new cluster
Mar 25 12:00:46 xiaojiB cmcld[6229]: Beginning standard election
Mar 25 12:01:46 xiaojiB cmcld[6229]: Cluster formation failed
Mar 25 12:01:46 xiaojiB cmcld[6229]: Reason: Ran out of time for manually starting the cluster
Mar 25 12:01:43 xiaojiB cmcld[6229]: Attempting to form a new cluster
Mar 25 12:01:46 xiaojiB  above message repeats 5 times
Mar 25 12:01:46 xiaojiB cmsrvassistd[6232]: The cluster daemon aborted our connection (231).
Mar 25 12:01:43 xiaojiB cmcld[6229]: Beginning standard election
Mar 25 12:01:46 xiaojiB  above message repeats 5 times
Mar 25 12:01:46 xiaojiB cmsrvassistd[6232]: Lost connection with Serviceguard cluster daemon (cmcld): Software caused connection abort
Mar 25 12:01:46 xiaojiB cmnetassistd[6234]: The cluster daemon aborted our connection (231).
Mar 25 12:01:46 xiaojiB cmnetassistd[6234]: Lost connection with Serviceguard cluster daemon (cmcld): Software caused connection abort
Mar 25 12:01:46 xiaojiB cmlvmd[6235]: The cluster daemon aborted our connection (231).
Mar 25 12:01:46 xiaojiB cmlvmd[6235]: Could not read messages from /usr/lbin/cmcld: Software caused connection abort
Mar 25 12:01:46 xiaojiB cmlvmd[6235]: CLVMD exiting
Mar 25 12:01:46 xiaojiB cmlvmd[6235]: Could not read messages from /usr/lbin/cmcld: Software caused connection abort


如果我用cmrunnode -v启动一个结点
还是报一样的错,不过用cmviewcl -v看cluster状态时显示节点时starting reforming
root@xiaojiB:/etc/cmcluster#cmrunnode -v
cmrunnode: Validating network configuration...
Gathering network information
Beginning network probing (this may take a while)
Completed network probing
cmrunnode: Network validation complete
Waiting for cluster to form .............. timed out
Check the syslog files for information.
cmrunnode failed: timed out waiting for cluster to form

cmrunnode -v启动时cmviewcl -v 的结果CLUSTER        STATUS      
cluster10g     starting     
  
  NODE           STATUS       STATE        
  xiaojiA        unknown      unknown      
   
    Cluster_Lock_LVM:
    VOLUME_GROUP          PHYSICAL_VOLUME       STATUS              
    /dev/vglock           /dev/dsk/c4t0d0       unknown            
   
    Network_Parameters:
    INTERFACE    STATUS       PATH                NAME         
    PRIMARY      unknown      0/4/2/0             lan1         
    PRIMARY      unknown      0/4/2/1             lan2         
  
  NODE           STATUS       STATE        
  xiaojiB        starting     reforming   
   
    Cluster_Lock_LVM:
    VOLUME_GROUP          PHYSICAL_VOLUME       STATUS              
    /dev/vglock           /dev/dsk/c4t0d0       unknown            
   
    Network_Parameters:
    INTERFACE    STATUS       PATH                NAME         
    PRIMARY      up           0/4/2/0             lan1         
    PRIMARY      up           0/4/2/1             lan2  

cmrunnode -v 时的syslog
Mar 25 12:11:07 xiaojiB CM-CMD[6554]: cmrunnode -v
Mar 25 12:11:07 xiaojiB cmclconfd[6556]: Request from root on node xiaojiB to start the cluster on this node
Mar 25 12:11:07 xiaojiB cmcld[6562]: Logging level changed to level 0.
Mar 25 12:11:07 xiaojiB cmcld[6562]: Daemon Initialization - Maximum number of packages supported for this incarnation is 150.
Mar 25 12:11:07 xiaojiB cmcld[6562]: Global Cluster Information:
Mar 25 12:11:07 xiaojiB cmcld[6562]: Heartbeat Interval is 1.00 seconds.
Mar 25 12:11:07 xiaojiB cmcld[6562]: Logging level changed to level 0.
Mar 25 12:11:07 xiaojiB cmcld[6562]: Node Timeout is 10.00 seconds.
Mar 25 12:11:07 xiaojiB cmcld[6562]: Network Polling Interval is 2.00 seconds.
Mar 25 12:11:07 xiaojiB cmcld[6562]: Auto Start Timeout is 600.00 seconds.
Mar 25 12:11:07 xiaojiB cmcld[6562]: Failover Optimization is disabled.
Mar 25 12:11:07 xiaojiB cmcld[6562]: Information Specific to node xiaojiB:
Mar 25 12:11:07 xiaojiB cmcld[6562]: Cluster lock disk: /dev/dsk/c4t0d0.
Mar 25 12:11:07 xiaojiB cmcld[6562]: lan1  0x001a4b07cade  10.88.5.11  bridged net:1
Mar 25 12:11:07 xiaojiB cmcld[6562]: lan2  0x001a4b07cadf  192.168.0.11  bridged net:1
Mar 25 12:11:07 xiaojiB cmcld[6562]: Heartbeat Subnet: 10.88.5.0
Mar 25 12:11:07 xiaojiB cmcld[6562]: Heartbeat Subnet: 192.168.0.0
Mar 25 12:11:07 xiaojiB cmcld[6562]: The maximum # of concurrent local connections to the daemon that will be supported is 4066.
Mar 25 12:11:07 xiaojiB cmlvmd[6568]: lvm online query ioctl success- supports online feature
Mar 25 12:11:07 xiaojiB cmcld[6562]: Waiting for connection request from CMGMSD
Mar 25 12:11:07 xiaojiB cmcld[6562]: CMGMSD (pid=6569) successfully started
Mar 25 12:11:07 xiaojiB cmcld[6562]: rcomm health:  Initializing timeout to 120000000 microseconds
Mar 25 12:11:07 xiaojiB cmcld[6562]: Total allocated: 35108680 bytes, used: 2002832 bytes, unused 33105840 bytes
Mar 25 12:11:07 xiaojiB cmcld[6562]: Starting cluster management protocols.
Mar 25 12:11:07 xiaojiB cmcld[6562]: Attempting to form a new cluster
Mar 25 12:11:07 xiaojiB cmcld[6562]: Beginning standard election
Mar 25 12:21:07 xiaojiB cmcld[6562]: Cluster formation failed
Mar 25 12:21:07 xiaojiB cmcld[6562]: Reason: Ran out of time for automatically joining a cluster
Mar 25 12:20:56 xiaojiB cmcld[6562]: Attempting to form a new cluster
Mar 25 12:21:07 xiaojiB  above message repeats 51 times
Mar 25 12:21:07 xiaojiB cmcld[6562]: Unable to contact all nodes in the cluster, thus it is not
Mar 25 12:20:56 xiaojiB cmcld[6562]: Beginning standard election
Mar 25 12:21:07 xiaojiB  above message repeats 51 times
Mar 25 12:21:07 xiaojiB cmcld[6562]:   possible to join the cluster at this time.
Mar 25 12:21:07 xiaojiB cmsrvassistd[6565]: The cluster daemon aborted our connection (231).
Mar 25 12:21:07 xiaojiB cmsrvassistd[6565]: Lost connection with Serviceguard cluster daemon (cmcld): Software caused connection abort
Mar 25 12:21:07 xiaojiB cmnetassistd[6567]: The cluster daemon aborted our connection (231).
Mar 25 12:21:07 xiaojiB cmnetassistd[6567]: Lost connection with Serviceguard cluster daemon (cmcld): Software caused connection abort
Mar 25 12:21:07 xiaojiB cmlvmd[6568]: The cluster daemon aborted our connection (231).
Mar 25 12:21:07 xiaojiB cmlvmd[6568]: Could not read messages from /usr/lbin/cmcld: Software caused connection abort
Mar 25 12:21:07 xiaojiB cmlvmd[6568]: CLVMD exiting
Mar 25 12:21:07 xiaojiB cmcld[6562]: If the cluster is not running, use the cmruncl command to
Mar 25 12:21:07 xiaojiB cmcld[6562]:   start it. If the cluster is running on other nodes, verify
Mar 25 12:21:07 xiaojiB cmlvmd[6568]: Could not read messages from /usr/lbin/cmcld: Software caused connection abort
Mar 25 12:21:07 xiaojiB cmcld[6562]:   this node's ability to send messages to the other nodes,
Mar 25 12:21:07 xiaojiB cmcld[6562]:   then re-issue the cmrunnode command

[[i] 本帖最后由 joebora 于 2008-3-25 12:10 编辑 [/i]]

2008-3-27 17:06 alex_an
network probing
please check config of network

页: [1]
查看完整版本: 现场求救啊HP双机问题


Powered by Discuz! Archiver 5.5.0  © 2001-2006 Comsenz Inc.