CUCM failover to subscriber failure!

Hi everyone!
I have a CUCM cluster of one publisher and one subscriber, active version 7.0 and inactive version 5.1.3
The pubslisher failed due to a power failure, the Cisco DB wasn't  starting at all, and there was no DRS backup.  Anyway, I did an upgrade  from 5.1 to 7.0 again on the publisher while the telephony was  operational normally on the subscriber node.
After the  upgrade, I uploaded the Publisher's and Subscriber's licenses, I added  back all the changes done to the CUCM between the databases of 5 and 7  (manually added by comparing to the subscriber's) and I replicated to  the subscriber when I was done and the replication state was good '2'. And I took an immediate DRS backup.
However, the problem appeared when I restarted the  CUCM's publisher node and none of the phones registered with the  subscriber node. I thought it was a network problem or server going  slow. I turned the publisher off for around 20 mins and nothing changed.
The  configuration of the Call Manager group is correct, the licenses are  correct, everything seem to be ok. When the phones are registered with  the publisher, I can see them registered from the subscriber's phone  page but when I stop the publisher, they turn to 'unknown'.
Does  anyone have any clue why this is happening? Do I have to upgrade the  subscriber again from 5 to 7? I just had this clue in mind, it doesn't  make sense to me since replication is working fine between the servers.
One  more thing, I use FreeSshd to do the DRS backup on an XP machine, it  wasn't connecting on my laptop with Windows 7. What are you guys using  on Windows 7? Tried a search result on Google but nothing worked.
Thank you for reading and for tips!
Regards,
Mazen

Hi Mazen,
I think you are on the right track with your thought of re-building
the Subscriber
You would be hitting this CUCM 5.x restriction;
Replacing the Publisher Node
Complete the following tasks to replace the Cisco Unified CallManager publisher server. If you are replacing a single server that is not part of a cluster, follow this procedure to replace your server.
Caution     If you are replacing a publisher node in a cluster, you must also reinstall all the subscriber nodes and dedicated TFTP servers in the cluster, after replacing the publisher node. For instructions on reinstalling these other nodes types, see the "Replacing a Subscriber or Dedicated TFTP Server Node" section
Follow the references in the For More Information column to get more information about a step.
Table 4     Replacing the Publisher Node Process Overview
Description For More Information
Step 1
Perform the tasks in the "Server or Cluster Replacement Preparation Checklist" section.
"Server or Cluster Replacement Preparation Checklist" section
Step 2
Gather the necessary information about the old publisher server.
"Gathering System Configuration Information to Replace or Reinstall a Server" section
Step 3
Back up the publisher server to a remote SFTP server by using the Disaster Recovery System (DRS) and verify that you have a good backup.
"Creating a Backup File" section
Step 4
Get the new license and verify it before system replacement.
You only need a new license if you are replacing the publisher node.
See the "Obtaining a License File" section.
Step 5
Shut down and turn off the old server.
Step 6
Connect the new server.
Step 7
Install the same Cisco Unified CallManager release on the new server that was installed on the old server, including any Engineering Special releases.
Configure the server as the publisher server for the cluster.
"Installing Cisco Unified CallManager on the New Publisher Server" section
Step 8
Upload the new license file to the publisher server.
"Uploading a License File" section
Step 9
Restore backed up data to the publisher server by using DRS.
"Restoring a Backup File" section
Step 10
Reboot the publisher server.
Step 11
Perform the post-replacement tasks in the "Post-Replacement Checklist" section.
http://www.cisco.com/en/US/docs/voice_ip_comm/cucm/install/5_1/clstr513.html#wp87717
Cheers!
Rob

Similar Messages

  • SUBSCRIBE failure: unrecognized format: 'multipart/related'

    Hi,
    In one of my network I have few SPA942 and one SPA504 with console SPA500S.
    Asterisk is 1.4.22. After connecting to the network console SPA500S I have a small problem.
    The console works fine but there are many such messages:
    [Jun 13 09:42:46] WARNING[2938]: chan_sip.c:15456 handle_request_subscribe: SUBSCRIBE failure: unrecognized format: 'multipart/related' pvt: subscribed: 0, stateid: -1, laststate: 0, dialogver: 0, subscribecont: 'hinty', subscribeuri: ''
    What may cause this symptom?
    Best regards,
    Daniel

    Hi,
    I have a solution of this problem.
    One of SPA942 had the error in configuration.
    Best regards,
    Daniel

  • CUCM 10 publisher & subscriber.

    Dears,
    I have a publisher and subscriber CUCM 10.5 with 2 voice gateway within a single site, instead of subscriber sleeping always i want to make it work also, so i have thought that i will split the 1000 phones on pub and sub  with 2 no's cucm groups and phones registered with pub will use VG1 and VG2 (MGCP gateway) and phones registered with SUB will also use VG1 and VG2 (MGCP gateway).
    Is it this is a correct thoughts for design and what obstacles i can face in this design

    Dear Manish,
    Thanks for the reply,
    tftp, for example the phones on 2nd and 3rd flr  will hit the option 150 with primary as subscriber and phones on 1st nd 2nd flr will hit the option 150 with primary as a publisher.
    MGCP gateway: Gateway configuration will be same for both of them as they will have redundant publisher and redundant subscriber.
    I have 2 no's E1 PRI so i can add them in 2 different route groups ???
    for example as below.
    RG-PUB
    port1--router 1
    port 2---router 2
    RG-SUB
    port2--router2
    port1-router1
    and these will be called in the RL with route pattern.
    Thanks

  • CUCM 10 Install Subscriber Error

    Hi All,
    I have plan to install  CUCM 10 in 2 sites :
    - sites A : Publiser , Subscriber 1
    - sites B : Subscriber 2 , Subscriber 3
    When I Install in Sites A there is no problem , but when i Install in Sites B for Subscriber 2 and Subscriber 3 there is always this error :
    I had try using ova template from different sites and installer , but no result ,
    any help ?
    Best Regards,
    Tommy

    Hi Tommy,
    did you already upgrade to the fixed version on pub & sub 1 ?
    if you already upgrade from v10.0.1.10000-24 to v10.0.1.11006.1 on publisher & sub1, before you install subs 2 & 3, you must install first cucm v10.0.1.10000-24 and then you choose the patch menu when the starting CUCM installer. and then you put the ISO fixed version v10.0.1.11006.1 into DVD.
    some error interrupt when installing the secondary node on cucm because the primary node having 2 version active & incative CUCM after doing upgraded. you need make it same version on secondary node.
    you can contact me with YM : biemabbit
    to make sure this problem clearly :)
    Regards,
    Habibi

  • CUCM Publisher and Subscriber unreachable

    I presently having issues with my Publisher and Subscriber Servers.
    I working with CUCM Version 7.1.5.20000-6 / Cisco MCS 7800 Series Media Convergence Server.
    Through Cisco Unified Reporting – Unified CM Database Status – I get this report
    For every server, shows if you can   read from the local and publisher databases.
    172.20.160.11 is down and cannot   be reached.
    The publisher database could not   be reached from 172.20.160.11 .
    The local database could not be   reached from 172.20.160.11 .
    https://172.20.160.10:8443/cucreports/showReport.do?isStandard=true&name=Unified%20CM%20Database%20Status&transform=truehttps://172.20.160.10:8443/cucreports/showReport.do?isStandard=true&name=Unified%20CM%20Database%20Status&transform=trueView Details
    Server
    Publisher   DB Reachable
    Local   DB Reachable
    172.20.160.10
    true
    true
    172.20.160.11
    172.20.160.11 is down and cannot   be reached.
    172.20.160.11 is down and cannot   be reached.
    I have also attached a Debug Script for review.
    Help,,,,,,,,,,,

    Hi Val,
    You can download the recovery ISO from the following location
    http://software.cisco.com/download/release.html?mdfid=282421166&flowid=5328&softwareid=282074294&release=7.1%285b%29&relind=AVAILABLE&rellifecycle=&reltype=latest
    Burn it on a DVD and boot up using the recovery iso, then select the options to automatically fix file system and manually fix file system if needed.
    Check for any DNS related issues if configured as the network connectiviry errors could be due to the same.
    However, looking at the errors and state of services, if the issue persists, i would recommend a reinstall of the subscriber as the Pub will replicate the data to the sub.
    HTH
    Manish

  • SQL SERVER Failover Cluster switch failure because the passive node automatically reassign drive letter

    I switch the sql server resource group to the standby node , when the disk resource ready bring online in the passive node ,then occur exception. because the original dependency disk resource the drive letter is 'K:' , BUT when the disk bring online , it
    automatically reassign new drive letter 'H:' ,  So the sql server resource couldnot bring online . And After Manual modify the drive letter to 'K:' in the passive node , It Works !  So my question is why it not use the original drive letter
    and reassign a new one . what reasons would be cause it ? mount point ? Some log as follows:
    00001cbc.000004e0::2015/03/12-14:41:11.377 WARN  [RES] Physical Disk <FltLowestPrice_K>: OnlineThread: Failed to set volguid \??\Volume{e32c13d5-02e6-4924-a2d9-59a6fae1a1be}. Error: 183.
    00001cbc.000004e0::2015/03/12-14:41:11.377 INFO  [RES] Physical Disk <FltLowestPrice_K>: Found 2 mount points for device \Device\Harddisk8\Partition2
    00001cbc.00001cdc::2015/03/12-14:41:11.377 INFO  [RES] Physical Disk: PNP: Update volume exit, status 1168
    00001cbc.00001cdc::2015/03/12-14:41:11.377 INFO  [RES] Physical Disk: PNP: Updating volume
    \\?\STORAGE#Volume#{1a8ddb8e-fe43-11e2-b7c5-6c3be5a5cdca}#0000000008100000#{53f5630d-b6bf-11d0-94f2-00a0c91efb8b}
    00001cbc.00001cdc::2015/03/12-14:41:11.377 INFO  [RES] Physical Disk: PNP: Update volume exit, status 5023
    00001cbc.000004e0::2015/03/12-14:41:11.377 ERR   [RES] Physical Disk: Failed to get volname for drive H:\, status 2
    00001cbc.000004e0::2015/03/12-14:41:11.377 INFO  [RES] Physical Disk <FltLowestPrice_K>: VolumeIsNtfs: Volume
    \\?\GLOBALROOT\Device\Harddisk8\Partition2\ has FS type NTFS
    00001cbc.000004e0::2015/03/12-14:41:11.377 INFO  [RES] Physical Disk: Volume
    \\?\GLOBALROOT\Device\Harddisk8\Partition2\ has FS type NTFS
    00001cbc.000004e0::2015/03/12-14:41:11.377 INFO  [RES] Physical Disk: MountPoint H:\ points to volume
    \\?\Volume{e32c13d5-02e6-4924-a2d9-59a6fae1a1be}\

    Sounds like you have an cluster hive that is out of date/bad, or some registry settings which are incorrect. You'll want to have this question transferred to the windows forum as that's really what you're asking about.
    -Sean
    The views, opinions, and posts do not reflect those of my company and are solely my own. No warranty, service, or results are expressed or implied.

  • Unity Connection - Certificate from cucm no more trusted for encrypted calls after upgrade to 10.5(1)

    Hello Support Community,
    i have a strange problem:
    after upgrading my cucm and unity connection from 9.1 to 10.5(1) enctrypted calls are no more working.
    situation 1: CUCM is down, Subscriber is up: Encrypted call to Unity Connection work correctly
    situation 2: CUCM is up: Encrypted Calls to Unity Connection not working.
    i get the following Info in the log for the Connection Conversion Manager:
    19:35:21.053 |15865,,,MiuGeneral,25,Invalid Certificate: Received Certificate -----BEGIN CERTIFICATE-----
    MIID8zCCAtugAwIBAgIQc/fBdUz1Zdh4CXhcPqGVuDANBgkqhkiG9w0BAQsFADBw
    MQswCQYDVQQGEwJERTELMAkGA1UEChMCSVQxGzAZBgNVBAsTEkhlbGxnYXRlIFRl
    XD0oD9d5MQ==
    -----END CERTIFICATE-----
     doesn't match with stored Certificate: -----BEGIN CERTIFICATE-----
    MIIC2DCCAkGgAwIBAgIIJWCm4bSdt+kwDQYJKoZIhvcNAQEFBQAw
    -----END CERTIFICATE-----
    so where does Unity Connection cache this certificate and how can i delete/replace it?
    the cert shown in the logs is the one from cucm: ("CallManager"), i recreated it through cucm os administration, now i see the same error message on unity connection for the new recreated certificate.

    Actually It doesn't. It says he's on a MacBook. I don't know all the different types of Macs. I was having a ton of problems with iChat. I opened DMZ to my computer, knocked down all firewalls etc and left everything exposed, still with bad results. A few weeks ago my power supply went out on my D-Link. I bought a linksys. Since I'd left all firewalls off I figured it couldn't be the router. I power cycled everything n the netork, still no luck. Today I bought a universal Power supply and started up my D-Link Router. Everything worked perfectly. My wifes computer - a laptop running Tiger worked fine with the Linksys and did my machine before the Leopard upgrade. Now that I've got the D-Link online everythings working.
    Message was edited by: graphico
    Message was edited by: graphico

  • Running CUCM 8x/9x on Cisco UCS - Design

    Hi,
    I have a design question popping in my mind regarding running CUCM 8.X/9X on UCS.
    Let us say that I have initially configured a CUCM Publisher and Subscriber to run on a Cisco UCS server using the VMware OVA Template that allows for 2500 IP Phones.
    Both CUCM servers have been installed and setup using the specs from an OVA that allows 2500 IP Phones. Later, I decide to increase the capacity of the same hosts from 2500 IP Phones to support 7500 IP Phones.
    Do I need to build up new CUCM servers using an OVA Template that allows 7500 IP Phones per server or do I modify the existing CUCM VMware host settings such as increase the RAM, HDD, etc.
    Warm regards,
    JK.

    Check out the resize support:
    http://docwiki.cisco.com/wiki/Unified_Communications_VMware_Requirements#Resize_Virtual_Machine
    Since the 2500 OVA have different storage requirements from the 7500 OVA you will need to rebuild the VM for full support.
    http://docwiki.cisco.com/wiki/Virtualization_for_Cisco_Unified_Communications_Manager_(CUCM)#Version_9.1.28x.29
    HTH,
    Chris

  • Can fast start failover be achieved in maximum performance mode

    In a physical standby I would like to know if it is possible to configure High Availability in async mode to achieve fast-start failover? If so, please briefly describe how. Thanks

    I've always said, size doesn't matter to Data Guard except when you are creating a standby database. After that it is the amount of Redo that is generated by the Primary database. A failover is a failover, for a 10GB database or for a 60TB database. The scary part is having to get the old Primary back up as a standby as quickly as possible.
    Of course if the original Primary is completely toast then you have to recreate it with a new backup of the new primary or use a backup of the original Primary that happens to be on the Primary system and restore it as a standby and let Data Guard catch it up with the current state of affairs.
    But, if the original Primary is still intact then you can use flashback database to reinstate it as a physical standby with little effort and no major backup restores. And of course, that is one of the requirements of Fast-Start Failover (FSFO). To enable it you have to have flashback database enabled on both the Primary and the FSFO target standby. Since enabling FB Database requires the database to be at the mount state I would suggest you enable it right after the upgrade to 11g.
    When FSFO triggers a failover after a failure of the Primary, the Observer will automatically re-instantiate the failed Primary as a standby when it is started again and then it will be resynchronized with the new Primary automatically by Data Guard as normal.
    The only difference between Maximum Availability FSFO and Maximum Performance FSFO is that you have to decide how much data you are will to lose when a failover occurs and tell the Broker by configuring the FastStartFailoverLagLimit property (see section 9.2.12 of the Broker manual).
    You also want to be sure to place the Observer on a system where it is not affected by the Primary or Standby server going down or a network break between the Primary and the Standby. And if you can, use Grid Control. When you enable FSFO in Grid Control (on the Data Guard home page) you can specify 2 systems where the Observer can live. If it goes down on the original system an atempt will be made to restart it there. If that system if no longer available then Grid Control will attempt to start a new Observer on the 2nd system.
    I would highly recommend you read the section [5.5 Fast-Start Failover|http://download.oracle.com/docs/cd/B28359_01/server.111/b28295/sofo.htm#i1027843] for lots of good information. And of course, the [Data Guard 11g Handbook|http://www.amazon.com/Oracle-Guard-Handbook-Osborne-ORACLE/dp/0071621113] (sorry, still plugging the book :^)
    Larry

  • IPCC Failover issue

    I have 2 IPCC 7.x server configured for failover. The issue is when I am logged into my agent and my primary server failsover to backup server, my agent looses connection/goes offline and reconnects in not ready after 15 seconds. The same applies if I failback from my secondary to primary server.
    Is this a normal behaviour ? Is there a document on cisco that describes the same issue. Please let me know.
    Thanks

    Not sure with IPCC Enterprise, but IPCC Express (or UCCX nowadays) the behavior you are reporting is expected.  From the design guide:
    Automatic Failover. Upon failure of the active Cisco Unified CCX server, CAD will automatically re-login agents on the standby server, and the agent will be placed into a Not Ready state. Upon failure of the active Cisco Unified CCX server, active calls on agents phones will survive. However, the call duration and other information that is associated with the call in the historical reporting database may be affected. Historical reports generated for time periods in which a failover occurred will have missing or incorrect data. It will be called out in the report that a failover occurred.
    http://www.cisco.com/en/US/docs/voice_ip_comm/cust_contact/contact_center/crs/express_7_0/design/guide/uccx70srnd.pdf  (page 25 of the pdf)
    It should also be noted that with UCCX, failback to the primary node (when it comes back on line) isn't automatic.  Mainly because the failback will exhibit the same behavior (along with an approximate 5 second hit on ACD/IVR functionality).  So, the failback should be manual - in my experience anyway.
    HTH.
    Regards,
    Bill
    Please remember to rate helpful posts.

  • CONCURRENT MANAGER SETUP AND CONFIGURATION REQUIREMENTS IN AN 11I RAC ENVIR

    제품 : AOL
    작성날짜 : 2004-05-13
    PURPOSE
    RAC-PCP 구성에 대한 Setup 사항을 기술한 문서입니다.
    PCP 구현은 CM의 workload 분산, Failover등을 목적으로 합니다.
    Explanation
    Failure sceniro 는 다음 3가지로 구분해 볼수 있습니다.
    1. The database instance that supports the CP, Applications, and Middle-Tier
    processes such as Forms, or iAS can fail.
    2. The Database node server that supports the CP, Applications, and Middle-
    Tier processes such as Forms, or iAS can fail.
    3. The Applications/Middle-Tier server that supports the CP (and Applications)
    base can fail.
    아래부분은 CM,AP 구성과
    CM과 GSM(Global Service Management)과의 관계를 설명하고 있습니다.
    The concurrent processing tier can reside on either the Applications, Middle-
    Tier, or Database Tier nodes. In a single tier configuration, non PCP
    environment, a node failure will impact Concurrent Processing operations do to
    any of these failure conditions. In a multi-node configuration the impact of
    any these types of failures will be dependent upon what type of failure is
    experienced, and how concurrent processing is distributed among the nodes in
    the configuration. Parallel Concurrent Processing provides seamless failover
    for a Concurrent Processing environment in the event that any of these types of
    failures takes place.
    In an Applications environment where the database tier utilizes Listener (
    server) load balancing is implemented, and in a non-load balanced environment,
    there are changes that must be made to the default configuration generated by
    Autoconfig so that CP initialization, processing, and PCP functionality are
    initiated properly on their respective/assigned nodes. These changes are
    described in the next section - Concurrent Manager Setup and Configuration
    Requirements in an 11i RAC Environment.
    The current Concurrent Processing architecture with Global Service Management
    consists of the following processes and communication model, where each process
    is responsible for performing a specific set of routines and communicating with
    parent and dependent processes.
    아래 내용은 PCP환경에서 ICM, FNDSM, IM, Standard Manager의 역활을 설명하고
    있습니다.
    Internal Concurrent Manager (FNDLIBR process) - Communicates with the Service
    Manager.
    The Internal Concurrent Manager (ICM) starts, sets the number of active
    processes, monitors, and terminates all other concurrent processes through
    requests made to the Service Manager, including restarting any failed processes.
    The ICM also starts and stops, and restarts the Service Manager for each node.
    The ICM will perform process migration during an instance or node failure.
    The ICM will be
    active on a single node. This is also true in a PCP environment, where the ICM
    will be active on at least one node at all times.
    Service Manager (FNDSM process) - Communicates with the Internal Concurrent
    Manager, Concurrent Manager, and non-Manager Service processes.
    The Service Manager (SM) spawns, and terminates manager and service processes (
    these could be Forms, or Apache Listeners, Metrics or Reports Server, and any
    other process controlled through Generic Service Management). When the ICM
    terminates the SM that
    resides on the same node with the ICM will also terminate. The SM is ?hained?
    to the ICM. The SM will only reinitialize after termination when there is a
    function it needs to perform (start, or stop a process), so there may be
    periods of time when the SM is not active, and this would be normal. All
    processes initialized by the SM
    inherit the same environment as the SM. The SM environment is set by APPSORA.
    env file, and the gsmstart.sh script. The TWO_TASK used by the SM to connect
    to a RAC instance must match the instance_name from GV$INSTANCE. The apps_<sid>
    listener must be active on each CP node to support the SM connection to the
    local instance. There
    should be a Service Manager active on each node where a Concurrent or non-
    Manager service process will reside.
    Internal Monitor (FNDIMON process) - Communicates with the Internal Concurrent
    Manager.
    The Internal Monitor (IM) monitors the Internal Concurrent Manager, and
    restarts any failed ICM on the local node. During a node failure in a PCP
    environment the IM will restart the ICM on a surviving node (multiple ICM's may
    be started on multiple nodes, but only the first ICM started will eventually
    remain active, all others will gracefully terminate). There should be an
    Internal Monitor defined on each node
    where the ICM may migrate.
    Standard Manager (FNDLIBR process) - Communicates with the Service Manager and
    any client application process.
    The Standard Manager is a worker process, that initiates, and executes client
    requests on behalf of Applications batch, and OLTP clients.
    Transaction Manager - Communicates with the Service Manager, and any user
    process initiated on behalf of a Forms, or Standard Manager request. See Note:
    240818.1 regarding Transaction Manager communication and setup requirements for
    RAC.
    Concurrent Manager Setup and Configuration Requirements in an 11i RAC
    Environment
    PCP를 사용하기위한 기본적인 Setup 절차를 설명하고 있습니다.
    In order to set up Setup Parallel Concurrent Processing Using AutoConfig with
    GSM,
    follow the instructions in the 11.5.8 Oracle Applications System Administrators
    Guide
    under Implementing Parallel Concurrent Processing using the following steps:
    1. Applications 11.5.8 and higher is configured to use GSM. Verify the
    configuration on each node (see WebIV Note:165041.1).
    2. On each cluster node edit the Applications Context file (<SID>.xml), that
    resides in APPL_TOP/admin, to set the variable <APPLDCP oa_var="s_appldcp">
    ON </APPLDCP>. It is normally set to OFF. This change should be performed
    using the Context Editor.
    3. Prior to regenerating the configuration, copy the existing tnsnames.ora,
    listener.ora and sqlnet.ora files, where they exist, under the 8.0.6 and iAS
    ORACLE_HOME locations on the each node to preserve the files (i.e./<some_
    directory>/<SID>ora/$ORACLE_HOME/network/admin/<SID>/tnsnames.ora). If any of
    the Applications startup scripts that reside in COMMON_TOP/admin/scripts/<SID>
    have been modified also copy these to preserve the files.
    4. Regenerate the configuration by running adautocfg.sh on each cluster node as
    outlined in Note:165195.1.
    5. After regenerating the configuration merge any changes back into the
    tnsnames.ora, listener.ora and sqlnet.ora files in the network directories,
    and the startup scripts in the COMMON_TOP/admin/scripts/<SID> directory.
    Each nodes tnsnames.ora file must contain the aliases that exist on all
    other nodes in the cluster. When merging tnsnames.ora files ensure that each
    node contains all other nodes tnsnames.ora entries. This includes tns
    entries for any Applications tier nodes where a concurrent request could be
    initiated, or request output to be viewed.
    6. In the tnsnames.ora file of each Concurrent Processing node ensure that
    there is an alias that matches the instance name from GV$INSTANCE of each
    Oracle instance on each RAC node in the cluster. This is required in order
    for the SM to establish connectivity to the local node during startup. The
    entry for the local node will be the entry that is used for the TWO_TASK in
    APPSORA.env (also in the APPS<SID>_<HOSTNAME>.env file referenced in the
    Applications Listener [APPS_<SID>] listener.ora file entry "envs='MYAPPSORA=<
    some directory>/APPS<SID>_<HOSTNAME>.env)
    on each node in the cluster (this is modified in step 12).
    7. Verify that the FNDSM_<SID> entry has been added to the listener.ora file
    under the 8.0.6 ORACLE_HOME/network/admin/<SID> directory. See WebiV Note:
    165041.1 for instructions regarding configuring this entry. NOTE: With the
    implementation of GSM the 8.0.6 Applications, and 9.2.0 Database listeners
    must be active on all PCP nodes in the cluster during normal operations.
    8. AutoConfig will update the database profiles and reset them for the node
    from which it was last run. If necessary reset the database profiles back to
    their original settings.
    9. Ensure that the Applications Listener is active on each node in the cluster
    where Concurrent, or Service processes will execute. On each node start the
    database and Forms Server processes as required by the configuration that
    has been implemented.
    10. Navigate to Install > Nodes and ensure that each node is registered. Use
    the node name as it appears when executing a nodename?from the Unix prompt on
    the server. GSM will add the appropriate services for each node at startup.
    11. Navigate to Concurrent > Manager > Define, and set up the primary and
    secondary node names for all the concurrent managers according to the
    desired configuration for each node workload. The Internal Concurrent
    Manager should be defined on the primary PCP node only. When defining the
    Internal Monitor for the secondary (target) node(s), make the primary node (
    local node) assignment, and assign a secondary node designation to the
    Internal Monitor, also assign a standard work shift with one process.
    12. Prior to starting the Manager processes it is necessary to edit the APPSORA.
    env file on each node in order to specify a TWO_TASK entry that contains
    the INSTANCE_NAME parameter for the local nodes Oracle instance, in order
    to bind each Manager to the local instance. This should be done regardless
    of whether Listener load balancing is configured, as it will ensure the
    configuration conforms to the required standards of having the TWO_TASK set
    to the instance name of each node as specified in GV$INSTANCE. Start the
    Concurrent Processes on their primary node(s). This is the environment
    that the Service Manager passes on to each process that it initializes on
    behalf of the Internal Concurrent Manager. Also make the same update to
    the file referenced by the Applications Listener APPS_<SID> in the
    listener.ora entry "envs='MYAPPSORA= <some directory>/APPS<SID>_<HOSTNAME>.
    env" on each node.
    13. Navigate to Concurrent > Manager > Administer and verify that the Service
    Manager and Internal Monitor are activated on the secondary node, and any
    other addititional nodes in the cluster. The Internal Monitor should not be
    active on the primary cluster node.
    14. Stop and restart the Concurrent Manager processes on their primary node(s),
    and verify that the managers are starting on their appropriate nodes. On
    the target (secondary) node in addition to any defined managers you will
    see an FNDSM process (the Service Manager), along with the FNDIMON process (
    Internal Monitor).
    Reference Documents
    Note 241370.1

    What is your database version? OS?
    We are using VCP suite for Planning Purpose. We are using VCP environment (12.1.3) in Decentralized structure connecting to 3 differect source environment ( consisting 11i and R12). As per the Oracle Note {RAC Configuration Setup For Running MRP Planning, APS Planning, and Data Collection Processes [ID 279156]} we have implemented RAC in our test environment to get better performance.
    But after doing all the setups and concurrent programs assignment to different nodes, we are seeing huge performance issue. The Complete Collection which takes generally on an avg 180 mins in Production, is taking more than 6 hours to complete in RAC.
    So I would like to get suggestion from this forum, if anyone has implemented RAC in pure VCP (decentralized) environment ? Will there be any improvement if we make our VCP Instance in RAC ?Do you PCP enabled? Can you reproduce the issue when you stop the CM?
    Have you reviewed these docs?
    Value Chain Planning - VCP - Implementation Notes & White Papers [ID 280052.1]
    Concurrent Processing - How To Ensure Load Balancing Of Concurrent Manager Processes In PCP-RAC Configuration [ID 762024.1]
    How to Setup and Run Data Collections [ID 145419.1]
    12.x - Latest Patches and Installation Requirements for Value Chain Planning (aka APS Advanced Planning & Scheduling) [ID 746824.1]
    APSCHECK.sql Provides Information Needed for Diagnosing VCP and GOP Applications Issues [ID 246150.1]
    Thanks,
    Hussein

  • What logical device for COMSTAR whole disk iSCSI?

    Hi,
    We have a poor man’s cluster with data redundancy through locally mirrored iSCSI disks.
    Solaris 11.1 x86 + Solaris Cluster 4.1
    Most all of the instructions I found describe mirroring the physical disks on the SAN, making a ZFS volume and iSCSIing this volume.
    We’d like to use the whole disk in raw mode without any filesystem between the physical SAS disks and the iSCSI target.
    What is the correct logical device for stmfadm create-lu ?
    One of our disks pairs was configured with LU data file /dev/rdsk/c0tXXXd0
    After a few months of use, the zpool has become degraded and scrub fails to fix the problem. zpool status claims both disks are degraded, but smartctl test gives both physical disks a clean bill of health.
    We used this zpool/zones for testing automatic failover during node failure, so it is very possible that this pool had been corrupted in the course of these tests and the logical device selection has nothing to do with the problem.
    We have another pair of disks configured with /dev/rdsk/c0tXXXd0s0
    I benchmarked the two configurations by copying data to a dataset on the mirrored iSCSI disks.
    /dev/rdsk/c0tXXXd0s0 gave a result of 80 MB/sec while the /dev/rdsk/c0tXXXd0 configuration gave a result of 50 MB/sec.
    This may be due to the fact that the /dev/rdsk/c0tXXXd0 configuration is degraded.
    I've seen some references to using c0tXXXd0p0 but I haven’t had a chance to try this out yet.
    Thanks for your help and advice.
    Mikko

    I tested the volume based iSCSI method and performance wise it sucks donkey d1ck.
    If it had been 50%, slower we could always buy another node, but the performance was truly pathetic.
    My make shift benchmark comprises of cp:ing 10 gigs of random test data from the local disk to the HA mirror of iSCSI disks and a sync at the end.
    Volume based iSCSI results          17m25.932s total write time, 9.8 MB/sec
    XXd0s0 whole disk iscsi                2m13.468s total write time, 77 MB/sec
    I made the test configuration as zpool->volume->iSCSI->efi-part according to the following instructions:
    www tokiwinter com / solaris-cluster-4-1-part-two-iSCSI-quorum-server-and-cluster-software-installation /
    I made the volume 5% smaller than the available space to account for metadata, spare blocks, ect.
    I did a little digging into my current partition tables and there is corruption/weirdness in in both the XXXd0 and XXd0s0 cases.
    The XXXd0s0 configuration has not seen a lot of reads or writes, that why it still appears to be ok.
    I’m convinced that in 6 month, it will be in just as bad condition as the XXXd0 configuration.
    prtvtoc:ing one of the XXXd0 disks shows:
    *                          First     Sector    Last
    * Partition  Tag  Flags    Sector     Count    Sector  Mount Directory
           0     24    00        256    524288    524543
           1      4    00     524544 286198368 286722911
           8     11    00  286722912     16384 286739295
          24    255    1030          0 9007199254741057538 9007199254741057537
          28    255    6332  3256158825667702840 3906306445338542081 7162465271006244920
          29    255    3038  3559300795614704941 18143602103762549516 3256158825667702840
    I was impressed by partition 24 being  4194304000.00 TB or 4 zettabyte disk size.  That’s a great compression ratio for a 146G drive
    Has anyone in the community successfully used iSCSI without zpools and volumes on the physical disk?
    If you have, please share your results and configuration. I’ve hit the wall with this one.
    -Mikko

  • VDI3: what happens when the Primary VDI core fails?

    A potential customer asked me this and I did not really know how to answer the question ... sorry if this was asked before. So here we go:
    What happens when and if the Primary VDI core fails? Will the entire VDI3 setup go down? Will running sessions be lost? And how do you fix that?
    I found this page:
    http://wikis.sun.com/display/VDI3/MySQL+Cluster+Reconfiguration+Scenarios
    So I assume the "Non-VDI host ==> Primary Management Host" scenario is what would need to be done?
    http://wikis.sun.com/display/VDI3/MySQL+Cluster+Reconfiguration+Scenarios#MySQLClusterReconfigurationScenarios-NonVdiToPrimary
    If this is correct then this would answer the "how do you fix that?" part. OK then.
    But still I'd welcome if anyone could explain what exactly would happen when and if a Primary VDI core fails. The customer's opinion on this is that they are a "single point of failure" because the Primary VDI host is not redundant (as opposed to the 2 x Secondary VDI cores) ... So, is this assumption correct or not?
    Thanks in advance.

    Hi,
    Assuming you have 3 VDI core hosts. One of them is the primary. And the primary goes down. So what happens then:
    * The underlying database is still running on the remaining hosts
    * All desktop sessions are still running on the remaining hosts
    * New session requests will be handled by the remaining hosts.
    * All desktops are still running on the virtualization hosts.
    So in essence, your VDI cluster is still healthy. The operation is just impacted in this way:
    * You can't add new VDI core hosts
    * You can't change the configuration of the Sun Ray server failover group
    * A failure of another VDI core host (data node) will result into a complete outage of the underlying database
    You should bring up the primary again as soon as possible in order to gain failover capabilities again.
    -Dirk

  • Call Manager upgrade to 10.x version

    Dear Community members,
    I need your urgent assistance.
    My company   purchased from Cisco Gold Partner  the licenses  for upgrade  from our existing CUCM 7.1.5 to the latest  version .
    My plan  was to make an upgrade in my lab first then  start in production Environment,
    But I can not do  anything  !!! Always get's  the Error message !!!
    Below  the steps what I did:
    1. Install in my MCS CUCM 7.0.2 version
    2. Upgrade it to 7.1.5 version
    3. Take a backup from my production CUCM 7.1.5
    4. Restore that backup in my LAB to 7.1.5
    5. Tried to upgrade it to 9.0  but get this error : Upgrades are prohibited during License Grace Period
    6. I've read in community that it's a bug and I have  2 options :
         6.1   Delete the license  files
         6.2   Install the .cop file : ciscocm.refresh_upgrade_v1.5.cop.sgn
    7. First  I've  tried to delete  all license files by this command through CLI :  file delete license */
        I've deleted all licenses
    8.  Tried to upload  that mentioned .cop file through DVD/CD  but get the same error :
    Upgrades are prohibited during License Grace Period
     Please help me to fix this issue,  what  can I do else ?? how can I fix this problem with upgrade ???
    P.S. Would like  to mention again that it's in closed environment .

    Hi Rafig,
    refer the discussion
    https://supportforums.cisco.com/discussion/12453416/cucm-715-jump-upgrade-failure
    regds,
    aman

  • Need for idempotent stateless session beans

    I'm trying to find a solution for failovering method calls on a stateless
              session bean, even for method calls that already have started.
              I understand that failover services for failures that occur while the method
              is in progress, are supported only for idempotent method.
              My question is why ???
              Assuming that I start a transaction each time I call a method on a bean, I
              believe that committing the work will be done only after the method returned
              successfully. Why can't the stub decide that if it something went wrong in
              the transaction then it wasn't committed and it can be run again ?
              We need the method call failover services real bad, while making the methods
              idempotent can be very awkward sometime.
              TIA
              Eran.
              

    Eran Erlich wrote:
              > I'm trying to find a solution for failovering method calls on a stateless
              > session bean, even for method calls that already have started.
              > I understand that failover services for failures that occur while the method
              > is in progress, are supported only for idempotent method.
              > My question is why ???
              > Assuming that I start a transaction each time I call a method on a bean, I
              > believe that committing the work will be done only after the method returned
              > successfully. Why can't the stub decide that if it something went wrong in
              > the transaction then it wasn't committed and it can be run again ?
              Its hard to decide whether the stub has to retry or not since the failure
              could have happened anytime. So, the stub will retry only if it knows for sure
              that it is safe to retry.
              - Prasad
              >
              > We need the method call failover services real bad, while making the methods
              > idempotent can be very awkward sometime.
              >
              > TIA
              > Eran.
              

Maybe you are looking for