Zero downtime deployment

Hi, was just wondering about "best practice" in terms of supporting zero-downtime deployment.
We have a cluster with N nodes that are not storage nodes, and M nodes that are storage nodes. We use java all around, and pof-serialize the objects that we store in coherence.
We want to deploy a new codebase, which requires a restart of all the processes, and it might also include changes to the objects that are being stored (ie, their serialization might be different).
The typical approach is to do a "rolling restart" of the various processes, but i fear all that synch-up coherence does might not work right with some members running an older version of code, while the ones that are restarted are running a newer version of the code.
Anybody have any experience with this?
Thanks.

Hi,
If you change the serialisation format then a rolling restart will not work. You need to make sure that all your POF classes are evolvable - that is that they implement EvolvablePortableObject. Doing this can take a lot of effort. We do it, not for rolling restart, but to make sure clients of our system do not constantly need to upgrade their client libraries. It can be quite complicated as it is not just serialisation that you need to be aware of but even changing how methods work that get called on the server side can break evolvability, for example changing how a filter works, an aggregator works, a method return type etc.
We have looked at zero downtime rolling restarts in the past and decided they were more trouble than they were worth. In our case we have large clusters 300+ nodes so a rolling restart would take a very long time as you need to wait for the cluster to re-balance the partitions each time you stop then restart the node. We found it could take quite a few hours to do a rolling restart whereas a full deployment and reload of the data take about 90 minutes.
You would also need to be careful if your cluster sits on top of a database. If you need to make any DB changes then this will not work unless they can some how be evolvable too otherwise either the new code or old code will be incompatible with the start of the DB.
Other people may have different experiences of zero-downtime, but as I said, we just found it too much effort and just go for the normal small-amount-of-planned-downtime approach.
JK

Similar Messages

  • ASA 5520 Upgrade 8.0(4)-- 8.4.2--Zero Downtime

    Hello Everyone,
    We are currently on 8.0(4) and planning on upgrading our failover pair to 8.4.2, I read some documents saying that we can perform a zero downtime upgrade.
    According the below documents Version 8.2 supports mismatch memory failover,
    http://www.cisco.com/en/US/docs/security/asa/asa82/configuration/guide/ha_overview.html#wp1077536
    https://supportforums.cisco.com/message/3549760#3549760//
    Upgrade Path:
    Active Firewall:                         Standby Firewall:
       8.0(4)                                       8.0(4)-->8.2.2
       8.0(4)                                       Upgrade RAM-2G---Reload
       faiover to standby                    8.2.2
       8.0(4)--->8.2.2                          8.2.2
       Upgrade RAM-2G-reload         8.2.2----Fail over
       8.2.2--Active                             8.2.2--Standby
      8.2.2                                          8.3.1
      8.2.2                                          8.4.2
      Failover to stanby                      8.4.2
      8.2.2--Standby                           8.4.2-----Active
    Can I perform zero downtime upgrade with the above upgrade path? Will both the firewalls act as a failover pair if one is on 8.2.2 and other is on 8.4.2.
    "Performing Zero Downtime Upgrades for Failover Pairs
    The two units in a failover configuration should have the same major  (first number) and minor (second number) software version. However, you  do not need to maintain version parity on the units during the upgrade  process; you can have different versions on the software running on each  unit and still maintain failover support."  (http://www.cisco.com/en/US/docs/security/asa/asa83/configuration/guide/admin_swconfig.html)
    Upgrade RAM-2G

    You can do it in a lot fewer steps.
    1. Upgrade RAM on standby, reload and make it active.
    2. Repeat process for newly standby unit.
    Now you have 2 units still on 8.0(4) with requisite RAM for 8.3+. TAC will recommend you go up in "baby steps" but the software will work upgrading directly from 8.0 to 8.4. 8.4(3) is the current version for the 5520 platform. At most conservative, I might upgrade to 8.2(4) as an interim but it's not strictly necessary. So my next step would be:
    3. Upgrade standby unit from 8.0(4) to 8.4(3). At this point take stock of the script syntax changes. Examine the upgrade log (on disk0:) and address any discrepancies.
    Note active/standby failover will work here but should not be run this way for any extended time as syntax changes would affect the ability to synchronize if changes are introduced on the active member.
    Finally:
    4. Flip upgraded standby unit to active and upgrade remaining standby unit to 8.4(3).
    If you follow these steps and check your work after each step, this would all be zero downtime.

  • WinPE 4.0 Startup Script on a Zero Touch Deployment

    Hi,
    First, the scenario im using is SCCM2012 SP1 + MDT2012 Update 1, Zero Touch deployment, started  from client, so no USB or PXE boot.
    I want to run a startup command when WinPE boots.
    WinPE have some ways to run startup scripts, you have to modify or include some files on your boot image to do it, please correct me if im wrong, there are some methods:
    Startnet.cmd - When you update boot image on Distribution Point, SCCM adds his binaries and Startnet.cmd doest run on boot.
    TSConfig.ini - Not valid for my ZeroTouch Scenario because, when computer reboots to the boot image it doesnt run. Only Media and PXE.
    Unattend.xml - Not valid for my ZeroTouch Scenario because, when computer reboots to the boot image it doesnt run. Only Media and PXE.
    AutoUnattend.xml - Not valid for my ZeroTouch Scenario because, when computer reboots to the boot image it doesnt run. Only Media and PXE.
    WinPEshl.ini - When you update the file gets overwrited, so i cant add commands
    to the file at the boot image and then update, anyways im trying to use this method because you should be able to inject the file on the boot image someway during
    the update process.
    Im trying inject the file at the update distribution point process by adding it on osdinjection.xml, but the update process fails to inject OSD Binaries on the boot image.
    Maybe some one know how to do it, file and directories have the correct permisions.

    I also wanted to inject some startupscripts in the winPe environment. SCCM 2012 ignored the WinPEshl.ini and the startnet.cmd.
    This solved my issue ->
    First I enabled powershell for the boot image and then added the script as a "Prestart Command".
    Now my powershell script runs before the task sequence starts. 

  • Zero Downtime Migration from Oracle to Sybase

    Is there any way/ tool to migrate from oracle to sybase with Zero Downtime??
    Thanks

    Better answered on a Sybase forum I suppose...

  • Zero downtime Upgrade ASA 8.0(4) TO 8.4(7)

    Hi All,
    I checked a few blogs and upgrading ASA 5520 from 8.0(4) to 8.4(7) following below path. I will be upgrading  RAM to 2GB at version 8.2.5. Reason for 8.4.6 is we may get an error message ""No Cfg structure found in downloaded image file" Error Message" if we upgrade directly to 8.4.7.
    Please advise if we can perform Zero downtime upgrade if I follow below path and will they still be in HA? Active/standby
    8.0.4-->8.2.5 (Active on 8.0.4 and standby 8.2.5)--> Will they be in HA?
    8.2.5--->8.4.6(Active on 8.2.5 and standby 8.4.6)--> Will they be in HA?
    I believe below one should not be a problem.
    8.4.6-->8.4.7(Active on 8.4.6 and standby 8.4.7)--> Will they be in HA?
    Thanks in advance.
    Regards

    8.0.4-->8.2.5 (Active on 8.0.4 and standby 8.2.5)--> Will they be in HA?
    HA will work...as in the units will failover.  But due to changes in configuration syntax you could run into problems with config synchronisation. And could also cause issues in traffic flow if a failover occurs.  So it is best to upgrade the second ASA to the new version ASAP.  It is also the reason cisco recommend using the same Major and Minor software versions.
    8.2.5--->8.4.6(Active on 8.2.5 and standby 8.4.6)--> Will they be in HA?
    Same as above.
    8.4.6-->8.4.7(Active on 8.4.6 and standby 8.4.7)--> Will they be in HA?
    This should be fine
    Please remember to select a correct answer and rate helpful posts

  • Cisco ASA non zero downtime upgrade

    Hello,
    with a NON zero downtime procedure upgrade all connections are lost, even nat and arp table ? here, http://www.cisco.com/c/en/us/td/docs/security/asa/asa84/configuration/guide/asa_84_cli_config/ha_overview.html#wp1078922, on Table 61-2 State Information I think it is only for plain failover but not for upgrade with a non zero downtime upgrade procedure.

    Assuming you have a working HA pair with stateful failover, the Cisco supported answer is that you cannot skip minor releases (i.e. going from 9.1 directly to 9.3).
    You CAN upgrade directly from 9.1(2) to 9.1(5) as that third ordinal (the number in parentheses) is known as the maintenance release level.
    See table 1-6 in the Release notes for confirmation, excerpted here:
    "You can upgrade from any maintenance release to any other maintenance release within a minor release.
    For example, you can upgrade from 8.4(1) to 8.4(6) without first installing the maintenance releases in between."
    Note that 9.1(3) or later have some restrictions that are unique to those more recent code levels as some file system changes were put in place that requires certain prerequisites for a successful upgrade. Given that you are on 9.1(2) already that doesn't affect you in this case but it may be a consideration for other readers. Those requirements are noted just above Table 1-6 in those release notes.

  • Zero-downtime Database Migration tool ?

    We are exploring\evaluating tools provided by Oracle (or its partners) that ensures Zero-downtime Database Migration. Migration should include:
    - Migration of data from one version of the application to another version with or without changes to the database schema.
    - Migration of data from staging to production where staging was used for beta testing to host customers who created live data which need to be migrated to production. (Oracle to Oracle, SQL Server to Oracle, MySQL to Oracle, etc)
    - if a data type changes (say int to varchar) in staging database for a particular column in a table, the change migration should happen in the production database as well
    - if a column is added\deleted in a table of staging database, the same table alteration should migrate to production database
    - records in production database should not be deleted\truncated during data\schema migration
    - maintain zero-downtime
    By Zero-Downtime we mean: both the source and the target should be up and accept updates in real time during migration process. This should again be synced across and hence help to eliminate downtime during migration between various vendor databases.
    We are not looking for any ETL product, but out-of-the-box products like GoldenGate and Celona that ensure Zero-downtime database migration.

    Hi,
    I dont think that there is any easy answer. It looks like huge project so it should be done part to part.
    If I understand
    1) you have create staging database with all changes
    2) production is in old structure
    3) now you want to merge this two databases into one? Or applly all changes form staging to prod?
    I see there one solution clone your staging and create new prod. Whenit's donw switch connection to your new prod database.
    Regards,
    Tom

  • Zero downtime backups

    I was shocked to read the following notation on Sun's Training website web page:
    "Please note that the Sun Web Learning Center will be down once a week for backup on Saturdays from 1:00 to 3:00 A.M. MDT (Friday 19:00 to 21:00 GMT)."
    I have been researching various methods of backup including fssnap ufsdump flar and others only to find it seems there is no "best way" to backup without downtime. It appears as though it is still best and most reliable to bring the system into single user mode to do a proper backup; and the fact that Suns Sys Admins feel it best to bring down the Learning Center system "teaches" me it must be best.
    Does anyone have any insite into why this might be, and how it might be possible for the system backup to take place with zero downtime?

    If you have enough diskspace you can go a long way with the use of ufssnap. But like you said; your milage may vary. I usually run ufsdump directly on the slices I wish to have backed up because I know there won't be much writes going on. So far this approach never failed me (and yes; it has been crash tested a few times).

  • Zero touch deployment using SCCM

    Hi,
    As SCCM is called as zero touch deployment method for operating system deployment. So please help me to clear my below doubts –
    How SCCM take care of backup & restore of entire HDD, if SCCM does not take care of backup & restore of entire HDD then how we can say it is zero touch deployment method?

    we can use USMT tool with SCCM to decide what we need to backup. I think good planned infrastructure will always have users data on file servers rather than on user's local system. please refer below few URLs for more information:
    http://technet.microsoft.com/en-us/magazine/jj127984.aspx
    http://technet.microsoft.com/en-us/library/dd560793(v=ws.10).aspx
    http://technet.microsoft.com/en-us/library/hh397289.aspx
    http://blogs.technet.com/b/meacoex/archive/2013/02/19/migrate-windows-xp-to-windows-7-using-sccm-2012.aspx
    Prashant Patil

  • Zero-fingerprint deployment scenario - peer review invite

    School/College implementation with student BYO laptop concept.
    Packaging the vApps and securing the desired sandboxing is of course a part of the project as well as having expirydates.
    Delivery is another question: By website for download, using streaming features, delivery by common iFolder or whatever you might have found a solid solution ??
    And on top of the applications what about filesharing ? (iFolder) and printing ? (iPrint) or any other solutions.
    KR /Bjrn

    bkelsen,
    It appears that in the past few days you have not received a response to your
    posting. That concerns us, and has triggered this automated reply.
    Has your problem been resolved? If not, you might try one of the following options:
    - Visit http://support.novell.com and search the knowledgebase and/or check all
    the other self support options and support programs available.
    - You could also try posting your message again. Make sure it is posted in the
    correct newsgroup. (http://forums.novell.com)
    Be sure to read the forum FAQ about what to expect in the way of responses:
    http://forums.novell.com/faq.php
    If this is a reply to a duplicate posting, please ignore and accept our apologies
    and rest assured we will issue a stern reprimand to our posting bot.
    Good luck!
    Your Novell Product Support Forums Team
    http://forums.novell.com/

  • Major version upgrade of WebLogic with zero/minimal downtime

    From what I can tell, the recommended approach for supporting minimal downtime during major version upgrades (e.g. WL 9 -> WL 10) is to have 2 domains available in the production environment.
    Leave one running to support existing users, upgrade the other domain, then swap to perform the upgrade on the first domain.
    We are planning on starting out with WL 9.1, but moving forward we require very high availability...(99.99%).
    Is this my only option?
    According to BEA marketing literature, service pack upgrades can be applied with "zero" downtime...but if this isn't reality, I'd like to hear more...
    Thanks...
    Chuck

    Have gotten as far as upgrading all of the software, deleting /var/db/.AppleSetupDone, and rebooting.  It brought me back in to Setup Assistant and let me choose "migrate from another mac os x server" and is now sitting waiting for me to take the old server down and boot it into target disk mode.  Which we can probably do Sunday at about 2am or so...
    You know, Setup Assistant should really let you run Software Update BEFORE migrating from another machine.  We have servers that can't be down for SoftwareUpdates in the middle of the day...

  • How to achieve no-downtime solution deployment on farms with multiple WFEs and LB

    Taking SharePoint Solution Deployer, my opensource PowerShell deployment script, to the next level,
    Bill Simser got me the idea of making the deployment even more smooth on farms with multiple WFEs and load balancer in order to achieve a no-downtime deployment
    The basic idea is to deploy the solutions on each WFEs one-by-one by
    1. Taking one WFE offline
    2. Installing the solution with the -local switch
    //Solution deployment
    Install-SPSolution -Identity <solutionname>.wsp –GACDeployment –CASPolicies –Local
    // Solution upgrade
    Update-SPSolution -Identity <solutionname>.wsp -LiteralPath LocalPathOfTheSolution.wsp -GacDeployment -Local
    3. Run post-deployment actions on the WFE (ie. restart services, recycle apppools or IIS reset, warmup server), which my script already does for each server
    4. Take WFE online again
    5. Repeat step 1-4 for all other WFEs
    I am struggling with three things here:
    1. The whole deployment process could be quite risky when something goes wrong in between. And in order to roll back I would require the original solution if it was already deployed before (which I can back up of course before I replace
    it)
    Anything which involves changing the content dbs should of course be done after the solutions is deployed to the whole farm, so this should not hurt in this case.
    Anyway MSDN says that the "DeployLocal" method (which I assume is the same as the -local switch in PS ) should be only used
    for
    troubleshooting purposes.
    So it would be great to hear about anyones experiences with it
    2. As there can be different types of load balancers (hardware, software) which might not be configurable through my script I assume that taking out the WFE from the the load balancer may not always be possible.
    So I thought about just taking the server offline.
    I haven't found an option yet to take only one server in the farm offline (without removing it from the farm of course), so maybe I miss something. Any ideas?
    3. Before taking a single WFE offline, I would like to assure that this server does not have any open sessions, operations of users ongoing. Unfortunately I found only the possibility to quiesce the whole farm, but not a single
    server. Am I missing something?
    Appreciate any ideas which might point me in the direction to solve the overall goal!
    SharePoint Architect, Speaker, MCP, MCPD, MCITP, MCSA, MCTS, Scrum Master/Product Owner
    Blog: www.matthiaseinig.de, Twitter:
    @mattein
    CodePlex: SharePoint Software Factory,
    SharePoint Solution Deployer

    Hi Mike, 
    unfortunately not. I tried several different approaches but didn't really success reliably with any of them. So eventually I gave up on it.
    Interesting idea though that Eric Hasley is commenting on the blog post you mentioned.
    "There is another approach that has worked for me in the past.  Because the deployment to each server is handled through a timer job,
    by stopping the timer service in a controlled fashion you can rollout your solution without incurring any user outage."
    It could work like that (in theory).
    Stop the SPTimerV4 on all servers in the farm apart from one.
    Take out the one to deploy to from the NLB
    Wait until it has no connections
    Deploy the solutions on it in the ordinary way (eg. with my
    SharePoint Solution Deployer ;))
    Put it back into the NLB and take the others out
    Wait until they have no connections left
    Activate the timer service on the others servers and let them deploy
    Put them back into the NLB
    No clue if this is actually working and you still have the problem with the NLB, so it could take a while.
    Also I am not certain what happens in state 5 if users use different versions of your solutions at the same time (old version on the remaining open connections, new version on the updated server)
    I do not have a suitable farm at hand to play with it though, so can't test it.
    Cheers
    Matthias
    Matthias Einig, CEO, SharePoint MVP
    Blog: www.matthiaseinig.de, Twitter:
    @mattein
    Projects: SharePoint Code Analysis Framework (SPCAF),SharePoint Code Check (SPCop),
    SharePoint Software Factory,
    SharePoint Solution Deployer

  • Deploy WSP without IISRESET

    Hi Team,
    I have a WSP with multiple Visual Web Parts with the custom code of .NET. I want to deploy the same on Production Server with Zero DownTime, without RESET the IIS. Please help me out??
    Regards Sourabh Soni

    hi!
    do you have the source code? in the source code when you open the package you can set the property "Reset Web Server" to false.
    As the custom code gets deployed to the GAC, the webserver won´t recognize the new dll without a iis-reset. So at the end nethertheless you need to make an iis-reset.
    I dont know what your webparts are doing,  but when its possible to create a sandboxed solution you can add them without iis-reset to a site collection.
    Plan a short downtime for deployment. Max. 5 minutes of being offline every company will survive.
    br,
    ronald

  • Downtime needs to be reduced while processing releases

    Hi All,
    We are having SQL Server 2012 which is configured in Mirroring.  Both are Standard Editions. 
    These servers are configured in customer environment and we do process frequent releases on these servers. Each release includes Master data changes, Table changes, Schema Changes (some times) etc.  Usually we will take a downtime or hour or two to
    complete the release process. Now, our customer is asking for a ZERO Downtime setup. They are ready to invest additional hardware if needed. 
    Can anyone suggest how to design an architecture to achieve ZERO Downtime with NO data loss.
    Thanks,
    Balu Kalepu

    Hello,
    Try the following technical considerations to speed up deployments:
     Understand processes
    and environments in depth.
    Automate database changes and deployments, and avoid manual changes. This should speed up releases and improve the quality of releases too. Define the deployment
    orchestration model.
    Parameterized scripts and configurations.
    Create build verification tests to validate automated deployments
    Schedule database changes at different times (during maintenance window).
    Find tools and practices to create smart automated tests.
    Organizational considerations:
    The less functional dependencies then changes can be implemented faster.
    Improve release planning
    Hope this helps.
    Regards,
    Alberto Morillo
    SQLCoffee.com

  • ASA 8.2 8.4 9.1 possible with no downtime as we run active/standby?

    Hello,
    We have 2 x ASA 5520s (with 2GB mem) in active/standby mode, they also include the IPS modules.
    The current firmware is 8.2 and I was wondering if it is possible to upgrade these firewalls with no downtimes?  In the past I have upgraded the standby ASA, rebooted it and then made it the active ASA then upgraded the new standby ASA.
    I have have quite a lot of NAT Exempts (No-NATs?) and a few static NATs, how did you approach this during your upgrades?
    I guess I can roll back as the 8.2 firmware will still be on the flash and I will have the config?
    Thanks

    Yeah it's supported:
    Release Notes for the Cisco ASA Series, 9.1(x)
    http://www.cisco.com/en/US/docs/security/asa/asa91/release/notes/asarn91.html#wp732442
    This document has the information that you need; it talks about the requirements and zero downtime procedure.
    But you need to take a lot of considerations that you can reference in the document:
    https://supportforums.cisco.com/docs/DOC-12690
    If you don't mind me asking why are you upgrading?
    Because of a fix or feature?

Maybe you are looking for

  • Freezes on back up

    when attempting to synch iPhone 4 with HP tablet running Windows 7 and altest iTunes, my process - and whole system -  seizes up at "backing-up, step 1 of 5" and requires a hard restart. What to do?

  • Maintaining a tickets

    hi   can any one tell me details about maintaining a ticket and solving d tickets and closing d ticket if any body have tickets send me some tickets to [email protected] thanking u.......        anand.

  • SI_CreationTime into webi report

    Hi Experts , I want to create a webi report for BOE system analysis.No of created in month. I can see infoobject called SI_CREATIONTIME in AdminTools.How do I get into webi report. Thanks in advance.

  • For ecommerce - coldfusion or php?

    Hi folks, I want to build a complete ecommerce site. I've just started learning java script and php and i am planning to learn colfusion for the ecommerce aspect of web development. However, i just came across something on the web that said one could

  • Olympus E-PL1 RAW ??

    I was very happy when the E-P1 and 2 were supported and can't help but wonder when Aperture will support the RAW files for the very similar E-PL... Wondering if anyone might have any thoughts on the subject? Many thanks for any help and/or info.