Distributed Cache Host Down

when i start the cache host  on the server with the below command
Start-CacheHost -Computername "name" -CachePort 22233
it immediately displays
HostName : CachePort Service Name            Service Status Version Info
name:22233           AppFabricCachingService DOWN           3 [3,3][1,3]
when i checked the logs i see
 AppFabricCachingService.Crash 
  Param System.UriFormatException: Invalid URI: The hostname could not be parsed. at System.Uri.CreateThis
I am able to ping the hostname.
Can somebody point what is missing?

I tried running the command
cmdlet Get-AFCacheHostConfiguration at command pipeline position 1
Supply values for the following parameters:
ComputerName: ABC1
CachePort: 22233
HostName        : ABC1
ClusterPort     : 22234
CachePort       : 22233
ArbitrationPort : 22235
ReplicationPort : 22236
Size            : 819 MB
ServiceName     : AppFabricCachingService
HighWatermark   : 99%
LowWatermark    : 90%
IsLeadHost      : True
Its running , But when i run
Start-CacheHost -Computername "abc1" -CachePort 22233
it immediately displays
HostName : CachePort Service Name            Service Status Version Info
abc1:22233           AppFabricCachingService DOWN          
3 [3,3][1,3]

Similar Messages

  • How to know (cmdlet) If my Distributed Cache hosts belong to the same Cluster or not ?

    Forum,
    Our Farm has two servers that are hosting and running the Distributed Cache service. How can I know if both servers/hosts belong to the exact same Cluster? What is the command for that?

    hi,
    you can take help of the below articles it has list of powershell command to provide details of each host inside cluster
    http://almondlabs.com/blog/manage-the-distributed-cache/
    Whenever you see a reply and if you think is helpful,Vote As Helpful! And whenever you see a reply being an answer to the question of the thread, click Mark As Answer

  • Distributed cache

    HI,
    We have a server (Server 1), on which the status of the Distributed cache was in "Error Starting" state.
    While applying a service pack due to some issue we were unable to apply the path (Server 1) so we decided to remove the effected server from the farm and work on it. the effected server (Server 1) was removed from the farm through the configuration wizard.
    Even after running the configuration wizard we were still able to see the server (Server 1) on the SharePoint central admin site (Servers in farm) when clicked, the service "Distributed cache" was still visible with a status "Error Starting",
    tried deleting the server from the farm and got an error message, the ULS logs displayed the below.
    A failure occurred in SPDistributedCacheServiceInstance::UnprovisionInternal. cacheHostInfo is null for host 'servername'.
    8130ae9c-e52e-80d7-aef7-ead5fa0bc999
    A failure occurred SPDistributedCacheServiceInstance::UnprovisionInternal()... isGraceFulShutDown 'False' , isGraceFulShutDown, Exception 'System.InvalidOperationException: cacheHostInfo is null     at Microsoft.SharePoint.DistributedCaching.Utilities.SPDistributedCacheServiceInstance.UnProvisionInternal(Boolean
    isGraceFulShutDown)'
    8130ae9c-e52e-80d7-aef7-ead5fa0bc999
    A failure occurred SPDistributedCacheServiceInstance::UnProvision() , Exception 'System.InvalidOperationException: cacheHostInfo is null     at Microsoft.SharePoint.DistributedCaching.Utilities.SPDistributedCacheServiceInstance.UnProvisionInternal(Boolean
    isGraceFulShutDown)     at Microsoft.SharePoint.DistributedCaching.Utilities.SPDistributedCacheServiceInstance.Unprovision()'
    8130ae9c-e52e-80d7-aef7-ead5fa0bc999
    We are unable to perform any operation install/repair of SharePoint on the effected server (Server 1), as the server is no longer in the farm, we are unable to run any powershell commands.
    Questions:-
    What would cause that to happen?
    Is there a way to resolve this issue? (please provide the steps)
    Satyam

    Hi
    try this:
    http://edsitonline.com/2014/03/27/unexpected-exception-in-feedcacheservice-isrepopulationneeded-unable-to-create-a-datacache-spdistributedcache-is-probably-down/
    Hope this helps. Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.

  • Distributed Cache queries

    Hi,
    In a distributed cache scheme ( in multiple servers/jvm).
    1. how to know which server is hosting what data (cache store) and the backup of this data is in which server?
    2. Can this distribution be controlled? like a 'xyz' cache store is required to be in a specified '123' server only and that of the backup of 'xyz' cache store is required to be in '234' server?
    Thanks,
    ~Ravi Shanker

    Hi,
    In a redundancy system only one server will be serving and the secondary will be idle. I just want to ensure that these idle systems are also used instead of lying idle.
    Hence the question was raised on can we control the Distribution logic, where-in the least used data can be moved into these idle systems and re-direct the usage of data to these idle systems.In Coherence cluster, all the servers hold both primary and backup data, every is serving the requests and holding the backups as well so there are no idle systems.
    but i have few things required for clarification.
    While running the sample programs as per the documentation. We need to start a Default Cache Server and the java programs which act/add as cluster to the cache server.
    But i have seen adding/acting of cluster is working even if the Default Cache Server is shut down?
    Can u provide any info (links) or clarification how the Cache Server and Clusters mechanism work? Gone through the documentation but none has provided a clear picture of this?This is wrong assumption and every storage enabled node can become the cluster member. DefaultCacheServer is one of the implementations to run coherence server.
    HTH
    Cheers,
    _NJ                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       

  • Distributed Cache service stuck in Starting Provisioning

    Hello,
    I'm having problem with starting/stopping Distributed Cache service in one of the SharePoint 2013 farm servers. Initially, Distributed Cache was enabled in all the farm servers by default and it was running as a cluster. I wanted to remove it from all hosts
    but one (APP server) using below PowerShell commands, which worked fine.
    Stop-SPDistributedCacheServiceInstance -Graceful
    Remove-SPDistributedCacheServiceInstance
    But later I attempted to add the service back to two hosts (WFE servers) using below command and unfortunately one of them got stuck in the process. When I look at the Services on Server from Central Admin, the status says "Starting".
    Add-SPDistributedCacheServiceInstance
    Also, when I execute below script, the status says "Provisioning".
    Get-SPServiceInstance | ? {($_.service.tostring()) -eq "SPDistributedCacheService Name=AppFabricCachingService"} | select Server, Status
    I get "cacheHostInfo is null" error when I use "Stop-SPDistributedCacheServiceInstance -Graceful".
    I tried below script,
    $instanceName ="SPDistributedCacheService Name=AppFabricCachingService" 
    $serviceInstance = Get-SPServiceInstance | ? {($_.service.tostring()) -eq $instanceName -and ($_.server.name) -eq $env:computername}
    $serviceInstance.Unprovision()
    $serviceInstance.Delete()
    ,but it didn't work either, and I got below error.
    "SPDistributedCacheServiceInstance", could not be deleted because other objects depend on it.  Update all of these dependants to point to null or 
    different objects and retry this operation.  The dependant objects are as follows: 
    SPServiceInstanceJobDefinition Name=job-service-instance-{GUID}
    Has anyone come across this issue? I would appreciate any help.
    Thanks!

    Hi ,
    Are you able to ping the server that is already running Distributed Cache on this server? For example:
    ping WFE01
    As you are using more than one cache host in your server farm, you must configure the first cache host running the Distributed Cache service to allow Inbound ICMP (ICMPv4) traffic through the firewall.If an administrator removes the first cache host from
    the cluster which was configured to allow Inbound ICMP (ICMPv4) traffic through the firewall, you must configure the first server of the new cluster to allow Inbound ICMP (ICMPv4) traffic through the firewall. 
    You can create a rule to allow the incoming port.
    For more information, you can refer to the  blog:
    http://habaneroconsulting.com/insights/Distributed-Cache-Needs-Ping#.U4_nmPm1a3A
    Thanks,
    Eric
    Forum Support
    Please remember to mark the replies as answers
    if they help and unmark them if they provide no help. If you have feedback for TechNet Subscriber Support, contact [email protected]
    Eric Tao
    TechNet Community Support

  • Foundation 2013 Farm and Distributed Cache settings

    We are on a 3 tier farm - 1 WFE + 1APP + 1SQL - have had many issues with AppFab and Dist Cache; and an additional issue with noderunner/Search Services.  Memory and CPU running very high.  Read that we shouldn't be running Search
    and Dist Cache in the same server, nor using a WFE as a cache host.  I don't have the budget to add another server in my environment. 
    I found an article (IderaWP_CachingFormSharePointPerformance.pdf) saying "To make use of SharePoint's caching capabilities requires a Server version of the platform." because it requires the publishing feature, which Foundation doesn't have. 
    So, I removed Distributed Cache (using Powershell) from my deployment and disabled the AppFab.  This resolved 90% of server errors but performance didn't improve. Now, not only I'm getting errors now on Central Admin. - expects Dist Cache
    - but I'm getting disk operations reading of 4000 ms.
    Questions:
    1) Should I enable AppFab and disable cache?
    2) Does Foundation support Dist Cache?  Do I need to run Distributed Cache?
    3) If so, can I run with just 1 cache host?  If I shouldn't run it on a WFE or an App server with Search, do I have to stop Search all together?  What happens with 2 tier farms out there? 
    4) Reading through the labyrinth of links on TechNet and MSDN on the subject, most of them says "Applies to SharePoint Server".
    5) Anyone out there on a Foundation 2013 production environment that could share your experience?
    Thanks in advance for any help with this!
    Monica
    Monica

    That article is referring to BlobCache, not Distributed Cache. BlobCache requires Publishing, hence Server, but DistributedCache is required on all SharePoint 2013 farms, regardless of edition.
    I would leave your DistCache on the WFE, given the App Server likely runs Search. Make sure you install
    AppFabric CU5 and make sure you make the changes as noted in the KB for
    AppFabric CU3.
    You'll need to separately investigate your disk performance issues. Could be poor disk layout, under spec'ed disks, and so on. A detail into the disks that support SharePoint would be valuable (type, kind, RPM if applicable, LUNs in place, etc.).
    Trevor Seward
    Follow or contact me at...
    &nbsp&nbsp
    This post is my own opinion and does not necessarily reflect the opinion or view of Microsoft, its employees, or other MVPs.

  • Write behind cache, DB down, when should the system stop taking new data in

    Hello:
    We are trying to use Coherence for our custom ESB, which is brokering payloads of various size between consumer and provider applications.
    Before Coherence, stopping our DB meant organization-wide outage for critically important business services.
    Since we have at least 40G of RAM in production environment, we believe that our app
    can use Coherence write-behind option for tolerating at least several hours worth of DB outage.
    We are currently using a near cache backed by distributed cache in write-behind mode.
    9 business service JVMs (storage enabled=false) use 30 storage enabled JVMs.
    IMPORTANT: We need to create an automated alerting facility determining when
    amount of unsaved data reaches critical level since DB goes down. This alert should help us decide when our application stops accepting inbound traffic.
    It is hard to use QueueSize parameter for that because our payload memory footprint can vary from 1KB to 3MB.
    We do not expire any entries in order to enable support queries against the cache during DB outage.
    Our experiments with trying various flavors of overflow-scheme resulted in OutOfMemoryError, therefore
    we decided to implement RAM-only cache as a first step.
    <near-scheme>
    <scheme-name>message_payload_scheme</scheme-name>
    <front-scheme>
    <local-scheme>
    <scheme-ref>limited_entities_front_scheme</scheme-ref>
    <high-units>100</high-units>
    </local-scheme>
    </front-scheme>
    <back-scheme>
    <distributed-scheme>
    <backing-map-scheme>
    <read-write-backing-map-scheme>
    <internal-cache-scheme>
    <local-scheme>
    <scheme-ref>limited_bytes_scheme</scheme-ref>
    <high-units>199229440</high-units>
    </local-scheme>
    </internal-cache-scheme>
    <cachestore-scheme>
    <class-scheme>
    <class-name>com.comp.MessagePayloadStore</class-name>
    </class-scheme>
    </cachestore-scheme>
    <read-only>false</read-only>
    <write-delay-seconds>3</write-delay-seconds>
    <write-requeue-threshold>2147483646</write-requeue-threshold>
    </read-write-backing-map-scheme>
    </backing-map-scheme>
    <autostart>true</autostart>
    </distributed-scheme>
    </back-scheme>
    </near-scheme>
    <local-scheme>
    <scheme-name>limited_entities_front_scheme</scheme-name>
    <eviction-policy>LRU</eviction-policy>
    <unit-calculator>FIXED</unit-calculator>
    </local-scheme>
    <local-scheme>
    <scheme-name>limited_bytes_scheme</scheme-name>
    <eviction-policy>HYBRID</eviction-policy>
    <unit-calculator>BINARY</unit-calculator>
    </local-scheme>

    Good info ... I feel like I need to restate my original question along with a couple of new questions caused by the discussion above.
    Q1. Does Coherence evict 'dirty', or 'queued', or 'unsaved' objects for cache configuration provided above?
    The answer should be 'NO', otherwise Coherence is unsafe to use as a system of record,
    it should not just drop unsaved information on the floor.
    Q2. What happens to the front tier of the near+partitioned write behind cache described above when amount of unsaved data exceeds max cache capacity defined via high-units?
    I would expect that map.put starts throwing exceptions: cache storage is full, so it should not accept more data
    Q3. How can I determine a moment when amount of dirty data in bytes(!), not in objects, hits 85% of
    max allowed cache capasity configured in bytes (using high-units param and BINARY calculator).
    'DirtyUnits' counter can probably be built with some lower-level Coherence API. Can we use
    this API?
    Please, understand, that we purchased Coherence for reliability, for making our
    system independent from short DB outages, for keeping our business services up
    and running when DBA need some time for admin operations like rebuilding an index.
    Performance benefits are secondary and are not as obvious for our system which
    uses primary keys only and has a well-tuned co-located Oracle back-end.
    We simply cannot put Coherence to production unless we prove that Coherence
    can reliably hold the data and give us information about approaching crisis
    (the cache full of unsaved data).
    If possible, forward this message to Cameron Purdy,
    who was presenting Coherence to our team several moths ago.
    Thanks,
    Vasili Smaliak
    Applications Architect, Enterprise App Integration
    GMAC ResCap
    [email protected]

  • Error handling for distributed cache synchronization

    Hello,
    Can somebody explain to me how the error handling works for the distributed cache synchronization ?
    Say I have four nodes of a weblogic cluster and 4 different sessions on each one of those nodes.
    On Node A an update happens on object B. This update is going to be propogated to all the other nodes B, C, D. But for some reason the connection between node A and node B is lost.
    In the following xml
    <cache-synchronization-manager>
    <clustering-service>...</clustering-service>
    <should-remove-connection-on-error>true</should-remove-connection-on-error>
    If I set this to true does this mean that the Toplink will stop sending updates from node A to node B ? I presume all of this is transparent. In order to handle any errors I do not have to write any code to capture this kind of error .
    Is that correct ?
    Aswin.

    This "should-remove-connection-on-error" option mainly applies to RMI or RMI_IIOP cache synchronization. If you use JMS for cache synchronization, then connectivity and error handling is provided by the JMS service.
    For RMI, when this is set to true (which is the default) if a communication exception occurs in sending the cache synchronization to a server, that server will be removed and no longer synchronized with. The assumption is that the server has gone down, and when it comes back up it will rejoin the cluster and reconnect to this server and resume synchronization. Since it will have an empty cache when it starts back up, it will not have missed anything.
    You do not have to perform any error handling, however if you wish to handle cache synchronization errors you can use a TopLink Session ExceptionHandler. Any cache synchronization errors will be sent to the session's exception handler and allow it to handle the error or be notified of the error. Any errors will also be logged to the TopLink session's log.

  • Newsfeed error - The operation failed because the server could not access the distributed cache.

    Recently installed SharePoint 2013 RTM, on the newsfeed page an error is displayed, and no entries display in the following or everyone tabs.
    "The operation failed because the server could not access the distributed cache."
    Reading through various posts, I've checked:
    - Activity feeds and mentions tabs are working as expected.
    - User Profile Service is operational and syncing as expected
    - Search is operational and indexing as expected
    - The farm was installed based on the autospinstaller scripts.
    - Don't believe this to be a permissions issue, during testing added accounts to the admin group to verify
    Any suggestions are welcomed, thanks.
    The full error message and trace logs is as follows.
    SharePoint returned the following error: The operation failed because the server could not access the distributed cache. Internal type name: Microsoft.Office.Server.Microfeed.MicrofeedException. Internal error code: 55. Contact your system administrator
    for help in resolving this problem.
    From the trace logs there's several messages which are triggered around the same time:
    http://msdn.microsoft.com/en-AU/library/System.ServiceModel.Diagnostics.TraceHandledException.aspxHandling an exception. Exception details: System.ServiceModel.FaultException`1[Microsoft.Office.Server.UserProfiles.FeedCacheFault]: Unexpected exception in
    FeedCacheService.GetPublishedFeed: Object reference not set to an instance of an object.. (Fault Detail is equal to Microsoft.Office.Server.UserProfiles.FeedCacheFault)./LM/W3SVC/2/ROOT/d71732192b0d4afdad17084e8214321e-1-129962393079894191System.ServiceModel.FaultException`1[[Microsoft.Office.Server.UserProfiles.FeedCacheFault,
    Microsoft.Office.Server.UserProfiles, Version=15.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c]], System.ServiceModel, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089Unexpected exception in FeedCacheService.GetPublishedFeed: Object
    reference not set to an instance of an object..  
     at Microsoft.Office.Server.UserProfiles.FeedCacheService.Microsoft.Office.Server.UserProfiles.IFeedCacheService.GetPublishedFeed(FeedCacheRetrievalEntity fcTargetEntity, FeedCacheRetrievalEntity fcViewingEntity, FeedCacheRetrievalOptions fcRetOptions)
     at SyncInvokeGetPublishedFeed(Object , Object[] , Object[] )    
     at System.ServiceModel.Dispatcher.SyncMethodInvoker.Invoke(Object instance, Object[] inputs, Object[]&amp; outputs)    
     at System.ServiceModel.Dispatcher.DispatchOperationRuntime.InvokeBegin(MessageRpc&amp; rpc)    
     at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage5(MessageRpc&amp; rpc)    
     at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage31(MessageRpc&amp; rpc)    
     at System.ServiceModel.Dispatcher.MessageRpc.Process(Boolean isOperationContextSet)System.ServiceModel.FaultException`1[Microsoft.Office.Server.UserProfiles.FeedCacheFault]: Unexpected exception in FeedCacheService.GetPublishedFeed: Object reference not
    set to an instance of an object.. (Fault Detail is equal to Microsoft.Office.Server.UserProfiles.FeedCacheFault).
    SPSocialFeedManager.GetFeed: Exception: Microsoft.Office.Server.Microfeed.MicrofeedException: ServerErrorFetchingConsolidatedFeed : ( Unexpected exception in FeedCacheService.GetPublishedFeed: Object reference not set to an instance of an object.. ) : Correlation
    ID:db6ddc9b-8d2e-906e-db86-77e4c9fab08f : Date and Time : 31/10/2012 1:40:20 PM    
     at Microsoft.Office.Server.Microfeed.SPMicrofeedThreadCollection.PopulateConsolidated(SPMicrofeedRetrievalOptions retOptions, SPMicrofeedContext context)    
     at Microsoft.Office.Server.Microfeed.SPMicrofeedThreadCollection.Populate(SPMicrofeedRetrievalOptions retrievalOptions, SPMicrofeedContext context)    
     at Microsoft.Office.Server.Microfeed.SPMicrofeedManager.CommonGetFeedFor(SPMicrofeedRetrievalOptions retrievalOptions)    
     at Microsoft.Office.Server.Microfeed.SPMicrofeedManager.CommonPubFeedGetter(SPMicrofeedRetrievalOptions feedOptions, MicrofeedPublishedFeedType feedType, Boolean publicView)    
     at Microsoft.Office.Server.Microfeed.SPMicrofeedManager.GetPublishedFeed(String feedOwner, SPMicrofeedRetrievalOptions feedOptions, MicrofeedPublishedFeedType typeOfPubFeed)    
     at Microsoft.Office.Server.Social.SPSocialFeedManager.Microsoft.Office.Server.Social.ISocialFeedManagerProxy.ProxyGetFeed(SPSocialFeedType type, SPSocialFeedOptions options)    
     at Microsoft.Office.Server.Social.SPSocialFeedManager.<>c__DisplayClass4b`1.<S2SInvoke>b__4a()
    Microsoft.Office.Server.Social.SPSocialFeedManager.GetFeed: Microsoft.Office.Server.Microfeed.MicrofeedException: ServerErrorFetchingConsolidatedFeed : ( Unexpected exception in FeedCacheService.GetPublishedFeed: Object reference not set to an instance of
    an object.. ) : Correlation ID:db6ddc9b-8d2e-906e-db86-77e4c9fab08f : Date and Time : 31/10/2012 1:40:20 PM    
     at Microsoft.Office.Server.Microfeed.SPMicrofeedThreadCollection.PopulateConsolidated(SPMicrofeedRetrievalOptions retOptions, SPMicrofeedContext context)    
     at Microsoft.Office.Server.Microfeed.SPMicrofeedThreadCollection.Populate(SPMicrofeedRetrievalOptions retrievalOptions, SPMicrofeedContext context)    
     at Microsoft.Office.Server.Microfeed.SPMicrofeedManager.CommonGetFeedFor(SPMicrofeedRetrievalOptions retrievalOptions)    
     at Microsoft.Office.Server.Microfeed.SPMicrofeedManager.CommonPubFeedGetter(SPMicrofeedRetrievalOptions feedOptions, MicrofeedPublishedFeedType feedType, Boolean publicView)    
     at Microsoft.Office.Server.Microfeed.SPMicrofeedManager.GetPublishedFeed(String feedOwner, SPMicrofeedRetrievalOptions feedOptions, MicrofeedPublishedFeedType typeOfPubFeed)    
     at Microsoft.Office.Server.Social.SPSocialFeedManager.Microsoft.Office.Server.Social.ISocialFeedManagerProxy.ProxyGetFeed(SPSocialFeedType type, SPSocialFeedOptions options)    
     at Microsoft.Office.Server.Social.SPSocialFeedManager.<>c__DisplayClass4b`1.<S2SInvoke>b__4a()    
     at Microsoft.Office.Server.Social.SPSocialUtil.InvokeWithExceptionTranslation[T](ISocialOperationManager target, String name, Func`1 func)
    Microsoft.Office.Server.Social.SPSocialFeedManager.GetFeed: Microsoft.Office.Server.Social.SPSocialException: The operation failed because the server could not access the distributed cache. Internal type name: Microsoft.Office.Server.Microfeed.MicrofeedException.
    Internal error code: 55.    
     at Microsoft.Office.Server.Social.SPSocialUtil.TryTranslateExceptionAndThrow(Exception exception)    
     at Microsoft.Office.Server.Social.SPSocialUtil.InvokeWithExceptionTranslation[T](ISocialOperationManager target, String name, Func`1 func)    
     at Microsoft.Office.Server.Social.SPSocialFeedManager.<>c__DisplayClass48`1.<S2SInvoke>b__47()    
     at Microsoft.Office.Server.Social.SPSocialUtil.InvokeWithExceptionTranslation[T](ISocialOperationManager target, String name, Func`1 func)

    Thanks Thuan,
    I've restarted to the Distrubiton Cache servicem and the error is still occuring.
    The AppFabric Caching Service is running under the service apps account, and does appear operational based on:
    > use-cachecluster
    > get-cache
    CacheName            [Host]
                         Regions
    default
    DistributedAccessCache_1e9f4999-0187-40e8-aa92-f8308d47d6e9
    DistributedActivityFeedCache_1e9f4999-0187-40e8-aa92-f8308d47d6e9
    DistributedActivityF [SERVER:22233]
    eedLMTCache_1e9f4999 LMT(Primary)
    -0187-40e8-aa92-f830
    8d47d6e9
    DistributedBouncerCache_1e9f4999-0187-40e8-aa92-f8308d47d6e9
    DistributedDefaultCache_1e9f4999-0187-40e8-aa92-f8308d47d6e9
    DistributedLogonToke [SERVER:22233]
    nCache_1e9f4999-0187 Default_Region_0538(Primary)
    -40e8-aa92-f8308d47d Default_Region_0004(Primary)
    6e9                  Default_Region_0451(Primary)
    DistributedSearchCache_1e9f4999-0187-40e8-aa92-f8308d47d6e9
    DistributedSecurityTrimmingCache_1e9f4999-0187-40e8-aa92-f8308d47d6e9
    DistributedServerToAppServerAccessTokenCache_1e9f4999-0187-40e8-aa92-f8308d47d6e9

  • Different distributed caches within the cluster

    Hi,
    i've three machines n1 , n2 and n3 respectively that host tangosol. 2 of them act as the primary distributed cache and the third one acts as the secondary cache. i also have weblogic running on n1 and based on some requests pumps data on to the distributed cache on n1 and n2. i've a listener configured on n1 and n2 and on the entry deleted event i would like to populate tangosol distributed service running on n3. all the 3 nodes are within the same cluster.
    i would like to ensure that the data directly coming from weblogic should only be distributed across n1 and n2 and NOT n3. for e.g. i do not start an instance of tangosol on node n3. and an object gets pruned from either n1 or n2. so ideally i should get a storage not configured exception which does not happen.
    The point is the moment is say CacheFactory.getCache("Dist:n3") in the cache listener, tangosol does populate the secondary cache by creating an instance of Dist:n3 on either n1 or n2 depending from where the object has been pruned.
    from my understanding i dont think we can have a config file on n1 and n2 that does not have a scheme for n3. i tried doing that and got an illegalstate exception.
    my next step was to define the Dist:n3 scheme on n1 and n2 with local storage false and have a similar config file on n3 with local-storage for Dist:n3 as true and local storage for the primary cache as false.
    can i configure local-storage specific to a cache rather than to a node.
    i also have an EJB deployed on weblogic that also entertains a getData request. i.e. this ejb will also check the primary cache and the secondary cache for data. i would have the statement
    NamedCahe n3 = CacheFactory.getCache("n3") in the bean as well.

    Hi Jigar,
    i've three machines n1 , n2 and n3 respectively that
    host tangosol. 2 of them act as the primary
    distributed cache and the third one acts as the
    secondary cache.First, I am curious as to the requirements that drive this configuration setup.
    i would like to ensure that the data directly coming
    from weblogic should only be distributed across n1
    and n2 and NOT n3. for e.g. i do not start an
    instance of tangosol on node n3. and an object gets
    pruned from either n1 or n2. so ideally i should get
    a storage not configured exception which does not
    happen.
    The point is the moment is say
    CacheFactory.getCache("Dist:n3") in the cache
    listener, tangosol does populate the secondary cache
    by creating an instance of Dist:n3 on either n1 or n2
    depending from where the object has been pruned.
    from my understanding i dont think we can have a
    config file on n1 and n2 that does not have a scheme
    for n3. i tried doing that and got an illegalstate
    exception.
    my next step was to define the Dist:n3 scheme on n1
    and n2 with local storage false and have a similar
    config file on n3 with local-storage for Dist:n3 as
    true and local storage for the primary cache as
    false.
    can i configure local-storage specific to a cache
    rather than to a node.
    i also have an EJB deployed on weblogic that also
    entertains a getData request. i.e. this ejb will also
    check the primary cache and the secondary cache for
    data. i would have the statement
    NamedCahe n3 = CacheFactory.getCache("n3") in the
    bean as well.In this scenario, I would recommend having the "primary" and "secondary" caches on different cache services (i.e. distributed-scheme/service-name). Then you can configure local storage on a service by service basis (i.e. distributed-scheme/local-storage).
    Later,
    Rob Misek
    Tangosol, Inc.

  • Setup failover for a distributed cache

    Hello,
    For our production setup we will have 4 app servers one clone per each app server. so there will be 4 clones to a cluster. And we will have 2 jvms for our distributed cache - one being a failover, both of those will be in cluster.
    How would i configure the failover for the distributed cache?
    Thanks

    user644269 wrote:
    Right - so each of the near cache schemes defined would need to have the back map high-units set to where it could take on 100% of data.Specifically the near-scheme/back-scheme/distributed-scheme/backing-map-scheme/local-scheme/high-units value (take a look at the [Cache Configuration Elements|http://coherence.oracle.com/display/COH34UG/Cache+Configuration+Elements] ).
    There are two options:
    1) No Expiry -- In this case you would have to size the storage enabled JVMs to that an individual JVM could store all of the data.
    or
    2) Expiry -- In this case you would set the high-units a value that you determine. If you want it to store all the data then it needs to be set higher than the total number of objects that you will store in the cache at any given time or you can set it lower with the understanding that once that high-units is reached Coherence will evict some data from the cluster (i.e. remove it from the "cluster memory").
    user644269 wrote:
    Other than that - there is not configuration needed to ensure that these JVM's act as a failover in the event one goes down.Correct, data fault tolerance is on by default (set to one level of redundancy).
    :Rob:
    Coherence Team

  • Distributed cache during solution deployment

    Hi,
    We are using MySite newsfeed.
    What is the best practice during deployment of solution the distributed cache is not affected.
    Last time when we did IIS reset the feed was lost and we have to use repopulated job to pull the data.Is there any beetr way to follow during deployment and server upgrades.
    Thanks,
    Sudan

    Hi Sudan,
    The Distributed Cache service stores data in-memory only, so executing iisreset might cause cache flush. Please refer to the thread below to move all cached items from local cache to other cache host in the cluster:
    http://social.technet.microsoft.com/Forums/sharepoint/en-US/6a415c75-4ca3-4c43-9110-25a68db93a54/sharepoint-2013-my-site-newsfeed-posts-disappear?forum=sharepointgeneral 
    Regards,
    Rebecca Tu
    TechNet Community Support

  • Node/Machine fail behavior of distributed caches

    My high level question is: what happens to a distributed cache when nodes fail?
    We have 2 servers which run 4 JVMs each. We have the default of 1 backup set.
    What happens when an entire machine fails (all 4 JVMs go down with the ship)?
    What happens when I stop and restart each JVM one at a time?
    My main concern is data-loss. Since I have backup set to 1 my expectations for both of my scenarios above is that I would lose no cached data, but that does not appear to be the case. I am left wondering in what scenarios the backups help.
    How does the cluster tell the difference between (a) a node failed but will be restored soon enough so don't reduce the cluster size, and (b) a node was removed and will never come back so reduce the cluster size?
    It would be nice to see a wiki page that describes the gory details of how the cluster handles various failure scenarios.

    Each partition is allocated to a JVM and a backup of that partition is allocated to another JVM. If you are running on multiple physical machines then Coherence will put the backup partition on another machine to the primary. You can tell how successful Coherence has been at doing this by looking at the StatusHA value for your services in JMX using something like JConsole. If the backup partitions are on different machines to the primary partitions the StatusHA value will say MACHINE-SAFE, if the backup is on the same machine as the primary the StatusHA value will be NODE-SAFE and if there is no backup the StatusHA value will be ENDANGERED.
    There is also a status called BALANCED, which means that besides being MACHINE-SAFE, the partitions are also as evenly distributed between nodes (not boxes) as possible.
    When you loose a JVM (or multiple JVMs if you loose a whole machine) this cause a partition loss event for the partitions that were allocated to the dead JVMs. In the case of loosing a single JVM the backup partition now becomes the primary and a new backup ios created (following the same rules about creating the backup on another machine if possible). If you loose a whole machine then the same thing happens but on a bigegr scale.
    A small correction: partition loss event happens is when you lose both the primary and all backups. What you described is not a partition loss, as a backup is there and is promoted to primary.
    Also, losing a whole machine is the same only in the case when you were machine-safe (or at least those partitions which had primaries on the lost box were machine-safe). If those partitions were not machine safe, then you would have lost partitions as all copies to non-machine-safe partitions on that box were lost.
    Other than that it does happen as described.
    In your case you should not necesarrily see data loss if you kill a single node from the cluster you described and neither should you loose data if you kill a whole machine. As mentioned, provided that the cluster or at least the partitions having primaries on the killed box are machine-safe.
    There are scenarios where data loss can occur, for example loosing two JVM on different machines at exactly the same time - this is becuse there is a very high chance that those two JVMs shared primary and backups for at least one partition.
    If you loose a JVM the cluster size will always be reduced - it cannot be anything else as a node has just departd the cluster.
    The above descriptions may be a bit simplified but I think they are close enough to describe what you wanted to know.
    JKBest regards,
    Robert

  • How can i configure Distributed cache servers and front-end servers for Streamlined topology in share point 2013??

    my question is regarding SharePoint 2013 Farm topology. if i want go with Streamlined topology and having (2 distribute cache and Rm servers+ 2 front-end servers+ 2 batch-processing servers+ cluster sql server) then how distributed servers will
    be connecting to front end servers? Can i use windows 2012 NLB feature? if i use NLB and then do i need to install NLB to all distributed servers and front-end servers and split-out services? What will be the configuration regarding my scenario.
    Thanks in Advanced!

    For the Distributed Cache servers, you simply make them farm members (like any other SharePoint servers) and turn on the Distributed Cache service (while making sure it is disabled on all other farm members). Then, validate no other services (except for
    the Foundation Web service due to ease of solution management) is enabled on the DC servers and no end user requests or crawl requests are being routed to the DC servers. You do not need/use NLB for DC.
    Trevor Seward
    Follow or contact me at...
    &nbsp&nbsp
    This post is my own opinion and does not necessarily reflect the opinion or view of Microsoft, its employees, or other MVPs.

  • Limitation on number of objects in distributed cache

    Hi,
    Is there a limitation on the number (or total size) of objects in a distributed cache? I am seeing a big increase in response time when the number of objects exceeds 16,000. Normally, the ServiceMBean.RequestAverageDuration value is in the 6-8ms range as long as the number of objects in the cache is less than 16K - I've run our application for weeks at a time without seeing any problems. However, once the number of objects exceeds the magic number of 16K the average request duration almost immediately jumps to over 100ms and continues to climb as more objects are added.
    I'm fairly confident that the cache is indexed properly (as Dimitri helped us with that). Are there any configuration changes that could possibly help out here? We are using Coherence 3.3.
    Any suggestions would be greatly appreciated.
    Thanks,
    Jim

    Hi Jim,
    The results from the load test look quite normal, the system fairly quickly stabilizes at a particular performance level and remains there for the duration of the test. In terms of latency results, we see that the cache.putAll operations are taking ~45ms per bulk operation where each operation is putting 100 1K items, for cache.getAll operations we see about ~15ms per bulk operation. Additionally note that the test runs over 256,000 items, so it is well beyond the 16,000 limit you've encountered.
    So it looks like your application are exhibiting different behavior then this test. You may wish to try to configure this test to behave as similarly to yours as possible. For instance you can set the size of the cache to just over/under 16,000 using the -entries parameter, set the size of the entries to 900 bytes using the -size parameter, and set the total number of threads per worker using the -threads parameter.
    What is quite interesting is that at 256,000 1K objects the latency measured with this test is apparently less then half the latency you are seeing with a much smaller cache size. This would seem to point at the issue being related to or rooted in your test. Would you be able to provide a more detailed description of how you are using the cache, and the types of operations you are performing.
    thanks,
    mark

Maybe you are looking for

  • HT5460 What is the difference between an OSX upgrade and an OSX Combo upgrade?

    That's my question, right there in the subject line. Thanks! Andy

  • Actions: Select by document name?

    How can I select a document by its name? I have about 40 image files open, in which I'm attempting to create an action to copy elements from numerous documents into a common one. (which I've named "Common File") However  when playing this back, the a

  • Problem with Vector.contains

    Hello, I wrote this code but it doens't work; but it should public class Word {      public Word(String v) { _value = new String (v); }      public Word() { _value ="";  }      String _value = null;      public boolean equals(Object obj) {           

  • Printing web page content from android tablet

    When I attempt to eprint a web page, the web address line prints just fine (with a lower line that reads:"sent from Samsung tablet"), but none of the content, even though the print preview showed everything on the page that I needed. PLEASE HELP!   I

  • How can I mix (and shuffle) both podcasts and music?

    The former version of Itunes and the ipod update that went along with it, allowed my ipod to helpfully shuffle both podcasts, and songs, randomly. While driving, I could play a podcast, and then a song would come up, and then maybe another song, or a