Why Clustering

Hi,
          We are in the process in making our decision whether we should
          use clustering or not, to achieve our high scalable need.
          what we are planning is, run weblogic in multi-jvm environment
          without having any knowledge about each other. FYI, we are having
          one dedicated WL server to cater data access layer and data
          caching, which will allow us to achieve persistence integrity. we will be
          using Toplink for WL.
          I have few basic question regarding clustering:
          1) Why should I use clustering ?
          2) What WL clustering buy me ?
          3) What I will loose if I will not use weblogic clustering?
          4) Is Scalability is the only factor people use clustering?
          Any help/pointer is highly appreciated.
          Thanks in advance
          Ahimanikya Satapathy
          OrderCare.Com

What you mentioned is something like clustering. It seems you are saying
          that you can just run many boxes/web-logic and that is fine. The problem is,
          if each of those servers has its own non-clustered weblogic running, then
          all the sessions stored on that server get lost if the server dies. Even if
          you can load-balance among many individual non-clustered web-logics, you
          wont be able to "share" the state of the session data amongst the other
          running servers. Clustering makes sure (through memory) that if server A has
          sessions, it looks for a "buddy" server to copy that session information to.
          So it finds server b. Now A and B have the identical http session
          information (assuming we are talking front-end side). If Server A dies,
          server B takes over and thus your clients don't lose everything they had
          entered (talking JSP pages, javabeans, session scope stuff, etc). So, if you
          have a cart system that people on your site add to the cart, or a multiple
          page "wizard like" form that uses the same one bean across the pages (or
          not), if that server dies, their info is still in memory on another weblogic
          server. Even better, if one of the two servers dies, the remaining server
          located another server (if you have 3 or more running) and then replicates
          its session data out to that server..so it automatically "fails over" for
          you incase one dies.
          Hope that helps a little bit.
          "Ahimanikya Satapathy" <[email protected]> wrote in message
          news:[email protected]...
          > Thanks Wei,
          >
          > I think the answer is not that simple and it does not convince me, I can
          > achieve load-balancing,failover, performance, availability,scalability in
          > other means say for example, I can have multiple weblogic running on
          > multiple powerful boxes to achieve all the factors. so how weblogic
          > clustering helps me in this regard, where it stands ?? I would appreciate
          if
          > I would get in depth answer. Also if any body know some good documentation
          > where I can find a good argument for why clustering , will be a great
          help.
          >
          > -- Ahimanikya
          >
          > Wei Guan wrote in message <[email protected]>...
          > >For all your questions, load-balancing, failover, performance,
          > availability,
          > >scalability, etc are the answers.
          > >
          > >--
          > >Cheers - Wei
          > >
          > >
          > >
          > >Ahimanikya Satapathy <[email protected]> wrote in message
          > >news:[email protected]...
          > >> Hi,
          > >>
          > >> We are in the process in making our decision whether we should
          > >> use clustering or not, to achieve our high scalable need.
          > >>
          > >> what we are planning is, run weblogic in multi-jvm environment
          > >> without having any knowledge about each other. FYI, we are having
          > >> one dedicated WL server to cater data access layer and data
          > >> caching, which will allow us to achieve persistence integrity. we will
          be
          > >> using Toplink for WL.
          > >>
          > >> I have few basic question regarding clustering:
          > >>
          > >> 1) Why should I use clustering ?
          > >>
          > >> 2) What WL clustering buy me ?
          > >>
          > >> 3) What I will loose if I will not use weblogic clustering?
          > >>
          > >> 4) Is Scalability is the only factor people use clustering?
          > >>
          > >>
          > >> Any help/pointer is highly appreciated.
          > >>
          > >>
          > >> Thanks in advance
          > >>
          > >> Ahimanikya Satapathy
          > >> OrderCare.Com
          > >>
          > >>
          > >>
          > >
          > >
          >
          >

Similar Messages

MSCS on Windows 2003 EE Server and x64

Hallo Peter
I was told on the Fujitsu-Siemens SAP Competence Center that you are the only one who can answer this question regarding compatibility of MSCS with ECC 4.7 Ext2.
One of our customer wants to install the following on an x64 Intel Machine:
SAP BASIS 6.20; R/3 4.7 Ext2.0 (SAP Kernel 6.40) Release for x86_64 64bits Windows Enterprise Server 2003
SQL Server 2005
According to SAP Note 106275 this is only possible with Itanium Servers but I heard it should be possible already with x64 Servers. That somehow, this note is probably outdated (see answers from the guys of the Microsoft-SAP Alliance adn note below)
Question is :
1) Is MSCS released for x64 and ECC 4.7ext2 ?
2) If yes, I guess the MSCS configuration would be a Classic two node one , just as Peter Simon indicates in your slides:
3) And Last but not least, do you know of a certified partner who is able to implement such a cluster with SAP ?

Hi Javier,
Before answering directly to your questions....
Why clustering the applications servers (Dialog instance) ?
App server are not considered as SPOF (Single point of failure), as yours DB+CI is protected by a cluster, the CI will take over when app server is failing.
If you have additional servers, i would prefer to install 2 app servers then one protected by a cluster. When 1 app server is failing you still have the other, giving the same "protection" than it would be in a cluster, but in normal operation you will have 2 app servers giving quit better utilization of servers then the cluster would give....
So if i should answer your questions it would be:
1) Maybe it is possible / supported to cluster a app server, but it makes no sense to do this.
2) No do not install additional app servers into the same cluster protecting your DB-CI instance, use additional (stand alone) servers for the app server(s).
Regards
Rolf

Why are variants sometimes better to use than clusters?

I am using variants in a driver I am developing to avoid having to pass a ton of wires between subVIs. My mentor asked me why I was doing this. He wanted to know why I wasn't just using clusters. The only answer I could come up with was that is was convenient to be able to "dereference" (I'm not sure if this is the correct term) the variant for attributes with the Get Variant Attribute VI and a string associated with the attribute of interest. Are there any other advantages of using variants?

Yes! Variants are great when you are trying to write code that must operate on many different data types. However there are caveats that make error handling more difficult and critical. Take a look at the OpenG LabVIEW Data Tools, which make Varaint usage practical. This library is part of the OpenG Toolkit.
LabVIEW Data Tools Presentation - variants, run-time type checking, and data manipulation design patterns
Examples:
Get Object Attributes as XML
<
A HREF=http://www.openg.org/tiki/tiki-index.php?page=EXAMPLE%20-%20OpenG%20Flatten%20to%20XML>OpenG Flatten to XML
Python Client to LabVIEW Server
Universal Probe

Why we need clusters

Hello
Just curious to know why we need to cluster servers and why can't we just have say 3 or 4 managed servers as part of weblogic domain and deploy application onto them and configure a load balancer to look at the managed servers individually than the cluster..I know general concept of clusters ,advantages of using it etc..but just wanted to know if there are any specific disadvantages if we don't cluster servers
thanks

Hi,
Scenario: Suppose you have 2 ManagedServers MS1 and MS2
Now Suppose you have One Application (AppOne.war) deployed on both ManagedServers MS1 & MS2
Now see what the Load balancer can do in above scenario...
*1).* If from Client-A a Request comes then the Load Balancer will either redirect the request to MS1 or MS2....Suppose this time it is MS1. Now MS1 started Serving the request.
*2).* It means the Http Session of Client-A is available inside the MS1 Server JVM. As the Session is created on MS1 ...now the client-A can access any page available in the Application.which requires a Valid Http Session Object.
*3).* Now Suppose Suddenly MS1 Goes Down/Crashes... Now if the Client-A will refresh his browser ...Means Client-A is sending a Fresh request...But this time as the MS1 already went down .... So the LoadBalancer will forward the request to MS2....
BUT now the Problem is MS2 doesn't have Client-A HttpSession Data...so the Client will be Moved to the Login Page rather then the Page he actually requested for.
Now this is the Point where we need the Clustering....Suppose MS1 & MS2 would have been part of a Cluster in above scenario then ...his Session Data might have been replicated to Both the Managed Servers ...(As Primary HttpSession and Secondary HttpSession)...so even if any one of the ManagedServer Crashes still client will be able to access the application in the same Manner without any Interruption...Or Even Without even knowing that something Wrong happened with the Server.
Thanks
Jay SenSharma

Why Sort operation on clustered columstore index insert?

Looking at the execution plan for a clustered columnstore index insert I noticed a Sort operation. My T-SQL has no sort and I understand that the clustered columnstore is not a sorted index. Why would there be a Sort operation in the execution plan?
This is running on:
Microsoft SQL Server 2014 - 12.0.2000.8 (X64)
Feb 20 2014 20:04:26
Copyright (c) Microsoft Corporation
Enterprise Edition: Core-based Licensing (64-bit) on Windows NT 6.3 <X64> (Build 9600: ) (Hypervisor)

Hello,
It's because how a columnstore index works: The index is created & compressed on column Level, not on row level. SQL Server orders the data to have the same data after each other to calculate the compressed index values.
Olaf Helper
[ Blog] [ Xing] [ MVP]

Why we need to go for clusters ?

Hi ,
Plz clarify me y we need to go for clusters in ABAP-HR programming ?...
Thanks in advance..
Suresh..

Because, that is where the ( Time & Payroll) Transactional Data is stored.. For Reporting off the Master Data tables you can use the PAnnnn & HRPnnnn Tables with the relevant Authority checks.. PL go through the following <a href="http://help.sap.com/saphelp_erp2005vp/helpdata/en/4f/d528ff575e11d189270000e8322f96/frameset.htm">SAP Help</a> for additional info on HR Clusters.
~Suresh

Why does qmaster fail on my clustered computers?

I'm using FCP 7 with compressor 3.5, and I'm getting failures when I try to run compressor against a Render Farm. I can't figure out why.
All of my computers are set to managed, and I've used QAdministrator to build a cluster. My main iMac is also set as the controller. I've shared my folders off of the iMac with the network so that the other computers can R/W to the local disk. I do NOT have a SAN or NAS currently configured.
All of my computers are statically assigned their IP addresses and are not connected to a DNS/DHCP server or Domain Controller. It's all local network without internet connection.
What could I be doing wrong?
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<services>
   <service type="jobcontroller:com.apple.qmaster.cluster.admin" displayName="Tim Bergmann’s iMac" address="tcp://10.1.1.2:52248" hostName="Tim-Bergmanns-iMac.local">
      <logs tms="385495925.807" tmt="03/20/2013 12:12:05.807" pnm="qmasterqd">
         <mrk tms="385495925.808" tmt="03/20/2013 12:12:05.808" pid="1997" kind="begin" what="log-session"/>
         <log tms="385495925.810" tmt="03/20/2013 12:12:05.810" pid="1997" msg="Starting up"/>
         <log tms="385495925.817" tmt="03/20/2013 12:12:05.817" pid="1997" msg="Keep completed targets in history = true"/>
         <log tms="385495925.817" tmt="03/20/2013 12:12:05.817" pid="1997" msg="Keep completed segments in history = false"/>
         <mrk tms="385495953.686" tmt="03/20/2013 12:12:33.686" pid="1997" kind="begin" what="CJobControllerService::publishClusterStorage"></mrk>
         <log tms="385495953.687" tmt="03/20/2013 12:12:33.687" pid="1997" msg="Cluster storage URL = file2nfs://localhost/Cluster%20Storage/54170E4A-5AEE9756/shared/"/>
         <log tms="385495953.687" tmt="03/20/2013 12:12:33.687" pid="1997" msg="Publishing shared storage."/>
         <log tms="385495953.698" tmt="03/20/2013 12:12:33.698" pid="1997" msg="Subscribing to shared storage, local path = /var/spool/qmaster/54170E4A-5AEE9756/shared"/>
         <log tms="385495954.692" tmt="03/20/2013 12:12:34.692" pid="1997" msg="Result cluster storage URL = nfs://Tim-Bergmanns-iMac.local/Cluster%20Storage/54170E4A-5AEE9756/shared"/>
         <mrk tms="385495954.692" tmt="03/20/2013 12:12:34.692" pid="1997" kind="end" what="CJobControllerService::publishClusterStorage"></mrk>
         <log tms="385496028.510" tmt="03/20/2013 12:13:48.510" pid="1997" msg="Service E44EC2F0-236B-4CCC-BFF1-87D4A53DEE9C sessionID mismatch (6C2F81C0-62AD-49C5-A255-13F26B4F5A22 != EB8B2DD5-1D53-4977-B54F-CC13B5C3FBE5) - did the service go down?"/>
         <log tms="385496028.511" tmt="03/20/2013 12:13:48.511" pid="1997" msg="CJobControllerServer::tickleService: we haven't heard from service E44EC2F0-236B-4CCC-BFF1-87D4A53DEE9C for 4 sec. and connecting failed - the service is down. %3Cad id=%22E44EC2F0-236B-4CCC-BFF1-87D4A53DEE9C%22 unmg=%220%22 suid=%22-1%22 host=%22Render-1.local%22 session=%22EB8B2DD5-1D53-4977-B54F-CC13B5C3FBE5%22 hostPerfScore=%2213.8%22 ver=%222.2%22 kind=%22servicecontroller:com.apple.stomp.transcoder%22 name=%22Render%201%22 desc=%22Compressor%22 scope=%223%22 status=%224%22 url=%22tcp://10.1.1.3:62071%22%3E%3C/ad%3E"/>
         <log tms="385496028.526" tmt="03/20/2013 12:13:48.526" pid="1997" msg="Service AE991F65-4E48-4DC7-807A-5B5CACCE9FEF sessionID mismatch (3240C3A7-9CAD-4389-9096-EAB3F1C6683C != CCD764A6-5236-4CA9-9D86-2CDC308E4B37) - did the service go down?"/>
         <log tms="385496028.527" tmt="03/20/2013 12:13:48.527" pid="1997" msg="CJobControllerServer::tickleService: we haven't heard from service AE991F65-4E48-4DC7-807A-5B5CACCE9FEF for 4 sec. and connecting failed - the service is down. %3Cad id=%22AE991F65-4E48-4DC7-807A-5B5CACCE9FEF%22 unmg=%220%22 suid=%22-1%22 host=%22Render-2.local%22 session=%22CCD764A6-5236-4CA9-9D86-2CDC308E4B37%22 hostPerfScore=%2213.8%22 ver=%222.2%22 kind=%22servicecontroller:com.apple.stomp.transcoder%22 name=%22Render%202%22 desc=%22Compressor%22 scope=%223%22 status=%224%22 url=%22tcp://10.1.1.4:64334%22%3E%3C/ad%3E"/>
         <log tms="385496028.532" tmt="03/20/2013 12:13:48.532" pid="1997" msg="Not releasing service E44EC2F0-236B-4CCC-BFF1-87D4A53DEE9C, sessionID has changed (6C2F81C0-62AD-49C5-A255-13F26B4F5A22 != EB8B2DD5-1D53-4977-B54F-CC13B5C3FBE5)"/>
         <log tms="385496028.533" tmt="03/20/2013 12:13:48.533" pid="1997" lvl="2" msg="Handling exception: batch = Untitled, job = Brokenness Aside Video, target = Brokenness Aside Video-MPEG-4 .mp4, segment = Video: 01:02:58;04 to 01:05:56;09, host = Render-1.local, exception = service down, fail count = 1. There are 2 hosts which haven't failed this request yet."/>
         <log tms="385496028.537" tmt="03/20/2013 12:13:48.537" pid="1997" msg="Rescheduling the failed request."/>
         <log tms="385496028.565" tmt="03/20/2013 12:13:48.565" pid="1997" msg="Service B55BC0BF-7DC1-4131-B0A7-D958643FE8AB sessionID mismatch (69B6CF92-6F19-4A19-9F6E-A07F51C1DC10 != E9582F6E-70BE-45CF-8D1E-440DA722C755) - did the service go down?"/>
         <log tms="385496028.565" tmt="03/20/2013 12:13:48.565" pid="1997" msg="CJobControllerServer::tickleService: we haven't heard from service B55BC0BF-7DC1-4131-B0A7-D958643FE8AB for 4 sec. and connecting failed - the service is down. %3Cad id=%22B55BC0BF-7DC1-4131-B0A7-D958643FE8AB%22 unmg=%220%22 suid=%22-1%22 host=%22Render-1.local%22 session=%22E9582F6E-70BE-45CF-8D1E-440DA722C755%22 hostPerfScore=%2213.8%22 ver=%222.2%22 kind=%22servicecontroller:com.apple.stomp.transcoder%22 name=%22Render%201%202%22 desc=%22Compressor%22 scope=%223%22 status=%224%22 url=%22tcp://10.1.1.3:62070%22%3E%3C/ad%3E"/>
         <log tms="385496028.598" tmt="03/20/2013 12:13:48.598" pid="1997" msg="Service FEBA9794-9834-4499-97E4-672D979ABF8A sessionID mismatch (9A9651EB-AD73-4B0B-B84A-1ACA7B75B1F9 != 74CFF689-E81F-4725-9B96-615F1C29A006) - did the service go down?"/>
         <log tms="385496028.611" tmt="03/20/2013 12:13:48.611" pid="1997" msg="CJobControllerServer::tickleService: we haven't heard from service FEBA9794-9834-4499-97E4-672D979ABF8A for 4 sec. and connecting failed - the service is down. %3Cad id=%22FEBA9794-9834-4499-97E4-672D979ABF8A%22 unmg=%220%22 suid=%22-1%22 host=%22Render-2.local%22 session=%2274CFF689-E81F-4725-9B96-615F1C29A006%22 hostPerfScore=%2213.8%22 ver=%222.2%22 kind=%22servicecontroller:com.apple.stomp.transcoder%22 name=%22Render%202%202%22 desc=%22Compressor%22 scope=%223%22 status=%224%22 url=%22tcp://10.1.1.4:64335%22%3E%3C/ad%3E"/>
         <log tms="385496028.613" tmt="03/20/2013 12:13:48.613" pid="1997" lvl="2" msg="Handling exception: batch = Untitled, job = Brokenness Aside Video, target = Brokenness Aside Video-MPEG-4 .mp4, segment = Video: 01:00:00;00 to 01:02:58;03, host = Render-2.local, exception = service down, fail count = 1. There are 2 hosts which haven't failed this request yet."/>
         <log tms="385496028.616" tmt="03/20/2013 12:13:48.616" pid="1997" msg="Rescheduling the failed request."/>
         <log tms="385496028.771" tmt="03/20/2013 12:13:48.771" pid="1997" msg="Not releasing service B55BC0BF-7DC1-4131-B0A7-D958643FE8AB, sessionID has changed (69B6CF92-6F19-4A19-9F6E-A07F51C1DC10 != E9582F6E-70BE-45CF-8D1E-440DA722C755)"/>
         <log tms="385496028.772" tmt="03/20/2013 12:13:48.772" pid="1997" lvl="2" msg="Handling exception: batch = Untitled, job = Brokenness Aside Video, target = Brokenness Aside Video-MPEG-4 .mp4, segment = Video: 01:05:56;10 to 01:08:54;15, host = Render-1.local, exception = service down, fail count = 1. There are 2 hosts which haven't failed this request yet."/>
         <log tms="385496028.776" tmt="03/20/2013 12:13:48.776" pid="1997" msg="Rescheduling the failed request."/>
         <log tms="385496028.811" tmt="03/20/2013 12:13:48.811" pid="1997" msg="Not releasing service FEBA9794-9834-4499-97E4-672D979ABF8A, sessionID has changed (9A9651EB-AD73-4B0B-B84A-1ACA7B75B1F9 != 74CFF689-E81F-4725-9B96-615F1C29A006)"/>
         <log tms="385496028.812" tmt="03/20/2013 12:13:48.812" pid="1997" lvl="2" msg="Handling exception: batch = Untitled, job = Brokenness Aside Video, target = Brokenness Aside Video-MPEG-4 .mp4, segment = Video: 01:08:54;16 to 01:11:52;19, host = Render-2.local, exception = service down, fail count = 1. There are 2 hosts which haven't failed this request yet."/>
         <log tms="385496028.815" tmt="03/20/2013 12:13:48.815" pid="1997" msg="Rescheduling the failed request."/>
         <log tms="385496032.900" tmt="03/20/2013 12:13:52.900" pid="1997" msg="Service B55BC0BF-7DC1-4131-B0A7-D958643FE8AB sessionID mismatch (E9582F6E-70BE-45CF-8D1E-440DA722C755 != 3A68B052-3EF7-4975-A7FC-9F2530112724) - did the service go down?"/>
         <log tms="385496032.901" tmt="03/20/2013 12:13:52.901" pid="1997" msg="CJobControllerServer::tickleService: we haven't heard from service B55BC0BF-7DC1-4131-B0A7-D958643FE8AB for 4 sec. and connecting failed - the service is down. %3Cad id=%22B55BC0BF-7DC1-4131-B0A7-D958643FE8AB%22 unmg=%220%22 suid=%22-1%22 host=%22Render-1.local%22 session=%223A68B052-3EF7-4975-A7FC-9F2530112724%22 hostPerfScore=%2213.8%22 ver=%222.2%22 kind=%22servicecontroller:com.apple.stomp.transcoder%22 name=%22Render%201%202%22 desc=%22Compressor%22 scope=%223%22 status=%224%22 url=%22tcp://10.1.1.3:62086%22%3E%3C/ad%3E"/>
         <log tms="385496032.907" tmt="03/20/2013 12:13:52.907" pid="1997" msg="Not releasing service B55BC0BF-7DC1-4131-B0A7-D958643FE8AB, sessionID has changed (E9582F6E-70BE-45CF-8D1E-440DA722C755 != 3A68B052-3EF7-4975-A7FC-9F2530112724)"/>
         <log tms="385496032.908" tmt="03/20/2013 12:13:52.908" pid="1997" lvl="2" msg="Handling exception: batch = Untitled, job = Brokenness Aside Video, target = Brokenness Aside Video-MPEG-4 .mp4, segment = Video: 01:17:49;02 to 01:20:47;05, host = Render-1.local, exception = service down, fail count = 1. There are 1 hosts which haven't failed this request yet."/>
         <log tms="385496032.911" tmt="03/20/2013 12:13:52.911" pid="1997" msg="Rescheduling the failed request."/>
         <log tms="385496032.920" tmt="03/20/2013 12:13:52.920" pid="1997" msg="Service E44EC2F0-236B-4CCC-BFF1-87D4A53DEE9C sessionID mismatch (EB8B2DD5-1D53-4977-B54F-CC13B5C3FBE5 != 9D7BDC2F-6A77-4B7E-B053-A176D3BDD7BB) - did the service go down?"/>
         <log tms="385496032.920" tmt="03/20/2013 12:13:52.920" pid="1997" msg="CJobControllerServer::tickleService: we haven't heard from service E44EC2F0-236B-4CCC-BFF1-87D4A53DEE9C for 4 sec. and connecting failed - the service is down. %3Cad id=%22E44EC2F0-236B-4CCC-BFF1-87D4A53DEE9C%22 unmg=%220%22 suid=%22-1%22 host=%22Render-1.local%22 session=%229D7BDC2F-6A77-4B7E-B053-A176D3BDD7BB%22 hostPerfScore=%2213.8%22 ver=%222.2%22 kind=%22servicecontroller:com.apple.stomp.transcoder%22 name=%22Render%201%22 desc=%22Compressor%22 scope=%223%22 status=%224%22 url=%22tcp://10.1.1.3:62087%22%3E%3C/ad%3E"/>
         <log tms="385496032.941" tmt="03/20/2013 12:13:52.941" pid="1997" msg="Service FEBA9794-9834-4499-97E4-672D979ABF8A sessionID mismatch (74CFF689-E81F-4725-9B96-615F1C29A006 != 5E121676-96B6-44B7-9338-1FEFCFCC60C7) - did the service go down?"/>
         <log tms="385496032.941" tmt="03/20/2013 12:13:52.941" pid="1997" msg="CJobControllerServer::tickleService: we haven't heard from service FEBA9794-9834-4499-97E4-672D979ABF8A for 4 sec. and connecting failed - the service is down. %3Cad id=%22FEBA9794-9834-4499-97E4-672D979ABF8A%22 unmg=%220%22 suid=%22-1%22 host=%22Render-2.local%22 session=%225E121676-96B6-44B7-9338-1FEFCFCC60C7%22 hostPerfScore=%2213.8%22 ver=%222.2%22 kind=%22servicecontroller:com.apple.stomp.transcoder%22 name=%22Render%202%202%22 desc=%22Compressor%22 scope=%223%22 status=%224%22 url=%22tcp://10.1.1.4:64345%22%3E%3C/ad%3E"/>
         <log tms="385496032.980" tmt="03/20/2013 12:13:52.980" pid="1997" msg="Service AE991F65-4E48-4DC7-807A-5B5CACCE9FEF sessionID mismatch (CCD764A6-5236-4CA9-9D86-2CDC308E4B37 != 325251D4-DFD5-4726-8481-6FE854BDF331) - did the service go down?"/>
         <log tms="385496032.980" tmt="03/20/2013 12:13:52.980" pid="1997" msg="CJobControllerServer::tickleService: we haven't heard from service AE991F65-4E48-4DC7-807A-5B5CACCE9FEF for 4 sec. and connecting failed - the service is down. %3Cad id=%22AE991F65-4E48-4DC7-807A-5B5CACCE9FEF%22 unmg=%220%22 suid=%22-1%22 host=%22Render-2.local%22 session=%22325251D4-DFD5-4726-8481-6FE854BDF331%22 hostPerfScore=%2213.8%22 ver=%222.2%22 kind=%22servicecontroller:com.apple.stomp.transcoder%22 name=%22Render%202%22 desc=%22Compressor%22 scope=%223%22 status=%224%22 url=%22tcp://10.1.1.4:64344%22%3E%3C/ad%3E"/>
         <log tms="385496033.132" tmt="03/20/2013 12:13:53.132" pid="1997" msg="Not releasing service E44EC2F0-236B-4CCC-BFF1-87D4A53DEE9C, sessionID has changed (EB8B2DD5-1D53-4977-B54F-CC13B5C3FBE5 != 9D7BDC2F-6A77-4B7E-B053-A176D3BDD7BB)"/>
         <log tms="385496033.133" tmt="03/20/2013 12:13:53.133" pid="1997" lvl="2" msg="Handling exception: batch = Untitled, job = Brokenness Aside Video, target = Brokenness Aside Video-MPEG-4 .mp4, segment = Video: 01:14:50;26 to 01:17:49;01, host = Render-1.local, exception = service down, fail count = 1. There are 2 hosts which haven't failed this request yet."/>
         <log tms="385496033.156" tmt="03/20/2013 12:13:53.156" pid="1997" msg="Rescheduling the failed request."/>
         <log tms="385496033.164" tmt="03/20/2013 12:13:53.164" pid="1997" msg="Not releasing service FEBA9794-9834-4499-97E4-672D979ABF8A, sessionID has changed (74CFF689-E81F-4725-9B96-615F1C29A006 != 5E121676-96B6-44B7-9338-1FEFCFCC60C7)"/>
         <log tms="385496033.165" tmt="03/20/2013 12:13:53.165" pid="1997" lvl="2" msg="Handling exception: batch = Untitled, job = Brokenness Aside Video, target = Brokenness Aside Video-MPEG-4 .mp4, segment = Video: 01:20:47;06 to 01:23:45;11, host = Render-2.local, exception = service down, fail count = 1. There are 2 hosts which haven't failed this request yet."/>
         <log tms="385496033.168" tmt="03/20/2013 12:13:53.168" pid="1997" msg="Rescheduling the failed request."/>
         <log tms="385496033.171" tmt="03/20/2013 12:13:53.171" pid="1997" msg="Not releasing service AE991F65-4E48-4DC7-807A-5B5CACCE9FEF, sessionID has changed (CCD764A6-5236-4CA9-9D86-2CDC308E4B37 != 325251D4-DFD5-4726-8481-6FE854BDF331)"/>
         <log tms="385496033.172" tmt="03/20/2013 12:13:53.172" pid="1997" lvl="2" msg="Handling exception: batch = Untitled, job = Brokenness Aside Video, target = Brokenness Aside Video-MPEG-4 .mp4, segment = Video: 01:02:58;04 to 01:05:56;09, host = Render-2.local, exception = service down, fail count = 2. There are 1 hosts which haven't failed this request yet."/>
         <log tms="385496033.210" tmt="03/20/2013 12:13:53.210" pid="1997" msg="Rescheduling the failed request."/>
         <log tms="385496037.193" tmt="03/20/2013 12:13:57.193" pid="1997" msg="Service E44EC2F0-236B-4CCC-BFF1-87D4A53DEE9C sessionID mismatch (9D7BDC2F-6A77-4B7E-B053-A176D3BDD7BB != D00018FA-C018-4D96-8079-4789501B97BF) - did the service go down?"/>
         <log tms="385496037.193" tmt="03/20/2013 12:13:57.193" pid="1997" msg="CJobControllerServer::tickleService: we haven't heard from service E44EC2F0-236B-4CCC-BFF1-87D4A53DEE9C for 4 sec. and connecting failed - the service is down. %3Cad id=%22E44EC2F0-236B-4CCC-BFF1-87D4A53DEE9C%22 unmg=%220%22 suid=%22-1%22 host=%22Render-1.local%22 session=%22D00018FA-C018-4D96-8079-4789501B97BF%22 hostPerfScore=%2213.8%22 ver=%222.2%22 kind=%22servicecontroller:com.apple.stomp.transcoder%22 name=%22Render%201%22 desc=%22Compressor%22 scope=%223%22 status=%224%22 url=%22tcp://10.1.1.3:62101%22%3E%3C/ad%3E"/>
         <log tms="385496037.203" tmt="03/20/2013 12:13:57.203" pid="1997" lvl="2" msg="Handling exception: batch = Untitled, job = Brokenness Aside Video, target = Brokenness Aside Video-MPEG-4 .mp4, segment = Video: 01:08:54;16 to 01:11:52;19, host = Render-1.local, exception = service down, fail count = 2. There are 1 hosts which haven't failed this request yet."/>
         <log tms="385496037.207" tmt="03/20/2013 12:13:57.207" pid="1997" msg="Rescheduling the failed request."/>
         <log tms="385496037.215" tmt="03/20/2013 12:13:57.215" pid="1997" msg="Service B55BC0BF-7DC1-4131-B0A7-D958643FE8AB sessionID mismatch (3A68B052-3EF7-4975-A7FC-9F2530112724 != FD78A150-CEF8-4CAF-9D26-1A8F195C41B4) - did the service go down?"/>
         <log tms="385496037.215" tmt="03/20/2013 12:13:57.215" pid="1997" msg="CJobControllerServer::tickleService: we haven't heard from service B55BC0BF-7DC1-4131-B0A7-D958643FE8AB for 5 sec. and connecting failed - the service is down. %3Cad id=%22B55BC0BF-7DC1-4131-B0A7-D958643FE8AB%22 unmg=%220%22 suid=%22-1%22 host=%22Render-1.local%22 session=%22FD78A150-CEF8-4CAF-9D26-1A8F195C41B4%22 hostPerfScore=%2213.8%22 ver=%222.2%22 kind=%22servicecontroller:com.apple.stomp.transcoder%22 name=%22Render%201%202%22 desc=%22Compressor%22 scope=%223%22 status=%224%22 url=%22tcp://10.1.1.3:62100%22%3E%3C/ad%3E"/>
         <log tms="385496037.232" tmt="03/20/2013 12:13:57.232" pid="1997" msg="Service AE991F65-4E48-4DC7-807A-5B5CACCE9FEF sessionID mismatch (325251D4-DFD5-4726-8481-6FE854BDF331 != 69C39B21-D68F-4FAE-B604-BF260DF77EA4) - did the service go down?"/>
         <log tms="385496037.233" tmt="03/20/2013 12:13:57.233" pid="1997" msg="CJobControllerServer::tickleService: we haven't heard from service AE991F65-4E48-4DC7-807A-5B5CACCE9FEF for 4 sec. and connecting failed - the service is down. %3Cad id=%22AE991F65-4E48-4DC7-807A-5B5CACCE9FEF%22 unmg=%220%22 suid=%22-1%22 host=%22Render-2.local%22 session=%2269C39B21-D68F-4FAE-B604-BF260DF77EA4%22 hostPerfScore=%2213.8%22 ver=%222.2%22 kind=%22servicecontroller:com.apple.stomp.transcoder%22 name=%22Render%202%22 desc=%22Compressor%22 scope=%223%22 status=%224%22 url=%22tcp://10.1.1.4:64356%22%3E%3C/ad%3E"/>
         <log tms="385496037.262" tmt="03/20/2013 12:13:57.262" pid="1997" msg="Service FEBA9794-9834-4499-97E4-672D979ABF8A sessionID mismatch (5E121676-96B6-44B7-9338-1FEFCFCC60C7 != BE768F9B-EC6E-49B5-AA47-7CDA25AE5AE0) - did the service go down?"/>
         <log tms="385496037.263" tmt="03/20/2013 12:13:57.263" pid="1997" msg="CJobControllerServer::tickleService: we haven't heard from service FEBA9794-9834-4499-97E4-672D979ABF8A for 5 sec. and connecting failed - the service is down. %3Cad id=%22FEBA9794-9834-4499-97E4-672D979ABF8A%22 unmg=%220%22 suid=%22-1%22 host=%22Render-2.local%22 session=%22BE768F9B-EC6E-49B5-AA47-7CDA25AE5AE0%22 hostPerfScore=%2213.8%22 ver=%222.2%22 kind=%22servicecontroller:com.apple.stomp.transcoder%22 name=%22Render%202%202%22 desc=%22Compressor%22 scope=%223%22 status=%224%22 url=%22tcp://10.1.1.4:64357%22%3E%3C/ad%3E"/>
         <log tms="385496037.426" tmt="03/20/2013 12:13:57.426" pid="1997" msg="Not releasing service B55BC0BF-7DC1-4131-B0A7-D958643FE8AB, sessionID has changed (3A68B052-3EF7-4975-A7FC-9F2530112724 != FD78A150-CEF8-4CAF-9D26-1A8F195C41B4)"/>
         <log tms="385496037.427" tmt="03/20/2013 12:13:57.427" pid="1997" lvl="2" msg="Handling exception: batch = Untitled, job = Brokenness Aside Video, target = Brokenness Aside Video-MPEG-4 .mp4, segment = Video: 01:00:00;00 to 01:02:58;03, host = Render-1.local, exception = service down, fail count = 2. There are 1 hosts which haven't failed this request yet."/>
         <log tms="385496037.430" tmt="03/20/2013 12:13:57.430" pid="1997" msg="Rescheduling the failed request."/>
         <log tms="385496037.494" tmt="03/20/2013 12:13:57.494" pid="1997" msg="Not releasing service AE991F65-4E48-4DC7-807A-5B5CACCE9FEF, sessionID has changed (325251D4-DFD5-4726-8481-6FE854BDF331 != 69C39B21-D68F-4FAE-B604-BF260DF77EA4)"/>
         <log tms="385496037.495" tmt="03/20/2013 12:13:57.495" pid="1997" lvl="2" msg="Handling exception: batch = Untitled, job = Brokenness Aside Video, target = Brokenness Aside Video-MPEG-4 .mp4, segment = Video: 01:23:45;12 to 01:26:43;17, host = Render-2.local, exception = service down, fail count = 1. There are 2 hosts which haven't failed this request yet."/>
         <log tms="385496037.498" tmt="03/20/2013 12:13:57.498" pid="1997" msg="Rescheduling the failed request."/>
         <log tms="385496037.499" tmt="03/20/2013 12:13:57.499" pid="1997" msg="Not releasing service FEBA9794-9834-4499-97E4-672D979ABF8A, sessionID has changed (5E121676-96B6-44B7-9338-1FEFCFCC60C7 != BE768F9B-EC6E-49B5-AA47-7CDA25AE5AE0)"/>
         <log tms="385496037.500" tmt="03/20/2013 12:13:57.500" pid="1997" lvl="2" msg="Handling exception: batch = Untitled, job = Brokenness Aside Video, target = Brokenness Aside Video-MPEG-4 .mp4, segment = Video: 01:05:56;10 to 01:08:54;15, host = Render-2.local, exception = service down, fail count = 2. There are 1 hosts which haven't failed this request yet."/>
         <log tms="385496037.503" tmt="03/20/2013 12:13:57.503" pid="1997" msg="Rescheduling the failed request."/>
         <log tms="385496041.515" tmt="03/20/2013 12:14:01.515" pid="1997" msg="Service E44EC2F0-236B-4CCC-BFF1-87D4A53DEE9C sessionID mismatch (D00018FA-C018-4D96-8079-4789501B97BF != A0030092-5C6E-46AF-8737-05654083AB51) - did the service go down?"/>
         <log tms="385496041.515" tmt="03/20/2013 12:14:01.515" pid="1997" msg="CJobControllerServer::tickleService: we haven't heard from service E44EC2F0-236B-4CCC-BFF1-87D4A53DEE9C for 4 sec. and connecting failed - the service is down. %3Cad id=%22E44EC2F0-236B-4CCC-BFF1-87D4A53DEE9C%22 unmg=%220%22 suid=%22-1%22 host=%22Render-1.local%22 session=%22A0030092-5C6E-46AF-8737-05654083AB51%22 hostPerfScore=%2213.8%22 ver=%222.2%22 kind=%22servicecontroller:com.apple.stomp.transcoder%22 name=%22Render%201%22 desc=%22Compressor%22 scope=%223%22 status=%224%22 url=%22tcp://10.1.1.3:62112%22%3E%3C/ad%3E"/>
         <log tms="385496041.523" tmt="03/20/2013 12:14:01.523" pid="1997" msg="Not releasing service E44EC2F0-236B-4CCC-BFF1-87D4A53DEE9C, sessionID has changed (D00018FA-C018-4D96-8079-4789501B97BF != A0030092-5C6E-46AF-8737-05654083AB51)"/>
         <log tms="385496041.523" tmt="03/20/2013 12:14:01.523" pid="1997" lvl="2" msg="Handling exception: batch = Untitled, job = Brokenness Aside Video, target = Brokenness Aside Video-MPEG-4 .mp4, segment = Video: 01:20:47;06 to 01:23:45;11, host = Render-1.local, exception = service down, fail count = 2. There are 1 hosts which haven't failed this request yet."/>
         <log tms="385496041.530" tmt="03/20/2013 12:14:01.530" pid="1997" msg="Rescheduling the failed request."/>
         <log tms="385496041.536" tmt="03/20/2013 12:14:01.536" pid="1997" msg="Service B55BC0BF-7DC1-4131-B0A7-D958643FE8AB sessionID mismatch (FD78A150-CEF8-4CAF-9D26-1A8F195C41B4 != DBA9799B-76A0-492B-A3B8-CDC083A06E80) - did the service go down?"/>
         <log tms="385496041.537" tmt="03/20/2013 12:14:01.537" pid="1997" msg="CJobControllerServer::tickleService: we haven't heard from service B55BC0BF-7DC1-4131-B0A7-D958643FE8AB for 4 sec. and connecting failed - the service is down. %3Cad id=%22B55BC0BF-7DC1-4131-B0A7-D958643FE8AB%22 unmg=%220%22 suid=%22-1%22 host=%22Render-1.local%22 session=%22DBA9799B-76A0-492B-A3B8-CDC083A06E80%22 hostPerfScore=%2213.8%22 ver=%222.2%22 kind=%22servicecontroller:com.apple.stomp.transcoder%22 name=%22Render%201%202%22 desc=%22Compressor%22 scope=%223%22 status=%224%22 url=%22tcp://10.1.1.3:62113%22%3E%3C/ad%3E"/>
         <log tms="385496041.641" tmt="03/20/2013 12:14:01.641" pid="1997" msg="Service AE991F65-4E48-4DC7-807A-5B5CACCE9FEF sessionID mismatch (69C39B21-D68F-4FAE-B604-BF260DF77EA4 != B67ED0BE-3332-49F8-8BE0-DCC902A7565F) - did the service go down?"/>
         <log tms="385496041.641" tmt="03/20/2013 12:14:01.641" pid="1997" msg="CJobControllerServer::tickleService: we haven't heard from service AE991F65-4E48-4DC7-807A-5B5CACCE9FEF for 4 sec. and connecting failed - the service is down. %3Cad id=%22AE991F65-4E48-4DC7-807A-5B5CACCE9FEF%22 unmg=%220%22 suid=%22-1%22 host=%22Render-2.local%22 session=%22B67ED0BE-3332-49F8-8BE0-DCC902A7565F%22 hostPerfScore=%2213.8%22 ver=%222.2%22 kind=%22servicecontroller:com.apple.stomp.transcoder%22 name=%22Render%202%22 desc=%22Compressor%22 scope=%223%22 status=%224%22 url=%22tcp://10.1.1.4:64369%22%3E%3C/ad%3E"/>
         <log tms="385496041.672" tmt="03/20/2013 12:14:01.672" pid="1997" msg="Service FEBA9794-9834-4499-97E4-672D979ABF8A sessionID mismatch (BE768F9B-EC6E-49B5-AA47-7CDA25AE5AE0 != 41626CB7-EF93-46D3-B602-A26B1A24E513) - did the service go down?"/>
         <log tms="385496041.673" tmt="03/20/2013 12:14:01.673" pid="1997" msg="CJobControllerServer::tickleService: we haven't heard from service FEBA9794-9834-4499-97E4-672D979ABF8A for 4 sec. and connecting failed - the service is down. %3Cad id=%22FEBA9794-9834-4499-97E4-672D979ABF8A%22 unmg=%220%22 suid=%22-1%22 host=%22Render-2.local%22 session=%2241626CB7-EF93-46D3-B602-A26B1A24E513%22 hostPerfScore=%2213.8%22 ver=%222.2%22 kind=%22servicecontroller:com.apple.stomp.transcoder%22 name=%22Render%202%202%22 desc=%22Compressor%22 scope=%223%22 status=%224%22 url=%22tcp://10.1.1.4:64368%22%3E%3C/ad%3E"/>
         <log tms="385496041.706" tmt="03/20/2013 12:14:01.706" pid="1997" lvl="2" msg="Handling exception: batch = Untitled, job = Brokenness Aside Video, target = Brokenness Aside Video-MPEG-4 .mp4, segment = Video: 01:02:58;04 to 01:05:56;09, host = Render-1.local, exception = service down, fail count = 3. There are 1 hosts which haven't failed this request yet."/>
         <log tms="385496041.709" tmt="03/20/2013 12:14:01.709" pid="1997" msg="Rescheduling the failed request."/>
         <log tms="385496041.710" tmt="03/20/2013 12:14:01.710" pid="1997" msg="Not releasing service AE991F65-4E48-4DC7-807A-5B5CACCE9FEF, sessionID has changed (69C39B21-D68F-4FAE-B604-BF260DF77EA4 != B67ED0BE-3332-49F8-8BE0-DCC902A7565F)"/>
         <log tms="385496041.711" tmt="03/20/2013 12:14:01.711" pid="1997" lvl="2" msg="Handling exception: batch = Untitled, job = Brokenness Aside Video, target = Brokenness Aside Video-MPEG-4 .mp4, segment = Video: 01:14:50;26 to 01:17:49;01, host = Render-2.local, exception = service down, fail count = 2. There are 1 hosts which haven't failed this request yet."/>
         <log tms="385496041.717" tmt="03/20/2013 12:14:01.717" pid="1997" msg="Rescheduling the failed request."/>
         <log tms="385496041.718" tmt="03/20/2013 12:14:01.718" pid="1997" lvl="2" msg="Handling exception: batch = Untitled, job = Brokenness Aside Video, target = Brokenness Aside Video-MPEG-4 .mp4, segment = Video: 01:17:49;02 to 01:20:47;05, host = Render-2.local, exception = service down, fail count = 2. There are 1 hosts which haven't failed this request yet."/>
         <log tms="385496041.722" tmt="03/20/2013 12:14:01.722" pid="1997" msg="Rescheduling the failed request."/>
         <log tms="385496045.606" tmt="03/20/2013 12:14:05.606" pid="1997" msg="Service E44EC2F0-236B-4CCC-BFF1-87D4A53DEE9C sessionID mismatch (A0030092-5C6E-46AF-8737-05654083AB51 != FF52830A-6269-4E3C-ACC4-68035A29387E) - did the service go down?"/>
         <log tms="385496045.607" tmt="03/20/2013 12:14:05.607" pid="1997" msg="CJobControllerServer::tickleService: we haven't heard from service E44EC2F0-236B-4CCC-BFF1-87D4A53DEE9C for 4 sec. and connecting failed - the service is down. %3Cad id=%22E44EC2F0-236B-4CCC-BFF1-87D4A53DEE9C%22 unmg=%220%22 suid=%22-1%22 host=%22Render-1.local%22 session=%22FF52830A-6269-4E3C-ACC4-68035A29387E%22 hostPerfScore=%2213.8%22 ver=%222.2%22 kind=%22servicecontroller:com.apple.stomp.transcoder%22 name=%22Render%201%22 desc=%22Compressor%22 scope=%223%22 status=%224%22 url=%22tcp://10.1.1.3:62120%22%3E%3C/ad%3E"/>
         <log tms="385496045.626" tmt="03/20/2013 12:14:05.626" pid="1997" lvl="2" msg="Handling exception: batch = Untitled, job = Brokenness Aside Video, target = Brokenness Aside Video-MPEG-4 .mp4, segment = Video: 01:00:00;00 to 01:02:58;03, host = Render-1.local, exception = service down, fail count = 3. There are 1 hosts which haven't failed this request yet."/>
         <log tms="385496045.630" tmt="03/20/2013 12:14:05.630" pid="1997" msg="Rescheduling the failed request."/>
         <log tms="385496045.642" tmt="03/20/2013 12:14:05.642" pid="1997" msg="Service AE991F65-4E48-4DC7-807A-5B5CACCE9FEF sessionID mismatch (B67ED0BE-3332-49F8-8BE0-DCC902A7565F != C52501DD-9692-4244-BB38-091F5BB1A722) - did the service go down?"/>
         <log tms="385496045.645" tmt="03/20/2013 12:14:05.645" pid="1997" msg="CJobControllerServer::tickleService: we haven't heard from service AE991F65-4E48-4DC7-807A-5B5CACCE9FEF for 4 sec. and connecting failed - the service is down. %3Cad id=%22AE991F65-4E48-4DC7-807A-5B5CACCE9FEF%22 unmg=%220%22 suid=%22-1%22 host=%22Render-2.local%22 session=%22C52501DD-9692-4244-BB38-091F5BB1A722%22 hostPerfScore=%2213.8%22 ver=%222.2%22 kind=%22servicecontroller:com.apple.stomp.transcoder%22 name=%22Render%202%22 desc=%22Compressor%22 scope=%223%22 status=%224%22 url=%22tcp://10.1.1.4:64374%22%3E%3C/ad%3E"/>
         <log tms="385496045.774" tmt="03/20/2013 12:14:05.774" pid="1997" lvl="2" msg="Handling exception: batch = Untitled, job = Brokenness Aside Video, target = Brokenness Aside Video-MPEG-4 .mp4, segment = Video: 01:05:56;10 to 01:08:54;15, host = Render-2.local, exception = service down, fail count = 3. There are 1 hosts which haven't failed this request yet."/>
         <log tms="385496045.782" tmt="03/20/2013 12:14:05.782" pid="1997" msg="Rescheduling the failed request."/>
      </logs>
   </service>
   <service type="requestprocessor:com.apple.qmaster.contentcontroller" displayName="Tim Bergmann’s iMac" address="tcp://10.1.1.2:52374" hostName="Tim-Bergmanns-iMac.local">
      <logs tms="385495954.758" tmt="03/20/2013 12:12:34.758" pnm="ContentController">
         <mrk tms="385495954.760" tmt="03/20/2013 12:12:34.760" pid="2005" kind="begin" what="log-session"/>
         <log tms="385495954.764" tmt="03/20/2013 12:12:34.764" pid="2005" msg="Starting up"/>
         <mrk tms="385496003.935" tmt="03/20/2013 12:13:23.935" pid="2005" kind="begin" what="service-request" req-id="95659D0D-E473-42FC-880F-E233593BCCFA:1" msg="Preprocessing job."></mrk>
         <mrk tms="385496008.955" tmt="03/20/2013 12:13:28.955" pid="2005" kind="end" what="service-request" req-id="95659D0D-E473-42FC-880F-E233593BCCFA:1" msg="Preprocessing job request end."></mrk>
         <mrk tms="385496009.007" tmt="03/20/2013 12:13:29.007" pid="2005" kind="begin" what="service-request" req-id="EB965BDE-0B54-4B45-BA50-547CE9744E87:1" msg="Preprocessing."></mrk>
         <mrk tms="385496014.010" tmt="03/20/2013 12:13:34.010" pid="2005" kind="end" what="service-request" req-id="EB965BDE-0B54-4B45-BA50-547CE9744E87:1" msg="Preprocessing service request end."></mrk>
      </logs>
   </service>
   <service type="servicecontroller:com.apple.stomp.transcoder" displayName="Render 1 2" address="tcp://10.1.1.3:62113" hostName="Render-1.local">
      <logs tms="385497699.989" tmt="03/20/2013 12:41:39.989" pnm="CompressorTranscoderX">
         <log tms="385497699.989" tmt="03/20/2013 12:41:39.989" pid="4897" kind="mrk" sub="error" what="get-log" avail="false" msg="Not logging to file."/>
      </logs>
   </service>
   <service type="servicecontroller:com.apple.stomp.transcoder" displayName="Render 2 2" address="tcp://10.1.1.4:64368" hostName="Render-2.local">
      <logs tms="385497699.975" tmt="03/20/2013 12:41:39.975" pnm="CompressorTranscoderX">
         <log tms="385497699.975" tmt="03/20/2013 12:41:39.975" pid="4637" kind="mrk" sub="error" what="get-log" avail="false" msg="Not logging to file."/>
      </logs>
   </service>
   <service type="servicecontroller:com.apple.stomp.transcoder" displayName="Render 1" address="tcp://10.1.1.3:62120" hostName="Render-1.local">
      <logs tms="385497700.008" tmt="03/20/2013 12:41:40.008" pnm="CompressorTranscoderX">
         <log tms="385497700.008" tmt="03/20/2013 12:41:40.008" pid="4910" kind="mrk" sub="error" what="get-log" avail="false" msg="Not logging to file."/>
      </logs>
   </service>
   <service type="servicecontroller:com.apple.stomp.transcoder" displayName="Render 2" address="tcp://10.1.1.4:64374" hostName="Render-2.local">
      <logs tms="385497699.992" tmt="03/20/2013 12:41:39.992" pnm="CompressorTranscoderX">
         <log tms="385497699.992" tmt="03/20/2013 12:41:39.992" pid="4650" kind="mrk" sub="error" what="get-log" avail="false" msg="Not logging to file."/>
      </logs>
   </service>
   <service type="servicecontroller:com.apple.stomp.transcoder" displayName="Tim Bergmann’s iMac" address="tcp://10.1.1.2:52251" hostName="Tim-Bergmanns-iMac.local">
      <logs tms="385495925.821" tmt="03/20/2013 12:12:05.821" pnm="CompressorTranscoderX">
         <mrk tms="385495925.823" tmt="03/20/2013 12:12:05.823" pid="1996" kind="begin" what="log-session"/>
         <log tms="385495925.827" tmt="03/20/2013 12:12:05.827" pid="1996" msg="Starting up"/>
         <mrk tms="385496024.462" tmt="03/20/2013 12:13:44.462" pid="1996" kind="begin" what="service-request" req-id="12BFC5E9-94E1-459C-BE2B-6FAD6D10BF4D:1" msg="Processing."></mrk>
         <mrk tms="385496974.947" tmt="03/20/2013 12:29:34.947" pid="1996" kind="end" what="service-request" req-id="12BFC5E9-94E1-459C-BE2B-6FAD6D10BF4D:1" msg="Processing service request end."></mrk>
         <mrk tms="385496974.981" tmt="03/20/2013 12:29:34.981" pid="1996" kind="begin" what="service-request" req-id="562D097C-96A0-4045-8578-CC81CC852CB5:7" msg="Processing."></mrk>
      </logs>
   </service>
</services>

Does the iOS device connect to other networks?
Does the iOS device see the network?
Any error messages?
Do other devices now connect?
Did the iOS device connect before?
Try the following to rule out a software problem:
- Reset the iOS device. Nothing will be lost
Reset iOS device: Hold down the On/Off button and the Home button at the same time for at
least ten seconds, until the Apple logo appears.
- Power off and then back on your router
.- Reset network settings: Settings>General>Reset>Reset Network Settings
- iOS: Troubleshooting Wi-Fi networks and connections
- Wi-Fi: Unable to connect to an 802.11n Wi-Fi network
- iOS: Recommended settings for Wi-Fi routers and access points
- Restore from backup. See:
iOS: How to back up
- Restore to factory settings/new iOS device.
If still problem and it does not connect to any networks make an appointment at the Genius Bar of an Apple store since it appears you have a hardware problem.
Apple Retail Store - Genius Bar

Why would anyone want to use ASM Clustered File system?

DB Version: 11gR2
OS : Solaris, AIX, HP-UX
I've read about the new feature ACFS.
http://www.oracle-base.com/articles/11g/ACFS_11gR2.php
But why would anyone want to store database binaries in a separate Filesystem created by Oracle?

Hi Vitamind,
how do these binaries interact with the CPU when they want something to be done?
ACFS should work with Local OS (Solaris) to communicate with the CPU . Isn't this kind of double work?ACFS dont work with .... but provide filesystem to Local S.O
There may be extra work, but that's because there are more resources that a common filesystem.
Oracle ACFS executes on operating system platforms as a native file system technology supporting native operating system file system application programming interfaces (APIs).
ACFS is a general purpose POSIX compliant cluster file system. Being POSIX compliant, all operating system utilities we use with ext3 and other file systems can also be used with Oracle ACFS given it belongs to the same family of related standards.
ACFS Driver Model
An Oracle ACFS file system is installed as a dynamically loadable vendor operating system (OS) file system driver and tool set that is developed for each supported operating system platform. The driver is implemented as a Virtual File System (VFS) and processes all file and directory operations directed to a specific file system.
It makes sense you use the ACFS if you use some of the features below:
• Oracle RAC / RAC ONE NODE
• Oracle ACFS Snapshots
• Oracle ASM Dynamic Volume Manager
• Cluster Filesystem for regular files
ACFS Use Cases
• Shared Oracle DB home
• Other “file system” data
• External tables, data loads, data extracts
• BFILES and other data customer chooses not to store in db
• Log files (consolidates access)
• Test environments
• Copy back a previous snapshot after testing
• Backups
• Snapshot file system for point-intime backups
• General purpose local or cluster file system
• Leverage ASM manageability
Note : Oracle ACFS file systems cannot be used for an Oracle base directory or an Oracle grid infrastructure home that contains the software for Oracle Clusterware, Oracle ASM, Oracle ACFS, and Oracle ADVM components.
Regards,
Levi Pereira

TDMS Icon and clusters... why not?

I want to use TDMS file, but I can't link wire from cluster to TDMS icon. See this figure to understand:
I can use array, I can Use Merge Signal, but no cluster... also if cluster is a cluster of numbers. Why?
What is the excellent way to use TDMS Icon? What type of wire are better to use?
It's possible to "unbundle" also an array. Can u show me how?

If you want to use a cluster to store settings as properties for a file/group/channel, you can use the following code (here I store an error cluster):
It uses code from OpenG toolkit.
Ton
Free Code Capture Tool! Version 2.1.3 with comments, web-upload, back-save and snippets!
Nederlandse LabVIEW user groep www.lvug.nl
My LabVIEW Ideas
LabVIEW, programming like it should be!

QMASTER hints 4 usual trouble (QM NOT running/CLUSTEREd nodes/Networks etc

All, I just posted this with some hints & workaround with very common issues people have on this forum and keep asking concerning the use of APPLE QMASTER with FCP, SHAKE, COMPRESSOR and MOTION. I've had many over the last 2 years and see them coming up frequently.
Perhaps these symptoms are fixed in FCS2 at MAY 2007 (now). However if not here's some ROTS that i used for FCP to compressor via QMASTER cluster for example. NO special order but might help someone get around the stuff with QMASTER V2.3, FCP V5.1.4, compressor.app V2.3
I saw the latest QMASTER UI and usage at NAB2007 and it looked a little more solid with some "EASY SETUP" stuff. I hope it has been reworked underneath.. I guess I will know soon if it has.
For most FCP/COMPRESSOR, SHAKE. MOTION and COMPRESSOR:
• provide access from ALL nodes to ALL the source and target objects (files) on their VOLUMES. Simply MOUNT those volumes through the APPLE file system (via NFS) using +k (cmd+k) or finder/go/connect to server. OR using an SSAFS such as XSAN™ where the file systems are all shared over FC not the network. YOu will notice the CPU's going very busy for a small while. THhis is the APPLE FILE SYSTEM task,,, I guess it's doing 'spotlight stuff". This goes away after a few minutes.
• set the COMPRESSOR preferences for "CLUSTER OPTIONS" to "Never copy source to Cluster". This means that all nodes can access your source and target objects (files) over NFS (as above). Failure to to this means LENGTHY times to COPY material back an forth, in some cases undermining the pleasure gained from initially using clustering (reduced job times)
• DONT mix the PHYSICAL or LOGICAL networks in your local cluster. I dont know why but I could never get this to work. Physical mean stick with eother ETHERNET or FIREWIRE or your other (airport etc whic will be generally way to slow and useless), Logical measn leepin all nodes on the SAME subnet. You can do this siply by setting theis up in the system preferences/QMASTER/advanced tab under "Use Network Interfaces". In my currnet QUAd I set this to use BUILT IN ETHERNET1 and in the MPBDC's I set this to their BUILTIN ETHERNET.
• LOGICAL NETWORKS (Subnet): simply HARDCODE an IP address on the ETHERNET (for eample) for your cluster nodes andthe service controller. FOr eample 3.1.1.x .... it will all connect fine.
• Physical Networks: As above (1) DONT MIX firewire (IPoFW) and Ethernet(IPoE). (2) if more than extra service node USE A HUB or SWITCH. I went and bought a 10 port GbE HUB for about $HK400 (€40) and it worked fine. I was NEVER able to get a stable system of QMASTER mixing FW and ETHERNET. (3) fwiw using IP of FW caused me a LOAD of DISK errors and timouts (I/O errors) on thosse DISKs that were FW400 (al gone now) but it showed this was not stable overall
• for the cluster controller node MAKE SURE you set the CLUSTER STORAGE (system preferences/QMASTER/shared cluster storage) for the CLUSTER CONTROLLER NODE IS ON A SHARED volume (See above). This seems essential for SHAKE to work. (if not check the Qmaster errors in the console.app [see below] ). IF you have an SSAFS like XSAN™ then just add this cluster storage on a share file path. NOte that QMASTER does not permit the cluster storage to be on a NETWORK NODE for some reason. So in short just MOUNT the volume where the SHARED CLUSTER file is maintained for the CLUSTER controller.
• FCP - avoid EXPORT to COMPRESSOR from the TIMELINE - it never seems to work properly (see later). Instead EXPORT FROM SEQUENCE in the BROWSER - consistent results
• FCP - "media missing " messages on EXPORT to COMPRESSOR.. seems a defect in FCP 5.1 when you EXPORT using a sequence that is NOT in the "root" or primary trry in the FCP PROJECT BROWSER. Simply if you have browser/bin A contains(Bin B (contains Bin C (contains sequence X))) this will FAIL (wont work) for "EXPORT TO COMPRESSOR" if you use EXPORT to COMPRESSOR in a FCP browser PANE that is separately OPEN. To get around this, simply OPEN/EXPOSE the triangles/trees in the BROWSER PANE for the PROJECT and select the SEQUENCE you want and "EXPORT to COMPRESSOR" from there. This has been documented in a few places in this forum I think.
• FCP -> COMPRESSOR -> .M2V (for DVDSP3): some things here. EXPORTING from an FCP SEQUENCE with CHAPTER MARKERS to an MPEG2 .M2V encoding USING A CLUSTER causes errors in the placement of the chapter makers when it is imported to DVDSP3. In fact CONSISTENTLY, ALL the chapter markers are all PLACED AT THE END of the TRACK in DVD SP# - somewhat useless. This seems to happen ALSO when the source is an FCP reference movie, although inconsistent. A simple work around if you have the machines is TRUN OF SEGMENTING in the COMPRESSOR ENCODER inspector. let each .M2V transcode run on the same service node. FOr the jobs at hand just set up a CLUSTER and controller for each machine and then SELECT the cluster (myclusterA, hisclusterb, herclusterc) for each transcode job.. anyway for me.. the time spent resolving all this I could have TRANSCODED all this on my QUAD and it would all have ben done by sooner! (LOL)
• CONSOLE logs: IF QMASTER fails, I would suggest your fist port of diagnosis should be /Library/Logs/Qmaster in there you will see (on the controller node) compressor.log, jobcontroller.com.apple.qmaster.cluster.admin.log, and lots of others including service controller.com.apple.qmaster.executorX.log (for each cpu/core and node) andd qmasterca.log. All these are worth a look and for me helped me solve 90% of my qmaster errors and failures.
• MOTION 3 - fwiw.. EXPORT USING COMPRESSOR to a CLUSTER seems to fail EVERY TIME.. seems MOTION is writing stuff out to a /var/spool/qmaster
TROUBLESHOOTING QMASTER: IF QMASTER seems buggered up (hosed), then follow these steps PRIOR to restarting you machines.
go read the TROUBLE SHOOTING in the published APPLE docs for COMPRESSOR, SHAKE and "SET UP FOR DISTRIBUTED PROCESSING" and serach these forums CAREFULLY.. the answer is usually there somewhere.
ELSE THEN,, try these steps....
You'll feel that QMASTER is in trouble when you
• see that the QMASTER ICON at the top of the screen says 'NO SERVICES" even though that node is started and
• that the APPLE QMASTER ADMINSTRATOR is VERY SLOW after an 'APPLY" (like minutes with SPINNING BEACHBALL) or it WONT LET YOU DELETE a cluster or you see 'undefined' nodes in your cluster (meaning that one was shut down or had a network failure)..... all this means it's going to get worse and worse. SO DONT submit any more work to QAMSTER... best count you gains and follow this list next.
(a) in COMPRESSOR.app / RESET BACKGROUND PROCESSES (its under the COMPRESSOR name list box) see if things get kick started but you will lose all the work that has been done up to that point for COMPRESSOR.app
b) if no OK, then on EACH node in that cluster, STOP the QMASTER (system preferences/QMASTER/setup [set 0 minutes in the prompt and OK). Then when STOPPED, RESET the shared services my licking OPTION+CLICK on the "START" button to reveal the "RESET SERVICES". Then click "START" on each node to start the services. This has the actin of REMOVING or in the case where the CLUSTER CONTROLLER node is "RESET" f terminating the cluster that's under its control. IF so Simply go to APPLE QMASTER ADMINISTRATOR and REDFINE it. Go restart you cluster.
c) if step (b) is no help, consult the QMASTER logs in /Library/Logs/Qmaster (using the cosole.app) for any FILE MISSING or FILE not found or FILE ERROR . Look carefully for the NODENAME (the machine_name.local) where the error may have occured. Sometimes it's very chatty. Others it is not. ALso look in the BATCH MONITOR OUTPUT for errors messages. Often these are NEVER written (or I cant find them) in the /var/logs... try and resolve any issues you can see (mostly VOLUME or FILE path issues from my experience)
(d) if still no joy then - try removing all the 'dead' cluster files from /var/tmp/qmaster , /var/sppol/qmaster and also the file directory that you specified above for the controller to share the clustering. FOR shake issues, go do the same (note also where the shake shared cluster file path is - it can be also specified in the RENDER FILEOUT nodes prompt).
e) if all this WONT help you, its time to get the BIG hammer out. Simply, STOP all nodes of not stopped. (if status/mode is "STOPPING" then it [QMASTER] is truly buggered). DISMOUNT the network volumes you had mounted. and RESTART ALL YOUR NODES. Tis has the affect of RESTARTING all the QMASTERD tasks. YEs sure you can go in and SUDO restart them but it is dodgy at best because they never seem to terminate cleanly (Kill -9 etc) or FORCE QUIT.... is what one ends up doing and then STILL having to restart.
f) after restart perform steps from (B) again and it will be usually (but not always) right after that
LAstly - here's some posts I have made that may help others for QMASTER 2.3 .. and not for the NEW QMASTER as at MAy 2007...
Topic "qmasterd not running" - how this happened and what we did to fix it. - http://discussions.apple.com/message.jspa?messageID=4168064#4168064
Topic: IP over Firewire AND Ethernet connected cluster? http://discussions.apple.com/message.jspa?messageID=4171772#4171772
LAstly spend some DEDICATED time to using OBJECTIVE keywords to search the FINAL CUT PRO, SHAKE, COMPRESSOR , MOTION and QMASTER forums
hope thats helps.
G5 QUAD 8GB ram w/3.5TB + 2 x 15in MBPCore Mac OS X (10.4.9) FCS1, SHAKE 4.1

Warwick,
Thanks for joining the forum and for doing all this work and posting your results for our benefit.
As FCP2 arrives in our shop, we will try once again to make sense of it and to see if we can boost our efficiencies in rendering big projects and getting Compressor to embrace five or six idle Macs.
Nonetheless, I am still in "Major Disbelief Mode" that Apple has done so little to make this software actually useful.
bogiesan

References from nested clusters

Hi,
Currently our station can test only one product at the time, but we modified the wiring so now we can attach 2 units to the same station. A new application must be written to handle the new scenario. The test has to be executed several times on both the units. The execution is sequential so unit1 first then unit2.
I have created a CONTROL cluster with the following elements
- bool: boolean button (means unit enabled/disabled)
- PARAMS cluster: various text rings. This cluster is disabled and greyed out once the user enabled the starter.
- MEASUREMENT graph
Rules:
- the unit can not be enabeled if any of the text rings is unconfigured.
- the test must be interrupted immediately for the given unit if the enabled button is pressed during the test (when the user disable the unit runtime). So a reference to this button must be used and continously monitored.
- the test must be interrupted immediately for both the units if the stop button is pressed during thet test
- after a test is completed the results must be evaluated and the unit must be disabled if the measured values are outside of the limits.
now... This would be a very easy task if I would have one unit only. I would just create the neccessary control references drive them to the measurement VI and here we go.
But its getting inconveniently complex when I have control 2 units. I can not treat the control elements in an array (so like an array with 2 CONTROL clusters) because then I can not disable the PARAMS clusters independently.
I dont see an easy way to add 2 of the CONTROL clusters to a new cluster (so treat them as one cluster), I am not sure how to get the references in this way. (if I combine them into one cluster its pretty easy to get propertynode/value for any of the elements, but I need control refs)
So I handle both clusters as an independent control on the front panel, so I have add lot of duplicatinos to handle both units in the same way. I find this very inconvenient and error prone, plus it complicates the block diagram
I am wondering what would be right approach to handle these type of problems.
(I have tried to create reentrant VIs but I gave up because I had to communicate too much between my main VI and the reentrant VIs. That made the code hard to follow)
I use LV2012, but the attachment in LV8 so hopefully everybody can open it
Thanks
Attachments:
Cluster.vi ‏16 KB

Well... if I create a reference to the main cluster then I can use the controls[] property which will give me back 3 references in an array. First the button, second the params clusters, third the graph (maybe the order is different, it doesnt matter for now). But when I drive the params cluster reference to another property node it does not offer me a controls[] property, so I can not access the contents of the cluster itself. I may could use some sort of a cast function, but its really counter-intuitive.
I always have to know the order of any of the given clusters and if I change the order my code will mess up instantly. And hell, should why should I refer to my objects as control[][0], control[][1] etc instead of a real name.
Not sure if this can be resolved in the current LabVIEW environment...
The workaround I made is that I have created a cluster in which each element are references. I drive the button, graph and params cluster references into it and as I have two units to control I made an array of this cluster.
Not sure if you agree but this is overcomplicating the code and I had to create an extra cluster just to access to the references of my original clusters. Pain in the back.
Let me know your thoughts!
thx.

Difference between the design of clusters PCLx and others like RFBLG etc .

There are a few nagging questions which I was not able to find in the forum hence i have to post a new question.
I am a little confused about the difference between the different clusters .
If i start with RFBLG i.e. the cluster for BSEG BSEC etc ,I can see that the tables which are part of this cluster
can be viewed through different methods like
1) whr usd list for RFBLG
2) dd02l table and give the required parameters there
now when I compare this with another so-called cluster PCL1 if find that PCL1 is not recognized as a cluster
and also I am not able to see the same in dd02l table when i give PCL1 and the tabtype as cluster which I was able to see
for the RFBLG ,there are other tables similar to RFBLG .
1) SO what is the difference between the RFBLG type of clusters and the PCLx type of cluster
2) are pclx and rfblg..type of clusters same ?
3) why does PCL1 shows that it is a transparent table ? where as rfblg shows in a diff way in se11
4) i know we access data from PCL1 using import and export stmts ,DO OR CAN WE DO THE SAME FOR RFBLG
5) I found that each and evry cluster table had diff fields ,this was kinda surprising for me as I had been thinking
that all cluster tables need to follow a certain rule ,SO WHO DECIDES THE FIELDS OF A TABLE CLUSTER ?
6) PCL1 has the index button enabled ,which again I think is not according to the cluster table rules?how?
7) I understand that we can save data in form of internal tables in the PCL1 cluster ,can we do the same in RFBLG ?
8) Can I think on lines that PCL1 and RFBLG type of cluster are two totally different types of data dictionary objects
and the usage and implementation of both of them is different and that the design and the BASE of both of such objects
is different .
I know this is a long list but I am sure that answers to these questions would really require some one who has really really work hard and invested a lot of time in understanding the dictionary system.I am awaiting a few answers ,few hints and a healthy discussion till we get them .
Thanks ...
a

Hello,
1/
BSEG is a typical Cluster Table.
This means that the physical table BSEG does NOT exist in the database, physical data for BSEG is stored in the database (table) cluster RFBLG.
In ABAP however you can perform selects on BSEG (with all fields from the SAP repository structure, see SE11 on BSEG), during execution the SAP database layer will translate these statements to physical selects in the RFBLG database table, so in ABAP this is transparant.
More info :
[http://help.sap.com/saphelp_nw04/helpdata/en/cf/21f083446011d189700000e8322d00/content.htm|http://help.sap.com/saphelp_nw04/helpdata/en/cf/21f083446011d189700000e8322d00/content.htm]
2/
PCL1, PCL2, ... are normal SAP transparent tables, however in HR they are often called HR cluster table.
Transparent tables are SAP objects where there is also a database table with the same name that contains the physical data.
However the PCL tables are somewhat different from normal transparent tables (data is compressed, external programs can not interpret the data, ...).
This means that in ABAP you can not use simple SQL statements to access data in PCL tables (because of compressed format).
In stead statements like EXPORT TO DATABASE and IMPORT FROM DATABASE need to be used.
More info :
[http://fuller.mit.edu/hr/cluster_tables.html|http://fuller.mit.edu/hr/cluster_tables.html]
Wim

Upgrading a 3-node Hyper-V clusters storage for £10k and getting the most bang for our money.

Hi all, looking for some discussion and advice on a few questions I have regarding storage for our next cluster upgrade cycle.
Our current system for a bit of background:
3x Clustered Hyper-V Servers running Server 2008 R2 (72TB Ram, dual cpu etc...)
1x Dell MD3220i iSCSI with dual 1GB connections to each server (24x 146GB 15k SAS drives in RAID 10) - Tier 1 storage
1x Dell MD1200 Expansion Array with 12x 2TB 7.2K drives in RAID 10 - Tier 2 storage, large vm's, files etc...
~25 VM's running all manner of workloads, SQL, Exchange, WSUS, Linux web servers etc....
1x DPM 2012 SP1 Backup server with its own storage.
Reasons for upgrading:
Storage though put is becoming an issue as we only get around 125MB/s over the dual 1GB iSCSI connections to each physical server. (tried everything under the sun to improve bandwidth but I suspect the MD3220i Raid is the bottleneck here.
Backup times for vm's (once every night) is now in the 5-6 hours range.
Storage performance during backups and large file syncronisations (DPM)
Tier 1 storage is running out of capacity and we would like to build in more IOPS for future expansion.
Tier 2 storage is massively underused (6tb of 12tb Raid 10 space)
Migrating to 10GB server links.
Total budget for the upgrade is in the region of £10k so I have to make sure we get absolutely the most bang for our buck.
Current Plan:
Upgrade the cluster to Server 2012 R2
Install a dual port 10GB NIC team in each server and virtualize cluster, live migration, vm and management traffic (with QoS of course)
Purchase a new JBOD SAS array and leverage the new Storage Spaces and SSD caching/tiering capabilities. Use our existing 2TB drives for capacity and purchase sufficient SSD's to replace the 15k SAS disks.
On to the questions:
Is it supported to use storage spaces directly connected to a Hyper-V cluster? I have seen that for our setup we are on the verge of requiring a separate SOFS for storage but the extra costs and complexity are out of our reach. (RDMA, extra 10GB NIC's
etc...)
When using a storage space in a cluster, I have seen various articles suggesting that each csv will be active/passive within the cluster. Causing redirected IO for all cluster nodes not currently active?
If CSV's are active/passive its suggested that you should have a csv for each node in your cluster? How in production do you balance vm's accross 3 CSV's without manually moving them to keep 1/3 of load on each csv? Ideally I would like just
a single CSV active/active for all vm's to sit on. (ease of management etc...)
If the CSV is active/active am I correct in assuming that DPM will backup vm's without causing any re-directed IO?
Will DPM backups of VM's be incremental in terms of data transferred from the cluster to the backup server?
Thanks in advance for anyone who can be bothered to read through all that and help me out! I'm sure there are more questions I've forgotten but those will certainly get us started.
Also lastly, does anyone else have a better suggestion for how we should proceed?
Thanks

Current Plan:
Upgrade the cluster to Server 2012 R2
Install a dual port 10GB NIC team in each server and virtualize cluster, live migration, vm and management traffic (with QoS of course)
Purchase a new JBOD SAS array and leverage the new Storage Spaces and SSD caching/tiering capabilities. Use our existing 2TB drives for capacity and purchase sufficient SSD's to replace the 15k SAS disks.
On to the questions:
Is it supported to use storage spaces directly connected to a Hyper-V cluster? I have seen that for our setup we are on the verge of requiring a separate SOFS for storage but the extra costs and complexity are out of our reach. (RDMA, extra 10GB NIC's
etc...)
When using a storage space in a cluster, I have seen various articles suggesting that each csv will be active/passive within the cluster. Causing redirected IO for all cluster nodes not currently active?
If CSV's are active/passive its suggested that you should have a csv for each node in your cluster? How in production do you balance vm's accross 3 CSV's without manually moving them to keep 1/3 of load on each csv? Ideally I would like just
a single CSV active/active for all vm's to sit on. (ease of management etc...)
If the CSV is active/active am I correct in assuming that DPM will backup vm's without causing any re-directed IO?
Will DPM backups of VM's be incremental in terms of data transferred from the cluster to the backup server?
Thanks in advance for anyone who can be bothered to read through all that and help me out! I'm sure there are more questions I've forgotten but those will certainly get us started.
Also lastly, does anyone else have a better suggestion for how we should proceed?
Thanks
1) You can use direct connection to SAS with a 3-node cluster of course (4-node, 5-node etc). Sure it would be much faster then running with an additional SoFS layer (with SAS fed directly to your Hyper-V cluster nodes all reads and writes would be local
travelling down the SAS fabric and with SoFS layer added you'll have the same amount of I/Os targeting SAS + Ethernet with its huge compared to SAS latency sitting in between a requestor and your data residing on SAS spindles, I/Os overwrapped into SMB-over-TCP-over-IP-over-Etherent
requests at the hypervisor-SoFS layers). Reason why SoFS is recommended - final SoFS-based solution would be cheaper as SAS-only is a pain to scale beyond basic 2-node configs. Instead of getting SAS switches, adding redundant SAS controllers to every hypervisor
node and / or looking for expensive multi-port SAS JBODs you'll have a pair (at least) of SoFS boxes doing a file level proxy in front of a SAS-controlled back end. So you'll compromise performance in favor of cost. See:
http://davidzi.com/windows-server-2012/hyper-v-and-scale-out-file-cluster-home-lab-design/
Used interconnect diagram within this design would actually scale beyond 2 hosts. But you'll have to get a SAS switch (actually at least two of them for redundancy as you don't want any component to become a single point of failure, don't you?)
2) With 2012 R2 all I/O from a multiple hypervisor nodes is done thru the storage fabric (in your case that's SAS) and only metadata updates would be done thru the coordinator node and using Ethernet connectivity. Redirected I/O would be used in a two cases
only a) no SAS connectivity from the hypervisor node (but Ethernet one is still present) and b) broken-by-implementation backup software would keep access to CSV using snapshot mechanism for too long. In a nutshell: you'll be fine :) See for references:
http://www.petri.co.il/redirected-io-windows-server-2012r2-cluster-shared-volumes.htm
http://www.aidanfinn.com/?p=12844
3) These are independent things. CSV is not active/passive (see 2) so basically with an interconnection design you'll be using there's virtually no point to having one-CSV-per-hypervisor. There are cases when you'd still probably do this. For example if
you'd have all-flash and combined spindle/flash LUNs and you know for sure you want some VMs to sit on flash and others (no so I/O hungry) to stay within "spinning rust". One more case is many-node cluster. With it multiple nodes basically fight for a single
LUN and a lot of time is wasted for SCSI reservation conflicts resove (ODX has no reservation offload like VAAI has so even if ODX is present its not going to help). Again it's a place where SoFS "helps" as having intermediate proxy level turns block I/O into
file I/O triggering SCSI reservation conflicts for a two SoFS nodes only instead of an evey node in a hypervisor cluster. One more good example is when you'll have a mix of a local I/O (SAS) and Ethernet with a Virtual SAN products. Virtual SAN runs directly
as part of the hypervisor and emulates high performance SAN using cheap DAS. To increase performance it DOES make sense to create a concept of a "local LUN" (and thus "local CSV") as reads targeting this LUN/CSV would be passed down the local storage
stack instead of hitting the wire (Ethernet) and going to partner hypervisor nodes to fetch the VM data. See:
http://www.starwindsoftware.com/starwind-native-san-on-two-physical-servers
http://www.starwindsoftware.com/sw-configuring-ha-shared-storage-on-scale-out-file-servers
(feeding basically DAS to Hyper-V and SoFS to avoid expensive SAS JBOD and SAS spindles). The same thing as VMware is doing with their VSAN on vSphere. But again that's NOT your case so it DOES NOT make sense to keep many CSVs with only 3 nodes present or
SoFS possibly used.
4) DPM is going to put your cluster in redirected mode for a very short period of time. Microsoft says NEVER. See:
http://technet.microsoft.com/en-us/library/hh758090.aspx
Direct and Redirect I/O
Each Hyper-V host has a direct path (direct I/O) to the CSV storage Logical Unit Number (LUN). However, in Windows Server 2008 R2 there are a couple of limitations:
For some actions, including DPM backup, the CSV coordinator takes control of the volume and uses redirected instead of direct I/O. With redirection, storage operations are no longer through a host’s direct SAN connection, but are instead routed
through the CSV coordinator. This has a direct impact on performance.
CSV backup is serialized, so that only one virtual machine on a CSV is backed up at a time.
In Windows Server 2012, these limitations were removed:
Redirection is no longer used.
CSV backup is now parallel and not serialized.
5) Yes, VSS and CBT would be used so data would be incremental after first initial "seed" backup. See:
http://technet.microsoft.com/en-us/library/ff399619.aspx
http://itsalllegit.wordpress.com/2013/08/05/dpm-2012-sp1-manually-copy-large-volume-to-secondary-dpm-server/
I'd look at some other options. There are few good discussion you may want to read. See:
http://arstechnica.com/civis/viewtopic.php?f=10&t=1209963
http://community.spiceworks.com/topic/316868-server-2012-2-node-cluster-without-san
Good luck :)
StarWind iSCSI SAN & NAS

Using single SMB share with multiple Hyper-V clusters

Hello,
I'm trying to find out if I can use a single SMB share with multiple Hyper-V Clusters. Looking at:
How to Assign SMB 3.0 File Shares to Hyper-V Hosts and Clusters in VMM
I think it's possible. Since the File Server is going to handle the file locking it shouldn't be a problem.
Has anyone tried that?
Thank you in advance!

Hello,
I'm not sure that's possible, I get this from this statement:"Assign the share—Assign
the share to a virtual machine host or cluster."
Even if it worked I wouldn't do that. Why don't you just create multiple shares?

Why does my 10GB iSCSI setup seem see such high latency and how can I fix it?

I have a iscsi server setup with the following configuration
Dell R510
Perc H700 Raid controller
Windows Server 2012 R2
Intel Ethernet X520 10Gb
12 near line SAS drives
I have tried both Starwind and the built in Server 2012 iscsi software but see similar results. I am currently running the latest version of starwinds free
iscsi server.
I have connected it to a HP 8212 10Gb port which is also connected via 10Gb to our vmware servers. I have a dedicated vlan just for iscsi and have enabled
jumbo frames on the vlan.
I frequently see very high latency on my iscsi storage. So much so that it can timeout or hang vmware. I am not sure why. I can run IOmeter and
get some pretty decent results.
I am trying to determine why I see such high latency 100'ms. It doesn't seem to always happen, but several times throughout the day, vmware is complaining
about the latency of the datastore. I have a 10Gb iscsi connection between the servers. I wouldn't expect the disks to be able to max that out. The highest I could see when running IO meter was around 5Gb. I also don't see much load
at all on the iscsi server when I see the high latency. It seems network related, but I am not sure what settings I could check. The 10Gb connect should be plenty as I said and it is no where near maxing that out.
Any thoughts about any configuration changes I could make to my vmware enviroment, network card settings or any ideas on where I can troubleshoot this. I
am not able to find what is causing it. I reference this document and for changes to my iscsi settings
http://en.community.dell.com/techcenter/extras/m/white_papers/20403565.aspx
Thank you for your time.

I have a iscsi server setup with the following configuration
Dell R510
Perc H700 Raid controller
Windows Server 2012 R2
Intel Ethernet X520 10Gb
12 near line SAS drives
I have tried both Starwind and the built in Server 2012 iscsi software but see similar results. I am currently running the latest version of starwinds free
iscsi server.
I have connected it to a HP 8212 10Gb port which is also connected via 10Gb to our vmware servers. I have a dedicated vlan just for iscsi and have enabled
jumbo frames on the vlan.
I frequently see very high latency on my iscsi storage. So much so that it can timeout or hang vmware. I am not sure why. I can run IOmeter and
get some pretty decent results.
I am trying to determine why I see such high latency 100'ms. It doesn't seem to always happen, but several times throughout the day, vmware is complaining
about the latency of the datastore. I have a 10Gb iscsi connection between the servers. I wouldn't expect the disks to be able to max that out. The highest I could see when running IO meter was around 5Gb. I also don't see much load
at all on the iscsi server when I see the high latency. It seems network related, but I am not sure what settings I could check. The 10Gb connect should be plenty as I said and it is no where near maxing that out.
Any thoughts about any configuration changes I could make to my vmware enviroment, network card settings or any ideas on where I can troubleshoot this. I
am not able to find what is causing it. I reference this document and for changes to my iscsi settings
http://en.community.dell.com/techcenter/extras/m/white_papers/20403565.aspx
Thank you for your time.
If both StarWind and MSFT target show the same numbers I can guess it's network configuration issue. Anything higher then 30 ms is a nightmare :( Did you properly tune your network stacks? What numbers (x-put and latency) you get for raw TCP numbers (NTtcp
and Iperf are handy to show)?
StarWind VSAN [Virtual SAN] clusters Hyper-V without SAS, Fibre Channel, SMB 3.0 or iSCSI, uses Ethernet to mirror internally mounted SATA disks between hosts.

Why Clustering

Similar Messages

Maybe you are looking for