Microsoft Exchange Replication Service keep stopping
Not sure what is going on here. The replication service keep stopping on one of my servers within the DAG. I cant think of anything different about this server compared to others (anything known at least). This server sits at our DR facility and is the host
of the 3rd copy to a few of our databases, which primary and secondary ones live at our HQ. The application log reports this every 5-10 seconds
Watson report about to be sent for process id: 31860, with parameters: E12IIS, c-RTL-AMD64, 15.00.0995.029, msexchangerepl, M.Exchange.Common, M.E.C.H.DatabaseFailureItem.Parse, System.ArgumentOutOfRangeException, fe19, 15.00.0995.012.
ErrorReportingEnabled: False
The system log reports this every 5-10 seconds also:
The Microsoft Exchange Replication service terminated unexpectedly. It has done this 1 time(s). The following corrective action will be taken in 5000 milliseconds: Restart the service.
I have ran windows update, and it seems that every time I reboot the machine, the shutdown tracker comes up. I think it has something to do with hardware, but no idea where to look next.
Hi,
This issue occurred because Microsoft Exchange Replication Services Was Unable to Write In "Microsoft-Exchange-MailboxDatabaseFailureItems/Operational"
The fix for the issue will be available in next cumulative update. Meanwhile, please follow the steps below for a workaround to resolve the issue. This workaround is to clear the entries log storing mailbox database failure items.
On problem server, open command prompt in administrator privileges.
Run following command to clear the even entries from the log
Wevtutil.exe cl "Microsoft-Exchange-MailboxDatabaseFailureItems/Operational"
Best Regards.
Similar Messages
-
We are getting a SCOM alert " Database copy isn''''t keeping up with the changes on the active database copy and has failed."
When we go to the mailbox server and check we found all the Mailbox DB's are mounted properly and their copies are in healthy state. Further digging to this problem we found that there are no critical events to this DB in the eventviewer.
When we runt he get-mailboxdatabasecopystatus -identity DBNAME | Fl then in the ErrorMessage section we get error "The Microsoft Exchange Replication service encountered an error while attempting to start a replication instance for . If the copy doesn't
recover automatically, administrator intervention may be necessary. Error: The directory is not empty."
ErrorEventID : 4126
In the event viewer we get a Information event 4114 saying the DB redundancy health check passed and in the details section we get the same error message mentioned above.
Also we observed that the "MSExchange Replication" counter "Failed" is set to 1 for this perticular DB but is is set to 0 for the other DB's. Tried restarting the MSEXChange replication service but still the "Failed" counter
is 1 and the SCOM alert is still active.
The version is Exchange 2010 SP3 UR5
Any clues???Hi,
From your description, this issue only affect one mailbox datatabase.
Please use the Update-MailboxDatabaseCopy command to check result.
If it doesn't work, please dismount your active database copy and check if this is Clean Shutdown, if it's dirty shutdown, please bring to clean shutdown and mount this database copy. Then run the Update-MailboxDatabaseCopy command again to check result.
Best regards,
Belinda Ma
TechNet Community Support -
Hi Board,
i´ve search across the board, technet and symantec sites but did not found a hint about my problem.
we drive a 2 node DAG (Location1-Ex1-mb1
Location2-exc1-mb1), on SP2 RU4 patchlevel with 40 Databases.
Since some time the backup of one - and only one DB - is failing with these events, logged on the Mailboxserver on which the passive DB is hosted.
Log Name: Application
Source: MSExchangeRepl
Date: 28.09.2012 00:37:17
Event ID: 2112
Task Category: Exchange VSS Writer
Level: Error
Keywords: Classic
User: N/A
Computer: Location1-Exc1-MB1
Description: The Microsoft Exchange Replication service VSS Writer instance 1ab7d204-609a-4aea-b0a7-70afb0db38de failed with error code 80070020 when preparing for a backup of database 'DB012'.
Followed by
Log Name: Application
Source: MSExchangeRepl
Date:
01.10.2012 03:33:06
Event ID: 2024
Task Category: Exchange VSS Writer
Level: Error
Keywords: Classic
User:
N/A
Computer: Location1-Exc1-MB1
Description:
The Microsoft Exchange Replication service VSS Writer (Instance 42916d80-36c1-4f73-86d0-596d30226349) failed with error 80070020 when preparing for a backup.
The backup Application - Symantec Backup Exec 2010 R3 – states, this error
Snapshot provider error (0xE000FED1): A failure occurred querying the Writer status.
Check the Windows Event Viewer for details.
Writer Name: Exchange Server, Writer ID: {76FE1AC4-15F7-4BCD-987E-8E1ACB462FB7}, Last error: The VSS Writer failed, but the operation can be retried (0x800423f3), State: Stable (1).
Symatec suggests within http://www.symantec.com/business/support/index?page=content&id=TECH184095
to restart the MS Exchange Replication Service – BUT the mentioned eventID
8229 isn´t present on any of the both Mailboxservers.
The affected Database is active on Location2-Exc1-Mb1 Server and in an overall healthy state. I found during my research, that below Location2-Exc1-Mb1 Server, there are not removed shadow copies present!
This confuses me, since all Backups are normally taken from the passive copy of a Database.
So my questions to the board are:
* Does anyone is facing similar issues?
* Can someone explain why snapshots are present on the Mailboxserver hosting the Active Database, whilst the errors are logged on the passive one?
- * Does someone know the conditions, why shadows copies remain and
aren´t removed in a proper manner?
What can cause the circumstance, that only 1 DB is facing such issues?
Any suggestion is welcome!
BR
MarkusHi Lenora,
I´ve encreases VSS / Exchange Backup Log levels to expert, before starting
those things i´ve all tried now:
- Backup from passive DB (forced within Symantec Backup Exec)
- Backup from active DB (forced within Symantec Backup Exec)
- Backup from passive DB without GRT enabled (forced within Symantec Backup Exec)
- Backup from active DB without GRT enabled(forced within Symantec Backup Exec)
All those attempts failed.
But brought some more details - the backup against the active DB states, that there is still a backup in progress and therefore this backup is cancelled by VSS.
The Solution was, that i´ve needed to restart the Exchange Replication Service on the Mailbox Server hosting the passive DB.
Backups are working again on all DBs!
THX for your replys.
Best regards
Markus -
We have 2 server DAG with a witness all servers are Server 2012 R2. Both members of the DAG are Exchange 2013 SP1 CU5.
We haven't seen this until recently, but during a VSS enabled storage snapshot the following errors occur. All of which are related to MSExchangeRepl and the Exchange VSS Writer.
Event IDs 2140, 2026, 2028, 2030.
The errors start with the 2140 seen here:
The Microsoft Exchange Replication service VSS Writer encountered an exception in function Microsoft::Exchange::Cluster::ReplicaVssWriter::CReplicaVssWriterInterop::PrepareSnapshot. HResult -2147467259. Exception Microsoft.Mapi.MapiExceptionDatabaseError: MapiExceptionDatabaseError:
Unable execute Prepare on snapshot. (hr=0x80004005, ec=1108)
Diagnostic context:
Lid: 1494 ---- Remote Context Beg ----
Lid: 56264 StoreEc: 0x1388
Lid: 46280 StoreEc: 0x454
Lid: 1750 ---- Remote Context End ----
Lid: 59216
Lid: 34640 StoreEc: 0x454
Lid: 52264
Lid: 46120 StoreEc: 0x454
at Microsoft.Mapi.MapiExceptionHelper.InternalThrowIfErrorOrWarning(String message, Int32 hresult, Boolean allowWarnings, Int32 ec, DiagnosticContext diagCtx, Exception innerException)
at Microsoft.Mapi.MapiExceptionHelper.ThrowIfError(String message, Int32 hresult, IExInterface iUnknown, Exception innerException)
at Microsoft.Mapi.ExRpcAdmin.ProcessSnapshotOperation(Guid mdbGuid, SnapshotOperationCode operationCode, UInt32 flags)
at Microsoft.Exchange.Cluster.Replay.StoreRpcController.<>c__DisplayClass1c.<SnapshotPrepare>b__1b()
at Microsoft.Exchange.Cluster.Replay.SafeRefCountedTimeoutWrapper.<>c__DisplayClass2.<ProtectedCallWithTimeout>b__0()
at Microsoft.Exchange.Data.HA.InvokeWithTimeout.Invoke(Action invokableAction, Action foregroundAction, TimeSpan invokeTimeout, Boolean sendWatsonReportNoThrow, Object cancelEvent)
at Microsoft.Exchange.Cluster.Replay.SafeRefCountedTimeoutWrapper.ProtectedCallWithTimeout(String operationName, TimeSpan timeout, Action operation)
at Microsoft.Exchange.Cluster.ReplicaVssWriter.CReplicaVssWriterInterop.SnapshotPrepare(Guid dbGuid, UInt32 flags)
at Microsoft.Exchange.Cluster.ReplicaVssWriter.CReplicaVssWriterInterop.PrepareSnapshot().Hi Julez K,
According to your description, you could try upgrading to CU6.
In addition, you could review the Exchange Application (App) Event log and the Exchange System (Sys) Event log for errors , and some related resolutions for your reference.
http://www.symantec.com/business/support/index?page=content&id=TECH209938
Note: Microsoft is providing this information as a convenience to you. The sites are not controlled by Microsoft. Microsoft cannot make any representations regarding the quality, safety,
or suitability of any software or information found there. Please make sure that you completely understand the risk before retrieving any suggestions from the above link.
Hope it can be helpful.
Best regards,
Eric -
VSS timeouts on Hyper-V guest, Exchange Replication Service, CU5
I'm having a problem with a Hyper-V guest that is running Exchange 2013 CU5. I will say I have had issues with this VM since installing CU5.
Note: I am doing a backup from within the VM guest, not from the host.
The scheduled backup takes a long time to complete, an entire weekend to backup 150GB to an ISCSI disk. In addition the CPU time is very high (58%) while this is happening. Attempting to open the backup manager window consistently makes the CPU
time hit 99%. When this happens Outlook clients will fail. When backup manager opens it will continually say "Reading data; please wait..." If the backup manager happened to already been open, the backup job will say "Volume
1 0% of 4 volumes."
The processes taking up the CPU time are Microsoft Volume Shadow Copy Service (24%), Microsoft Block level backup service (62%) and virtual disk service (12%). Memory use always hovers around 65%. If I attempt to kill the processes with
task manager, there is no change. If I use the kill executable it will say the process is not running. I cannot stop the corresponding service either. I cannot stop the backup. I cannot query vss writer status. I cannot restart
the ISCSI service (device in use.) Restarting the NAS that contains the ISCSI target does nothing. The only recourse is to restart the server.
If I restart the server and start a backup fairly soon after, the backup completes normally, in about an hour. During a normal backup CPU usage is at about 30%. The Microsoft Volume Shadow Copy Service runs at 0% CPU time as well as the virtual
disk service. The Microsoft Block Level Backup Engine runs at 10% CPU time. The scheduled backup is set to start at 9:30pm. I have also tried changing backup times. If I restart the server at 4 am and do not run a manual backup, the
scheduled backup performs poorly.
After some digging I find these errors:
Log Name: Application
Source: MSExchangeRepl
Date: 10/14/2014 9:30:41 PM
Event ID: 2112
Task Category: Exchange VSS Writer
Level: Error
Keywords: Classic
User: N/A
Description:
The Microsoft Exchange Replication service VSS Writer instance 7a465f3f-25ba-45b2-952a-870a6ddc2f2b failed with error code 80070020 when preparing for a backup of database 'Mailbox Database 2123847568'.
Log Name: Application
Source: VSS
Date: 10/14/2014 9:30:41 PM
Event ID: 8229
Task Category: None
Level: Warning
Keywords: Classic
User: N/A
Description:
A VSS writer has rejected an event with error 0x00000000, The operation completed successfully.
. Changes that the writer made to the writer components while handling the event will not be available to the requester. Check the event log for related events from the application hosting the VSS writer.
Operation:
PrepareForBackup event
Context:
Execution Context: Writer
Writer Class Id: {7e47b561-971a-46e6-96b9-696eeaa53b2a}
Writer Name: MSMQ Writer (MSMQ)
Writer Instance Name: MSMQ Writer (MSMQ)
Writer Instance ID: {b8ae6140-7fcb-427d-9493-e070221f752f}
Command Line: C:\Windows\system32\mqsvc.exe
Process ID: 1676
Log Name: Application
Source: MSExchangeRepl
Date: 10/14/2014 9:30:41 PM
Event ID: 2024
Task Category: Exchange VSS Writer
Level: Error
Keywords: Classic
User: N/A
Description:
The Microsoft Exchange Replication service VSS Writer (Instance 7a465f3f-25ba-45b2-952a-870a6ddc2f2b) failed with error 80070020 when preparing for a backup.
As you can see the errors happen almost immediately after the backup starts.
In addition the following VSSwriters show a last error "timed out"
Microsoft Exchange Writer
Com+ RegDB Writer
Shadow Copy Optimization Writer
Registry Writer
I will add also the following issues I have been experiencing ever since installing CU5:
1 Version buckets threshold easy reached. Had to modify threshold and set limits on email size. Sometimes still happens.
2 After a restart of the server, the server may have no, or limited connection to the network. It may or may not have an exclamation point on the network icon. If there is no exclamation point, it can ping other network resources, but inbound requests
and pings, will fail. The event log shows that the network should be available while booting, but it's not since it cannot communicate with active directory. The fix for this is to disable/enable the network adapter and then all is well.
I really need some help figuring this out. Again, never had an issue with this server prior to Exchange 2013 CU5 being installed.Hi,
The reason I think it is Exchange related, is that from the error message it mentioned:
A VSS writer has rejected an event with error 0x00000000
And then it indicated it is "Microsoft Exchange Replication service VSS Writer".
The Microsoft Exchange Replication service VSS Writer (Instance 7a465f3f-25ba-45b2-952a-870a6ddc2f2b) failed with error 80070020 when preparing for a backup
Later I found this article:
How to turn on the Exchange writer for the Volume Shadow Copy service in Windows Small Business Server 2003
http://support2.microsoft.com/kb/838183/en-us
It mentioned that with turned on Exchange writer, we will fail to do system state backup and Exchange backup at a same time:
The Exchange writer may cause conflicts with the information store backup feature of the Backup utility. The information store backup feature uses online streaming to back up the Exchange databases. If the Exchange writer is registered, the Backup utility
may log errors if you try to back up the system state and the Exchange information store at the same time. (For example, the Backup utility may log Event ID 8019.)
In order to confirm if it is the cause, please test to:
1. Create a backup-once task to backup only a simple file - this is a quick test to confirm if Windows Server Backup is fine or not.
2. A second backup-once task to do a system state backup without any exchange related information.
3. If both failed, please test to disable Exchange Writer and redo test 1&2.
It will take us some time on doing these tests. The reason is to figure out if it is Exchange related or not. I'll continue discuss with you if any new clue is found.
If you have any feedback on our support, please send to [email protected] -
Exchange replication service not starting after installing Rollup update 3 v3
Hi,
I have installed rollup update 3 v3 for exchange 2010 sp1 on my mailbox server. This mailbox server is a DAG member. After installing update rollup, Microsoft exchange replication service on this server attempts to start & show status started for couple
of seconds & then gets stopped. This behaviuor continues untill i uninstall this rollup. I have installed the same rollup on other DAG member as well with no problem. The following errors are generated in the application logs,
Event ID 4096: The Microsoft Exchange Replication service failed to start the Tcp Listener. Please review the Event Log for more information. The system will attempt to start the service again in 30 seconds.
Event ID 2121: The Microsoft Exchange Replication service failed to start the TCP listener. Error: System.Net.Sockets.SocketException: Only one usage of each socket address (protocol/network address/port) is normally permitted
at System.Net.Sockets.Socket.DoBind(EndPoint endPointSnapshot, SocketAddress socketAddress)
at System.Net.Sockets.Socket.Bind(EndPoint localEP)
at Microsoft.Exchange.Cluster.Replay.TcpListener.StartListening()
Event ID 4999:
Watson report about to be sent for process id: 6000, with parameters: E12, c-RTL-AMD64, 14.01.0289.001, msexchangerepl, M.E.Cluster.Replay, M.E.C.R.ReplicaInstance.TargetGetCopyStatus, System.MethodAccessException, a379, 14.01.0289.001.
ErrorReportingEnabled: True
Event ID 2060: The Microsoft Exchange Replication service encountered a transient error while attempting to start a replication instance for DB1\SRV-MBX2. The copy will be set to failed. Error:
The NetworkManager has not yet been initialized. Check the event logs to determine the cause.
Event ID 2060:
The Microsoft Exchange Replication service encountered a transient error while attempting to start a replication instance for DB2\SRV-MBX2. The copy will be set to failed. Error: The NetworkManager has not yet been initialized.
Check the event logs to determine the cause.
Thanks
in Advance!!I hope you are following the best practice to install rollup on DAG.
Suspend activation for the databases on the server being updated.
Perform a server switchover so that all databases on the server are passive copies. There will be a brief interruption in service for the mailboxes hosted on the active databases during the switchover process.
Install the update rollup.
Resume activation for the databases on the updated server.
Perform database switchovers as needed.
http://blogs.technet.com/b/scottschnoll/archive/2009/12/10/installing-update-rollup-1-for-exchange-2010-on-dag-members.aspx
Thanks Uday Kiran,
Senior Consultant
Cyquent Technology Consultants, Dubai
Please Mark as answer if it helps you -
Microsoft Exchange Transport Service has been stopped.
I changed out failing drives on a customers SBS 2011 which uses Intel Rapid Storage Technology. I saw that it had an option to email a failing condition to up to three addresses. I configured it but got the Could Not Connect error from the RST software.
I Googled the condition and determined that I needed to install another Receive Connector on a different port, which is what I tried.
I opened all of the receive connectors in the MS Exchange Console - Server Configuration - Hub Transport section. As far as I know I did not change anything there, I just looked and cancelled. None of the new receive connectors I configured worked., Everytime
one configuration made no difference I deleted the connector, re added a connector using different options and tried the RST email test again.
Getting nowhere, I deleted the last new connector I configured, Closed the Exchange management console figuring I had a job for some time in the future when things were slow.
This morning I get the call No e-mail, and found the Exchange Transport Service stopped. I restarted it, made sure it was set to automatic, and it ran for about 3 minutes and then stopped. Right now I am keeping my customer working by starting the service
every 5 to 10 minutes.
Since I am not going to stay here indefinitely restarting the service, I need help!
J R WeymouthThanks for the links, I'll check them out. I finally got an error from the logs, in fact there are so many errors in the log I don't even know which one to follow. It gets a cluster of errors every 5 minutes (6 errors one critical) but that is when the transport
is shut down. I am relaying the first message that comes up after I restart the transport.
Log Name: Application
Source:
MSExchangeTransport
Date:
9/16/2014 9:11:48 AM
Event ID: 17003
Task Category: Storage
Level:
Error
Keywords: Classic
User:
N/A
Computer: SBSSRVR2011.MyDomain.local
Description:
Sender Reputation Database: An operation has encountered a fatal error. The database may be corrupted. The Microsoft Exchange Transport service is shutting down. Manual database recovery or repair may be required. Exception details: Microsoft.Exchange.Isam.IsamDbTimeCorruptedException:
Dbtime on current page is greater than global database dbtime (-344)
at Microsoft.Exchange.Isam.JetInterop.MJetDelete(MJET_TABLEID tableid)
at Microsoft.Exchange.Isam.Interop.MJetDelete(MJET_TABLEID tableid)
at Microsoft.Exchange.Transport.Storage.DataTableCursor.DeleteCurrentRow()
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="MSExchangeTransport" />
<EventID Qualifiers="49156">17003</EventID>
<Level>2</Level>
<Task>17</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2014-09-16T16:11:48.000000000Z" />
<EventRecordID>1141638</EventRecordID>
<Channel>Application</Channel>
<Computer>SBSSRVR2011.BainInsuranceAgency.local</Computer>
<Security />
</System>
<EventData>
<Data>Sender Reputation Database</Data>
<Data>Microsoft.Exchange.Isam.IsamDbTimeCorruptedException: Dbtime on current page is greater than global database dbtime (-344)
at Microsoft.Exchange.Isam.JetInterop.MJetDelete(MJET_TABLEID tableid)
at Microsoft.Exchange.Isam.Interop.MJetDelete(MJET_TABLEID tableid)
at Microsoft.Exchange.Transport.Storage.DataTableCursor.DeleteCurrentRow()</Data>
</EventData>
</Event>
J R Weymouth -
Exchange 2010 SP3 services keep stopping
Hi Folks,
Been struggling to find any answer to this problem. Hoping someone here can give me some guidance.
Have a Windows SBS 2011 Server which was migrated from SBS 2003. Have a secondary DC purely as a backup DC.
Ever so often three Exchange Services keep stopping:
EdgeSync
Forms-Based Authentication service
RPC Client Access
Which of course stops clients accessing emails. If I simply start the three services all is good. Sometimes it will be a few days before they will stop again. At the time they stop I get the following Event ID's
MSExchangeRPC Event ID: 1002
Failed to register service principal name ExchangeMDB. Failed with error code No authority could be contacted for authentication (0x80090311).
MSExchange EdgeSync Event ID: 1045
Initialization failed with exception: Microsoft.Exchange.Data.Directory.NoSuitableServerFoundException: The Microsoft Exchange Active Directory Topology service on server localhost did not return any suitable domain controllers.
at Microsoft.Exchange.Data.Directory.DSAccessTopologyProvider.GetConfigDCInfo(Boolean throwOnFailure)
at Microsoft.Exchange.Data.Directory.TopologyProvider.PopulateConfigNamingContexts()
at Microsoft.Exchange.Data.Directory.ADSession.GetConfigurationNamingContext()
at Microsoft.Exchange.Data.Directory.SystemConfiguration.ADSystemConfigurationSession.GetLocalSite()
at Microsoft.Exchange.EdgeSync.EdgeSyncConfig.<Initialize>b__0()
at Microsoft.Exchange.Data.Directory.ADNotificationAdapter.RunADOperation(ADOperation adOperation, Int32 retryCount)
at Microsoft.Exchange.Data.Directory.ADNotificationAdapter.TryRunADOperation(ADOperation adOperation, Int32 retryCount). If this warning frequently occurs, contact Microsoft Product Support.
I'm wondering if something went wrong with the migration as we didn't do it. It is a fairly new server 6 months old.
Any suggestions would be appreciated.
Thanks BrandanThanks Guys,
I can confirm that IPv6 is on. I ran the BPA and it did throw up a few issues, one with old server in DNS forward lookup zone. I have fixed these and waiting to see if they make a difference. The problem has always happened since the migration.
I have also switched the FormsBased and RPC to delayed start because I noticed these sometimes don't start after a reboot.
Keeping an eye on it and will report any updates.
Thanks Brandan -
Microsoft Forefront Threat management Gateway services keeps stopping
Please assist urgently
Microsoft Forefront Threat management Gateway 2010 services keeps stopping. We are on the
Service Pack 2 Roll update 5
On the event viewer does not display reason why the services stopped.
Your assistance will be highly appreciated
Regards
Daniel NkunaHi,
Here is a similar thread that TMG keeps stopping and no error displayed in event log. It is fixed by uninstalling Surf cop on the TMG servers. Do you have such application installed on TMG server?
TMG Firewall service stopping
Best Regards,
Joyce
Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Subscriber Support, contact [email protected] -
Hi All,
After installing the lovely update (https://technet.microsoft.com/library/security/ms14-068) that Microsoft released last week. I restarted my exchange 2013 server and now the Microsoft Exchange Transport Service will not start. It is pretty much constantly
in a starting state. Tried all the standard stuff, restarting the server etc.
Running on Server 2012 R2
Hopefully someone can shed some light.
PeteHi,
A few questions here :
- Have you installed all roles on the same server ? ( MultiRole server )
- If yes, do use the preconfigured receive connectors from Exchange or did you add custom connectors ?
- Do you have any anti-spam service installed on the server that could be listening on port 25 ? -
I'm replicating between two servers in two sites (Server A - Server 2012 R2 STD, Server B - Server 2008 R2) over a VPN (Sonicwall Firewall). Though the initial replication seems to be
happening it is very slow (the folder in question is less than 3GB). I'm seeing these in the event viewer every few minutes:
The DFS Replication service is stopping communication with partner PPIFTC for replication group FTC due to an error. The service will retry the connection periodically.
Additional Information:
Error: 1726 (The remote procedure call failed.)
and then....
The DFS Replication service successfully established an inbound connection with partner PPIFTC for replication group FTC.
Here are all my troubleshooting steps (keep in mind that our VPN is going through a SonicWall <--I increased the TCP timeout to 24 hours):
-Increased TCP Timeout to 24 hours
-Added the following values on both sending and receiving members and rebooted server
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
Value =DisableTaskOffload
Type = DWORD
Data = 1
Value =EnableTCPChimney
Type = DWORD
Data = 0
Value =EnableTCPA
Type = DWORD
Data = 0
Value =EnableRSS
Type = DWORD
Data = 0
---------------------------------more troubleshooting--------------------------
-Disabled AntiVirus on both members
-Made sure DFSR TCP ports 135 & 5722 are open
-Installed all hotfixes for 2008 R2 (http://support.microsoft.com/kb/968429) and rebooted
-Ran NETSTAT –ANOBP TCP and the DFS executable results are listed below:
Sending Member:
[DFSRs.exe]
TCP 10.x.x.x:53 0.0.0.0:0
LISTENING 1692
[DFSRs.exe]
TCP 10.x.x.x:54669
10.x.x.x:5722 TIME_WAIT 0
TCP 10.x.x.x:54673
10.x.x.x:5722 ESTABLISHED 1656
[DFSRs.exe]
TCP 10.x.x.x:64773
10.x.x.x:389 ESTABLISHED 1692
[DFSRs.exe]
TCP 10.x.x.x:64787
10.x.x.x:389 ESTABLISHED 1656
[DFSRs.exe]
TCP 10.x.x.x:64795
10.x.x.x:389 ESTABLISHED 2104
Receiving Member:
[DFSRs.exe]
TCP 10.x.x.x:56683
10.x.x.x:389 ESTABLISHED 7472
[DFSRs.exe]
TCP 10.x.x.x:57625
10.x.x.x:54886 ESTABLISHED 2808
[DFSRs.exe]
TCP 10.x.x.x:61759
10.x.x.x:57625 TIME_WAIT 0
TCP 10.x.x.x:61760
10.x.x.x:57625 TIME_WAIT 0
TCP 10.x.x.x:61763
10.x.x.x:57625 TIME_WAIT 0
TCP 10.x.x.x:61764
10.x.x.x:57625 TIME_WAIT 0
TCP 10.x.x.x:61770
10.x.x.x:57625 TIME_WAIT 0
TCP 10.x.x.x:61771
10.x.x.x:57625 TIME_WAIT 0
TCP 10.x.x.x:61774
10.x.x.x:57625 TIME_WAIT 0
TCP 10.x.x.x:61775
10.x.x.x:57625 TIME_WAIT 0
TCP 10.x.x.x:61776
10.x.x.x:57625 TIME_WAIT 0
TCP 10.x.x.x:61777
10.x.x.x:57625 TIME_WAIT 0
TCP 10.x.x.x:61778
10.x.x.x:57625 TIME_WAIT 0
TCP 10.x.x.x:61779
10.x.x.x:57625 TIME_WAIT 0
TCP 10.x.x.x:61784
10.x.x.x:52757 ESTABLISHED 7472
[DFSRs.exe]
TCP 10.x.x.x:63661
10.x.x.x:63781 ESTABLISHED 4880
------------------------------more troubleshooting--------------------------
-Increased Staging to 32GB
-Opened the ADSIedit.msc console to verify the "Authenticated Users" is set with the default READ permission on the following object:
a. The computer object of the DFS server
b. The DFSR-LocalSettings object under the DFS server computer object
-Ran
ping <var>10.x.x.x</var> -f -l 1472 and got replies back from both servers
-AD replication is successful on all partners
-Nslookup is working so DNS is working
-Updated NIC drivers on both servers
- I ran the following to set the Primary Member:
dfsradmin Membership Set /RGName:<replication group name> /RFName:<replicated folder name> /MemName:<primary member> /IsPrimary:True
Then Dfsrdiag Pollad /Member:<member name>
I'm seeing these errors in the dfsr logs:
20141014 19:28:17.746 9116 SRTR 957 [WARN] SERVER_EstablishSession Failed to establish a replicated folder session. connId:{45C8C309-4EDD-459A-A0BB-4C5FACD97D44} csId:{7AC7917F-F96F-411B-A4D8-6BB303B3C813}
Error:
+ [Error:9051(0x235b) UpstreamTransport::EstablishSession upstreamtransport.cpp:808 9116 C The content set is not ready]
+ [Error:9051(0x235b) OutConnection::EstablishSession outconnection.cpp:532 9116 C The content set is not ready]
+ [Error:9051(0x235b) OutConnection::EstablishSession outconnection.cpp:471 9116 C The content set is not ready]
---------------------------------------more troubleshooting-----------------------------
I've done a lot of research on the Internet and most of it is pointing to the same stuff I've tried. Does anyone have any other suggestions? Maybe I need to look somewhere
else on the server side or firewall side?
I tried replicating from a 2012 R2 server to another 2012 server and am getting the same events in the event log so maybe it's not a server issue.
Some other things I'm wondering:
-Could it be the speed of the NICs? Server A is a 2012 Server that has Hyper-V installed. NIC teaming was initially setup and since Hyper-V is installed the NIC is a "vEthernet
(Microsoft Network Adapter Multiplexor Driver Virtual Switch) running at a speed of 10.0Gbps whereas Server B is running a single NIC at 1.0Gbps
-Could occasional ping timeout's cause the issue? From time to time I get a timeout but it's not as often as the events I'm seeing. I'm getting 53ms pings. The folder
is only 3 GB so it shouldn't take that long to replicate but it's been days. The schedule I have set for replication is mostly all day except for our backup times which start at 11pm-5am. Throughout the rest of the time I have it set anywhere from
4Mbps to 64 Kbps. Server A is on a 5mb circuit and Server B is on a 10mb circuit.I'm seeing the same errors, all servers are running 2008 R2 x64. Across multiple sites, VPN is steady and reliably.
185 events from 12:28:21 to 12:49:25
Events are for all five servers (one per office, five total offices, no two in the same city, across three states).
Events are not limited to one replication group. I have quite a few replication groups, so I don't know for sure but I'm running under the reasonable assumption that none are spared.
Reminder from original post (and also, yes, same for me), the error is: Error: 1726 (The remote procedure call failed.)
Some way to figure out what code triggers an Event ID 5014, and what code therein specifies an Error 1726, would extremely helpful. Trying random command line/registry changes on live servers is exceptionally unappealing.
Side note, 1726 is referenced here:
https://support.microsoft.com/kb/976442?wa=wsignin1.0
But it says, "This RPC connection problem may be caused by an unstable WAN connection." I don't believe this is the case for my system.
It also says...
For most RPC connection problems, the DFS Replication service will try to obtain the files again without logging a warning or an error in the DFS Replication log. You can capture the network trace to determine whether the cause of the problem is at the network
layer. To examine the TCP ports that the DFS Replication service is using on replication partners, run the following command in a
Command Prompt window:
NETSTAT –ANOBP TCP
This returns all open TCP connections. The connections in question are "DFSRs.exe", which the command won't let you filter for.
Instead, I used the NETSTAT command as advertised, dumping output to info.txt:
NETSTAT -ANOBP TCP >> X:\info.txt
Then I opened Excel and manually opened the .TXT for the open wizard. I chose fixed-width fields based on the first row for each result, and then added a column:
=IF(A3="Can not", "Can not obtain ownership information", IF(LEFT(A3,1) = "[", A3&B3&C3, ""))
Dragging this down through the entire file let me see that row (Row F) as the file name. Some anomalies were present but none impacted DFSrs.exe results.
Finally, you can sort/filter (I sorted because I like being able to see everything, should I choose to) to get just the results you need, with the partial rows removed from the result set, or bumped to the end.
My server had 125 connections open.
That is a staggering number of connections to review, and I feel like I'm looking for a needle in a haystack.
I'll see if I can find anything useful out, but a better solution would be most wonderful. -
DFSR Event 5014 - DFS Replication service is stopping communication
Hi,
I seem to be having issues with the DFSR warnings in the event log. I receive the below warning every 5 minutes:
"The DFS Replication service is stopping communication with partner 'Servername' for replication group Domain System Volume due to an error. The service will retry the connection periodically.
Additional Information:
Error: 1726 (the remote procedure call failed.)"
The errors are immediately followed by an information entry (5004) stating that a connection was successfully established but then the warning repeats after 5 minutes again. Replication does actually seem to be working fine and the SYSVOL shares on both
domain controllers are identical.
I have run diagnostic reports from the DFS Management snapin and the only error reported is that
"The DFS Replication service is restarting frequently".
I have disabled TCP Offloading on the server as per other suggestions which doesn;t seem to have made a difference.
For reference, the domain controllers are in separate sites connected via site-to-site VPN. The AD sites are configured with the correct subnets and the WAN/VPN connection seems stable as I am getting consistent 87ms ping responses.
Any assistance would be greatly appreciated.
Thanks,
Charlie.Although I haven't specifically run into this, doing a Bing search I have seen a common theme that if there is a firewall between the two it drops the connection after a set period of inactivity. You might want to investigate that possibility.
http://faultbucket.ca/2011/02/dfsr-event-5014-the-remote-procedure-call-failed/
http://social.technet.microsoft.com/Forums/windowsserver/en-US/68c4f402-6c77-4388-9701-51a4fc112964/error-1726-the-remote-procedure-call-failed-every-7-minutes-dfsr-backlogs?forum=winserverDS
Paul Bergson
MVP - Directory Services
MCITP: Enterprise Administrator
MCTS, MCT, MCSE, MCSA, Security+, BS CSci
2008, Vista, 2003, 2000 (Early Achiever), NT4
Twitter @pbbergs
http://blogs.dirteam.com/blogs/paulbergson
Please no e-mails, any questions should be posted in the NewsGroup. This posting is provided "AS IS" with no warranties, and confers no rights. -
Microsoft Exchange Transport service terminated unexpectedly
Hi All,
We have configured exchange 2013 in windows server 2012. From last two days exchange transport service getting stopped.
Microsoft Exchange Transport service terminated unexpectedly.
Log Name: System
Source: Service Control Manager
Date: 11/13/2014 1:10:28 AM
Event ID: 7031
Task Category: None
Level: Error
Keywords: Classic
User: N/A
Computer: BMBPRMBX01.BMB.LOCAL
Description:
The Microsoft Exchange Transport service terminated unexpectedly. It has done this 1 time(s). The following corrective action will be taken in 5000 milliseconds: Restart the service.Hi,
This issue may due to some of the corrupt message in the queue. Please try these steps to solve this issue.
Pause the transport service and let all the messaged go out.
When queue is empty, stop the transport service and go to the mail.que database in transport-->queue folder. Cut paste all the files from queue folder to some other folder as backup.
Now start the transport service, it will create a fresh mail.que database and observe if the service us crashing again.
In addition, please try to disable any AV software, sometimes it will cause Transport service corrupted.
Best Regards. -
Noticed at about noon that no emails had been received all day. Began to investigate and found that the MS Exchange Transport service had been set to deny email submission because it was using too much memory on the server (91%).
The error message makes me think that we may have been getting used by malware or something similar.“The Microsoft Exchange Transport service is rejecting message submissions because the service continues to consume more memory than the
configured threshold.”
There are also several warning messages that list particular IP addresses and say that a connection from that IP was denied because there were already the maximum number of connections (20).
From what I can tell, all of the IP addresses are from Taiwan.
The time period for which some emails may be missing is from close of business yesterday ( 4/3/2014) through about 12:45 today (4/4/2014).
From the time I spent reading and trying to figure out the error, I think we may need to readjust our throttling policies to prevent this from happening.
The exchange server is currently running at 90%+ CPU and 50%+ memory usage the majority of the time, and I’m not sure how to fix it.
Also, I cannot get into EMS I get a access denied message from the destination computer. (Exchange server) I want to get into there to change the throttling policy back to default, since we disabled it.
The Error reads:
The WinRM client cannot process the request. The WinRM client tried to use Kerberos authentication mechanism, but the destination computer <Exchange> returned an 'access denied' error. Change the configuration to allow Kerberos authentication
mechanism to be used or specify one of the authentication mechanism supported by the server. (How do I do this?) To use Kerberos, specify the local computer name as the remote destination. (I'm trying to use EMS while logged into the local Exchange server)
Also verify that the client computer and the destination computer are joined to a domain. (Exchange is on our domain, and the computer trying to connect is the same computer) To use basic, specify the local computer name as the remote destination, specify
Basic authentication and provide user mane and password. Possible authentication mechanisms reported by server.
At line:1 char:1
+ New-PSSession -ConnectionURI "$connectionUri" -ConfigurationName Microsoft.Excha ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : OpenError: (System.Manageme....RemoteRunspace:RemoteRunspace) [New-PSSession], PSRemotingTransportException + FullyQualifiedErrorId : AccessDenied,PSSessionOpenFailed
I assumed control of this exchange system already in place and I do not have much experience with exchange 2013 or server 2012. I do know 2008, but that doesn't help very much in this situation.
Recent changes to the system:
About three days ago we switch our sessions policy to allow many more connections, and I believe this caused the issue. This is what I changed it to:
Made the registry DWORD (32-bit) "Maximum Allowed Sessions Per User" and modified the value to 1000. Location of registry change @ HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\MSExchangeIS\ParametersSystem
I just changed it to 10 from the 1000. I'm hoping this solves this. So far no.
Also, I am not the best in the shell or command line interfaces. Any help would be wonderful!Hi,
Yes, could be hardware performance issue. Try recycle the Transport process and see if the issue persists.
Thanks,
Simon Wu
TechNet Community Support -
"The Microsoft Exchange Diagnostics service terminated unexpectedly"
Hi all, this happens about twice per hour and started some days ago, no reconfigs etc that should make it do that as far as I can find. Have been running CU6 for some months without this happening until recently.
Also getting AppEventID 1007:
Failed to create or start performance logs with error: System.ArgumentException: Value does not fall within the expected range.
at PlaLibrary.DataCollectorSetClass.start(Boolean Synchronous)
at Microsoft.Exchange.Diagnostics.PerformanceLogger.PerformanceLogSet.StartLog(Boolean synchronous)
at Microsoft.Exchange.Diagnostics.PerformanceLogger.PerformanceLogMonitor.CheckPerflogStatus(). Performance log: ExchangeDiagnosticsPerformanceLog.
Found this article
http://blogs.technet.com/b/anya/archive/2014/11/25/microsoft-exchange-diagnostics-service-crashing-post-cu6-upgrade.aspx, but I do not actually understand what the "fix" really is here, many steps are made for Diagnostics buth which
of those are necessary to take? And I'm not so into exporting/importing those XMLs.. I have only one E2013-server also.
Hope someone can clarify this. I was thinking of installing CU7 if that might fix it, but I'm not sure if I dare in danger of running into trouble with things not working as they should.
Thanks,
-Ray.Hi,
Based on the description of that article, the root issue was due to upgrade. While upgrading to Exchange 2013 CU6, the upgrade did some modifications in the registry hives the ExchangeDiagnosticsDailyPerformanceLog got missing from Task Scheduler, performance
monitor and was also missing from the location C:\Windows\System32\Tasks\Microsoft\Windows\PLA
Afterwards he did many troubleshooting steps to this issue.
Actually we don’t need to export/import templates key to XML file. We can just delete both ExchangeDiagnosticsDailyPerformanceLog and ExchangeDiagnosticsPerformanceLog entries from the above registry location and then reboot server. After rebooting server,
the new default ExchangeDiagnosticsDailyPerformanceLog will be recreated automatically in the task scheduler and performance monitor.
Here are my suggested steps for ference:
Ensure that the templates key are present under
HKLM\Software\Microsoft\PLA\Templates
Locate to
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Schedule\TaskCache\Tree\Microsoft\Windows\PLA\ExchangeDiagnosticsDailyPerformanceLog and delete ExchangeDiagnosticsDailyPerformanceLog and ExchangeDiagnosticsPerformanceLog entries.
Reboot server and go to registry to check if that two entries have been recreated.
Check if default ExchangeDiagnosticsDailyPerformanceLog was recreated in task scheduler and performance monitor.
If this issue persists, please let me know.
Best Regards.
Maybe you are looking for
-
Steps involved in validating a server's certificate
Hello All, I'm writing a custom trust manager and wondering if anyone can tell me all the steps that are involved in validating a certificate presented by the server during an SSL handshake. The following are the things I think are must to check if a
-
Hi Gurus, I configured every settings for CIN in ECC 6.0, Now while creating Material 'Excise data is not coming in Foreign Trade Import tab' infact I maintained same in J1ID............now I created PO ..........tax procedure is taxinn........all co
-
Nearby MovieClips "shake" and "vibrate"
I often have this problem but didn't really care about it when it doesn't appear to be too apparent. But this time, it is quite obvious. There seems to always have a problem in some of my animations when I have movements involved for clicking somethi
-
I rented a movie from the iTunes Store, and get about 2 minutes into watching it when it stops and a "Cant load video" message appears. I've tried everything: signed in and out of my account, restarted the iPad, closed apps, and my operating system i
-
Box just reset itself, now wont bring up guide, dvr, or info about shows
my hd dvr box just randomly reset itself for no reason, and now it only displays the channel, wont let me bring up a guide, dvr or information about the program, athe channel up and down work, however it wont bring up information about the channels a