Why finding replication stream matchpoint takes too long

hi,
I am using bdb je 5.0.58 HA(two nodes group,JVM 6G for each node).
Sometimes, I found bdb node takes too long to restart(about 2 hours).
When this occurs, I catch the process stack of bdb(jvm process) by jstack.
After analyzing stack,I found "ReplicaFeederSyncup.findMatchpoint()" taking all the time.
I want to know why this method takes so much time,and how can I avoid this bad case.
Thanks.
帖子经 liang_mic编辑过

Liang,
2 hours is indeed a huge amount of time for a node restart. It's hard be sure without doing more detailed analysis of your log as to what may be going wrong, but I do wonder if it is related to the problem you reported in outOfMemory error presents when cleaner occurs [#21786]. Perhaps the best approach is for me to describe in more detail what happens when a replicated node is connecting with a new master, which might give you more insight into what is happening in your case.
The members of a BDB JE HA replication group share the same logical stream of replicated records, where each record is identified with a virtual log sequence number, or VLSN. In other words, the log record described by VLSN x on any node is the same data record, although it may be stored in a physically different place in the log of each node.
When a replica in a group connects with a master, it must find a common point, the matchpoint, in that replication stream. There are different situations in which a replica may connect with a master. For example, it may have come up and just joined the group. Another case is when the replica is up already but a new master has been elected for the group. One way or another, the replica wants to find the most recent point in its log, which it has in common with the log of the master. Only certain kinds of log entries, tagged with timestamps, are eligible to be used for such a match, and usually, these are transaction commits and aborts.
Now, in your previous forum posting, you reported an OOME because of a very large transaction, so this syncup issue at first seems like it might be related. Perhaps your replication nodes need to traverse a great many records, in an incomplete transaction, to find the match point. But the syncup code does not blindly traverse all records, it uses the vlsn index metadata to skip to the optimal locations. In this case, even if the last transaction was very long, and incomplete, it should know where the previous transaction end was, and find that location directly, without having to do a scan.
As a possible related note, I did wonder if something was unusual about your vlsn index metadata. I did not explain this in outOfMemory error presents when cleaner occurs but I later calculated that the transaction which caused the OOME should only have contained 1500 records. I think that you said that you figured out that you were deleting about 15 million records, and you figured out that it was the vlsn index update transaction which was holding many locks. But because the vlsn index does not record every single record, it should only take about 1,500 metadata records in the vlsn index to cover 15 million application data records. It is still a bug in our code to update that many records in a single transaction, but the OOME was surprising, because 1,500 locks shouldn't be catastrophic.
There are a number of ways to investigate this further.
- You may want to try using a SyncupProgress listener described at http://docs.oracle.com/cd/E17277_02/html/java/com/sleepycat/je/rep/SyncupProgress.html to get more information on which part of the syncup process is taking a long time.
- If that confirms that finding the matchpoint is the problem, we have an unadvertised utility, meant for debugging, to examine the vlsn index. The usage is as follows, and you would use the -dumpVLSN option, and run thsi on the replica node. But this would require our assistance to interpret the results. We would be looking for the records that mention where "sync" points are, and would correlate that to the replica's log, and that might give more information if this is indeed the problem, and why the vlsn index was not acting to optimize the search.
$ java -jar build/lib/je.jar DbStreamVerify
usage: java { com.sleepycat.je.rep.utilint.DbStreamVerify | -jar je-<version>.jar DbStreamVerify }
-h <dir> # environment home directory
-s <hex> # start file
-e <hex> # end file
-verifyStream # check that replication stream is ascending
-dumpVLSN # scan log file for log entries that make up the VLSN index, don't run verify.
-dumpRepGroup # scan log file for log entries that make up the rep group db, don't run verify.
-i # show invisible. If true, print invisible entries when running verify mode.
-v # verbose

Similar Messages

  • Why does my iphone 4 take soooo long to stream video?

    Why does my iphone 4 take soooo long to stream a video???  I will have an iphone 3GS next to it, and we will play a youtube video at the same time, and the 3GS goes significantly faster.  Any advice??

    Bad cable? Bad USB port? Something corrupt in the album itself?
    We need more info as to what your configuration is and what you've tried.

  • I don't know why it takes too long time to sample flat file.

    I don't know why it takes too long time to sample flat file.
    OWB Client 10.1
    While importing a flat file of fixed width ,
    in the screen "Flat File Sample Wizard" shows the text box number of rows with default value 200.
    I want to extend this value to 700,000.
    But, it takes too long time (over 5 hours) to sample it.
    Do you know why it is happend? or How can i fix this problem?
    Thanks in Advance.
    Regards,
    JWS.

    Hello,
    Actually flat file sampling process’ goal is to capture the structure of the file. That’s why initially the sample size is set to 200 lines.
    The question is why you are trying to perform sampling by 700000 rows? Are you expecting some change in structure beyond this mark?
    If so, and you want to capture the fact that your source file is multi – typed, your better prepare small file for sampling outside the OWB.
    Sergey

  • Why does it take too long to open attachments from email account?

    It takes forever to download attachments from my email account. I have tried to do the following tasks explained on the vista forum :http://windowshelp.microsoft.com/Windows/en-US/help/6b046ae9-1434-4423-9303-400ff6fe686b1033.mspx#ESD but none of the possible fixes work.
    After clicking open on the pop up box asking whether i want to open or save the attachement It takes too long to download it. The transfer window stays open showing that it is ready to download but stays at that bhox window. I press cancel and try to open again, if lucky it opens that file, otherwise it takes forever forceing me to cancel. The files very small files most of the time, usually between 50kb so should take seconds.
    I have even tried to save the files but again same process. The transfer box stays open but does not download.
    Any one any ideas.
    Thanx in advance.

    Hello
    It is not easy to say what happen exactly but it must be something with email account provider and their page. For me this case is not some typical Vista problem but you can try to find solution on Microsoft Vista IT Pro forum.
    By the way: do you have alternative mail address by some other provider? Is there the same situation?

  • Why ipad 2 / iphone 3g resetting takes too long?

    Im trying to reset and delete all content of my ipad and iphone 3g by choosing the selections about resetting on the general. Why it takes too long to reset my ipad / iphone 3g? It is almost 2 days my ipad are on the apple logo / looping circle appears on it and still im waiting for my ipad / iphone 3g to return to main screen.  What will i do? Thanks...

    There is no other way except to restore the iPad - plain and simple. You have to restore the device within iTunes. You want to use the same computer that you always sync with so that you can restore your app data and settings. You can restore with any other computer, but you will lose everything on the iPad.
    You will need to use recovery mode
    iPad: Unable to update or restore

  • Indexing and categorization takes too long - why?

    I have set up a news publishing system where journalists access a folder and publish their news there using an xml form. So far so good.
    Readers of the news access a page with a km-navigation iview that points to a taxonomy folder. The query based taxonomy is set up to categorize news based on property-values chosen by the journalist in the xml-form. This also works as supposed.
    Lifetime (time based publishing) is activated for the folder the journalists access to publish their news. Corresponding start and end dates/times for setting lifetime for an article is set by the journalists. This also works as supposed.
    BUT: after saving each article the system takes awfully long to actually categorize the news and thereby make the articles visible for others than the journalists. It might be as long as half an hour or so. This also repeats itself every time an article is edited.
    I also feel that basic indexing of all other documents in the portal takes too long. I want all new documents to be indexed as soon as they're saved.
    Any tips?
    Henning

    I have set up a news publishing system where journalists access a folder and publish their news there using an xml form. So far so good.
    Readers of the news access a page with a km-navigation iview that points to a taxonomy folder. The query based taxonomy is set up to categorize news based on property-values chosen by the journalist in the xml-form. This also works as supposed.
    Lifetime (time based publishing) is activated for the folder the journalists access to publish their news. Corresponding start and end dates/times for setting lifetime for an article is set by the journalists. This also works as supposed.
    BUT: after saving each article the system takes awfully long to actually categorize the news and thereby make the articles visible for others than the journalists. It might be as long as half an hour or so. This also repeats itself every time an article is edited.
    I also feel that basic indexing of all other documents in the portal takes too long. I want all new documents to be indexed as soon as they're saved.
    Any tips?
    Henning

  • [SOLVED] initramfs takes too long to load

    Using systemd-analyze I found out that initramfs takes too long to load:
    463ms (kernel) + 11875ms (initramfs) + 6014ms (userspace) = 18353ms
    My HOOKS array in mkinitcpio.conf is the following:
    HOOKS="base udev autodetect modconf block encrypt lvm2 filesystems usbinput fsck"
    I suspect that the long loading time of initramfs is caused by partitions decryption (I am using dm-crypt / LUKS on top of LVM).
    Is there any tool that can report loading times of HOOKS seperately, just like systemd-analyze plot does for userspace?
    Last edited by nasosnik (2013-01-21 14:45:28)

    cfr wrote:
    In what sense is it "too long"?
    I'm just wondering: suppose that you find out that it is because you are using encryption. Would you then switch to a non-encrypted system? Would you make better use of the extra seconds you might save on those rare occasions when you reboot? Even if you reboot twice a day, you might save what? Suppose you would even save 5s per boot. That will give you a whole extra 1 minute and 10 seconds a week. Assuming you don't multitask. Obviously if you multitask, the gain will be less. Would that make it worth risking the security of your data?
    EDIT: I didn't mean this to sound as confrontational as it does now I read it back. It just always puzzles me that people are so concerned about shaving a few milliseconds here and there. I always hope that they put the time they save to good use but then I realise that the time they spent shaving the milliseconds off will obviously outstrip the time saved.
    I really don't care about the boot time because of the reasons you have already mentioned. I just want to figure out if there is any misconfiguration. I am just investigating why initramfs takes significant longer to load compared with my desktop Arch installation (non-encrypted) 1316ms for initramfs. My desktop has a Pentium 4 CPU and laptop has a quad-core i7.
    roentgen wrote:11875ms (initramfs)  means the time it takes you to type the password.
    systemd-analyze is not counting the time is spent to type the password.

  • AME CS6 rendering with AE and Pr takes too long

    Hi Guys,
    Need some help here. i have rendered a 30 secs mp4 video with 1920 x 1080 HD format 25 frames w/o scripting in AME for 4 hours!
    Why does it take too long? I have rendered a 2 minute video with same format w/ scripting but only spare less than 30 minutes for rendering.
    Im using After Effects and Premium Pro both CS6 and using Dynamic Link in AME.
    What seems to be wrong in my current settings?
    Any help would be appreciated.
    Thanks!

    This may be a waste of time, but it won't take a minute and is something you should always do whenever things go strangely wrong  ............ trash the preferences, assuming you haven't done it already.
    Many weird things happen as a result of corrupt preferences which can create a vast range of different symptoms, so whenever FCP X stops working properly in any way, trashing the preferences should be the first thing you do using this free app.
    http://www.digitalrebellion.com/prefman/
    Shut down FCP X, open PreferenceManager and in the window that appears:-
    1. Ensure that only  FCP X  is selected.
    2. Click Trash
    The job is done instantly and you can re-open FCP X.
    There is absolutely no danger in trashing preferences and you can do it as often as you like.
    The preferences are kept separately from FCP X and if there aren't any when FCP X opens it automatically creates new ones  .  .  .  instantly.

  • My Query takes too long ...

    Hi ,
    Env   , DB 10G , O/S Linux Redhat , My DB size is about 80G
    My query takes too long ,  about 5 days to get results , can you please help to rewrite this query in a better way ,
    declare
    x number;
    y date;
    START_DATE DATE;
    MDN VARCHAR2(12);
    TOPUP VARCHAR2(50);
    begin
    for first_bundle in
    select min(date_time_of_event) date_time_of_event ,account_identifier  ,top_up_profile_name
    from bundlepur
    where account_profile='Basic'
    AND account_identifier='665004664'
    and in_service_result_indicator=0
    and network_cause_result_indicator=0
    and   DATE_TIME_OF_EVENT >= to_date('16/07/2013','dd/mm/yyyy')
    group by account_identifier,top_up_profile_name
    order by date_time_of_event
    loop
    select sum(units_per_tariff_rum2) ,max(date_time_of_event)
    into x,y
    from OLD_LTE_CDR
    where account_identifier=(select first_bundle.account_identifier from dual)
    and date_time_of_event >= (select first_bundle.date_time_of_event from dual)
    and -- no more than a month
    date_time_of_event < ( select add_months(first_bundle.date_time_of_event,1) from dual)
    and -- finished his bundle then buy a new one
      date_time_of_event < ( SELECT MIN(DATE_TIME_OF_EVENT)
                             FROM OLD_LTE_CDR
                             WHERE DATE_TIME_OF_EVENT > (select (first_bundle.date_time_of_event)+1/24 from dual)
                             AND IN_SERVICE_RESULT_INDICATOR=26);
    select first_bundle.account_identifier ,first_bundle.top_up_profile_name
    ,FIRST_BUNDLE.date_time_of_event
    INTO MDN,TOPUP,START_DATE
    from dual;
    insert into consumed1 VALUES(X,topup,MDN,START_DATE,Y);
    end loop;
    COMMIT;
    end;

    > where account_identifier=(select first_bundle.account_identifier from dual)
    Why are you doing this?  It's a completely unnecessary subquery.
    Just do this:
    where account_identifier = first_bundle.account_identifier
    Same for all your other FROM DUAL subqueries.  Get rid of them.
    More importantly, don't use a cursor for loop.  Just write one big INSERT statement that does what you want.

  • Unlike IE When back button is pressed it takes too long. pleas do something for that. thnx.

    Unlike IE When back button is pressed it takes too long. pleas do something for that. I like firefox and i get to use the back button more often.
    thnx

    In order to be able to find the correct solution to your problem, we require some more non-personal information from you. Please do the following:
    *Click the Firefox button at the top left, then click the ''Help'' menu and select ''Troubleshooting Information'' from the submenu. If you don't have a Firefox button, click the Help menu at the top and select ''Troubleshooting Information'' from the menu.
    Now, a new tab containing your troubleshooting information should open.
    *At the top of the page, you should see a button that says "Copy text to clipboard". Click it.
    *Now, go back to your forum post and click inside the reply box. Press Ctrl+V to paste all the information you copied into the forum post.
    If you need further information about the Troubleshooting information page, please read the article [[Use the Troubleshooting Information page to help fix Firefox issues]].
    Thanks in advance for your help!

  • Business Rules Project Takes Too Long to Open

    Does anyone know why it takes too long (~3-5 minutes) to open/edit a security project definded for assigning business rules to planning application forms? We are using Hyperion v11.1.1.3.0. Essbase is on Windows server, Shared Services on Solaris 10 Unix. Even before we migrated Essbase to Windows to gain better performance running calcs, opening projects using EAS has always been very slow to open. Please advice if there is a way to improve performance on this.

    Clear Cookies & Cache
    * https://support.mozilla.com/en-US/kb/Template:clearCookiesCache
    Clear the Network Cache
    * https://support.mozilla.com/en-US/kb/How%20to%20clear%20the%20cache#w_clear-the-cache
    Firefox takes a long time to start up
    * https://support.mozilla.com/en-US/kb/firefox-takes-long-time-start-up
    Check and tell if its working.

  • RTF export takes too long

    Hi all,
    you know why a report with a lot of data and many pages (about 300 pages), if I export this in PDF format using short time (1 min) and if I export this in PDF format takes too long (even 10 minutes) and sometimes even times out.
    Have an idea?
    Thanks

    re: Paul.  Not true, or shouldn't be.  XDCAM HD timeline, XDCAM HD output.
    re: Michael.  Well, close.  I mixdown the edited timeline, then place that into another sequence where I apply a Broadcast Safe filter and hit it with a bit of audio compression.  I render that, ridding myself of the dreaded "red lines".  However, when I splat-E, I get red lines all over again while I'm exporting.  When the export is done, the red lines disappear.  Now as an aside, I do experience actual "missing" render files, but that occurs when I quit a project and then re-open it later...they sometimes don't seem to re-link.  As I mentioned in this post, I have a feeling that's because this system is set up in the worst way possible (a single RAID5 which acts as system drive and media storage).
    Don't blame me, I'm just trying to work within the confines of what I'm given. My major concern here is that it seems to be getting worse.

  • Takes too long to hibernate when I close the lid - Also random device noise when it boots up

    Hello guys.
    Ever since i've wiped the machine, i've been having these two problems. When I close the lid, it used to go to sleep straight away, but now I can see the sleep light (and the power button) flash and flash, then it goes to sleep.
    When waking up it goes through the lenovo startup screen and resuming windows, then it asks for a password, before I used to open the lid and it would ask me for the password straight away. I know it was going to sleep because I could hear the beep straight away when I closed and opened it, but now it just takes too long.
    Also, everytime I boot up into windows or resume into windows from a sleep state, I can hear the device noise, like something is being plugged in/out. But there is nothing being plugged in or out at the time. I can't get to device manager quick enough to see what it is that is causing it. 
    But all drivers seem okay.
    Thanks in advance.
    Sam.
    EDIT: Also noticed, when the lid is closed, randomly the laptop turns off (hear the beep) an then turns back off again.
    Weird.
     T420 model number: 4180-PR1 with OS: Windows 7 Pro 64 bit
    T420 4180-PR1 - OS: Windows 7 Pro 64 bit
    Solved!
    Go to Solution.

    Hi Sam,
    is this to do with the T420 model number: 4180-PR1 with OS: Windows 7 64 bit installed on it as in another thread you posted in?
    Maybe you could pop the information into your signature; members like to know which system and OS are involved.  At the top next to Sign Out choose   My Settings > Personal Profile > Personal Information - Signature
    Andy  ______________________________________
    Please remember to come back and mark the post that you feel solved your question as the solution, it earns the member + points
    Did you find a post helpfull? You can thank the member by clicking on the star to the left awarding them Kudos Please add your type, model number and OS to your signature, it helps to help you. Forum Search Option T430 2347-G7U W8 x64, Yoga 10 HD+, Tablet 1838-2BG, T61p 6460-67G W7 x64, T43p 2668-G2G XP, T23 2647-9LG XP, plus a few more. FYI Unsolicited Personal Messages will be ignored.
      Deutsche Community     Comunidad en Español    English Community Русскоязычное Сообщество
    PepperonI blog 

  • 11gR2:crsctl, srvctl commands takes too long to respond

    Hi,
    I have successfully configured *11gR2 two node RAC on ASM on Win 2008 64bit.*
    Everything work very well like connecting to database, querying database. Node restart also takes acceptable time to go down & restart the clusterware & database.
    But when I execute crsctl status resource -t or srvctl status database -d db_name commands from any node takes 10-15min to give output.
    They give output & everything completes successfully but takes too long to respond.
    The questions are:
    - If everything works fine then why crsctl, srvctl takes too long to respond?
    - what could be blocking these command to gather clusterware and database status on all nodes?
    - what additional info would be helpful that I can provide?

    dag wrote:
    I dont have this issue either. are you auto starting mpd?
    that time is how long it is up in other words you opened it then closed it at that time
    I'm not sure if you are referring to me or not, but in my screenshot I am timing the lag in ncmpcpp by pressing 'q' in the terminal during the delay, so it quits ncmpcpp immediately after the lag. The lag is longer for ncmpc because the interface loads before the program connects to mpd, so I have to stop it manually immediatly after it connects.
    WonderWoofy wrote:I don't have this problem... if you are starting it as a systemd user service, maybe there is relevant information in the journal.
    The journal did not reveal anything relevant sadly. I have now tried launching mpd without systemd, and the lag remains. I have also noticed that a small mpd programming project I am writing also experiences the same lag when it tries to connect to mpd.

  • HT4352 apple tv takes too long to load photos

    I am running iTunes (11.1.5.5) on Windows 7 and have setup home sharing and accessing the Photo Library from Apple TV 3rd Generation.
    We have a lots of photos (approx 21000) organizaed in folders, eg:
    Main Photo Folder
    Folder 1 (5500 photos)
    F1-Sub 1
    F1-Sub 2
    F1-Sub 3
    Folder 2 (3500 photos)
    F1-Sub 1
    F1-Sub 2
    F1-Sub 3
    Folder 3 (12000 photos)
    F1-Sub 1
    F1-Sub 2
    F1-Sub 3
    Apple TV doesn't show the sub-folders beyond the first set of folders and as a result it takes too long to load our list of photos, is there anything that can be done to fix this?
    Thanks!

    I am having exactly the same issue. Moved house and now starting a movie takes minutes, not seconds... I've got both the Mac and ATV connected via ethernet to my Airport Extreme base station, but have no explanation for this bizarre slowdown. I also don't have internet currently, but I can't understand with content not purchased on iTunes, why that would make any difference...?

Maybe you are looking for

  • Certain Artists Do Not Show Up On Artist List

    I recently purchased a new iPod Classic and everything is lovely, but a few artists just aren't showing up on my artists list. The files are there - I can access the music in question through the album search feature - but the name does not show up.

  • Songs not showing up in Itunes 11?

    Music that I have bought on amazon is showing up in the music folder on my hardrive, but when I select "open with itunes" itunes opens, but the song does not show up in my library.

  • Permissions - authorizing minimac

    I have three devices purchased from Apple: MacBook (as the first device), Mac Mini and IPad 2. THey are not communicating and sync-ing in terms of content in ITunes. I cannot get my Mac Mini be Authorized, it gives the same error message all the time

  • Keyboards minature plug requires soldering

    HI All I have a White Macbook 1.83ghz version where it looks like the Keyboards minature plug requires soldering back in. Is this practical and will it make the keyboard funcional again? Many thanks GArry

  • How to uninstall Reader 10x?

    My computer had crashed and we had to do a clean install of XP sp3. Installed Reader 10.1.3 and immediately had problems...printing caused computer to freeze/hang. I want to UNINSTALL it....don't see an uninstall program in it....how to do it without