What is the best practice of deleting large amount of records?

hi,
I need your suggestions on best practice of deleting large amount of records of SQL Azure regularly.
Scenario:
I have a SQL Azure database (P1) to which I insert data every day, to prevent the database size grow too fast, I need a way to  remove all the records which is older than 3 days every day.
For on-premise SQL server, I can use SQL Server Agent/job, but, since SQL Azure does not support SQL Job yet, I have to use a Web job which scheduled to run every day to delete all old records.
To prevent the table locking when deleting too large amount of records, in my automation or web job code, I limit the amount of deleted records to
5000 and batch delete count to 1000 each time when calling the deleting records stored procedure:
1. Get total amount of old records (older then 3 days)
2. Get the total iterations: iteration = (total count/5000)
3. Call SP in a loop:
for(int i=0;i<iterations;i++)
   Exec PurgeRecords @BatchCount=1000, @MaxCount=5000
And the stored procedure is something like this:
 BEGIN
  INSERT INTO @table
  SELECT TOP (@MaxCount) [RecordId] FROM [MyTable] WHERE [CreateTime] < DATEADD(DAY, -3, GETDATE())
 END
 DECLARE @RowsDeleted INTEGER
 SET @RowsDeleted = 1
 WHILE(@RowsDeleted > 0)
 BEGIN
  WAITFOR DELAY '00:00:01'
  DELETE TOP (@BatchCount) FROM [MyTable] WHERE [RecordId] IN (SELECT [RecordId] FROM @table)
  SET @RowsDeleted = @@ROWCOUNT
 END
It basically works, but the performance is not good. One example is, it took around 11 hours to delete around 1.7 million records, really too long time...
Following is the web job log for deleting around 1.7 million records:
[01/12/2015 16:06:19 > 2f578e: INFO] Start getting the total counts which is older than 3 days
[01/12/2015 16:06:25 > 2f578e: INFO] End getting the total counts to be deleted, total count:
1721586
[01/12/2015 16:06:25 > 2f578e: INFO] Max delete count per iteration: 5000, Batch delete count
1000, Total iterations: 345
[01/12/2015 16:06:25 > 2f578e: INFO] Start deleting in iteration 1
[01/12/2015 16:09:50 > 2f578e: INFO] Successfully finished deleting in iteration 1. Elapsed time:
00:03:25.2410404
[01/12/2015 16:09:50 > 2f578e: INFO] Start deleting in iteration 2
[01/12/2015 16:13:07 > 2f578e: INFO] Successfully finished deleting in iteration 2. Elapsed time:
00:03:16.5033831
[01/12/2015 16:13:07 > 2f578e: INFO] Start deleting in iteration 3
[01/12/2015 16:16:41 > 2f578e: INFO] Successfully finished deleting in iteration 3. Elapsed time:
00:03:336439434
Per the log, SQL azure takes more than 3 mins to delete 5000 records in each iteration, and the total time is around
11 hours.
Any suggestion to improve the deleting records performance?

This is one approach:
Assume:
1. There is an index on 'createtime'
2. Peak time insert (avgN) is N times more than average (avg). e.g. supposed if average per hour is 10,000 and peak time per hour is 5 times more, that gives 50,000. This doesn't have to be precise.
3. Desirable maximum record to be deleted per batch is 5,000, don't have to be exact.
Steps:
1. Find count of records more than 3 days old (TotalN), say 1,000,000.
2. Divide TotalN (1,000,000) with 5,000 gives the number of deleted batches (200) if insert is very even. But since it is not even and maximum inserts can be 5 times more per period, set number of deleted batches should be 200 * 5 = 1,000.
3. Divide 3 days (4,320 minutes) with 1,000 gives 4.32 minutes.
4. Create a delete statement and a loop that deletes record with creation day < today - (3 days ago - 3.32 * I minutes). (I is the number of iterations from 1 to 1,000)
In this way the number of records deleted in each batch is not even and not known but should mostly within 5,000 and even you run a lot more batches but each batch will be very fast.
Frank

Similar Messages

  • What is the best way to migrate large amounts of data from a g3 to an intel mac?

    I want to help my mom transfer her photos and other info from an older  blueberry G3 iMac to a new intel one.  There appears to be no prmigration provision on the older mac.  Also the firewire caconnestions are different.  Somebody must have done this before.

    Hello
    the cable above can be use to enable Target Disk mode for data transfert
    http://support.apple.com/kb/ht1661 for more info
    to enable Target Disk mode just after startup sound Hold on "T" key on key board until see at  screen firewire symbol aka screen saver , then plug fire wire cable betwen 2 mac
    HTH
    Pierre

  • What is the best practice for inserting (unique) rows into a table containing key columns constraint where source may contain duplicate (already existing) rows?

    My final data table contains a two key columns unique key constraint.  I insert data into this table from a daily capture table (which also contains the two columns that make up the key in the final data table but are not constrained
    (not unique) in the daily capture table).  I don't want to insert rows from daily capture which already exists in final data table (based on the two key columns).  Currently, what I do is to select * into a #temp table from the join
    of daily capture and final data tables on these two key columns.  Then I delete the rows in the daily capture table which match the #temp table.  Then I insert the remaining rows from daily capture into the final data table. 
    Would it be possible to simplify this process by using an Instead Of trigger in the final table and just insert directly from the daily capture table?  How would this look?
    What is the best practice for inserting unique (new) rows and ignoring duplicate rows (rows that already exist in both the daily capture and final data tables) in my particular operation?
    Rich P

    Please follow basic Netiquette and post the DDL we need to answer this. Follow industry and ANSI/ISO standards in your data. You should follow ISO-11179 rules for naming data elements. You should follow ISO-8601 rules for displaying temporal data. We need
    to know the data types, keys and constraints on the table. Avoid dialect in favor of ANSI/ISO Standard SQL. And you need to read and download the PDF for: 
    https://www.simple-talk.com/books/sql-books/119-sql-code-smells/
    >> My final data table contains a two key columns unique key constraint. [unh? one two-column key or two one column keys? Sure wish you posted DDL] I insert data into this table from a daily capture table (which also contains the two columns that make
    up the key in the final data table but are not constrained (not unique) in the daily capture table). <<
    Then the "capture table" is not a table at all! Remember the fist day of your RDBMS class? A table has to have a key.  You need to fix this error. What ETL tool do you use? 
    >> I don't want to insert rows from daily capture which already exists in final data table (based on the two key columns). <<
    MERGE statement; Google it. And do not use temp tables. 
    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking
    in Sets / Trees and Hierarchies in SQL

  • What are the best practices for generating an EPS logo from InDesign?

    Our costomer is running into technical issues with the logo we sent them, which was exported from Indesign. Images were not embedded and fonts missing. I was able to embed the images and fonts. However, we DO NOT want them to be able to make any text changes. So after exporting an eps, I opened the file in Adobe Illustrator and made all the text outlines. I hope this works. But I just wanted to post the question on what are the best practices for doing this?
    The client needs the logo with transparent background, images emebdded and type in outlines. Also, they need some space around the text. When I exported the eps, the file is right up on the edge of the type.

    It sounds like you are pretty far from "best practice" with regard to logo design and delivery.
    These days, the very use of the EPS format should be considered bad practice, and some other terms in your post, (i.e., 'images,' 'missing fonts'), make it sound like there is not a seasoned logo designer involved.
    That said, you probably already got the advice you need to get out of the immediate jam. However, without proper logo design, you and the client will soon be facing other problems. You should be delivering a 100% vector graphic in single-color (black) and corporate-color(s) versions, with no live font data, that has been test-scaled to very small and very large sizes; ensuring it will work at postage-stamp size and on the side of a truck or building, with specific spot color(s) and proportions that will enable it to be offset printed, embroidered and screen-printed on apparel, and cut into signage materials and decals.

  • Terabyte Plus Libraries - What is the Best Practice?

    In my current Aperture library I have 150 gigs and I have only been using it for about a year. Previous to that I used iPhoto for a short time which still has a 60 gig library and previous to that I stored my images in file folders on a windows servers which I have around 500+ gig there. Then there are are those non electronic images which should be scanned in one day.....
    My hope was to have import everything into one tool and eventually to have some better organization and management of my images. At the rate I am shooting now it won't be long before I break the terabyte mark and am wondering as I try to pull al these sources together what is the best Practice?
    I know I can have more than one library now with Aperture, do folks manage their libraries by themes? Weddings, Family, Commercial, etc? I just picked up a 2 terabyte drive to start moving stuff off my Mac Book but am not sure if I should use the Archive tool to do this or break apart my images into libraries and store them them and just keep a working library on my Mac?
    Within Aperture I am using the Project -> Album hierarchy to manage my shoots now as well.
    Also, I don't have a ton of video yet, but have started shooting a little, plus have been making slideshows and books now, so I need to start planning for that as well. Just wondering what is the best, most efficient way of large data management with Aperture.
    Thanks!

    The solution is to avoid the (unfortunately default) Managed Masters and instead use a Referenced Masters Library kept on an internal drive. Back up originals prior to importing and keep Masters off the Library drive. That way the Aperture Library will remain small enough to live on a standard internal drive without overfilling it.
    Note that working drives (as opposed to backup-only drives) should not be allowed to exceed ~70% full for ideal speed and stability.
    Multiple Libraries is almost always poor images management in a digital world unless all rights to the images, including the right to simply view any image (such as security work), belong exclusively to the client. Usage of multiple Libraries is a big backward step into film-think and very significantly limits the power of digital images management.
    -Allen

  • What are the best practices to connect 30-40 iPads to Wi-Fi in a single room?

    What are the best practices to connect 30-40 iPads to Wi-Fi in a single room?

    I don't use it but it does say this in the help section...

  • What are the best practices to migrate VPN users for Inter forest mgration?

    What are the best practices to migrate VPN users for Inter forest mgration?

    It depends on a various factors. There is no "generic" solution or best practice recommendation. Which migration tool are you planning to use?
    Quest (QMM) has a VPN migration solution/tool.
    ADMT - you can develop your own service based solution if required. I believe it was mentioned in my blog post.
    Santhosh Sivarajan | Houston, TX | www.sivarajan.com
    ITIL,MCITP,MCTS,MCSE (W2K3/W2K/NT4),MCSA(W2K3/W2K/MSG),Network+,CCNA
    Windows Server 2012 Book - Migrating from 2008 to Windows Server 2012
    Blogs: Blogs
    Twitter: Twitter
    LinkedIn: LinkedIn
    Facebook: Facebook
    Microsoft Virtual Academy:
    Microsoft Virtual Academy
    This posting is provided AS IS with no warranties, and confers no rights.

  • What are the best practices to replace a disk in 6140 ?

    What are the best practices to replace a disk in 6140?
    Regards

    The best way is to follow CAM Service Advisor instructions.

  • What is the best practice for changing view states?

    I have a component with two Pie Charts that display
    percentages at two specific dates (think start and end values).
    But, I have three views: Start Value only, End Value only, or show
    Both. I am using a ToggleButtonBar to control the display. What is
    the best practice for changing this kind of view state? Right now
    (since this code was inherited), the view states are changed in an
    ActionScript function which sets the visible and includeInLayout
    properties on each Pie Chart based on the selectedIndex of the
    ToggleButtonBar, but, this just doesn't seem like the best way to
    do this - not very dynamic. I'd like to be able to change the state
    based on the name of the selectedItem, in case the order of the
    ToggleButtons changes, and since I am storing the name of the
    selectedItem for future reference.
    Would using States be better? If so, what would be the best
    way to implement this?
    Thanks.

    I would stick with non-states, as I have always heard that
    states are more for smaller components that need to change under
    certain conditions, like a login screen that changes if the user
    needs to register.
    That said, if the UI of what you are dealing with is not
    overly complex, and if it will not become overly complex, maybe
    states is the way to go.
    Looking at your code, I don't think you'll save much in terms
    of lines of code.

  • What is the best practice in securing deployed source files

    hi guys,
    Just yesterday, I developed a simple image cropper using ajax
    and flash. After compiling the package, I notice the
    package/installer delivers the same exact source files as in
    developed to the installed folder.
    This doesnt concern me much at first, but coming to think of
    it. This question keeps coming out of my head.
    "What is the best practice in securing deployed source
    files?"
    How do we secure application installed source files from
    being tampered. Especially, when it comes to tampering of the
    source files after it's been installed. E.g. modifying spraydata.js
    files for example can be done easily with an editor.

    Hi,
    You could compute a SHA or MD5 hash of your source files on
    first run and save these hashes to EncryptedLocalStore.
    On startup, recompute and verify. (This, of course, fails to
    address when the main app's swf / swc / html itself is
    decompiled)

  • What is the best practice to display info of completed task in process flow

    Hi all,
    I'm starting to study BPM modeling with CE7.1 EHP1. Thanks to the tutorial and example on SDN site and I can easily build my own process in NWDS and deploy to server, start it, finish it.
    I like the new runtime which can show a BPMN diagram to the processors. However, I can't find a way to let the follow up processor to review the task result completed in previous step. I'm more familiar with Guided Procedure, and know there is "Display Callable Object" which can used to show some info of a completed task when the processor/owner/admin/overseer click on a completed task.  Where is the feature in BPM ? What is the best practice to show such task information in BPM environment.
    For example, A multiple level approval process, the higher level approver need to know the comment written by the previous approver. Can he read this information from process flow ?
    I think it is very important feature for a BPM platform. In Guided Procedure, such requirement can be done with Display Callable Object + View Permission, and you just need some coding for the UI. If BPM is superior to GP, I think there must be a way to achieve this, I just do not know how ?
    Can anyone shed me some light on it ?

    Oliver,
    Thanks for your quick reply.
    Yes, Notes and Attachment CAN BE USED for the purpose. But I'm still looking for a more elegant solution.
    With the solution of using Notes/Attachment, the processor need to give input at two places : the task UI and Note/Attach , with similar or same data. It is really annoying.
    Is there any SAP BPM real-world deployment ? None of customer has the requirement ?

  • What is the best practice for full browser video to achieve the highest quality?

    I'd like to get your thoughts on the best way to deliver full-browser (scale to the size of the browser window) video. I'm skilled in the creation of the content but learning to make the most out of Flash CS5 and would love to hear what you would suggest.
    Most of the tutorials I can find on full browser/scalable video are for earlier versions of Flash; what is the best practice today? Best resolution/format for the video?
    If there is an Adobe guide to this I'm happy to eat humble pie if someone can redirect me to it; I'm using CS5 Production Premium.
    I like the full screen video effect they have on the "Sounds of pertussis" web-site; this is exactly what I'm trying to create but I'm not sure what is the best way to approach it - any hints/tips you can offer would be great?
    Thanks in advance!

    Use the little squares over your video to mask the quality. Sounds of Pertussis is not full screen video, but rather full stage. Which is easier to work with since all the controls and other assets stay on screen. You set up your html file to allow full screen. Then bring in your video (netstream or flvPlayback component) and scale that to the full size of your stage  (since in this case it's basically the background) . I made a quickie demo here. (The video is from a cheapo SD consumer camera, so pretty poor quality to start.)
    In AS3 is would look something like
    import flash.display.Loader;
    import flash.net.URLRequest;
    import flash.display.Bitmap;
    import flash.display.BitmapData;
    import flash.ui.Mouse;
    import flash.events.Event;
    import flash.events.MouseEvent;
    import flash.display.StageDisplayState;
    stage.align = StageAlign.TOP_LEFT;
    stage.scaleMode = StageScaleMode.NO_SCALE;
    // determine current stage size
    var sw:int = int(stage.stageWidth);
    var sh:int = int(stage.stageHeight);
    // load video
    var nc:NetConnection = new NetConnection();
    nc.connect(null);
    var ns:NetStream = new NetStream(nc);
    var vid:Video = new Video(656, 480); // size off video
    this.addChildAt(vid, 0);
    vid.attachNetStream(ns);
    //path to your video_file
    ns.play("content/GS.f4v"); 
    var netClient:Object = new Object();
    ns.client = netClient;
    // add listener for resizing of the stage so we can scale our assets
    stage.addEventListener(Event.RESIZE, resizeHandler);
    stage.dispatchEvent(new Event(Event.RESIZE));
    function resizeHandler(e:Event = null):void
    // determine current stage size
        var sw:int = stage.stageWidth;
        var sh:int = stage.stageHeight;
    // scale video size depending on stage size
        vid.width = sw;
        vid.height = sh;
    // Don't scale video smaller than certain size
        if (vid.height < 480)
        vid.height = 480;
        if (vid.width < 656)
        vid.width = 656;
    // choose the smaller scale property (x or y) and match the other to it so the size is proportional;
        (vid.scaleX > vid.scaleY) ? vid.scaleY = vid.scaleX : vid.scaleX = vid.scaleY;
    // add event listener for full screen button
    fullScreenStage_mc.buttonMode = true;
    fullScreenStage_mc.mouseChildren = false;
    fullScreenStage_mc.addEventListener(MouseEvent.CLICK, goFullStage, false, 0, true);
    function goFullStage(event:MouseEvent):void
        //vid.fullScreenTakeOver = false; // keeps flvPlayer component from becoming full screen if you use it instead  
        if (stage.displayState == StageDisplayState.NORMAL)
            stage.displayState=StageDisplayState.FULL_SCREEN;
        else
            stage.displayState=StageDisplayState.NORMAL;

  • What is the Best practice for ceramic industry?

    Dear All;
    i would like to ask two questions:
    1- which manufacturing category (process or discrete) fit ceramic industry?
    2- what is the Best practice for ceramic industry?
    please note from the below link
    [https://websmp103.sap-ag.de/~form/sapnet?_FRAME=CONTAINER&_OBJECT=011000358700000409682008E ]
    i recognized that ceramic industry is under category called building material which in turn under mill product and mining
    but there is no best practices for building material or even mill product and only fabricated meta and mining best practices is available.
    thanks in advance

    Hi,
    I understand that you refer to production of ceramic tiles. The solution for PP was process, with these setps: raw materials preparation (glazes and frits), dry pressing (I don't know extrusion process), glazing, firing (single fire), sorting and packing. In Spain, usually are All-in-one solutions (R/3 o ECC solutions). Perhaps the production of decors have fast firing and additional processes.
    In my opinion, the curiosity is in batch determination in SD, that you must determine in sales order because builders want that the order will be homogeneus in tone and caliber, and he/she can split the order in diferents deliveries. You must think that batch is tone (diferents colours in firing and so on) and in caliber.
    I hope this helps you
    Regards,
    Eduardo

  • Database Log File becomes very big, What's the best practice to handle it?

    The log of my production Database is getting very big, and the harddisk is almost full, I am pretty new to SAP, but familiar with SQL Server, if anybody can give me advice on what's the best practice to handle this issue.
    Should I Shrink the Database?
    I know increase hard disk is need for long term .
    Thanks in advance.

    Hi Finke,
    Usually the log file fills up and grow huge, due to not having regular transaction log backups. If you database is in FULL recovery mode, every transaction is logged in Transaction file, and it gets cleared when you take a log backup. If it is a production system and if you don't have regular transaction log backups, the problem is just sitting there to explode, when you need a point in time restore. Please check you backup/restore strategy.
    Follow these steps to get transactional file back in normal shape:
    1.) Take a transactional backup.
    2.) shrink log file. ( DBCC shrinkfile('logfilename',10240)
          The above command will shrink the file to 10 GB.(recommended size for high transactional systems)
    >
    Finke Xie wrote:
    > Should I Shrink the Database? .
    "NEVER SHRINK DATA FILES", shrink only log file
    3.) Schedule log backups every 15 minutes.
    Thanks
    Mush

  • What is the Best way to move large mailboxes between datacenters?

    What is the Best way to move large mailboxes between datacenters?

    Hi, 
     Are you asking with regards to on-premises Exchange? With Microsoft Online SaaS services (aka Exchange Online) there is no control and no need to control which data center a mailbox resides in.
     With regard to on-premises Exchange, you have two choices: you can move it over the WAN in which case you would either do a native mailbox move (assuming you have Exchange 2010 or later you can suspend the move after the copy so you can control the
    time of the cutover) or create a database copy in the second data center and once the database copies have synchronized change the active copy.
    The other choice is to move is out of band which would usually involve an offline seed of the database (you could conceivably move via PST file but that would disrupt access to the mailbox and is not really the 'best way').
    In general, Exchange on-premises questions are best asked on the Exchange forum: http://social.technet.microsoft.com/Forums/office/en-US/home?category=exchangeserver
    Thanks,
    Guy 

Maybe you are looking for

  • IdeaCentre A740 performanc​e and usability improvemen​ts

    Hi all, I replaced then Lenovo mouse with my old Logitech wireless mouse because the Lenovo mouse is to annoying to be usefull. I also replaced the slow Seagate 1TB SSHD by a Crucial M550 1TB SSD. OS and data migration by using a Logilink USB 3.0 to

  • Questions about RTMFP

    1. About RTMFP for Group P2P, if P2P is not allowed by network infrastructure, what suggestions for developers to deal with this condition? If P2P prohibited by network fireworks, FMS will automatically switch to UDP for video streaming?      2. Abou

  • Q180 will not connect to TV

    Hi I can not get my Q180 to connect to any TV via the HDMI cable.  I have tried different HDMI cables and different TVs but there is no signal. The only way I can connect is via the VGA and therefore have no sound. Any help would be of great assistan

  • How can I get the transID of journal entry When I insert it?

    Hi, When I insert a new journal Entry I need to know the transID associated. Because it may not be the last of OJDT table. How can i have sure, using the SBO Objects? Best Regards, Ricardo Pereira

  • Different DNS lookup stratergi for recognized and not recognized root domain

    Firefox DNS lookup stratergy is different if the root domain in the URL is recognized (like .se) or not recognized (like .local). How can I add my selfintroduced root domain in the list of recognized root domains?