Scalability issues w/rt large documents

I use the following abbreviations in my query.
BDB = Berkeley DB
BDBXML = Berkeley DB XML
After reviewing the docs, I have two major gating issues about BDBXML's handling of large XML documents. By large, I mean documents that are can be between 2MB or several 10s of MB, particularly w/ frequent repetition of the keys.
(A) XML documents tend be very verbose w/ the repetition of keys. RDBMSes avoid that verbosity by encoding the keys into table column names. We want to use an embedded DB to eliminate the dependency on an RDBMS. However, this verbosity of XML documents could kill us b/c our XML "documents" are going to be relatively large w/ a lot of repetition of keys. The key question here is, will our BDBXML documents use a correspondingly large amounts of storage? Or does BDBXML optimize storage utilization somehow via such a mechanism as internment of repeated keys' strings?
(B) Depending on the answer to (A), this particular issue may not be a concern. The BDB docs strongly warns programmers to minimize overflow-page utilization; and gives detailed advice for setting the page-size to an appropriate value, starting w/ the block-size of the underlying filesystem then increasing the value to optimize etc., etc., the core idea being to minimize use 'overflow pages'. however, for documents on the order of megabytes, utilization of overflow pages is unavoidable w/o an encoding scheme that breaks the document into smaller parts. based on the BDBXML docs, it is not clear to me that BDBXML optimizes use of large XML documents. So - my question is, how does BDBXML handle large XML documents so as to avoid the performance degradation that accompanies overflow pages?
-rlehr
Software Architect
Cadence Design Systems, Inc.

George,
Thank you very much for clarifying those aspects.
RE (A) - Although query speed concerns me, the "bloated" database files currently concerns me more. "Bloated" databases have been a concern for us previously. We might be able to compensate for it in our app's current rev. however, that is still uncertain. the issues therefore remain.
Speed also concerns since, after the index search, the data still needs to be loaded and decoded from string to native data. however, that is still uncertain. the issues therefore remain.
If I understand correctly, BDB XML treats the XML documents as a hierarchical structure of nodes that are either
* simple string/string key/value pairs - two categories for these, attributes and elements
* string/node key/value pairs
This reduces essentially to a hierarchical structure of string/string key/value pairs. IOW, a lot of string/string key/value pairs are being written to storage and indexed.
Two aspects of this problem have me considering using bare BDB instead of BDB XML.
(A) to eliminate the add'l overhead that accompanies
[1] the add'l IO that accompanies plain-text vs native data
[2] the add'l CPU ops for converting between string and native data
(B) our application's data is pretty much well structured. we KNOW what we will be storing. so that aspect of the XML storage has little value for us. That being said, BDB XML offers a potential simplicity that would be valuable.
In one sense, this is uninteresting for you. OTOH, that summarizes my concerns w/rt BDB XML vs BDB or another solution, to which you can respond with add'l insight, having worked w/ BDB XML much more extensively. I am interested in your perspective.
Again, thanks. I look forward to your reply.
-rlehr

Similar Messages

  • How do I increase the font size of a large document?

    Whenever I try to increase the font size of a large document, the text boxes cross their boundaries and mix with each other, or partly disappear at the end of every page. Do I really have to adjunst every text bos in the document or is there a faster way? Im having this issue with adobe acrobat 11 pro

    This is not a feasible thing to do. Editing a PDF is a desperate last resort for so many reasons.
    Whatever you need to solve, you are unlikely  to solve it in Acrobat. Probably best to export text and remake it.

  • Can't print large documents.

    For years, we've been printing large documents to our plotter without issue.  Suddenly last week, nothing large will print.
    I've seen a few different error messages, and have tried different computers, and different versions of Illustrator, but nothing large will print.
    When trying to print, I currently get the following:
    I've also seen an error suggesting not enough memory is available, which is bogus.  And I think another error that was more vague, something to the effect of "can't print"
    I've setup a clean install of illustrator on a clean windows install, same issue.
    I setup an alternate print server with our rip software, same software we've been using for years.  Same error on the alternate server.
    Nothing changed, and it's effecting everything here, and new installs.  I don't know what happened, but this is getting critical, we need the ability to print.

    EFI eXpress v. 4.1
    This all worked fine until a week ago, so I don't think there should be a hardware software compatibility issue.  The software and hardware were all working fine, and then we got blindsided by this stubborn error that came out of nowhere, won't go away, and has no error or log that gives us any useful information.
    It isn't a font issue.  I've created a basic test document that just consists of the word test.  Using this same font, I've had success printing documents that are 37"x79", 39"x75".  But, when I print 39"x79", it fails.  i've not zeroed in on the size that seems to be the cutoff, but I knew that a 37x75 worked, and 40x80 didn't, and 39x79 didn't either.  So, playing around with the numbers I found the above that did work.
    Also, using the general epson driver that you can get on epson's site, any print job can be sent without issue.  It's just that anything beyond 90 inches gets cut off. 
    I suppose this suggests the problem lies with EFI.  But, a clean install of EFI didn't resolve the issue.  What it comes down to for me is, the error is generated from Adobe.  So, at the least, I'd like an explanation from Adobe as to what the error is, since their software is generating it, and giving me no reason why it failed.  Plus, there's no way I can justify to management the cost of bringing in a 3rd party support for the EFI software, unless I have a better reason why the EFI software worked last week, but not this week, so I really need better data out of Adobe as to why it fails.
    I currently have a paid case open with Adobe, but they seem to be having difficulty as well, so far it's been a lot of time on hold, and a little bit of "try this, try that, try this", which is getting nowhere.  I need something more methodical to happen, and some actual following of data, or checking of logs, to get some detailed information, hopefully the case will get escalated to someone with a bit more in depth knowledge on these things.
    I will keep posting the details here as they emerge, because there's nothing I hate more than a forum thread with a problem I'm experiencing, that doesn't end with an answer.  I think forums should do a better job of either posting a resolution, or purging the cases that don't resolve, to get rid of the useless junk that clogs the internet and wastes people's time searching for answers.

  • 8500 A909a Won't Print Large Documents

    My OfficeJet Pro 8500 A909a will not print large documents. Prints fine on websites, 2 or 3 page docs but nothing larger. Trying to print a mail merge of 450 letters - print spooler goes though normal proces but printer never prints - it just beeps when the print spooler is completed. Printer is installed ethernet on XP. HP check software reports OK. Thank you.

    Take a look at this HP support page.
    Please mark the post that solves your issue as "Accept as Solution".
    If my answer was helpful click the “Thumbs Up" on the left to say “Thanks”!
    I am not a HP employee.

  • Scalability issue with global temporary table.

    Hi All,
    Does create global temporary table would lock data disctionary like create table? if yes would not it be a scalable issue with multi user environment?
    Thanks and Regards,
    Rudra

    Billy  Verreynne  wrote:
    acadet wrote:
    am I correct in interpreting your response that we should be using GTT's in favour of bulk operations and collections and in memory operations? No. I said collections cannot scale. This means due to the fact that collections reside in expensive PGA memory, you cannot stuff large data volumes into them. Thus they do not make an ideal storage bin for temporary data (e.g. data loaded from file or a web service). GTTs otoh do not suffer from the same restrictions, can be indexed and offer vastly better scalability and so on.
    Multiple passes are often needed using such a data structure. Or filtering to find specific data. As a GTT is a SQL native, it offers a lot more flexibility and performance in this regard.
    And this makes sense - as where do we put out persistent data? Also in tables, but ones of a persistent and not temporary kind like a GTT.
    Collections are pretty useful - but limited in size and capability.
    Rudra states:
    I want to pull out few metrices from differnt tables and processing itIf this can't be achieved in a SQL statement, unless Rudra is a master of understatement then I would see GTT's as a waste of IO and programming effort. I agree.
    My comments however were about choices for a temporary data storage bin in PL/SQL.I agree with your general comments regarding temporary storage bins in Oracle, but to say that collections don't scale is putting to narrow a definition on scaling. True, collections can be resource intensive in terms of memory and CPU requirements, but their persistence will generally be much shorter than other types of temporary storage. Given the right characteristics, collections will scale and given the wrong characteristics GTT's wont scale.
    As you say it is all about choice. Getting back to the theme of this thread though, the original poster should be made aware that well designed and well coded applications are most likely to scale. Creating tables on the fly is generally considered bad practice and letting the database do what it does best, join tables in queries at the SQL level is considered good practice. The rest lies somewhere in between and knowing when to do which is why we get paid the big bucks (not). ;-)
    Regards
    Andre

  • File  Adapter :- Handling Large documents

    Hi
    I am currently working on File Adapter. Reading large documents and writing the same in to some other file location.
    I came across the following techniques:
    1. Scalable DOM
    2. File Chunk Read.
    Can any one help me the exact use cases of the above mentioned techniques in File Adapter.
    Thanks

    1. Scalable DOM - is used to move/copy large files intact.
    2. File Chunk Read - is used to process large documents (it uses a while loop).
    When you're using File ChunkRead, you can take a large document with many elements and for each of those elements, perform some operations.
    -----------Documentation-----------
    **Oracle File Adapter Scalable DOM
    http://docs.oracle.com/cd/E23943_01/integration.1111/e10231/adptr_file.htm#BABCHCEI
    This use case demonstrates how a scalable DOM process uses the streaming feature to copy/move huge files from one directory to another.
    The streaming option is not supported with DB2 hydration store.
    You can obtain the Adapters-103FileAdapterScalableDOM sample by accessing the Oracle SOA Sample Code site.
    **Oracle File Adapter ChunkedRead
    http://docs.oracle.com/cd/E23943_01/integration.1111/e10231/adptr_file.htm#BABJFCBH
    This is an Oracle File Adapter feature that uses an invoke activity within a while loop to process the target file. This feature enables you to process arbitrarily large files.
    You can obtain the Adapters-106FileAdapterChunkedRead sample by accessing the Oracle SOA Sample Code site.
    An additional reference that may be helpful is, Handling Binary Content and Large Documents in Oracle SOA Suite 11g
    http://www.oracle.com/technetwork/middleware/soasuite/learnmore/binarycontentlargepayloadhandling-1705355.pdf

  • Performance Issues when editing large PDFs

    We are using Adobe 9 and X Professional and are experiencing performance issues when attempting to edit large PDF files.  (Windows 7 OS). When editing PDFs that are 200+ pages, we are seeing pregnated pauses (that feel like lockups), slow open times and slow to print issues. 
    Are there any tips or tricks with regard to working with these large documents that would improve performance?

    You said "edit." If you are talking about actual editing, that should be done in the original and a new PDF created. Acrobat is not a very good editing tool and should only be used for minor, critical edits.
    If you are talking about simply using the PDF, a lot depends on the structure of the PDF. If it is full of graphics, it will be slow. You can improve this performance by using the PDF Optimize to reduce graphic resolution and such. You may very likely have a bloated PDF that is causing the problem and optimizing the structure should help.
    Be sure to work on a copy.

  • Inserting a large document

    Hi I am trying to insert a large document a litte over 7M, but I get
    com.sleepycat.dbxml.XmlException: Error: Buffer: failed to allocate memory, err
    code = NO_MEMORY_ERROR
         at com.sleepycat.dbxml.dbxml_javaJNI.XmlContainer_putDocumentInternal__SWIG_0(
    Native Method)
         at com.sleepycat.dbxml.XmlContainer.putDocumentInternal(XmlContainer.java:884)
         at com.sleepycat.dbxml.XmlContainer.putDocument(XmlContainer.java:103)
         at com.sleepycat.dbxml.XmlContainer.putDocument(XmlContainer.java:54)
    I am usiing Java
    4 caches of 33554432 each
    jvm runs with -Xms256m -Xmx512m
    I already tried using XmlInputStream created with XmlManager.createLocalFileInputStream(java.lang.String) as well as XmlEventWriter, but both fail because of memory.
    I just found out that the shell loads it so fast I could not blink twice, so I guess I must be doing something wrong in Java code.
    Edited by: MauricioSC on Jul 6, 2009 6:10 PM
    Edited by: MauricioSC on Jul 6, 2009 9:35 PM
    OK something came to my mind while driving home and I tested it with almost 100% assurance that it was THE issue. Yes it was. My wrapper class around dbxml java stuff was creating a default index (a fully featured one) to any container by default. That's it, the 7M doc was crashing because of indexing operations set as default by the container while loading.
    Sorry to bother.

    2009-07-06 18:42:45.640 dbxml DEBUG: c.getContainerType()=1
    so, no. It has to be something else.
    Edited by: MauricioSC on Jul 6, 2009 9:47 PM
    OK something came to my mind while driving home and I tested it with almost 100% assurance that it was THE issue. Yes it was. My wrapper class around dbxml java stuff was creating a default index (a fully featured one) to any container by default. That's it, the 7M doc was crashing because of indexing operations set as default by the container while loading.

  • Loading large documents

    I need a way to load large documents without major performance issues.
    At present, the best I can find in Swing itself is the PlainDocument class, which has issues once it crosses the size of about 1MB. As it gets closer and closer to 10MB, it gets slower and slower, and even higher just gets unacceptable.
    I started off writing a custom implementation of Document, but the need to implement the listener methods bogged me down and I can't progress any further without further help.
    I find it hard to believe that nobody would have written a Swing editor that can handle over 10MB of text, but I can't find anything with Google, so...
    Has someone out there already tackled this problem?

    I think you should create a document which reads the text from the
    file whenever the data is requested by getText-method in the document.
    You could also use some caching of the previous read text in order to reach a good performance.
    For instance you could reserve a buffer (byte-array) of about 500 kb,
    which I would split up in 500 arrays of 1000 chars, since 1000 chars should be enough to fill at
    least a screen and somewhat above.
    I would propose to keep a map for holding these arrays, and coupling them to the
    offset in the file from which they are read.
    By alligning the offsets from which you read in the file with the size of such an array,
    you will never read overlapping parts in your buffer.
    You will also need some mechanism for remembering which array is least frequently used or least
    recently used and may be replaced by the newly read data once the buffer is full.
    You will also need some mechanism for adding or changing data from the file,
    since you cannot change the data on the fly in the file. You should use some caching of this
    data (perhaps in some other file) and mix this cached data with the cached buffer when the
    getText-method is called.
    While saving the new file this data should be mixed with the original file and written to a new file which
    will then replace the old file. Afterwards all buffers should be cleaned (especially the change-buffers).
    You could adapt some auto-saving mechanism for doing this save after a specified time,
    or when the change-buffers are growing to large.
    Finally you could also opt to split the read-buffer in a buffer using strong-references to the arrays and
    an additional buffer which uses soft- or weak-references to the arrays.
    The strong-references will make the first buffer a solid reliable buffer of fixed size (minimum),
    while the soft- or weak-references will make the second buffer a more soft buffer which can lose
    data when the memory-requirements of the application are high.
    This way you get a maximum size buffer when memory-requirements for the application do not claim the
    soft-references of the buffer and you will always keep a minimum buffer for the application.
    Data will always flow from the solid to the soft buffer and never backwards.
    kind regards,
    to add to the

  • Pagecount: problem with larger documents

    Hi,
    To print the pagecount I print &SFSY-PAGE&/&SFSY-JOBPAGES& on every page. This works for smaller documents (2-3 pages) but if I have larger documents (12-15 pages) it will print 1/, 2/, 3/, 4/ etc untill close to the end where it will print 12/13, 13/13 correctly.
    Anyone had this issue before?
    Thanks!

    Hi,
    I think you should specific a length option for field, just like:
    &SFSY-PAGE(4ZC)& / &SFSY-FORMPAGES(4ZC)&
    "4: set length to 4
    "Z: Suppresses leading zeros in numbers
    "C: This effect corresponds to that of the ABAP statement CONDENSE
    Please try,
    Thanks

  • Have Windows XP and Adobe 9 Reader and need to send a series of large documents to clients as a matter of urgency     When I convert 10 pages a MS-Word file to Pdf this results in file of 6.7 MB which can't be emailed.     Do I combine them and then copy

    I have Windows XP and Adobe 9 Reader and need to send a series of large documents to clients as a matter of urgency When I convert 10 pages a MS-Word file to Pdf this results in file of 6.7 MB which can't be emailed.  Do I combine them and then copy to JPEG 2000 or do I have to save each page separately which is very time consuming Please advise me how to reduce the size and send 10 pages plus quickly by Adobe without the huge hassles I am enduring

    What kind of software do you use for the conversion to pdf? Adobe Reader can't create pdf files.

  • How can I change a page position in a large document?,

    How can I change a page position in a large document?

    Question asked and answered many times !
    Insert a section break just before the page to move.
    Insert a section break just after the page to move.
    Select the page's thumbnail
    cut
    Insert a section break where you want to insert the page.
    paste
    The required infos are available in Pages User Guide which isn't delivered to help helpers to help you.
    Yvan KOENIG (VALLAURIS, France) mercredi 5 octobre 2011 14:33:24
    iMac 21”5, i7, 2.8 GHz, 4 Gbytes, 1 Tbytes, mac OS X 10.6.8 and 10.7.0
    My iDisk is : <http://public.me.com/koenigyvan>
    Please : Search for questions similar to your own before submitting them to the community

  • I have an issue with some PDF documents   some MS Office documents, notably MS Word when received in eMail as attachments not displaying on iPhones, iPad 2s

    Hi,
    I have an issue with some PDF documents & some MS Office documents, notably MS Word when received in eMail as attachments not displaying properly on iPhones, iPad 2s & new iPads?
    PDF docs - areas of the doc only show as grey splotches regardles of viewer
    MS Word Docs -
    inserted graphics don't display at all,
    tables & lists especially with borders are broken &
    tables & lists missing the borders &
    tables & lists missing the 1st 1-2 lines of the lists/tables.
    This is replicated on iPhone 4s, iPhone 4Ss, iPad 2s, iPad new (x2)
    It certainly happens in iOS 5.1.1 & iOS 5.1
    We believe it all worked aok in iOS 5.0.1(?) & prior
    There is no problems reading/seeing these PDF & Word docs on anything other than iDevices.
    This is rather critical for us & if not quickly fixed/rectified will prevent us from further purchases of these devices
    Rolling back to iOS 5.0.1 or prior I believe isn't possible because of the BaseBand update(?) & isn't much of an option because of the quite noticeable Battery/Charging/WiFi improvements in 5.1 & 5.1.1

    Hi,
    I logged a call with AppleCare & have since had explained why this occurs (some time ago)
    I'll try to explain what was indicated to me.
    iOS has a limited Font Set & this affects what PDFs can display. You need to use iOS supported fonts in your PDFs to see/read them properly  
    These supported fonts are indicated here => http://support.apple.com/kb/HT4980
    Regards MS office docs, iOS doesn't have much of an API to work with MS Office documents at all so is stuck with 3rd Party Apps to try to do it.
    Unfortunately Mail in/on iOS uses the API to attempt to open/use MS Office attachments unless you tell it to use an App to open the attachment.
    I have had success opening & reading MS office docs now with CloudOn, but find it slow & very awkward to use.
    Not too sure if this helps others, but at least it explains why this is occurring

  • Links in each page of a large document

    I have a 350+ pages PDF document and would like to add a link to the TOC in each page. Obviously not one by one because this would take a lot of work/time.
    I thought to use a general header or footer with the linked text. But I don't know how to put linked text in a header/footer. Can this be done?
    Is there any other way you know to easily add a link to all pages of a large document?

    Another way is to put your link in a button: buttons can be duplicated to all (or a range) of pages. If you don't like a button that 'looks like a button' just use 'no border, no fill', or even an icon to customise it.

  • Date issue in processing billing documents that shipped in a prior period

    Date issue in processing billing documents that shipped in a prior period that is now closed:
    SAP values those deliveries with the current document date in A/R when it should be the original delivery date.  Baseline date needs to stay the delivery date, but is getting copied from the billing date in the new period.  A user exit exists (& needs to be applied) to manipulate the baseline date when the accounting document is being created.
    Any input is appreciated.

    Hi
    Try with Invoice Correction Request concept
    Sale document Type: RK
    Reference Document Type : Billing Number
    for Understanding the Invoice Correction request check below link
    [Creating Invoice Correction Requests|http://help.sap.com/saphelp_46c/helpdata/en/dd/55feeb545a11d1a7020000e829fd11/content.htm]
    Regards,
    Prasanna

Maybe you are looking for

  • Video Streaming in Windows media player won't play through Airport Extreme

    Hello and thank you for reading my issue. I have the current square shaped airport extreme base station. My mother has a Windows XP system on her Toshiba laptop and she takes an online class which has video streaming for the lectures. When we connect

  • K8d or k8t master far?

    Hello to all, looking at a server mobo what do you all think would be the better board? k8d master far ms-9131  amd chipped k8t master far ms-9130  via chipped Ive looked and cant seem to find out if the k8d can use a dual core, I know the k8t is sin

  • The Use of Folders in Portal Favorites iView

    I have enabled the Portal Favourites and one of the functionality this provides is to create Folders within Favourites.  The user can create links in the folders to organise their favourites.  I am experiencing problems adding folders, by clicking on

  • Adobe Shockwave Player 11.0.0.458   

    File size: 4477 K Download Time Estimate: 6 minutes @ 56K modem Platform: Windows Version: Shockwave 11.0.0.458 Browser: Netscape or Netscape-compatible and Internet Explorer Date Posted: 6/19/2008 Language: English, French, German http://www.adobe.c

  • Why duplicate images uploaded from iPhoto to Elements?

    Uploading images from iPhoto to PS Elements 13 created duplicates of every image.  I have a MAC with Maverick OS.  Is this a problem for others?  How does one organize in Elements with twice the number of images to handle?  Very frustrating!!