Most efficient method of storing configuration data for huge volume of data

The scenario in which i'm boggled up is as follows:
I have a huge volume of raw data (as CSV files).
This data needs to be rated based on the configuration tables.
The output is again CSV data with some new fields appended to the original records.
These new fields are derived from original data based on the configuration tables.
There are around 15 configuration tables.
Out of these 15 tables 4 tables have huge configurations.
1 table has 15 million configuration data of 10 colums.
Other three tables have around 1-1.5 million configuration data of 10-20 columns.
Now in order to carry forward my rating process, i'm left with the following methods:
1) Leave the configurations in database table. Query the table for each configuration required.
Disadvantage: Even if the indexes are created on the table, it takes a lot of time to query 15 configuration tables for each record in the file.
2) Load the configurations as key value pairs in RAM using a suitable collection (Eg HashMap)
Advantage: Processing is fast
Disadvantage: Takes around 2 GB of RAM per instance.
Also when the CPU context swithes (as i'm using a 8 CPU server), the process gets hanged up for 10 secs.
This happens very frequently, so the net-net speed which i get is again less
3) Store the configurations as CSV sorted files and then perform a binary search on it.
Advantages: No RAM usage, Same configuration shared by multiple instances
Disadvantages: Only 1 configuration table has an integer key, so cant use this concept for other tables
(If i'm wrong in that please correct)
4) Store the configurations as an XML file
Dont know the advantages/disadvantages for it.
Please suggest with the methodology which should be carried out....
Edited by: Vishal_Vinayak on Jul 6, 2009 11:56 PM

Vishal_Vinayak wrote:
2) Load the configurations as key value pairs in RAM using a suitable collection (Eg HashMap)
Advantage: Processing is fast
Disadvantage: Takes around 2 GB of RAM per instance.
Also when the CPU context swithes (as i'm using a 8 CPU server), the process gets hanged up for 10 secs.
This happens very frequently, so the net-net speed which i get is again lessSounds like you don't have enough physical memory. Your application shouldn't be hanging at all.
How much memory is attached to each CPU? e.g. numactl --show

Similar Messages

IFRAME into iMOVIE - most efficient method for importing?

What would be the most efficient method for importing iFrame movies from a camera into iMovie?
iFrame i suppose to save time and work more efficiently in lue of quality but I don't seem to find I way to import the movies faster than in other formats.
On a second note, inporting in iMovie from DV (tape) cameras dramaticly reduced the image quality. Do we still have the same issue when importing an iFrame movie?
Thank you for your help!

Im completly new myself to importing IFRAME into Imovie 11 as i only got my new Panasonic X920 Camcorder 2 days ago.Can you please tell me is there a big drop in quality from 1080 60p to IFRAME Quality.

Most efficient method to process 2 million plus records from & to a Ztable

Hi All,
My requirement is as follows:
There is a table which has 20 and odd columns, and close to 2 million records.
Initially only 5 or 6 columns will have data. Now the requirement is to fetch them and populate the remaining columns of the table by looking into other tables.
Looking for the most efficient method to handle this as the data count is huge.
There should be an optimum balance between memory usage and time consumption.
Kindly share your expertise in this regard.
Thanks
Mani

Write   a Program to Download the data for that table column to be filled into Local file   .XLS Format.
Then   Write a report for Uploading the data from the local file   .XLS to    database table through internal table itab.
Loop at itab .
UPDATA database table   where   condition of the primary fields.
endloop.
first try this in the development and testing server , then go for the Production.
But   take backup of full exsisting Production data into   local file and also take neccesary approvals for   doing this task .
Reward Points if it is usefull..
Girish

Need advice on storing configuration variables for use by FP2000 Controller for embedded application.

I am creating 8 machines that generally operate in the same way and each will be controlled using a FP-2000 controller. The only difference between the machines is a set of scaling constants and pass values for determining if the machine completed its process successfully.
In the past, using an idependent PC and Labview, I have created a configuration.vi for writing the constants and configuration variables to a data file on my hard drive. Then in the auto.vi I read(only once each time the program is started) the file and store the data in the program. I would like to do something similar with this system but am not familar with the Field Point system.
I know i
t is probably not difficult to store the data to the host computer and transfer it to the modules but I am better off writing to the modules once and storing the data onboard the FP controller for use by an embedded application. This way, if the network connection is lost for any reason, the machine can still operate. Is this possible, and if not what do you suggest in order to prevent being so reliant on the host computer?
Thank you for your help.

Mike,
There are a number of ways to accomplish what you desire. The easiest is to continue doing what you are already doing. The FP-20xx series modules treat their flash memory as if it was a hard drive, so the file I/O VI's in LabVIEW work just the same in a FP-20xx as on a regular computer running LabVIEW. The primary variation will be in how you write the files over the network. Since mapping network drives is more of a Windows functionality, you can not simply have a VI running on your host computer use a File I/O VI to write to a FP-20xx. Instead, what you will need to do is to write the file to you local drive and then FTP (file transfer protocol) the VI to the FP-20xx module. This can be done using the LabVIEW Internet toolkit or any 3rd party FTP util
ity. One word of advice; the OS on the FP-20xx does not support long filenames but due to a problem in the FTP server, long filenames (non 8.3 compliant) may be uploaded and once there, you will be unable to access the file again, even to delete it.
An alternative method that I have seen used is to use a global VI and write to it from the host machine through the use of VI server. You can then have the program on-board the FP-20xx save the globals to your configuration file.
Regards,
Aaron

I need a more efficient method of transfering data from RT in a FP2010 to the host.

I am currently using LV6.1.
My host program is currently using Datasocket to read and write data to and from a Field Point 2010 system. My controls and indicators are defined as datasockets. In FP I have an RT loop talking to a communication loop using RT-FIFO's. The communication loop is using Publish to send and receive via the Datasocket indicators and controls in the host program. I am running out of bandwidth in getting data to and from the host and there is not very much data. The RT program includes 2 PID's and 2 filters. There are 10 floats going to the Host and 10 floats coming back from the Host. The desired Time Critical Loop time is 20ms. The actual loop time is about 14ms. Data is moving back and forth between Host and FP several times a second without regularity(not a problem). If I add a couple more floats each direction, the communications goes to once every several seconds(too slow).
Is there a more efficient method of transfering data back and forth between the Host and the FP system?
Will LV8 provide faster communications between the host and the FP system? I may have the option of moving up.
Thanks,
Chris

Chris,
Sounds like you might be maxing out the CPU on the Fieldpoint.
Datasocket is considered a pretty slow method of moving data between hosts and targets as it has quite a bit of overhead assosciated with it. There are several things you could do. One, instead of using a datasocket for each float you want to transfer (which I assume you are doing), try using an array of floats and use just one datasocket transfer for the whole array. This is often quite a bit faster than calling a publish VI for many different variables.
Also, as Xu mentioned, using a raw TCP connection would be the fastest way to move data. I would recommend taking a look at the TCP examples that ship with LabVIEW to see how to effectively use these.
LabVIEW 8 introduced the shared variable, which when network enabled, makes data transfer very simple and is quite a bit faster than a comparable datasocket transfer. While faster than datasocket, they are still slower than just flat out using a raw TCP connection, but they are much more flexible. Also, the shared variables can fucntion in the RT fifo capacity and clean up your diagram quite a bit (while maintaining the RT fifo functionality).
Hope this helps.
--Paul Mandeltort
Automotive and Industrial Communications Product Marketing

Most efficient method of online replication!

Hello Guys,
I want to know the most efficient way of synchoronous (real time) replication among 2 oracle databases, that are on 2 different geographyical locations and connectivity among them is over the internet.
Both systems are linux based and oracle 11gR1 is installed.
The constraint is performance.
Kindly help.
Regards, Imran

1) Do you really, really need synchronous replication? Or just "near real time" replication? Synchronous replication requires that neither database commit until both databases have the change, which implies that you're using two-phase commit which implies that every single transaction requires multiple messages to be exchanged between the servers. Two servers that are widely separated are likely going to add a substantial overhead to the transaction. There are a small handful of cases where that might be reasonable, but for 99.9% of the applications out there, synchronous replication is something to be avoided at all costs.
2) What is the business problem you are trying to solve? Are you trying to create a failover database? Are you trying to replicate a small subset of data from one database to another? Something else?
3) What edition of Oracle are you using? Enterprise or standard?
Justin

What is the most efficient way to pass LV data to a dll?

For efficiency, this question primarily becomes important when passing large arrays, structures containing large arrays, or generally, any large block of data to a dll.
The way the dll setup appears and the .c file it create for the dll call, it appears that labVIEW directly passes data in whatever native passing format LV requires without copying the actual data if you select the "Adapt to Type" as the "Type" option. If I pass an array, for example, the .c file contains the type definition of a labVIEW array, i.e., (size,data), and depending whether I select handle or handle pointer, the data passed is either a handle or handle pointer to the array. Likewise, if I pass a LV structure, the .c file will con
tain the typedef to the structure and the data passed is either a pointer, or pointer to a pointer. These are, I believe, labVIEW native types and do not require copying.
On the other hand if an array is passed as an array type, then it is converted to a C array that requires LV to copy the array on both sides of the call.
I further assume all structures can be passed to the memory manager to be manipulated, although I'm actually not sure that you could resize an array pointer in the dll. That seems a bit dubious, but then I guess upon return LV could query the memory manager to determine the array pointer size.
That�s how I would think things work. If not, could someone please correct me?
Kind regards,
Eric

Eric,
Let me tell you something about me too...
I've been working with LabVIEW for (just) 4 years. That is, 40 hours a week
professionally, 10 hours a week privatelly. I've started with LV4, and went
through all versions and revisions until 6.0.2 (6.1 will come soon, but
first I have to finish some major projects.
During this time I've been working on lots of interfaces with the windows
OS. Some 'dll' things I've worked on: OpenGL interface, MSXML driver,
keyboard hooks, mouse hooks, GDI interfacing, calling LV dll's from
assembler, calling assembler dll's from LV, creating threads, using serial
interrupts, etc. I'm now (also) working on a way to automatically generate
documentation (much more then the 'export VI stings') from a VI. This
requires 'under the hood' knowledge about how VI's work.
When I had to make a fast routine for a project one time, I choose
assembler, because I had this knowledge. Also, I wanted to use pure SIMD
opperations. The operation had to modify an array of DBL's. The SIMD uses
the same format (IEEE 754, I think), so it was easy. But when it came to
testing, it appeard that the routine only paid off if the routine was 'long'
enough. The routine was n*O^2, where n was a parameter. When the array was
large, and n small, the overhead of copiing the array to modifiable memory
was relativelly large, and the LV routine was faster.
When I get a pointer to a LV array, I can use this pointer to modify the
data in the array. This can (I think) only be done if LV copied this data,
just like LV is doing when a wire is split to be modified.
It might be that this copiing can be prevented, e.g. by using other data
types, or fiddling with threads and reentrance... If you want to optimally
benefit from dll's I'd look for a way to keep the data in the dll space, or
pass it once at initialisation. You could use CreateHeap, HeapAlloc,
AllocGlobal, and other functions. You can use these functions in LV, or in
the dll. Once you have a pointer to the (one and only) data space, you can
use this to pass to the dll functions.
I think LV does not show the memory in question in the profiler, but I'm not
sure.
Using the "Adapt to type" option might just result in a internal convertion
during 'compile' time, and might be exactly the same as doing it yourself.
Perhaps you can share a bit about the application you are making, or at
least why you need the speed you are seeking?
Regards,
Wiebe.
"Eric6756" wrote in message
news:50650000000500000025C60000-1042324653000@exchange.ni.com...
Greg,
There are two relevant documents which are distributed with labVIEW,
in labVIEW 6i, (hey, I'll get around to upgrading), the first is
titled, "Using External Code in LabVIEW", the second is application
note 154, "LabVIEW Data Formats".
Actually, a statement Wiebe@air made on my previous question regarding
dll calls, "Do dll calls monopolize the calling thead?" provoked this
line of questions. Based on other things he has said, I gather he is
also using dlls. So as long as we're here let me ask the next
question...
If labVIEW must make a copy of the passed data, does it show up as
additional memory blocks in the vi profiler? In other words, can you
use the profiler to infer what labVIEW is doing, or as you put it,
infer whether there is a clever passing method available?
As a personal note Greg:
First, as a one time engineering student and teaching assistant, I
don't recall hearing or using the terms "magical", or "clever". Nor I
might add, do I find them in print elsewhere in technical journals.
While I don't mind NI marketing in their marketing documents, used
here in this mostly educational forum, they strike me as arrogant,
and/or pompous.
I like NI's products because they work and are reliable. I doubt it
has anything to do with magic, has somewhat more to do with being
clever, but is mostly due to the dogmatic persistence of your
engineers. I rather doubt any of them adjoin the term "magical" or
even "clever" to their solutions. I believe the term "best" is
generally accepted with the qualifier "I've or we've found". At
least, that has been my engineering experience.
Second, many of my questions I can sort out on my own, but I figure as
long as your willing to answer the questions, then your answers are
generally available to others. The problem is that one question seems
to lead to another and specific information gets buried in a rather
lengthy discourse. When I come here with a specific question, it
would be nice to find it asked and answered specifically rather than
buried in the obscurity of some other question. As such, at some
point in these discussions it might be appropriate to reframe a
question and put at the top. In my opinion, that decision is
primarily yours as you have a better feel for the redundancy of
questions asked and/or your answers.
Anyway, the next question I'm posting at the top is, "Do the handles
passed to a dll have to be locked down to insure other threads don't
move the data?"
Thanks,
Kind Regards,
Eric

Best method of storing/receiving data

I'm goofing around with developing a Flash game that would be accessible via a Facebook application but am still trying to figure out how I want to organize the game itself. The game essentially is an educational game with competitive components - completing tests faster results in more points with which you can customize your avatar and other items in the game. Also, passing certain "lessons" open up new lessons. So the major components I see are:
- Player data storage (scores, progress, etc.)
- Social connections - transference of rewards, comparing high scores, etc.
- Lesson data (what questions to ask, possible answers, etc.)
How is this best controlled? PHP and SQL? XML? Should I use datagrids as an intermediary? I'm having a lot of problems finding good information, both on the internet and at the bookstore about how to manage this. If only I had access to the source for the more popular Facebook games...
Thanks!

player data you can store using the sharedobject which is easier to encode than using a database like mysql or xml.
but to compare high scores among different players you'll need to use a database and server-side scripting. so, you may as well use that for all the data.

How to extract data from table for huge volume

Hi,
I have around 200000 material doc number for which need to get material number from MSEG table but while using SE16 it gives dump , i have even tried breaking it into batches of 20000 records but still SAP gives dump on executing SE16 for MSEG. Please advise if there is any alternate way to get data from SE16 table for such a large volume.
Note: In our system SE16N does not work, only SE16 is there for our SAP version.
Thanks,
Vihaan

Hi Jurgen,
Thanks for your reply.
I am getting Dump when i enter more than 5000 records as input parameter in MSEG, if I put more than that then it gives dump as "ABAP runtime errors SAPSQL_STMNT_TOO_LARGE ".
I understand that I can extract data restrciting 5000 every time but I have around 250000 material docs so that means if we consider batches of 5000 I need to run the step more 50 times--> 50 excel files. I wanted to avoid that as that is going to take lots of my time.
Any suggestion, please help.
Also wanted to highlight that apart from Material Doc number I am entering Plant (8 plants) and Mvt type (14 mvt type) also as input parameter.
Regards,
Vihaan
Edited by: Vihaan on Mar 25, 2010 12:30 AM

Need to raise an Alert for huge Volume of messages struck in the queue

Dear All,
I have a query which is mentioned below.
The Partners send huge number of volume of messages at a time to PI Server and due to this reason,the messages are getting struck up in the inbound/outbound queue.Everytime the user manually check for the struck up messages and reprocess it.
Example: One Partner sends 50,000 number of ORDERS at a time.
Now an alert needs to be raised for those messages which got struck up ( i.e. messages on hold and "not failed") in the queue.
Please share your inputs /suggestion.
Warm Regards
B.Dheepa

Hi,
Else you can implement the Logic in this blog
XI : How to Re-Process failed XI Messages Automatically
You can Schedule the Strandard Reports to release automatically the Strucked messages in Queues
Regards
Seshagiri

Efficient method to insert large number of data into table

Hi,
I have a procedure that accepts an input parameter, that contains, a comma seperated values as input.
Something like G12-UHG,THA-90HJ,NS-98039,........There can be more than 90,000 values in that comma seperated input paramter.
What is the most efficient way to do an insert in this case?.
3 methods I have in mind are :
1) Get individual tokens from CSV and use a plain old loop and do an insert.
2) Use BULK COLLECT & FOR ALL. However I don't know how to do this, since input is not from cursor, rather a parameter.
3) Use Table collections. Again this involves plain old looping through the collection. Same as 1st method.
Please do suggest the most efficient method.
Thanks

90,000 values?? Whats the data type of the input parameter?
you can use the string to row conversion trick if you want and do a single insert
SQL> with t as (select 'ABC,DEF GHI,JKL' str from dual)
2 select regexp_substr(str,'[^,]+', 1, level) list
3 from t connect by level <= NVL( LENGTH( REGEXP_REPLACE( str, '[^,]+', NULL ) ), 0 ) + 1
4 /
LIST
ABC
DEF GHI
JKL Edited by: Karthick_Arp on Feb 13, 2009 2:18 AM

Most efficient coding method to get from WFM Digital to Timestamp Array and Channel Number Array - Earn Kudos Here !

I'm Fetching data from a digitizer card using the niHSDIO Fetch Waveform VI. After Fetching the data I want to keep the Timestamp and Digital Pattern which I'll stream to TMDS file.
What is the most efficient method of striping out the Arrays I'm interested in keeping?
The attached VI shows the input format and desired output. The Record Length is always 1. I'll be streaming 100,000+ records to file using a producer-consumer architecture with the consumer performing a TDMS write.
I'm assuming only the WDT Fetch gives you the time from t0.
Attachments:
Digital Waveform to Array Coding.vi ‏11 KB

Hi bmann2000,
I'm not sure about efficiency but this method definitely works. I've just used a 'Get Digital Waveform Component' function and the 'Digital to Boolean Array' function.
Hope this helps.
Chris
National Instruments - Tech Support
Attachments:
Digital Waveform to Array Coding 2.vi ‏15 KB

What is the most efficient way of passing large amounts of data through several subVIs?

I am acquiring data at a rate of once every 30mS. This data is sorted into clusters with relevant information being grouped together. These clusters are then added to a queue. I have a cluster of queue references to keep track of all the queues. I pass this cluster around to the various sub VIs where I dequeue the data. Is this the most efficient way of moving the data around? I could also use "Obtain Queue" and the queue name to create the reference whenever I need it.
Or would it be more efficient to create one large cluster which I pass around? Then I can use unbundle by index to pick off the values I need. This large cluster can have all the values individually or it co
uld be composed of the previously mentioned clusters (ie. a large cluster of clusters).

> I am acquiring data at a rate of once every 30mS. This data is sorted
> into clusters with relevant information being grouped together. These
> clusters are then added to a queue. I have a cluster of queue
> references to keep track of all the queues. I pass this cluster
> around to the various sub VIs where I dequeue the data. Is this the
> most efficient way of moving the data around? I could also use
> "Obtain Queue" and the queue name to create the reference whenever I
> need it.
> Or would it be more efficient to create one large cluster which I pass
> around? Then I can use unbundle by index to pick off the values I
> need. This large cluster can have all the values individually or it
> could be composed of the previously mentioned clusters (i
e. a large
> cluster of clusters).
It sounds pretty good the way you have it. In general, you want to sort
these into groups that make sense to you. Then if there is a
performance problem, you can arrange them so that it is a bit better for
the computer, but lets face it, our performance counts too. Anyway,
this generally means a smallish number of groups with a reasonable
number of references or objects in them. If you need to group them into
one to pass somewhere, bundle the clusters together and unbundle them on
the other side to minimize the connectors needed. Since the references
are four bytes, you don't need to worry about the performance of moving
these around anyway.
Greg McKaskle

I am giving my old MacBook Air to my granddaughter. What is the most efficient way to erase all the data on it?

I am giving my old MacBook Air to my granddaughter. What is the most efficient way to erase the data?

You have two options.....
One is to do a clean reinstall of your OS - if you still have the USB installer that came with your Macbook Air...
The second option is to create a new user (your granddaugher's name).....Deauthorize your Macbook Air from your Itunes and Appstore.....
Restart your Macbook after you've created your granddaughter's user name, login under your granddaughter's username and delete your username.
Search your Macbook for your old files and delete them.....
Good luck...

How to Cool Down a MacBook Pro most efficient?

Hello,
I bought a MacBook Pro 13' a few days ago, and I'm very happy about the computer! Very good quality! I just have one little problem, when I'm gaming Flight Simulator, X-Plane, then my MacBook gets really hot. I know that there's methods to cool down a computer, but what's the best and most efficient methods to cool down my MacBook Pro 13' during gaming?
Hope some one is able to help ! Thank you !
AMLaursen

No I just found it on Amazon, I have a Belkin one I picked up at a supermarket near home. I don't game but I find Skype get the computer hot so I use the cooler then.
The one I have brings the temp down and the fans slow down as well.
I would have thought that they are all very much similar, just blowing air under the computer case, the Belkin ones leave a gap under the computer for the hot air to escape, not sure if that helps or not.
http://www.amazon.co.uk/Belkin-Cooling-Stand-Laptops-17-inch/dp/B001HNOLBI/ref=s r_1_3?ie=UTF8&qid=1400791130&sr=8-3&keywords=laptop+cooler
It is a case I think of getting one and trying it out.

Most efficient method of storing configuration data for huge volume of data

Similar Messages

Maybe you are looking for