Schema advice for huge csv file

Guys, I need an advice: Huge csv file (500 millions rows) to load in a table and I did. But now I need to alter the columns (they came all as varchar(50)). I'm just change one column and it's taking age...what kind of schema should I adopt? So far I applied
a simple data flow but I am wondering if I should do something like:
drop table
create table (all varchar)
data flow
alter table
No sure about

Is this a once off/ad-hoc load or something that'll be ongoing/BAU?
If it's ongoing then Arthur's post is the standard approach.
create a staging table with varchar(50s) or whatever. Load into that, then from that staging table go into your 'normal' table that has the correct column types.
If it's a once off, what I'd do is create a new table with the correct data types. Do a bulk insert from your table with 500mil rows.
then drop the old table and rename the new table.
Converting the columns in your 500mil table one by one is going to take a very long time, it'll be faster to do one bulk insert into a table with the correct schema
Jakub @ Adelaide, Australia Blog

Similar Messages

Creating a header for a .CSV file

Hi,
I have looked through the forums and cannot find a solution for creating a header for a csv file. I am using Labview 8.2. I want a label for each column at the top of the file and then I will append new rows to the file as the data is collected. An example of what I am looking for is attached.
It would also be nice to have the labels descending in the first column, pretty much a transposed version of what was described above.
Thanks,
Gary
Solved!
Go to Solution.
Attachments:
exampleLOG.csv ‏1 KB

Thank you very much. That worked well.
If i wanted to transpose the data how would I do that? I can get the header to be vertical, but I cant get the data to append to the 2nd column , and then the third and so on with the data descending from top to bottom. I attached an example of what I might want the file to look like. Each column would be added one at a time.
Attachments:
data3.csv ‏1 KB

Dynamic Resource name for Excel/csv files

Hi All,
Reading from a source excel (as well as csv) file, I tried passing the resource name via a variable which is set to contain the name of the file (refresh). However, on execution, the file name shows up with an additional "\" character without being specified in the variable text causing the interface to face on load step.
The error says:
ODI-1227: Task SrcSet0 (Loading) fails on the source FILE connection <ConnNameHere>.
Caused By: java.sql.SQLException: ODI-40438: File not found: C:\DirectoryNameHere\ /OTD.
Is this expected behavior where one needs to escape the special character?
I've verified that the extra / is not present in either the variable value or the directory for the Physical Schema.
TIA.

http://youtu.be/-QMV6cElgsk
http://youtu.be/V7OzzZ3kYdc
http://youtu.be/hNqqbZ4sJWA
http://youtu.be/TMHzhtEwgrA
http://youtu.be/REJDi584jh8
http://youtu.be/jE6k5OfWqi0
http://youtu.be/P3EKyykseps
http://youtu.be/-QMV6cElgsk
http://youtu.be/Gl7YvZohVVA
http://youtu.be/zb1ue5BP9mI
http://youtu.be/NhPDcc8n4VU
http://youtu.be/tD_kw5QXmps

XML publisher report not generating output for huge XML files

Changed Depreciation Projections Report output type to XML.
Defined a Data Definition and a new Data Template (RTF) for this report.
Ran the Depreciation Projection Report to generate the XML output.
Ran the XML Report Publisher report to generate teh PDF/Excel output of the above report.
Output generated for smaller XML files. When XML size is big, the program is running for hours without generating the output.
Teh RTF template is basically a matrix report in which the number of columns in the report is based on the number of periods the report is run for.
The same is not working in the Desktop version also. The system is hanging when i try to view the preview pdf.
The XML file size is approximately 33 MB.
Please let me know if there is any way we can increase the memory size to see the output.
Thanks,
Ram.

for publisher use Category: E-Business Suite

Huge CSV File

I'm trying to read in the first double value of every line in a gigantic CSV file in an efficient manner. Using "readLine()" creates a gigantic string that is disgaurded right after the first value is parsed out. This seems incredibly inefficient, and it's taking about 30 minutes just to complete an analysis of this file. Is there any way to just grab the values out?
I've tried just reading one byte at a time, grabbing the first value until I reach a comma, and then reading in a byte until I reach a end of line. But this has it's obvious disadvantages of reading byte by byte, and an inherant slowness to it.
Any solutions to this? Anything in NIO?
-Jason Thomas.

this:
http://ostermiller.org/utils/CSVLexer.html
Works nicely.

Schema (xsd) for CVT Tag File

Where can I file the .xsd file for tag files loaded to/stored from the Current Values Table (CVT) library?
I have a pressing situation. I am the software lead for a machine control project that relies heavily on customer configurability and tag management. I proposed an architecture that included use of CVT, CCC and the TCE to manage configurations; only to find out the update to TCE v2.x broke XML compatibility with CVT. The project at NI to update the CVT to remedy this doesn't seem to be making much progress, so I have to fashion a workaround to meet my timeline. My plan is to use an open source or COTS XML tool to manage configuration instead of the TCE. However, I need a copy of the xsd to make a solution that's consumable for the customer.
Where do I find this?
Side note: I've noticed similar questions raised for other LabVIEW components. I would suggest posting standards and schemas in a document area of decible.ni.com.

deepforge,
Have you already read the next information? http://zone.ni.com/reference/en-XX/help/371361J-01/lvconcepts/converting_data_to_and_from_xml/
At the end it tells the following:
LabVIEW XML Schema
LabVIEW converts data to an established XML schema. Currently, you cannot create customized schemas, and you cannot control how LabVIEW tags each piece of data. Also, you cannot convert entire VIs or functions to XML.
The predefined XML schema that LabVIEW uses is LVXMLSchema.xsd located in the labview\vi.lib\Utility directory. You can open the file in a text editor to read the schema.

How to indicate the Schema path for a XML file when parsing?

I have to validate a XML file. At the header line of this file, I need to specificate only the name of the schema but not the full path:
<DATAMODULES xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="Datamodules.xsd">
When validating, there is any way to specify to the parser the correct location of the XSD file?
I know I could copy the XSD file to the XML file directory or vice versa, but I cannot do that and I need a software solution.
Could someone help me?

An External Parsed Entity could be used to reference a schema.
In your DTD:
<!ENTITY datamodules SYSTEM "file:///c:/Datamodules/Datamodules.xsd">
Refer the external entity in xml document:
<datamodules>&datamodules;</datamodules>

Partition Scheme advice for external HDD & Missing HDD

I have two single partitioned external WD My Book HDDs and I've just noticed they have different setups on the single partition - this is unintentional.
1.GUID Partition Table
2.Master Boot Record
3.Apple Partition Map
From the list above one is 2. the other is 3. What should they be? One is HFS+ Journaled while the other is HFS+. Reading the notes against the partition types I can't decide what they should be as it's not a boot disc for the iMac.
This problem may be related to the above?
I installed 10.4.8 yesterday (after the iLife update the day before) then having earlier formatted one of the external My Book's I ran SuperDuper to create a backup of User files. This morning I found I couldn't launch any apps from the dock or directly.
So I rebooted and all seemed well except I couldn't sleep or shut down the Backup My Book, the other My Book was fine and they're both on the same fw hub. It unmounted OK but wouldn't power down manually until I pulled the PSU. So an hour later I tried switching it on and it powers up OK but isn't mounting.
Any ideas? Dusk Utility reports OK and unmounted so it can be seen by OS X.
iMac Core Duo 1GB Mac OS X (10.4.7) iBook G3-800 & Airport §

Why just one?
Shopping list I can think of:
Vista: NTFS for backups and storage. Can be read by, but not written, Mac OS.
MacDrive 7: can access HFS+ from within Windows (but not x64 versions)
WinClone 1.5: to make backups of Windows from within Mac OS (requires disk image or partition to hold volume image)
SuperDuper: essential for OS X backups and bootable clone.
A drive that is used for emergency only and maybe backup partition, separate from your OS X and windows clone drive.
You can create large partitions with Disk Utility to be FAT32. Read and write in both, but not the best file system and limited to files less than 4GB.
FW800 would be fast and efficient interface. Also, eSATA if possible.

Advice for cleaning up files from my drive

I'm going through old directories and want to delete what I do not need from old projects and transfer what I do to other drives.
Is there any reason to keep:
1.) Thumbnail Cache Files?
2.) Render Files that have subfolders called Constant Frames?
3.) I found a whole bunch 2 GB files called Sequence 1-av1, Sequence 1-av2, etc. These files will not open when double-clicked on. A message indicating "File Error: Wrong type. displays, and when I drag them into the QT player, it indicates The movie could not be opened. The file is not a movie file. What are these? Can I delete them?
Thanks.

What's up, Nick?
If I delete my Compressor Files from my external hard drive, will my project files (my movie) still be intact without any changes at all?
How about my Color Files? Can I delete them from my external HD with out altering my project files (my movie)?
I'm working on a very short movie, and I have a 1TB external but it's almost full from all my different compressions and color gradings.
Also, can I delete older saved versions of my project without effecting my latest version?
Thanks,
Trey

How to plot line chart for huge xml files?

Guys,
I would like to plot files over 2000 lines in a line chart!
but the application is getting very slow!
does anyone have any tips to improve performance?

Can we see how you implement the LineChart and bind the XML please?
It should be fast normally.

JDBC wrapper for CSV files?

I wrote my own method to read in CSV files into a table structure (String[][]). For big CSV files, I added several functionalities to ignore specific data lines that have specific values. All this looks quite similar to a database table that I do a select * for and reduce the resulting rows via specific WHERE clause criterias. So I wonder if there's already such a JDBC wrapper around CSV files?

Yes. I believe the JDBC-ODBC bridge can use an Excel URL to read in a CSV. Though don't quote me on that one.
However, why not simply use your RDBMS data-import utility? You can invoke it from a scheduler or from Runtime.exec(). It should perform MUCH better than middleware for a huge CSV file. If manipulation needs to occur for the data, write it first to a temp table, then manipulate it.
- Saish

Dashboard for csv file monitoring

Can someone explain to me how to add a widget to a dashboard for only alerts generated by with a specific rule or for text / csv file monitong?

Here a post about custom alert views which should help you setup the view you want.
http://social.technet.microsoft.com/Forums/systemcenter/en-US/3c391854-ec0c-4114-8e63-b7511cb79913/scom-2012-custom-alert-view-or-condition?forum=operationsmanagergeneral#90577c8c-a80c-40cf-81b8-bce3b1a7ae9a
Cheers,
Martin
Blog:
http://sustaslog.wordpress.com
LinkedIn:
Note: Posts are provided “AS IS” without warranty of any kind, either expressed or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.

Jdvc driver for csv files

do you know some free driver for using csv files as databases in java?

I've tried this one and it seems to work. Its read-only though.
http://csvjdbc.sourceforge.net/
Col

Reading csv file using file adapter

Hi,
I am working on SOA 11g. I am reading a csv file using a file adapter. Below are the file contents, and the xsd which gets generated by the Jdev.
.csv file:
empid,empname,empsal
100,Ram,20000
101,Shyam,25000
xsd generated by the Jdev:
<?xml version="1.0" encoding="UTF-8" ?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:nxsd="http://xmlns.oracle.com/pcbpel/nxsd" xmlns:tns="http://TargetNamespace.com/EmpRead" targetNamespace="http://TargetNamespace.com/EmpRead" elementFormDefault="qualified" attributeFormDefault="unqualified"
nxsd:version="NXSD"
nxsd:stream="chars"
nxsd:encoding="ASCII"
nxsd:hasHeader="true"
nxsd:headerLines="1"
nxsd:headerLinesTerminatedBy="${eol}">
<xsd:element name="Root-Element">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="Child-Element" minOccurs="1" maxOccurs="unbounded">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="empid" type="xsd:string" nxsd:style="terminated" nxsd:terminatedBy="," nxsd:quotedBy=""" />
<xsd:element name="empname" minOccurs="1" type="xsd:string" nxsd:style="terminated" nxsd:terminatedBy="," nxsd:quotedBy=""" />
<xsd:element name="empsal" type="xsd:string" nxsd:style="terminated" nxsd:terminatedBy="${eol}" nxsd:quotedBy=""" />
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
For empname i have added minoccurs=1. Now when i remove the empname column, the csv file still gets read from the server, without giving any error.
Now, i created the following xml file, and read it through the file adapter:
<?xml version="1.0" encoding="UTF-8" ?>
<Root-Element xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://TargetNamespace.com/EmpRead xsd/EmpXML.xsd" xmlns="http://TargetNamespace.com/EmpRead">
<Child-Element>
<empid>100</empid>
<empname></empname>
<empsal>20000</empsal>
</Child-Element>
<Child-Element>
<empid>101</empid>
<empname>Shyam</empname>
<empsal>25000</empsal>
</Child-Element>
</Root-Element>
When i removed the value of empname, it throws the proper error for the above xml.
Please tell me why the behaviour of file adapter is different for the csv file and the xml file for the above case.
Thanks

Hi,
I am working on SOA 11g. I am reading a csv file using a file adapter. Below are the file contents, and the xsd which gets generated by the Jdev.
.csv file:
empid,empname,empsal
100,Ram,20000
101,Shyam,25000
xsd generated by the Jdev:
<?xml version="1.0" encoding="UTF-8" ?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:nxsd="http://xmlns.oracle.com/pcbpel/nxsd" xmlns:tns="http://TargetNamespace.com/EmpRead" targetNamespace="http://TargetNamespace.com/EmpRead" elementFormDefault="qualified" attributeFormDefault="unqualified"
nxsd:version="NXSD"
nxsd:stream="chars"
nxsd:encoding="ASCII"
nxsd:hasHeader="true"
nxsd:headerLines="1"
nxsd:headerLinesTerminatedBy="${eol}">
<xsd:element name="Root-Element">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="Child-Element" minOccurs="1" maxOccurs="unbounded">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="empid" type="xsd:string" nxsd:style="terminated" nxsd:terminatedBy="," nxsd:quotedBy=""" />
<xsd:element name="empname" minOccurs="1" type="xsd:string" nxsd:style="terminated" nxsd:terminatedBy="," nxsd:quotedBy=""" />
<xsd:element name="empsal" type="xsd:string" nxsd:style="terminated" nxsd:terminatedBy="${eol}" nxsd:quotedBy=""" />
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
For empname i have added minoccurs=1. Now when i remove the empname column, the csv file still gets read from the server, without giving any error.
Now, i created the following xml file, and read it through the file adapter:
<?xml version="1.0" encoding="UTF-8" ?>
<Root-Element xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://TargetNamespace.com/EmpRead xsd/EmpXML.xsd" xmlns="http://TargetNamespace.com/EmpRead">
<Child-Element>
<empid>100</empid>
<empname></empname>
<empsal>20000</empsal>
</Child-Element>
<Child-Element>
<empid>101</empid>
<empname>Shyam</empname>
<empsal>25000</empsal>
</Child-Element>
</Root-Element>
When i removed the value of empname, it throws the proper error for the above xml.
Please tell me why the behaviour of file adapter is different for the csv file and the xml file for the above case.
Thanks

Error while creating table from csv file

I am getting an error while creating a table using 'Import Data' button for a csv file containing 22 columns and 8 rows. For primary key, I am using an existing column 'Line' and 'Not generated' options.
ORA-20001: Excel load run ddl error: drop table "RESTORE" ORA-00942: table or view does not exist ORA-20001: Excel load run ddl error: create table "RESTORE" ( "LINE" NUMBER, "PHASE" VARCHAR2(30), "RDC_MEDIA_ID" VARCHAR2(30), "CLIENT_MEDIA_LABEL" VARCHAR2(30), "MEDIA_TYPE" VARCHAR2(30), "SIZE_GB" NUMBER, "RDC_IMG_HD_A" NUMBER, "START_TECH" VARCHAR2(30), "CREATE_DATE" VARCHAR2(30), "RDC_MEDIA_DEST" VARCHAR2(30), "POD" NUMBER, "TAPE" NUMBER, "ERRORS_YN" VA
Any idea?

I am getting an error while creating a table using 'Import Data' button for a csv file containing 22 columns and 8 rows. For primary key, I am using an existing column 'Line' and 'Not generated' options.
ORA-20001: Excel load run ddl error: drop table "RESTORE" ORA-00942: table or view does not exist ORA-20001: Excel load run ddl error: create table "RESTORE" ( "LINE" NUMBER, "PHASE" VARCHAR2(30), "RDC_MEDIA_ID" VARCHAR2(30), "CLIENT_MEDIA_LABEL" VARCHAR2(30), "MEDIA_TYPE" VARCHAR2(30), "SIZE_GB" NUMBER, "RDC_IMG_HD_A" NUMBER, "START_TECH" VARCHAR2(30), "CREATE_DATE" VARCHAR2(30), "RDC_MEDIA_DEST" VARCHAR2(30), "POD" NUMBER, "TAPE" NUMBER, "ERRORS_YN" VA
Any idea?

Schema advice for huge csv file

Similar Messages

Maybe you are looking for