Reading/Parsing a CSV file in UTF-16 ?
Hello everyone,
I'm in rush to modify my current CSV file parser that works fine for files in UTF-8 , to be able to parse the UTF-16 as well, as far as I checked the sample plugins, didn't find any code,
Also how could have support for both encodings? to do this I need to recognize the encoding by reading the file first then decide how to read from stream, any advice/ snippet will greatly appreciated.
P.S. I'm using this code to read a file
stream = StreamUtils::CreateFileStreamRead()
stream ->XferByte(aChar) // in a loop till find a eol char
I need to use to read the 2 bytes, i had some experiment with XferInt16 but seems it doesn't do what i want...
Regards,
Kamran
I had forgotten to skip the first two bytes in this case, Now I can read the file properly with XferInt16, also you may consider Byte Swapping for BigEndian in parsing process.
-Kamran
Similar Messages
-
Parsing BLOB (CSV file with special characters) into table
Hello everyone,
In my application, user uploads a CSV file (it is stored as BLOB), which is later read and parsed into table. The parsing engine is shown bellow...
The problem is, that it won't read national characters as Ö, Ü etc., they simply dissapear.
Is there any CSV parser that supports national characters? Or, said in other words - is it possible to read BLOB by characters (where characters can be Ö, Ü etc.)?
Regards,
Adam
|
| helper function for csv parsing
|
+-----------------------------------------------*/
FUNCTION hex_to_decimal(p_hex_str in varchar2) return number
--this function is based on one by Connor McDonald
--http://www.jlcomp.demon.co.uk/faq/base_convert.html
is
v_dec number;
v_hex varchar2(16) := '0123456789ABCDEF';
begin
v_dec := 0;
for indx in 1 .. length(p_hex_str) loop
v_dec := v_dec * 16 + instr(v_hex, upper(substr(p_hex_str, indx, 1))) - 1;
end loop;
return v_dec;
end hex_to_decimal;
|
| csv parsing
|
+-----------------------------------------------*/
FUNCTION parse_csv_to_imp_table(in_import_id in number) RETURN boolean IS
PRAGMA autonomous_transaction;
v_blob_data BLOB;
n_blob_len NUMBER;
v_entity_name VARCHAR2(100);
n_skip_rows INTEGER;
n_columns INTEGER;
n_col INTEGER := 0;
n_position NUMBER;
v_raw_chunk RAW(10000);
v_char CHAR(1);
c_chunk_len number := 1;
v_line VARCHAR2(32767) := NULL;
n_rows number := 0;
n_temp number;
BEGIN
-- shortened
n_blob_len := dbms_lob.getlength(v_blob_data);
n_position := 1;
-- Read and convert binary to char
WHILE (n_position <= n_blob_len) LOOP
v_raw_chunk := dbms_lob.substr(v_blob_data, c_chunk_len, n_position);
v_char := chr(hex_to_decimal(rawtohex(v_raw_chunk)));
n_temp := ascii(v_char);
n_position := n_position + c_chunk_len;
-- When a whole line is retrieved
IF v_char = CHR(10) THEN
n_rows := n_rows + 1;
if n_rows > n_skip_rows then
-- Shortened
-- Perform some action with the line (store into table etc.)
end if;
-- Clear out
v_line := NULL;
n_col := 0;
ELSIF v_char != chr(10) and v_char != chr(13) THEN
v_line := v_line || v_char;
if v_char = ';' then
n_col := n_col+1;
end if;
END IF;
END LOOP;
COMMIT;
return true;
EXCEPTION
-- some exception handling
END;Uploading CSV files into LOB columns and then reading them in PL/SQL: [It’s|http://forums.oracle.com/forums/thread.jspa?messageID=3454184�] Re: Reading a Blob (CSV file) and displaying the contents Re: Associative Array and Blob Number of rows in a clob doncha know.
Anyway, it woudl help if you gave us some basic information: database version and NLS settings would seem particularly relevant here.
Cheers, APC
blog: http://radiofreetooting.blogspot.com -
Hello everyone,
I’ve been assigned one requirement wherein I would like to read around 50 CSV files from a specified folder.
In step 1 I would like to create schema for this files, meaning take the CSV file one by one and create SQL table for it, if it does not exist at destination.
In step 2 I would like to append the data of these 50 CSV files into respective table.
In step 3 I would like to purge data older than a given date.
Please note, the data in these CSV files would be very bulky, I would like to know the best way to insert bulky data into SQL table.
Also, in some of the CSV files, there will be 4 rows at the top of the file which have the header details/header rows.
According to my knowledge I would be asked to implement this on SSIS 2008 but I’m not 100% sure for it.
So, please feel free to provide multiple approaches if we can achieve these requirements elegantly in newer versions like SSIS 2012.
Any help would be much appreciated.
Thanks,
Ankit
Thanks, <b>Ankit Shah</b> <hr> Inkey Solutions, India. <hr> Microsoft Certified Business Management Solutions Professionals <hr> http://ankit.inkeysolutions.comHello Harry and Aamir,
Thank you for the responses.
@Aamir, thank you for sharing the link, yes I'm going to use Script task to read header columns of CSV files, preparing one SSIS variable which will be having SQL script to create the required table with if exists condition inside script task itself.
I will be having "Execute SQL task" following the script task. And this will create the actual table for a CSV.
Both these components will be inside a for each loop container and execute all 50 CSV files one by one.
Some points to be clarified,
1. In the bunch of these 50 CSV files there will be some exception for which we first need to purge the tables and then insert the data. Meaning for 2 files out of 50, we need to first clean the tables and then perform data insert, while for the rest 48
files, they should be appended on daily basis.
Can you please advise what is the best way to achieve this requirement? Where should we configure such exceptional cases for the package?
2. For some of the CSV files we would be having more than one file with the same name. Like out of 50 the 2nd file is divided into 10 different CSV files. so in total we're having 60 files wherein the 10 out of 60 have repeated file names. How can we manage
this criteria within the same loop, do we need to do one more for each looping inside the parent one, what is the best way to achieve this requirement?
3. There will be another package, which will be used to purge data for the SQL tables. Meaning unlike the above package, this package will not run on daily basis. At some point we would like these 50 tables to be purged with older than criteria, say remove
data older than 1st Jan 2015. what is the best way to achieve this requirement?
Please know, I'm very new in SSIS world and would like to develop these packages for client using best package development practices.
Any help would be greatly appreciated.
Thanks, <b>Ankit Shah</b> <hr> Inkey Solutions, India. <hr> Microsoft Certified Business Management Solutions Professionals <hr> http://ankit.inkeysolutions.com
1. In the bunch of these 50 CSV files there will be some exception for which we first need to purge the tables and then insert the data. Meaning for 2 files out of 50, we need to first clean the tables and then perform
data insert, while for the rest 48 files, they should be appended on daily basis.
Can you please advise what is the best way to achieve this requirement? Where should we configure such exceptional cases for the package?
How can you identify these files? Is it based on file name or are there some info in the file which indicates
that it required a purge? If yes you can pick this information during file name or file data parsing step and set a boolean variable. Then in control flow have a conditional precedence constraint which will check the boolean variable and if set it will execute
a execte sql task to do the purge (you can use TRUNCATE TABLE or DELETE FROM TableName statements)
2. For some of the CSV files we would be having more than one file with the same name. Like out of 50 the 2nd file is divided into 10 different CSV files. so in total we're having 60 files wherein the 10 out of 60 have
repeated file names. How can we manage this criteria within the same loop, do we need to do one more for each looping inside the parent one, what is the best way to achieve this requirement?
The best way to achieve this is to append a sequential value to filename (may be timestamp) and then process
them in sequence. This can be done prior to main loop so that you can use same loop to process these duplicate filenames also. The best thing would be to use file creation date attribute value so that it gets processed in the right sequence. You can use a
script task to get this for each file as below
http://microsoft-ssis.blogspot.com/2011/03/get-file-properties-with-ssis.html
3. There will be another package, which will be used to purge data for the SQL tables. Meaning unlike the above package, this package will not run on daily basis. At some point we would like these 50 tables to be purged
with older than criteria, say remove data older than 1st Jan 2015. what is the best way to achieve this requirement?
You can use a SQL script for this. Just call a sql procedure
with a single parameter called @Date and then write logic like below
CREATE PROC PurgeTableData
@CutOffDate datetime
AS
DELETE FROM Table1 WHERE DateField < @CutOffDate;
DELETE FROM Table2 WHERE DateField < @CutOffDate;
DELETE FROM Table3 WHERE DateField < @CutOffDate;
GO
@CutOffDate which denote date from which older data have to be purged
You can then schedule this SP in a sql agent job to get executed based on your required frequency
Please Mark This As Answer if it solved your issue
Please Vote This As Helpful if it helps to solve your issue
Visakh
My Wiki User Page
My MSDN Page
My Personal Blog
My Facebook Page -
Hi all,
I want some clarification about CSV file. we using CSV file to upload data from us front end site to db(oracle db). First we read data from CSV file via java CSV reader, the problem is starts from here. The CSV file may be contain some foreign character (i.e. chines, Spanish, Japanese, hindi, arabic), We need to create the CSV file as UTF-8 format.So,
Step - 1 :we create a excel file and save as CSV format
Step - 2 the same file in open Edit+ and change the utf-8
Mean the file has been read in java CSV reader.
Otherwise suppose i changed the second step that is the file open in notepad and convert utf-8 mean this same file not recognized to read in java CSV reader
Please give some idea how to create CSV file in UTF-8 charsetIf your input file is in csv format, try importing it directly into the database using SQLDeveloper. If you have a table created already, riight click on the table in the Connections navigator and select import data. On the first page of the wizard, select the correct encoding from the Encoding list. You should see the characters in the file displayed correctly in the bottom of the page. Select the other options like format, delimiiters and line terminators. When these options are specified correctly, you should see the file displayed as rows and columns in the bottom of the screen. Continue with the import using the wizard. If the table is not already created, you can right click on the table folder in the Connections navigator. The second page of the wizard will allow you to enter a new table name, the wizard will automatically create a column in the table for each column in the file and you can refine the column definitions using step 4, the Column Definition panel.
Joyce Scapicchio
SQLDeveloper Team -
I just want to read in a csv file and send it to an instrument at 1 Hz per line
I want to read in a csv file that has pwr levels given at 1 Hz and append this using a format into string to the signal generator at 1 Hz. I just have never read in an entire file to automate the procedure. Thanks!
duplicate post
Please give time for people to respond. -
hi all,
i am working on this app, in which i need a parse a CSV file every 1hr. now the CSV file is average size. i need to parse the file (i will use simple stringtokenizer), organise the data in the file (using simple string manipulation) and export to some format (will worry about later). now whats the most efficient and quick way to do this.
and what about the 1hr loop, how should i implement that. pls help.
thanks.ag2011 wrote:
hi all,
i am working on this app, in which i need a parse a CSV file every 1hr. now the CSV file is average size. i need to parse the file (i will use simple stringtokenizer), organise the data in the file (using simple string manipulation) and export to some format (will worry about later). now whats the most efficient and quick way to do this.
and what about the 1hr loop, how should i implement that. pls help.
thanks.Hi ,
Look at Quartz API ! This has very efficient job scheduling engine .
SchedulerFactory schedFact = new org.quartz.impl.StdSchedulerFactory();
Scheduler sched = schedFact.getScheduler();
sched.start();
// create trigger
Trigger trigger = TriggerUtils.makeHourlyTrigger(1); // fire every one hour
JobDetail jobDetail = new JobDetail("myJob", "MyGrp", CSVParser.class); // This class is u r actual csv parser code exists
//schedule job
sched.scheduleJob(jobDetail, trigger);
i hope it helps ,
see Quartz API for more details !
http://www.opensymphony.com/quartz/
--------Amit
Edited by: AmitChalwade123456 on Jan 5, 2009 10:57 AM -
Hi there,
I have a form sending user's input to a csv file. I am trying to read the data back from the .csv and calculate some feedback. I found a taglib ...but first I cant make it work and second it gets the data as String and I need it in int so I can calculate it.
PLease, give me some advice on what is the best way to do this.I wrote CSV writing and parsing libraries for Java:
http://ostermiller.org/utils/CSV.html -
Hi,
I am doing a File (SalesData.csv) -> PI -> IDOC scenario and have created a DT, MT and also made FCC configuration in sender File Channel.
However, when I run the scenario I get an error message in Channel Monitoring.
CSV File:
12,36,45,78,89
2154,789,65,78,99
This CSV file should be converted into an XML by FCC like:
<MT_SalesData>
<Data>12,36,45,78,89
2154,789,65,78,99</Data>
</MT_SalesData>
I used FCC in the channel like:
Document Name: MT_SalesData
Document Namespace: actual namespace of the MT
Recirdset Name: MT_SalesData
Recordset Structure: Data,1
Error in Channel Monitoring:
Conversion initialization failed: java.lang.Exception: java.lang.Exception: java.lang.Exception: Error(s) in XML conversion parameters found: Parameter 'Data.fieldFixedLengths' or 'Data.fieldSeparator' is missing Mandatory parameter 'Data.fieldNames': no value found
What more information should I include in the FCC? If required I can change the DT.
Entire content of the SalesData.csv file should be placed inside the Data tag, as is without any modification.
Thanks
PankajHi Pankaj,
Your file content conversion configuration is incomplete.
Please check below link for the addtional parameters you need to specify:
[http://help.sap.com/saphelp_nwpi71/helpdata/en/44/682bcd7f2a6d12e10000000a1553f6/frameset.htm]
Please check if you have maintained following parameters:
a. NameA.fieldFixedLengths or NameA.fieldSeparator
b. NameA.fieldNames
Regards,
Beena. -
Table unable to read data from CSV file
Dear All,
I have created a table which have to read data from external CSV file. The table is giving error if the file is not there at the specified location,but when i put the file at that location there is no error but no rows are returned.
Please suggest what should i do???
Thanks in advance.No version.
No operating system information.
No DDL.
No help is possible.
I want to drive home the point here to the many people that write posts similar to yours.
"My car won't start please tell me why" is insuffiicent information. Perhaps it is out of gas. Perhaps the battery is dead. Perhaps you didn't turn the key in the ignition. -
Parsing a csv file with carriage return replaced with #
Hi,
We have a weird problem. We are able to download a csv file using standard FM HTTP_GET. We want to parse the file and upload the data into our SAP CRM system. However, the file downloaded, has the carriage return replaced and the character # replaces it and everything seems like its one line.
I understand that the system replaces the Carriage return with the charater #. My question is, if I try to pass this file into my program to parse for the data, will there be any issues in the system recognizing that "#" that it is a carriage return and that the data in the file is not 1 record but multiple records?Hi
'#' is what you see in the SAP. But the actuall ascii associated will be of carraige return itself. So to identify if you have multiple records of not don't use hard coded '#' but instead use the constant CL_ABAP_CHAR_UTILITIES=>CR_LF.
Regards
Ranganath -
How to read the entire CSV file content
Hi All,
I am using JDBC adapter and i need to send the entire CSV file to the Blob datatype filed.
How to achive this?
Thanks
Mahi.Hi Mahi
So you want to send the content of the entire CSV file to a particular field in data base?
You can write a java mapping to read the content of the CSV file and then populate the same under the field of the data base.
If you need help on java mapping then please provide the sample CSV file and the target JDBC structure. -
Hi Guys,
I am trying to read data from a CSV file character by character. Whats the best way to do this? Any examples around?
Thanks
tzafDoes this mean your file will have multiple lines? And each line would indicate a new record? If so, you should use the BufferedReader and take in each line as a String. Then you can use the StringTokenizer to separate your string into tokens using the comma as your delimiter. From there you can convert the string tokens into whatever form you like, but by default they are already in String form. To make it an Integer I'd use Integer.parseInt().
I will show you some code on how to get the values from your file using the BufferedReader and the StringTokenizer but what you do with those values afterwards I'm going to leave up to you.
File csvfile = new File("myfile.csv");
byte[] fileBuf = new byte[1024]; // buffer for file data
int bytesRead = 0; // number of bytes read
try
BufferedReader fileIn = new BufferedReader( new FileReader( csvfile ));
PrintStream out = new PrintStream( System.out );
String readLine; // stores a line from the file as a string
while( (readLine = fileIn.readLine()) != null )
StringTokenizer tokens = new StringTokenizer( readLine, ",", false);
// false means you don't want to count the commas, only the values
while( tokens.hasMoreTokens() )
String aValue = tokens.nextToken();
// ... do what you want with the value
// ... change to Integer or whatever
out.println(aValue); // Printing the value to the screen
fileIn.close();
}Good luck,
.kim -
How to Read excel or .csv files in java
I am writing a program which takes input as excel or .csv file.
How to read these files.
Any API's are existed or need to use the third party jar.
Please suggest me.
Thanks & RegardsDid you search in google? Did you search here? There are so many excel related questions here, including answers about third party libraries.
I have the impression that you didn't research at all.
_[How to ask questions|http://faq.javaranch.com/view?HowToAskQuestionsOnJavaRanch]_ It's the same here. -
Reading/Parsing an EXE file with Java
Hey guys,
Is there a way (in Java) to parse an EXE file and get its version, description, etc? (mostly the information in the VERSION tab inside the file properties window).
Thanks.I can't find a thing about that. :(
Perhaps I'll tell you what I need that for:
I got tired of rearranging my start menu and drag-n-droppin' every time I install a new program, so I wanted to create an application that scans a folder of my choice (i.e. c:/progra~1) and creates a folder in the start menu with shortcuts to all of the EXEs in the folder and in its subfolders, arranged by application name (some applications have more than one EXE, including uninstall or update programs) and usage (that I'll do at the end, basing on rules and lists I'll create from DOWNLOAD.COM, for instance). I've done everything but the usage types filtering (putting it into folders: "system", "media", "internet", etc).
So my problem was, that the shortcuts' names are ugly most of the times, and I can't tell the full name of an application just from its filename. For example, Google Earth's main EXE is "googleearth.exe". A shortcut that says "Googleearth" isn't so nice to look at. also, with this kind of name you can know what it does, but what about other filenames that don't exactly say what that file does?
I needed a way to get the -true- name of the application file, and the only way I see is through the properties, in the "Description" field under "Version".
But alas, that's not so simple :P
I thought about simply getting the name from the folder the EXE's in, but then there are more than one EXE per folder.
Any other suggestions will be great.
Thanks again, guys. -
Reading a Blob (CSV file) and displaying the contents
Hello Experts,
I’m currently working on a system that allows the users to upload an Excel spreadsheet (.xls) in the system. The upload page is a PL/SQL cartridge. Then I’ve written a Java servlet (using Oracle Clean Content) to convert the XLS into a CSV and store it back in the database. (it is stored in the “uploaded_files” table as a blob). I’m trying to create another procedure to read the contents of the blob and display a preview of the data on the screen (using an html table (will be done using cartridge)). After the preview, the user can choose to submit the data into the database into the “detail_records” table or simply discard everything.
Can anyone provide me any guidelines and/or code samples for reading the contents of a blob and displaying it as an html table? My data contains about 10 rows and 20 columns in the spreadsheet. I’ve been looking into using associative arrays or nested tables but I’m getting confused about how to implement them in my situation.
Any help is greatly appreciated.BluShadow,
Thanks for your response. The reason that it is a blob is that we use the same table for uploaded files for several applications. Some files are text, some excel, some other formats. Hence the table has been set up in a generic sense.
However, in my procedure, I'm using
DBMS_LOB.createtemporary and
DBMS_LOB.converttoclob
Is there any way you could possibly provide an example of how to read the contents of the file?
Maybe you are looking for
-
Variable or Substitution String in Interactive Report Filter
Background: I have an application that has a number of customized Interactive Reports where the Filter on the reports is set to a custom company name. When I install the application, I do not want to go through the reports and change the filter for t
-
Converting object libraries from indesign 2.0 to CS3
Hi ! I work for a company which needs to convert the object libraries created on Indesign 2.0 to work on Indesign CS3. Is there a way to do that? Object libraries are compatible within Indesign versions? regards Cicero Lira
-
Software and pictures pixilated - is PSE12 not compatible with Mac retina display?
Is there a software update that will fix this issue? My husband did not realise PSE12 wouldn't work with our Mac when he bought it for my Christmas. It doesn't say on the packaging or descriptions online. If anyone has an answer that would be very he
-
Cursor jumping around on curve 9300
Hi, I've seen many people posting about this problem, but have yet to see a solution. The cursor on my Curve 9300 jumps around all over the place. - The problem started before I did a software update. - It is not from track pad over-sensitivity. I've
-
Hi, Is that correct that there is no way to use Chinese web safe fonts in Muse? I've been looking around and found only a few references but no solutions. It's quite surprising because it's not only a deal-breaker for me, needing Chinese fonts for a