Flat Files - Column Concantenation
I was just wondering if anyone has ever written something or knows of a source where one can get a list of handy hints of dealing with flat files in SSIS so as to avoid common drawbacks associated with loading data from flat files. In my experience
I have noticed that flat files can be such a headache. I have worked with flat files for a wile now and I’m relatively comfortable with them. Occasionally I hit on stubborn issues. My latest challenge is that I’m getting two of the columns from a CSV file
being concatenated in the intended destination. Please see the illustration below.
Surprising thing is that the concatenated columns have a coma separating them.
The rows affected by this end up with associated columns being shifted out of proper mapping.
In the illustration below the first two rows are the offending rows and the last two rows show what the expected results should look like.
Data type is VARCHAR on all columns
ProductCode
ProductName
Quantity
BestBeforeDate
NULL
BRD
Bread
13,2014-03-06
NULL
MLK
Milk
5,2014-03-15
BTR
Butter
4
2014-05-02
EGG
Eggs
12
2014-03-12
The following are the things which I have tried without success yet
Ticking box “Retain NULL Values” box on the Flat File Source component
Using double quotes for Text Qualifier when exporting data to CSV
Using double quotes for Text Qualifier when loading data from CSV
Not using anything for Text Qualifier
Exporting data to a Raw File destination and importing data from a Raw File source hoping that raw file might preserve the original format.
Please note that the data which is being exported has none of the values from the source columns with comas in them. All values in all columns do not have any special characters. They are simple alphanumeric.
Suggestions will be warmly welcome.
Many thanks,
Mpumelelo
Thank you for your responses
Jonathan – I’ve never used a hex editor before. Is there any other way round that other than hex editor?
B3nt3n – What is your delimiter? – I don’t know if I understand your question correctly. I am using the default settings on the Flat File component. The only places where I have made changes are addition of the
double quotes to the Text Qualifier areas as well as putting a tick on the “retain null values …” option. Everything else is default.
A csv would have commas. But you are saying there are no commas when you open the .csv in notepad?
– I meant there are no comas in the original data values as they are on the table before being exported to the csv file. That is, none of the values on a given column has a coma in it. However, there are comas on the csv file itself as you have rightly said
about csv file formats, but not on the table for this data that I am dealing with.
Mpumelelo
Similar Messages
-
Check flat file column list and column types
Hi guys!
Is there any "easy" way to check if the source flat file column names and column types correspond to target datastore column name and types ?
Regards,
PsmakRHi,
There is a way that I already used some time to validate if the data is the one expected into target.
Conditions:
1) The file source must have all columns as "String"
2) All mapping for the analysed columns must be done at "staging area"
How I do it: (oracle way)
1) create a database function (by ODI procedure) like:
create or replace function F$_DATATYPE (pData in varchar2, pDatatype in varchar2, pFormat in varchar2)
return varchar2 as
vDate date;
vNumber number;
BEGIN
if pDatatype = 'D' then /* Date */
vDate := to_date(pData, pFormat);
elsif pDatatype = 'N' then /* Number */
if pFormat is null then
vNumber := to_number(pData);
else
vNumber := to_number(pData, pFormat);
end if;
end if;
return 'OK';
EXCEPTION
When OTHERS then
return 'KO';
end F$_DATATYPE ;
3) Now you can create a constraint to each source column that you wish to validate data like:
'OK' = F$_DATATYPE(my_source_column, 'D', 'ddmmyyyy hh24:mi:ss' ) /* to a date column as example */
4) drag and drop the source datasource (table from model) into package and a E$ table with all errors will be created.
Does it help you? -
Export table to flat file and need to insert sysdate in flat file column
Hi, I created an interface to export oracle table to a csv file. All of the table columns are working well. Then I need to insert the sysdate in csv file column.
I made the mapping as working in staging area, implementation is to_char(sysdate,'dd/mm/yyyy'). But the result is insert 14 to the column.
I have tried to create a variable refreshing as select to_char(sysdate,'dd/mm/yyyy') from dual, then mapping that to the column in csv file, but it only insert to 1 row and the format is yyyymmdd.
I have tried to use SELECT '<%=odiRef.getSysDate( "yyyyMMdd")%>' from dual for the variable, and it also only insert one row to the flat file.
I used the same methodology in ODI10g, it works fine.
So, I am wondering how it can be implemented in 11g.
ThanksThe first option you have stated seems like the obvious choice - I don't see any reason why this shouldn't work. What do you mean by "But the result is insert 14 to the column." Do you mean that it inserted the string "14" (I can't imagine why this would be the case) or that it inserted 14 rows?
-
Split flat file column data into multiple columns using ssis
Hi All, I need one help in SSIS.
I have a source file with column1, I want to split the column1 data into
multiple columns when there is a semicolon(';') and there is no specific
length between each semicolon,let say..
Column1:
John;Sam;Greg;David
And at destination we have 4 columns let say D1,D2,D3,D4
I want to map
John -> D1
Sam->D2
Greg->D3
David->D4
Please I need it ASAP
Thanks in Advance,
RH
sqlImports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
Imports System.IO
Public Class ScriptMain
Inherits UserComponent
Private textReader As StreamReader
Private exportedAddressFile As String
Public Overrides Sub AcquireConnections(ByVal Transaction As Object)
Dim connMgr As IDTSConnectionManager90 = _
Me.Connections.Connection
exportedAddressFile = _
CType(connMgr.AcquireConnection(Nothing), String)
End Sub
Public Overrides Sub PreExecute()
MyBase.PreExecute()
textReader = New StreamReader(exportedAddressFile)
End Sub
Public Overrides Sub CreateNewOutputRows()
Dim nextLine As String
Dim columns As String()
Dim cols As String()
Dim delimiters As Char()
delimiters = ",".ToCharArray
nextLine = textReader.ReadLine
Do While nextLine IsNot Nothing
columns = nextLine.Split(delimiters)
With Output0Buffer
cols = columns(1).Split(";".ToCharArray)
.AddRow()
.ID = Convert.ToInt32(columns(0))
If cols.GetUpperBound(0) >= 0 Then
.Col1 = cols(0)
End If
If cols.GetUpperBound(0) >= 1 Then
.Col2 = cols(1)
End If
If cols.GetUpperBound(0) >= 2 Then
.Col3 = cols(2)
End If
If cols.GetUpperBound(0) >= 3 Then
.Col4 = cols(3)
End If
End With
nextLine = textReader.ReadLine
Loop
End Sub
Public Overrides Sub PostExecute()
MyBase.PostExecute()
textReader.Close()
End Sub
End Class
Put this code in ur script component. Before that add 5 columns to the script component output and name them as ID, col1, co2..,col4. ID is of data type int. Create a flat file destination and name it as connection and point it to the flat file as the source.
Im not sure whats the delimiter in ur flat file between the 2 columns. I have use a comma change it accordingly.
This is the output I get:
ID Col1
Col2 Col3
Col4
1 john
Greg David
Sam
2 tom
tony NULL
NULL
3 harry
NULL NULL
NULL -
SSIS Flat File Source not populating varchar column
I have SSIS package that imports flat files with column separator | from "Flat File Source" into the database trough "OLE DB Destination".
For one of the columns (file contains UIDs in curly brackets) destination column is varchar(250) null-able, when "Retain nulls" is set to true complete data have been imported, but when "Retain nulls" is set to false beside fact that in
the file data is not null and it is not empty string it is not imported and in the destination column is empty string.
I know that this "Retain nulls" applies to the columns that contains NULLs but this is not the case.
If someone have experience with such issue please help.
Thank you in advance.From your statement it looks like when you have curly brackets it skips the value and inserts nulls instead. And that may be the reason why its working fine when you change retain null value.
I would suggest
Make your setting "retain null" to true and load the flat file, then check what happens to the values having curly braces (is there any value or null) , if there is value we need to check for that(please share a sample file). And if it having null
instead of "{value}", I would suggest to put a script task to remove { from your flat file and then try load the data.
Hope this helps.
Regards, -Amit -
How to extract required data from a column to a flat file
my ssis package is working OK. However, I want to refine one of the column extraction.
when data is extracted to the flat file, I just want to the initials, firstname, lastname e.g.
FZ = Ben Smith, Add1, add1, etc
the only bit that i want is Ben Smith
how can i state in the package to just give me the name and exclude the rest
sukaiAdd a derived column task to extract Name part alone and give expression as below
LEFT([ColumnName],FINDSTRING([ColumnName],",",1)-1)
If before SSIS 2012 use SUBSTRING
SUBSTRING([ColumnName],1,FINDSTRING([ColumnName],",",1)-1)
Please Mark This As Answer if it helps to solve the issue Visakh ---------------------------- http://visakhm.blogspot.com/ https://www.facebook.com/VmBlogs -
SSIS : How to create Column Header dynamically using expression in Flat File Source
Hi Team,
I need to keep configured Header Names for columns, Is there is any way to set each column name from expression? or is there is any other way?Nope
But you could add a dummy row to your source to include column headers and then use options column headers in first row in flat file connection manager.
So suppose you've three columns column0,coulmn1,column2 and you want to make it as ID,Name,Datethen make source query as
SELECT 'ID' AS Col1,'Name' AS Col2,'Date' AS Col3, 0 AS ord
UNION ALL
SELECT Column1,Column2,Column3,1
FROM YourTable
ORDER BY Ord
then choose column headers in first row option
Please Mark This As Answer if it helps to solve the issue Visakh ---------------------------- http://visakhm.blogspot.com/ https://www.facebook.com/VmBlogs -
I am using SSIS to extract fixed width data into a flat file destination and I keep getting below error. I have tried almost everything in this forum but still no solution. can anyone help me out to solve this problem.
[Flat File Destination [220]] Error: Failed to write out column name for column "Column 2".
[SSIS.Pipeline] Error: component "Flat File Destination" (220) failed the pre-execute phase and returned error code 0xC0202095
ThanksHi Giss68,
Could you check the Advanced tab of the Flat File Connection Manager to see whether the InputColumnWidth and the OutputColumnWidth properties of the Column2 has the same value? Please refer to the following link about the same topic:
http://stackoverflow.com/questions/10292091/how-do-i-fix-failed-to-write-error-while-exporting-data-to-ragged-right-flat-fil
If it doesn’t work, please post the sample data and the advanced settings of Column2 for further analysis.
Regards,
Mike Yin
If you have any feedback on our support, please click
here
Mike Yin
TechNet Community Support -
Dynamic Column Names in Flat File Destination
Hello,
Inside a Data Flow Task, I have an ADO.Net data source which executes a stored procedure that provides results in 5 columns.
The requirement is to have it connect to a flat file destination, such that the column names is dependent on what data was pulled by the data source. There is a variable indicator which identifies the data that was pulled. For example:
If the indicator is 0, then the columns names will be A,B,C,D,E. Otherwise, if the indicator is 1, then column names will be V,W,X,Y,Z.
Any suggestions will be of great help.
AJIf you only have two variations then use a branched execution (based on precedence constraints) and direct it to one DFT or another based on the result returned by the stored procedure.
Otherwise use .net
code to create one package or another dynamically.
PS: I suggest not to bother using SSIS for such a simplistic scenario.
Arthur My Blog -
Importing From Flat File with Dynamic Columns
HI
I am using ssis 2008,i have folder in which I have Four(4) “.txt” files each file will have 2 columns(ID, NAME). I loaded 4
files in one destination, but today I receive one more “.txt” file here we have 3 columns (ID, NAME, JOB) how can I get a message new column will receive in source. And how can I create in extra column in my destination table dynamically …please help meHi Sasidhar,
You need a Script Task to read the names and number of columns in the first row of the flat file each time and store it in a variable, then create a staging table dynamically based on this variable and modify the destination table definition if one ore more
new columns need to be added, and then use the staging table to load the destination table. I am afraid there is no available working script for your scenario, and you need some .NET coding experience to achieve your goal. Here is an example you can refer
to:
http://www.citagus.com/citagus/blog/importing-from-flat-file-with-dynamic-columns/
Regards,
Mike Yin
TechNet Community Support -
Ragged right with Flat file in SSIS for last column
When I am importing from Flat file to OLEDB using DATA flow in SSIS,
It has fixed length for each column
but last column had length of 20
So I used for last column as {LF} and input width 0 & output width 20
but facing problem of last column has showing more data in preview so I am missing some some records
May I get any solution
ThanksHi Madhu,
I totally agree with Visakh. If your row delimiter is {CR}{LF}, you need to consider the two placeholders for the delimiter when defining the column length, that is to say you need to set the length of the last column to 22.
Regards,
Mike Yin
TechNet Community Support -
Reorder columns in Flat File Destination
Hi Friend's
Although, I did some google in order to find the solution for reording the columns before exporting to a text file using FLAT FILE DESTINATION, one of the work around which i found is editing the final package xml file and moving the DTS:FlatFileColumn
fields as required.
Is there any other solution in order to avoid this approach as I have some 40 fields to be displayed in text file and manually reordring these fields in xml would be cumbersome.
I need to create text file using FLAT FILE DESTINATION.SSIS Data flow doesn't support dynamic metadata, and any changes in metadata like re-arrange mappings should be done manually. if you want to have columns in a re-arrange design which could be re-mapped dynamically and simply it is better to looks for
another way than data flow task, dynamic t-sql queries can be good alternative.
http://www.rad.pasfu.com
My Submitted sessions at sqlbits.com -
Reject columns which exceed a particular length during OWB flat file import
Hi,
I have a file in csv format. This file has a lot of columns out of which i need to use only a few. I created a file in OWB using the given csv file using the file import wizard. I am using this file to create an external table.
The import is working fine, But the problem is that certain columns (which are not required to be processed) have very huge data. This is leading to rejection of certain rows (when i deploy the external table) which have valid data for the required columns, but too huge data was columns which i do not need.
But these rows must not be rejected.
I wanted to know if there any way by which we can truncate/ insert null in these unwanted columns (either during flat file import / or during creation of external table), so that the row is taken in partially(with the wanted columns)?
Thanks a lot!
NSHi Julie,
As Jim posted, you could try a RAW file instead of a CSV file. In SSIS, the Raw File destination writes raw data to a file. Because the format of the data is native to the destination, the data requires no translation and little parsing. So does the Raw
File source. Besides, the Raw File destination/source can improve the package performance because they write data more quickly than other destinations such as the Flat File, the OLE DB, and Recordset destinations.
References:
http://www.katieandemil.com/ssis-raw-file-source-example-ssis-2012
http://www.jasonstrate.com/2011/01/31-days-of-ssis-raw-files-are-awesome-131/
Regards,
Mike Yin
TechNet Community Support -
How to split column wise into separate flat files in ssis
IN SSIS...
1.---->I have a sales table country wise regions like (india, usa, srilanka) ....
india usa
srilanka
a b
c
d e
f
so i want output like in
flat file1.txt has india flat file2.txt has usa flat file3.txt has srilanka
a b
c
d e
f
2.----->i dont know how many regions in my table....dynamically split into separate flat files ....
pls help me.....thank uI think what you can do is this
1. Do a query based on UNPIVOT to get the data as rows instead of columns
For that you can use a query like this
IF OBJECT_ID('temp') IS NOT NULL DROP TABLE temp
CREATE TABLE temp
Country varchar(100),
Val decimal(25,5)
DECLARE @CountryList varchar(3000),@SQL varchar(max)
SELECT @CountryList = STUFF((SELECT ',[' + Column_Name + ']' FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = '<SalesTableNameHere>' FOR XML PATH('')),1,1,'')
SET @SQL= 'SELECT * FROM <SalesTableNameHere> t UNPIVOT (Val FOR Country IN (' + @CountryList + '))p'
INSERT temp
EXEC (@SQL)
Once this is done you'll get data unpivoted to table
Then you can use a execute sql task with query like above
SELECT DISTINCT Country FROM Temp
Use resultset option as full resultset and store the result to an object variable
Then add a ForEach loop container with ADO enumerator and map to the object variable created above. Have variables inside loop to get inidvidual country values out.
Inside loop place a data flow task. Use a variable to store source query , make EvaluateAsExpression true for it and set Expression as below
"SELECT Val FROM Temp WHERE Country = " + @[User::LoopVariable]
Where LoopVariable is variable created inside loop for getting iterated values
Inside data flow task place a oledb source, choose option as SQL command from variable and map to the above query variable.
Link this to flat file destination create a flat file connection manager. Set a dynamic flat file connection using expression builder. Make it based on a variable and set variable to increment based on loop iteration
The core logic looks similar to this
http://visakhm.blogspot.ae/2013/09/exporting-sqlserver-data-to-multiple.html
dynamic file naming can be seen here
http://jahaines.blogspot.ae/2009/07/ssis-dynamically-naming-destination.html
Please Mark This As Answer if it solved your issue
Please Vote This As Helpful if it helps to solve your issue
Visakh
My Wiki User Page
My MSDN Page
My Personal Blog
My Facebook Page -
SSIS: Why do columns become misaligned when importing flat files?
Hi All
I am stumped with the following.
When I try to load a fixed length flat file into a table, the first few thousand records load correctly but then the columns start going out of sync.
Visually it looks like the data is drifting off to the right. Below is what the table looks like when the load completes:
Col1 Col2 Col3 Col4
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
Additional info:
1. The file has 400 000 records.
2. The first few thousand records load OK.
3. Source file is a flat file, fixed length, no delimeters.
4. All rows are same length, with LFCR.
I tried using a script task to check with C# if all rows are the same length and has the same line terminator, and could not pick up anything out of the ordinary.
What could be the cause?Thank you for the reply.
I discovered the source of the problem, but still don't understand how to solve it.
There is one extra character in one of the columns every few thousand lines. This is the character: �
I don't understand why all the rows following a corrupt row is shifted by one character and not just the effected row?
Secondly, via a script task, SSIS indicates that the length of the row is still the same, despite the extra character. Is it possible that SSIS does not recognize this character and this is what is causing all columns to shift / mis-align?
Maybe you are looking for
-
Time Capsule vs Seagate for professional video and music
Hi there, I don´t know what to purchase, I want an external hard drive to storage my music and videos as well my photos work, and keep my computer free in space. But if all my documents are in my hard drive, so I wouldn´t backup my documents but only
-
Multilingual data file Oracle - 8.1.7
I am using 'sqlldr' to insert data into Oracle 8.1.7 database(charset UTF-8, OS=Win NT) from a datafile encoded in UTF-8. This datafile contains data in English, Japanese, Chinese languages. All rows were inserted into the database but I couldn't see
-
- Opened in Numbers 2.3 the working well for over a year spreadsheet ABC created in Numbers 2.x - Opened in Numbers 3 the same spreadsheet ABC created in Numbers 2.x - So far no problem - spreadsheet in Numbers 3 seems identical to the one opened in
-
R3load - what's the meaning of parameter "-para_cnt X"?
During system copies/shell creations I always come across the parameter -para_cnt <count> count of parallel R3load processes (MaxDB only) I wonder what's the usage of that parameter. Is that something like "if only one R3load is running use that t
-
Guest User permission for Federated portal setup
Hello SDNers, I am trying to setup Federation between two portals. Both the portals are in the same domain and use the same LDAP user data source. Both the portals are on NW 7.0 EHP1 SP 05. I am following the online help and few other links. I have f