Parsing XMLTV file with Dom4J

Hi,
I'm fairly new to Java programming, although I manage to usually do what I want in JSP, since they usually only contain small scriptlets.
But when it comes to pure Java programming, I'm don't feel at home. My program below parses an XMLTV (http://xmltv.org) file, and my intention is to insert it's data in a database, but so far, I'm only using System.out.println() and piping the output to a file, which I then in turn import into SQL Query Analyzer and populate my database.
I would like to just have some pointers if there is any other (I just feel there is), better way of coding what I have done. Perhaps other metods, like with Get/Set?
package java2db;
import java.util.*;
import java.text.*;
import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;
* @author chka
public class Java2Db {
    private static Document document;
    /** Creates a new instance of Java2Db */
    public Java2Db() {
    public static String formatStr(String instr) {
        instr = instr.replaceAll("'","''");
        /*instr = instr.replaceAll("'","'");
        instr = instr.replaceAll("\"",""");
        instr = instr.replaceAll(">",">");
        instr = instr.replaceAll("<","<");*/
        return instr;
    public static void main(String[] args) {
             String szActor = "";
                String szDirector = "";
                String szStart = "";
                String szEnd = "";
                String szChId = "";
                String szTitle = "";
                String szSubTitle = "";
          String szID = "";
                String szChannel = "";
                String szAspect = "";
                String szIcon = "";
                String szDisplayNameLang = "";
                String szDate = "";
                String szDescription = "";
                String szCategory = "";
                String szEpisodeNum = "";
          Date startTime = null;
          Date endTime = null;
          SimpleDateFormat xmlDateFormat = new SimpleDateFormat("yyyyMMddHHmmss Z");
          SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
          SAXReader reader = new SAXReader();
          try {
                    document = reader.read(args[0]);
                    //document = reader.read("c:/development/tvguide.xml");
                Element rootElement = document.getRootElement();
                Iterator itChannel = rootElement.elementIterator("channel");
          while (itChannel.hasNext() ) {
                    Element chElement = (Element) itChannel.next();
                    szID = chElement.attribute("id").getStringValue();
                    szChannel = chElement.element("display-name").getText();
                    szDisplayNameLang = chElement.element("display-name").attribute("lang").getStringValue();
                    if (chElement.element("icon") != null) {
                        szIcon = chElement.element("icon").attribute("src").getStringValue();
                    } else
                    szIcon = "";
                    System.out.println("insert into channel (displayname,lang,channelid,iconsrc) values ('"+szChannel+"','"+szDisplayNameLang+"','"+szID+"','"+szIcon+"')");
          Iterator itProgramme = rootElement.elementIterator("programme");
          while (itProgramme.hasNext() ) {
               Element pgmElement = (Element) itProgramme.next();
                        szChId = pgmElement.attribute("channel").getStringValue();
                        szStart = pgmElement.attribute("start").getStringValue();
               szEnd = pgmElement.attribute("stop").getStringValue();
               szTitle = pgmElement.element("title").getText();
                        if (pgmElement.element("sub-title") != null) {
                            szSubTitle = pgmElement.element("sub-title").getText();
                        } else
                            szSubTitle = "";
                        if (pgmElement.element("video") != null) {
                            szAspect = pgmElement.element("video").element("aspect").getText();
                        } else
                            szAspect = "";
                        if (pgmElement.element("credits") != null) {
                            StringBuffer sb = new StringBuffer();
                            if (pgmElement.element("credits").element("director") != null ) { // There can be more than 1 director, should be handled.
                                szDirector = formatStr(pgmElement.element("credits").element("director").getText());
                            else szDirector = "";
                            Iterator itActor = pgmElement.element("credits").elementIterator("actor");
                            while (itActor.hasNext()) {
                                Element elementActor = (Element) itActor.next();
                                sb.append(formatStr(elementActor.getText())+";");
                            szActor = sb.toString();
                        } else {
                            szDirector = "";
                            szActor = "";
                        if (pgmElement.element("date") != null) {
                            szDate = pgmElement.element("date").getText();
                        } else
                            szDate = "";
                        if (pgmElement.element("category") != null) {
                            StringBuffer sb = new StringBuffer();
                            List listCategory = pgmElement.elements("category");
                            Iterator itCategory = listCategory.iterator();
                            while (itCategory.hasNext()) {
                                Element elementCategory = (Element) itCategory.next();
                                sb.append(elementCategory.getText()+";");
                            szCategory = sb.toString();
                        } else
                            szCategory = "";
                        if (pgmElement.element("episode-num") != null) {
                            List listEpisode = pgmElement.elements("episode-num");
                            Iterator itEpisode = listEpisode.iterator();
                            while (itEpisode.hasNext()) {
                                Element elementEpisode = (Element) itEpisode.next();
                                if (elementEpisode.attribute("system").getStringValue().equalsIgnoreCase("onscreen")) {
                                    szEpisodeNum = elementEpisode.getText();
                        } else
                            szEpisodeNum = "";
                        if (pgmElement.element("desc") != null) {
                    szDescription = formatStr(pgmElement.element("desc").getText() );
               } else
                            szDescription = "";
                        try {
                            startTime = xmlDateFormat.parse(szStart);
                            endTime = xmlDateFormat.parse(szEnd);
                        catch (ParseException pe) {
                            System.out.println(pe.getMessage());
                        System.out.println("insert into programme (title,subtitle,channelid,starttime,endtime,copyrightdate,aspect,category,episodenum,director,actor,description) values ('"+formatStr(szTitle)+"','"+formatStr(szSubTitle)+"','"+szChId+"','"+sdf.format(startTime)+"','"+sdf.format(endTime)+"','"+szDate+"','"+szAspect+"','"+formatStr(szCategory)+"','"+szEpisodeNum+"','"+szDirector+"','"+szActor+"','"+szDescription+"')");
                } // try
          catch (Exception e) {
               System.out.println("! Exception: "   );
                        e.printStackTrace();
}Thanks for any suggestions on how I can improve this small program!
--chris                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               

1. Only the value of one "system" attribute will be assigned to the "szEpisodeNum" variable, even if there are several "tv/programme/episode-num" elements, so the below loop is useless. Didn't you by any chance want to concatenate the elementEpisode.getText() values to a single string (as you do with the values of the "tv/programme/category" elements) or did you omit a break statement?> while (itEpisode.hasNext()) {
Element elementEpisode = (Element) itEpisode.next();
if (elementEpisode.attribute("system").getStringValue().equalsIgnoreCase("onscreen")) {
szEpisodeNum = elementEpisode.getText();
}2. You wrote:
There can be more than 1 director, should be handled.They are not handled in your code. If you want to do so, you should process them in a loop as you do with the actors.
3. The below if statement is not prepared for handling "tv/programme/episode-num" elements that do not have a "system" attribute, which throw a NullPointerException.> if (elementEpisode.attribute("system").getStringValue().equalsIgnoreCase("onscreen")) {
szEpisodeNum = elementEpisode.getText();
}4. I rewrote your Java2Db class using XPath queries. My class is almost equivalent with yours, the only difference being that my class does not apply formatting on date values. This would have necessitated adding an extra few lines which would have detracted from the readability of the code.import java.util.ArrayList;
import java.util.List;
import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.DocumentHelper;
import org.dom4j.io.SAXReader;
import org.dom4j.Node;
import org.dom4j.XPath;
* Generates SQL queries for inserting data from org.dom4j.Node nodes into a database table.
* @author prgguy
public class Java2Db {
   * Generates SQL queries for inserting data from org.dom4j.Node nodes into a database table.
   * @param node    The node, or its descendant nodes, of which the data will be inserted as
   *                records into a database table.
   * @param table   The name of the database table that the records will be inserted into.
   * @param entity  The XPath definition of the nodes that will be interpreted as entities
   *                in the database.
   * @param record  Stores the field names (in the 1st dimension) and the values
   *                (in the 2nd dimension) of the records.
   * @param returns A list containing the SQL queries that insert the records into the table.
  public static ArrayList<String> getSQLCommands(Node node, String table, String entity, String[][] record) {
    ArrayList<String> sqlList = new ArrayList<String>();
    String fields = "";
    String values = "";
    String value = "";
    String sqlcommand = "";
    String comma;
    XPath xp = DocumentHelper.createXPath(entity);
    List entList = xp.selectNodes(node);
    List subList = null;
    Node entNode = null;
    Node subNode = null;   
    for (int i = 0; i < entList.size(); i++) {
      entNode = (Node)entList.get(i);
      for (int j = 0; j < record.length; j++) {
        xp = DocumentHelper.createXPath(record[j][1]);
        subList = xp.selectNodes(entNode);
        for (int k = 0; k < subList.size(); k++) {
          subNode = (Node)subList.get(k);
          value += subNode.getText() + (subList.size() > 1? ";": "");
        comma = (j == record.length - 1)? "" : ",";
        fields += record[j][0] + comma;
        values += (subNode == null? "''" : ("'" + value + "'")) + comma;
        value = "";
      sqlcommand = "insert into " + table + " (" + fields + ") values (" + values + ")";
      sqlList.add(sqlcommand);
      fields = "";
      values = "";
    return sqlList;
  public static void main(String[] args) {
    Document document = null;
    try {
      SAXReader reader = new SAXReader();
      document = reader.read("c:/development/tvguide.xml");
    catch (DocumentException e) {
      e.printStackTrace();
    // In this section we define the entities that will provide the records. The definitions
    // are written in XPath language.
    // This array defines the records of the "channel" entity.
    String[][] record_channel = {
      {"displayname",   "display-name"},
      {"lang",          "display-name/@lang"},
      {"channelid",     "@id"},
      {"iconsrc",       "icon/@src"}
    // This array defines the records of the "programme" entity.
    // The fields "episodenum" and "director" are restricted to contain only one value (node)
    // per field. I only did it to make my code work exactly as your code does so that
    // the output of your code and mine will be easy to compare.
    String[][] record_programme = {
      {"title",         "title"},
      {"subtitle",      "sub-title"},
      {"channelid",     "@channel"},
      {"starttime",     "@start"},
      {"endtime",       "@stop"},
      {"copyrightdate", "date"},
      {"aspect",        "video/aspect"},
      {"category",      "category"},
      {"episodenum",    "episode-num[@system='onscreen'][last()]"},
      {"director",      "credits/director[1]"},
      {"actor",         "credits/actor"},
      {"description",   "desc"}
    // Now let's see what we have done:
    String sqlcommand;
    ArrayList<String> sqlList;
    sqlList = getSQLCommands(document, "channel", "tv/channel", record_channel);
    for (int i = 0; i < sqlList.size(); i++) {
      sqlcommand = sqlList.get(i);
      System.out.println(sqlcommand);
    sqlList = getSQLCommands(document, "programme", "tv/programme", record_programme);
    for (int i = 0; i < sqlList.size(); i++) {
      sqlcommand = sqlList.get(i);
      System.out.println(sqlcommand);
}

Similar Messages

  • How to parse XML file with namesapce?

    Hi,
       I am trying to parse an xml file having namespace. But no data is returned.
    Sample Code:
    public class XMLFileLoader
    var xml:XML = new XML();
    var myXML:XML = new XML();
    var XML_URL:String = "file:///C:/Documents and Settings/Administrator/Desktop/MyData.xml";
    var myLoader:URLLoader = null;
    public function XMLFileLoader()
    var myXMLURL:URLRequest = new URLRequest(XML_URL);
    myLoader= new URLLoader(myXMLURL);
    myLoader.addEventListener(Event.COMPLETE,download);
    public function download(event:Event):void
    myXML = XML(myLoader.data);
    var ns:Namespace=myXML.namespace("xsi");
    for(var prop:String in myXML)
         trace(prop);
    //Alert.show(myXML..Parameters);
    //trace("Data loadedww."+myXML.toString());
    //Alert.show(myXML.DocumentInfo.attributes()+"test","Message");
    The XML Contains the following format.
    <Network xmlns="http://www.test.com/2005/test/omc/conf"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.test.com/2005/test/omc/conf/TestConfigurationEdition3proposal4.xsd">
        <TestDomain>
          <WAC>
            <!--Release Parameter  -->
            <Parameters ParameterName="ne_release" OutageType="None"
                        accessRight="CreateOnly" isMandatory="true"
                        Planned="false"
                        Reference="true" Working="true">
              <DataType>
                <StringType/>
              </DataType>
              <GUIInfo graphicalName="Release"
                       tabName="All"
                       description="Describes the release version of the managed object"/>
            </Parameters>
    </TestDomain>
    </Network>
    Any sample code how to parse this kind of xml file with namespaces...
    Regards,
    Purushotham

    i have exactly the same problem with KXml2, but using a j2me-polish netbeans project.
    i've tried to work around with similar ways like you, but none of them worked. now i've spent 3 days for solving this problem, i'm a bit disappointed :( what is wrong with setting the downloaded kxml2 jar path in libraries&resources?
    screenshot

  • Parse xml file with validating againat dtd

    i have a xlm file looks like:
    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE map SYSTEM "map.dtd">
    <map width="20" height="15" goal="25" name="eXtreme Labyrinth of Dooom">
    <random-item type='lantern' amount='5' />
    <random-item type='health' amount='10' />
    <tile x="14" y="0" type="wall">
    <renderhint>wall:rock,cracked</renderhint>
    </tile>
    <tile x="15" y="0" type="wall" />
    <tile x="16" y="0" type="floor">
    <renderhint>floor:marble,cracked</renderhint>
    </tile>
    <tile x="17" y="0" type="floor">
    <renderhint>floor:stone,rubble</renderhint>
    </tile>
    <tile x="18" y="0" type="floor" />
    <tile x="0" y="1" type="floor" />
    <tile x="1" y="1" type="floor" startlocation="1" />
    <tile x="2" y="1" type="floor" />
    <tile x="3" y="1" type="floor">
    <item type="treasure">Bar of Silver</item>
    <renderhint>floor:stone,blood</renderhint>
    </tile>
    <tile x="4" y="1" type="wall" />
    <tile x="5" y="1" type="wall" />
    <tile x="6" y="1" type="wall">
    <renderhint>wall:bricks,cracked</renderhint>
    </tile>
    </map>and a dtd document like:
    <!ELEMENT map (random-item+, tile+)>
    <!ATTLIST map
    width CDATA #REQUIRED
    height CDATA #REQUIRED
    goal CDATA #REQUIRED
    name CDATA #REQUIRED
    <!ELEMENT random-item EMPTY>
    <!ATTLIST random-item
    type (armour|health|sword|treasure|lantern) #REQUIRED
    amount CDATA #REQUIRED
    <!ELEMENT tile (item|renderhint)*>
    <!ATTLIST tile
    x CDATA #REQUIRED
    y CDATA #REQUIRED
    type (exit|floor|wall) #REQUIRED
    startlocation CDATA #IMPLIED
    <!ELEMENT item (#PCDATA)>
    <!ATTLIST item
    type (armour|health|sword|treasure|lantern) #REQUIRED
    <!ELEMENT renderhint (#PCDATA)>i need to validate the xml file against the dtd document and parse it to java using DOM.
    Can anyone give ma any suggestions on how to do it?
    thank you

    i have started my coding like:
    import javax.xml.parsers.DocumentBuilderFactory;
    import javax.xml.parsers.DocumentBuilder;
    import javax.xml.parsers.ParserConfigurationException;
    import org.w3c.dom.*;
    import org.xml.sax.SAXException;
    import java.io.*;
    class loadxml
        public static void main(String[] args)
         try {
              DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
              factory.setValidating(true);
              factory.setIgnoringElementContentWhitespace(true);
              DocumentBuilder parser = factory.newDocumentBuilder();
              Document doc = parser.parse(new File("hallways.xml"));
              loadxml load = new loadxml();
              load.parseNode(doc);
         } catch (ParserConfigurationException e) {
              e.printStackTrace();
         } catch (SAXException e) {
              e.printStackTrace();
         } catch (IOException e) {
              e.printStackTrace();
        public void parseNode (Node node) throws IOException
               // here is where i have problem with
    }since in my xml file, i have got ATTLIST, this ready confuses me when i try to coding it.
    Can anyone help me, please.
    Thank you.
    Edited by: mujingyue on Mar 12, 2008 3:10 PM

  • Correct way to parse the file with named captures

    Hello!
    I'm trying to build the configuration parser script, as I'm new on powershel /regex I have a lot of questions.
    I have the following source file:
    add chain=input comment=Accept dst-port=53 protocol=udp src-address-list=DNS_servers
    add chain=input comment=AcceptDNSqueries dst-port=53 protocol=udp src-address-list=Servers
    add chain=input comment=AcceptDNS dst-port=53 protocol=tcp src-address-list=DNS_servers
    add chain=input comment=DHCP_relay dst-address-list=DHCP dst-port=67,68 protocol=udp src-address-list=IntIp
    The following result should be done (results as csv format, the captures with no value are divided by seperator):
    Chain,Comment,dst-port,protocol,src-address-list,dst-address-list
    input,Accept,53,udp,DNS_servers,,
    input,AcceptDNSqueries,53,udp,,Servers
    input,Acceptudp,,udp,,Servers
    According the following
    link we can use the named captures. I have played with it and noticed that this script works only when data is at predefined order (or by using a lot of or expressions by repeating them when needed).
    Question: how to parse the file when it's possible to use various captures with data located on different order(some values at configuration file is missing)? Is it possible to modify the script on link or we need to work with different approach?
    Thanks!

    Try this:
    $text =
    add chain=input comment=Accept dst-port=53 protocol=udp src-address-list=DNS_servers
    add chain=input comment=AcceptDNSqueries dst-port=53 protocol=udp src-address-list=Servers
    add chain=input comment=AcceptDNS dst-port=53 protocol=tcp src-address-list=DNS_servers
    add chain=input comment=DHCP_relay dst-address-list=DHCP dst-port=67,68 protocol=udp src-address-list=IntI
    '@ | set-content sourcefile.txt
    get-content sourcefile.txt |
    foreach {
    new-object PSObject -Property $($_ -replace '^add ' -replace ' ',"`n" | convertfrom-stringdata) |
    select Chain,Comment,dst-port,protocol,src-address-list,dst-address-list
    } | ft -auto
    chain comment          dst-port protocol src-address-list dst-address-list
    input Accept           53       udp      DNS_servers                      
    input AcceptDNSqueries 53       udp      Servers                          
    input AcceptDNS        53       tcp      DNS_servers                      
    input DHCP_relay       67,68    udp      IntI             DHCP            
    I know, it's not "named captures", but those key=value pairs just beg to be run through convertfrom-stringdata.
    Edit: just noticed I missed the dst-address-list.  Fixed.
    [string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "

  • Parsing XML file with different languages (Xerces)

    How do we code or program to an XML file with different
    languages , say english and spanish. WHen we parse such a document with the default locale , the presence of special characters throws errors .For eg when I use xerces and use
    DOMParser parser = new DOMParser();
    try
    // Parse the XML Document
    parser.parse(xmlFile);
    catch (SAXException se)
    se.printStackTrace();
    org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0xfc) was found in the element content of the document.
    System Error:          void org.apache.xerces.framework.XMLParser.parse(org.xml.sax.InputSource)
    So what locale do we set before we parse ?How to handle this problem

    You need an encoding attribute in the xml declaration. If you don't, the parser assumes UTF-8, which are ASCII characters up to 127 - useful (only) for English.
    So, something like this would allow you to use characters above 127, ISO-8859-1 is the encoding used by standard PCs.
    <?xml version="1.0" encoding="ISO-8859-1"?>
    You can find a (offical) list of encodings at:
    http://www.iana.org/assignments/character-sets
    I'm not sure about mixing various encodings. I think you have to resort external parsed entities, which can have their own encoding, but I think you cannot mix encodings in one XML file.
    Good luck.

  • Parsing a file with different groupings!

    I need to parse a file. The file looks like this.
    ABC45     CLIENT-90      MY COMPANY-0333    STOCK  VS POSITIONS      DATE
    ......n number of records here
    ABC45     CLIENT-90      MY COMPANY-0897    STOCK VS POSITIONS       DATE
    .....n number of records here
    ABC 45    CLIENT-90      MY COMPANY-0333     STOCK VS POSITIONS       DATE
    ....n number of records here
    ABC 45    CLIENT-90      MY COMPANY-0367     STOCK VS POSITIONS       DATE
    ....n number of records hereI need to get all the records under say "MY COMPANY-0333" and parse them further and output to a text file. How can I do that?
    Thanks

    You have header and detail records, I suppose. Let's call them that. Pseudocode:boolean rightStuff = false;
    for each record:
      if it's a header record:
        if it has MY COMPANY-0333 in the right place then rightStuff = true else rightStuff = false
      if it's a detail record:
        if rightStuff then process it and write to the filePC&#178;

  • How to get the filename when parsing a file with d3l

    All
    After some time have experience with interconnect, IStudio I need the following info. Is it possible to get the filename when parsing a flat file using a d3l? This is needed because we need to store the filename together with the data into the database.
    Any examples or directions to some documents are welcome.
    Regards
    Olivier De Groef

    has anyone some info on this

  • Parse CSV file with some dynamic columns

    I have a CSV file that I receive once a week that is in the following format:
    "Item","Supplier Item","Description","1","2","3","4","5","6","7","8" ...Linefeed
    "","","","Past Due","Day 13-OCT-2014","Buffer 14-OCT-2014","Week 20-OCT-2014","Week 27-OCT-2014", ...LineFeed
    "Part1","P1","Big Part","0","0","0","100","50", ...LineFeed
    "Part4","P4","Red Part","0","0","0","35","40", ...LineFeed
    "Part92","P92","White Part","0","0","0","10","20", ...LineFeed
    An explanation of the data - Row 2 is dynamic data signifying the date parts are due. Row 3 begins the part numbers with description and number of parts due on a particular date. So looking at the above data: row 3 column7 shows that PartNo1 has 100 parts
    due on the Week of OCT 20 2014 and 50 due on the Week of OCT 27, 2014.
    How can I parse this csv to show the data like this:
    Item, Supplier Item, Description, Past Due, Due Date Amount Due
    Part1 P1 Big Part 0 20 OCT 2014 100
    Part1 P1 Big Part 0 27 OCT 2014 50
    Part4 P4 Red Part 0 20 OCT 2014 35
    Part4 P4 Red Part 0 27 OCT 2014 40
    Is there a way to manipulate the format to rearrange the data like I need or what is the best method to resolve this? Moreover how do I go about doing this? 

    Hello,
    If the files have the same structure you can create an Integration Service Package.
    see this article
    http://www.mssqltips.com/sqlservertip/2923/configure-the-flat-file-source-in-sql-server-integration-services-2012-to-read-csv-files/
    Javier Villegas |
    @javier_vill | http://sql-javier-villegas.blogspot.com/
    Please click "Propose As Answer" if a post solves your problem or "Vote As Helpful" if a post has been useful to you

  • Parsing text file with SGML tags

    <EMAIL>
    <ADDRESS>>[email protected]
    <DESC>>Email your questions to Click Here to E-mail
    <POP>
    <ADDRESS>N/A
    <ZIP>N/A
    How do you parse above text file which has SGML tags ? thanks

    Oh, sorry, if it's not an XML-file, then you properbly should use another approach. Where does the file come from? Could you properbly make it be a xml file ? ;-)
    Possible approaches are:
    * StringTokenizer, possibly with returning delimiters option enabled
    * good old make-it yourself indexOf/substring-method, ok... complicated
    * if you would like to handle the contents exactly like xml you could add a <?xml version="1.0"?><root> in the beginning and a </root> in the end and still use an xml parser, but keep character encodings in mind here.
    * another quite simple way that properbly fits best is java.util.regex API. if you want to use it, I can give you the code. it's easy.
    regards
    sven

  • Parsing XML file with the word "Infinity" in it

    I am having a problem within Flex that parses an XML file,
    one of the text names within the file contains the word "Infinity".
    Unfortunately this is a constant within Flex and the element of my
    array does not think it is a text string but a value. I have tried
    type setting the xml element, and putting escape characters around
    the text but nothing works. If I change the text to lowercase
    "infinity" it works fine.
    Any help greatly appreciated.

    This sounds like a bug. Can you file it, with a small code
    example that illustrates it, here:
    http://bugs.adobe.com/jira
    Thanks!
    matt horn
    flex docs

  • Parse XML file with regex

    Hi,
    Is that possible to parse a xml file using regular expressions....if s what is the API needed
    thanx in advance

    Is that possible to parse a xml file using regular
    expressions....if s what is the API neededI'm sure it can be done. Here's the regex API:
    http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/package-summary.html
    http://www.javaregex.com/tutorial.html
    But that's not where regex is for. Better have a look at this:
    http://java.sun.com/xml/

  • Parsing Log file with PowerShell

    Hey Guys, I have the following line in a txt file (log file) 
    2012-08-14 18:00:00 [ERROR] . Exception SQL error 1 2012-08-14 18:10:00 [ERROR] . Exception SQL error 22012-08-15 18:00:00 [INFO] . Started 
    - Check the most recent entry(s) the last 24 hours
    - if there's an error [ERROR] write-out a statement that says (Critical) with the date-time of the error
    - If there's no erros write-out (Ok)
    So far I learned to write this much and would like to learn more from you:
    $file = "C:\Users\example\Documents\Log.txt" cat $file | Select-String "ERROR" -SimpleMatch

    Hello,
    I am new to PowerShell, and looking for same requirement, here is my function.
    Function CheckLogs()
        param ([string] $logfile)
        if(!$logfile) {write-host "Usage: ""<Log file path>"""; exit}
        cat $logfile | Select-String "ERROR" -SimpleMatch | select -expand line |
             foreach {
                        $_ -match '(.+)\s\[(ERROR)\]\s(.+)'| Out-Null 
                        new-object psobject -Property @{Timestamp = [datetime]$matches[1];Error = $matches[2]} |
                        where {$_.timestamp -gt (get-date).AddDays(-1)}
                        $error_time = [datetime]($matches[1])
                        if ($error_time -gt (Get-Date).AddDays(-1) )
                            write-output "CRITICAL: There is an error in the log file $logfile around 
                                          $($error_time.ToShortTimeString())"; exit(2)
      write-output "OK: There was no errors in the past 24 hours." 
    CheckLogs "C:\Log.txt" #Function Call
    Content of my log file is as follows
    [ERROR] 2013-12-23 19:46:32
    [ERROR] 2013-12-24 19:46:35
    [ERROR] 2013-12-24 19:48:56
    [ERROR] 2013-12-24 20:13:07
    After executing above script, getting the below error, can you please correct me.
     $error_time = [datetime]($matches[1])
    +                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        + CategoryInfo          : InvalidOperation: (:) [], RuntimeException
        + FullyQualifiedErrorId : NullArray
    Cannot index into a null array.
    At C:\PS\LogTest.ps1:10 char:21
    +                     new-object psobject -Property @{Timestamp = 
    [datetime]$match ...
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ~~~
        + CategoryInfo          : InvalidOperation: (:) [], RuntimeException
        + FullyQualifiedErrorId : NullArray
    Cannot index into a null array.
    At C:\Test\LogTest.ps1:12 char:21
    +                     $error_time = [datetime]($matches[1])
    +                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        + CategoryInfo          : InvalidOperation: (:) [], RuntimeException
        + FullyQualifiedErrorId : NullArray

  • Trying to parse a file with SAX

    Hi All,
    I have some code which takes a XML DTD and parses it, writing information to System.out. Or it's meant to - I cannot get the parser to see the DTD file.
    The code is:
    public static void main (String[] args)
    try
    Class loadedClass = Class.forName("com.ibm.xml.parser.SAXDriver");
    Parser xParser = (Parser)loadedClass.newInstance();
    CatalogueReader cr = new CatalogueReader();
    xParser.setDocumentHandler(cr);
    xParser.setErrorHandler(cr);
    xParser.parse("catalogue1.txt");
    catch(Exception e)
    {System.out.println("Problem: " +e.getMessage());}
    which produces the following message:
    Problem: catalogue1.txt (The system cannot find the file specified)
    I have added the text file to the project and tried putting it in various folders without success.
    I've been stuuck on this for a day now and would appreciate a nudge in the right direction.
    Many thanks.

    You might try creating a separate class and putting all your code in that class's constructor and seeing if that works. (have main() instansiate that class in order to run the code). I try to avoid putting code that parses directly in main(). You can also add the following to your main() and / or class's constructor to see what directory its is running from:
              File file=new File("");
              System.out.println(file.getAbsolutePath());
    Lets say your catalogue1.txt file is two directories 'up' from the path given from getAbsolutePath() above, and one level down in a package called 'books', then this
    might work:
    xParser.parse("../../books/catalogue1.txt");
    or
    xParser.parse("././books/catalogue1.txt");

  • Parsing Larg files

    Has anyone tried parsing large XML files. I need parse a fiel
    about 70M+. When I try and parse this file I get
    java.io.UTFDataFormatException: Invalid UTF8 encoding
    When I break the file down into smaller size about 2M it has no
    problems. Also the sample data I am using was created using
    OracleXMLQuery. Anyone have the same problem, solutins,
    suggestions?
    Any help is appricated
    Thanks
    null

    I wasn't using any kind of encoding but I did have problems
    parsing large files with the DOMParser. It seemed that whenever
    you attempt to use a reader or InputSource, it would throw some
    error...I think it was arrayoutofbounds error. I used the
    following code, and it seemed to work:
    xmlDoc is a String of xml
    byte aByteArr [] = xmlDoc.getBytes();
    ByteArrayInputStream bais = new ByteArrayInputStream
    (aByteArr, 0, aByteArr.length);
    domParser.parse(bais);
    This also works if you use the URL version of .parse as well,
    but I am under the contraint of not being able to write out a
    file, so I need to use some kind of memory-based buffer.
    ByteArrayInputStream works for me. I think the reason this
    works is that the actual length of the stream is specified.
    Hope this helps.
    Dan
    Arpan (guest) wrote:
    : Has anyone tried parsing large XML files. I need parse a fiel
    : about 70M+. When I try and parse this file I get
    : java.io.UTFDataFormatException: Invalid UTF8 encoding
    : When I break the file down into smaller size about 2M it has
    no
    : problems. Also the sample data I am using was created using
    : OracleXMLQuery. Anyone have the same problem, solutins,
    : suggestions?
    : Any help is appricated
    : Thanks
    null

  • Need some help with DNG and error parsing the files

    Ok, so I found out that I can't open NEF files with CS2 from my Nikon D3100 - unless I upgrade to CS6, or use the DNG converter.  I did download the DNG converter that came with the Camera RAW 3.7 (for CS2) in one of Adobe's links, but when I try to convert the NEF files, the DNG converter says "There is an error parsing the file".   If it helps, I have Windows 7.  Is there a different version of the DNG converter I should be using?  Thanks!

    A simple Google search will find it.  For some reason it isn't on the Adobe website.  If you upgrade to CS6 you'll find much improvement in your raw conversions.
    http://blogs.adobe.com/lightroomjournal/2012/12/camera-raw-7-3-and-dng-converter-7-3-now-a vailable.html

Maybe you are looking for