Parsing multiple delimiters

I have to input and parse the contents of a .txt file. I can do this using Scanner class and defining a delimiter (e.g. scanner.useDelimiter(",");)
However, my txt file has multiple delimiters. For example, this is a snippet of the file:
Question: Question text goes here Location: Chicago, IL School: Unit 5 Grade: 8 Question: Question 2 text goes here Location: Inglewood, CA School: Ben Franklin Elementary Grade: 4
So, basically, there are numerous delimiters (Question:, Location: School:, etc) and nothing that indicates the start of a new question. Note, I can't use : as a delimiter as it often appears in the text as a legitimate character
Can someone inform me on the best way to parse using multiple delimiters so it reads:
Question: Question Text
Location: Chicago, IL
School:
Grade: 8
Question: Question Text
Location: Inglewood, CA
School:
Grade: 4
thank you

Or can use this code, but it has many limitations: the delimiters must start with uppercase (no other word must be uppercase)->instead of this you can use a regexp; also the result will contain all entries for the respective delimiter, delimited with ',' (comma), you can use other delimiter and then use that to split it if you want them separated.
This is very poor quality code; any improvements and tips are more welcomed from experienced users
import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
public class test
public static void main(String[] args)
try {
BufferedReader br = new BufferedReader(new FileReader(new File("c:\\file.txt";)));
String questions=processLine(br.readLine(), "Question:");
System.out.println(questions);
} catch (FileNotFoundException ex) {
System.out.println(ex.getMessage());
catch (IOException ex) {
System.out.println(ex.getMessage());
public static String processLine(String line, String delimiter)
String processLine="";
if(line.contains(delimiter))
String delimiterIncrement=line;
while(delimiterIncrement.contains(delimiter))
int pos=delimiterIncrement.indexOf(delimiter);
if(pos!=-1)
int end=getFirstUpper(delimiterIncrement);
if(pos>end)
delimiterIncrement=delimiterIncrement.substring(pos);
end=getFirstUpper(delimiterIncrement);
processLine=processLine+delimiterIncrement.substring(delimiter.length()+1, end)+",";
delimiterIncrement=delimiterIncrement.substring(end+1);
else
processLine=processLine+delimiterIncrement.substring(pos+delimiter.length()+1, end)+",";
delimiterIncrement=delimiterIncrement.substring(end+1);
else
break;
return processLine;
public static int getFirstUpper(String myString)
int first=1000000;
myString=myString.substring(1);
for(char c:myString.toCharArray())
if(Character.isUpperCase(c))
first=myString.indexOf(c);
break;
return first;
}Or... you can use StringTokenizer as instructed on the other forum :))) That's the advantage of Java: knowledge of already built-in classes that improve and ease your work...
edit: so this code does exactly what you want (printout) you must first adapt it to your specific needs... but hopefully helps a bit...
Edited by: don1983p on Apr 19, 2009 12:11 AM

Similar Messages

  • Multiple delimiters for StringTokenizer

    If I am trying to parse a string which has multiple delimiters like
    13:10_76.5
    How do I tell the string tokenizer that : _ . (colon, underscore, period) are all delimiters? Or can I?
    Thanks,
    WF

    Constructs a string tokenizer for the specified
    string. The characters in the delim argument are
    the delimiters for separating tokens.
    Delimiter characters themselves will not be treated as
    tokens.
    [b]Parameters:
    str - a string to be parsed.
    delim - the delimiters[u].
    Okay, I am new at this and perhaps I am missing something obvious, but if my string name is "inputstring''
    this should look like this: public StringTokenizer(inputstring, :_.) ??
    I am trying to parse out the stuff between these delimiters like in this example 13:10_78.5
    If I want 13 then 10 then 78 then 5 pulled out, how do I get it to do that? Is that more clear??
    Thanks for your patience!
    Wf

  • Flat File Destination Multiple Delimiters

    Hi Guys,
    This might been an easy one for you. I need to find out if the data from a table can be loaded into a text file with multiple delimiters or not using SSIS? For example, if I have 2 columns of data from a table,
    ID Name
    100 Mark
    I need to load this into a flat file and the output should be like this
    100,@Mark
    So basically, I need to have 2 delimiters in the flat file destination namely "," and "@". How can this be done.
    Thanks in advance.

    Can't you just fill in two delimiters in the Column Delimiter field of the Flat File Connection Manager (never tried it)?
    Alternatives:
    -In the source query add a @ to the second column: select column1,
    '@' + column2 as column2 from yourTable
    -Add a Derived Column with an expression that adds a @ in front of column2: "@" + [column2]
    Please mark the post as answered if it answers your question | My SSIS Blog:
    http://microsoft-ssis.blogspot.com |
    Twitter

  • Read from spreadsheet file with multiple delimiters

    Is there a way to specify multiple delimitors in the read from spreadsheet file vi? I have a file that i need to read in that contains both space and comma delimitors and would like to read that data into an array using both delimitors (or not and). Below is the data I'm trying to read.
    ;attenuator data table
    att00:   db       000h,015h,017h,035h,03Ch,03Eh,03Eh,05Ch,05Eh,05Eh
    att10:   db       07Ch,07Eh,07Fh,09Dh,09Fh,09Fh,0BDh,015h,017h,035h
    att20:   db       03Eh,03Eh,05Ch,05Ch,05Eh,07Ch,07Ch,07Eh,09Dh,09Dh
    att30:   db       09Fh,0BDh,000h,000h,000h,002h,002h,002h,002h,003h
    att40:   db       021h,021h,021h,021h,021h,023h,023h,023h,023h,023h
    att50:   db       041h,041h,048h,048h,048h,04Ah,04Ah,04Ah,04Ah,068h
    att60:   db       068h,068h,068h,068h,068h,06Ah,06Bh,06Bh,06Bh,089h
    att70:   db       089h,089h,089h,08Bh,08Bh,08Bh,08Bh,0A9h,0A9h,0A9h
    att80:   db       0A4h,0A6h,0A6h,0A6h,0A6h,0C4h,0C4h,0C4h,0C4h,0C6h
    att90:   db       0C6h,0C6h,0C6h,0E4h,0E4h,0E5h,0E5h,0E7h,0E7h,0E7h
        END
    I'm looking to just read in the data adjust the hex values and then save the data in the exact form which I read it in. If read from spreadsheet file can not recognize multiple delimiters that is all I need to know. I do not want to spend time reading it in using a single delimitor and doing a bunch of string manipulation. I'm also working with LabView 8.5 if that makes a difference.

    You should use "scan string for tokens", and wire an array of delimiters.
    One nice behavior is the fact that consecutive delimiters are contracted into one (by default), so e.g. if your delimiters is an array containing a space and a comma, a sequence of three spaces and a comma would still count as one delimiter.
    For some ideas, have a look at my old example here:
    http://forums.ni.com/ni/board/message?board.id=170&message.id=192847#M192847
    LabVIEW Champion . Do more with less code and in less time .

  • How to parse multiple xml documents from single buffer

    Hello,
    I am trying to use jaxb 2.0 to parse a buffer which contains multiple xml documents. However, it seems that it is meant to only parse a single document at a time and throws an exception when it gets to the 2nd document.
    Is there a way I can tell jaxb to only parse the first complete document and not fetch the next one out of the buffer? Or what is the most efficient way to separate the buffer into two documents without parsing it manually. If I have to search the buffer for the next document root and then split the buffer, it seems like that defeats the purpose of using jaxb as the parser.
    I am using the Unmarshaller.unmarshall method and the exception I am getting is:
    org.xml.sax.SAXParseException: Illegal character at end of document, <.]
         at javax.xml.bind.helpers.AbstractUnmarshallerImpl.createUnmarshalException(AbstractUnmarshallerImpl.java:315)
         at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.createUnmarshalException(UnmarshallerImpl.java:476)
         at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:198)
         at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(UnmarshallerImpl.java:167)
         at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(AbstractUnmarshallerImpl.java:137)
         at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(AbstractUnmarshallerImpl.java:184)
    Thank you for your help

    It's just like any other XML parser, it's only designed to parse one XML document. If you have something that concatenates two XML documents together (that's what your "buffer" sounds like), then stop doing that.

  • Parsing multiple lines

    hey
    i'm trying to parse some text
    name Pac Man
    email pac.man@@gamebots.com
    phone oh oh oh
    address smaith st
    name Luke Skywalker
    address Dirt Farm,
    near Los Isely,
    Tattooine
    name abc
    phone 9283042
    i've managed to parse ok for single lines and input it into a arraylist but with address for luke skywalker i'm unable to parse the multiple lines
    i've think i''ll set it so it parses to the next name or blank line but i'm having trouble writing this code, but it still only parses a single line
    else if (word.equalsIgnoreCase("address") && text.hasNext()) {
    address = text.nextLine();
    address = address.trim();
    pers.setAddress(address);

    You could just read lines until you reach a blank line. Now you have all the info for one person, you can process them to get the name, email and address, concating multiple lines as needed.

  • Digester parsing multiple tags in XML

    Hi,
    I am using Digester parser to parse a XML which does have multiple tags in it like for example:-
    <Customer>
    <Customerdetails>
    <Customer name="xxx" age="xxx" />
    <Customer name="yyy" age="yyy" />
    </Customerdetails>
    <Messages>
    <Success msg="success" active="true"/>
    <Failure msg="failure" active="true"/>
    </Messages>
    </Customer>
    Here i am parsing the the customer details using the following code with Digester:-
    Digester digester = new Digester();
    digester.setValidating(false);
    ArrayList<Customer> customerList = new ArrayList<Customer>();
    digester.push(customerList);
    digester.addObjectCreate("Customer/Customerdetails/Customer", Customer.class);
              String[] attributeNames = new String[] { "name",
                        "age"};
              String[] propertyNames =new String[] { "name",
                        "age"};
              digester.addSetProperties("Customer/Customerdetails/Customer", attributeNames,
                        propertyNames);
              digester.addSetNext("Customer/Customerdetails/Customer", "add");
    // parsing for Message success & failure will go here
              try {
                   digester.parse(in);
              } catch (Exception e) {
                   throw new RuntimeException(e);
              System.out.println("the customerList.size"+customerList.size());
              return customerList;
    here it is giving the CustomerList size as "2" , which is correct and i am getting the customer details (name & age) properly, now i need to parse the <Messages> both "Success" and "Failure", that should be done in the same method , when i try to do the same logic for Messages like:-
    digester.addObjectCreate("Customer/Messages/Success", Customer.class);
    String[] attributeNames = new String[] { "msg",
                        "active"};
              String[] propertyNames =new String[] { "msg",
                        "active"};
              digester.addSetProperties("Customer/Messages/Success", attributeNames,
                        propertyNames);
              digester.addSetNext("CustomerMessages/Success", "add");
    i am getting the success message data properly , but the customerdetails data is lost, since i am writing this logic next after the customerdetails.
    Now i need the final list (ie) CustomerList to hold all the values that is "Customerdetails", "MessageSuccess" & "MessageFailure" . But i am getting only the values for the last one that is parsed. Is there any way to get all those values in the final list (CustomerList), i am pretty new to Digester, similarly to parse the message failure details, whether we need create another digester.addObjectCreate("Customer/Messages/Failure", Customer.class); to parse it , cant we optimize this through Digester? Please do shed some light into this.
    Thanks,
    Rithu

    No reply guys?

  • Passing and Parsing Multiple values for 1 Parameter using ASP and reportInterface.ReportParameters

    Post Author: ckwizard77
    CA Forum: Crystal Reports
    HELP!!
    I have been knocking my head against a wall trying to figure out how to pass multiple values to 1 parameter and how to add it to the parameter collection. I have code so if I pass single values for each parameter it works fine. I am passing the parmeters and values in a pipe delimited string through a url where it gets parsed and pass in here.
    Any help would be greatly appreciated.
    Here is the single param code:
    Public Sub SetParamValues(ReportName, strParamName, ParamValue) Dim i,reportInterface, ParamName,strSubReprotName,CurrentValues        Set reportInterface = reportObject.PluginInterface("")      Set reportParameters = result.Item(1).PluginInterface("Report").ReportParameters      For i=1 to reportParameters.Count           if ReportName <>"" then                 strSubReprotName=reportInterface.ReportParameters.Item(i).ReportName               if strSubReprotName=ReportName then                    ParamName = reportInterface.ReportParameters.Item(i).ParameterName                    if ucase(ParamName)=Ucase(strParamName) then                         Set CurrentValues = reportInterface.ReportParameters.Item(i).CurrentValues                         CurrentValues.Clear()                         Dim newSingleParameter                         Set newSingleParameter = ReportInterface.ReportParameters.Item(i).CreateSingleValue                         newSingleParameter.Value = ParamValue                         reportInterface.ReportParameters.Item(i).CurrentValues.Add newSingleParameter                         reportInterface.ReportParameters.Item(i).PromptOnDemandViewing=false                         iStore.Commit result                    End if               End if          Else               ParamName = reportInterface.ReportParameters.Item(i).ParameterName               Set param1 = reportInterface.ReportParameters.Item(i)               if Ucase(ParamName)=UCase(strParamName) then                     Set CurrentValues = reportInterface.ReportParameters.Item(i).CurrentValues                    CurrentValues.Clear()                    Set newSingleParameter = ReportInterface.ReportParameters.Item(i).CreateSingleValue                    newSingleParameter.Value = ParamValue                    reportInterface.ReportParameters.Item(i).CurrentValues.Add newSingleParameter                    reportInterface.ReportParameters.Item(i).PromptOnDemandViewing=false                    iStore.Commit result                End if         End if      NextEnd Sub

    That Makes sense.
    thanks a lot !
    Well now are at it, mind if I ask you another quick question:
    If I make an option in the multiselect list called ALL which should return all the results:
    as should act like this
    select * from dept;
    you solution was:
    select * from dept
    where INSTR(':'||:P1_EMPNO||':', ':'||empno ||':') > 0
    Can I modify this to return all the rows ?

  • Parse multiple files in one flat file?

    Hi all,
    I'm currently working with flat file with  this kind of structure:
    "849000","1","2","3","4"             <- begin of file
    "849HD","","1939","12"              <- header level
    "849D1","39193","313","1"         <- detail level
    "849D2","","description","48,13" <- detail description level
    "849RT","133,1","N4","203"        <- totals level
    The problem is that the file i have to pick up (the map is File => EDI)
    can contain many structures (every estructure is an edi to generate)
    example:
    "849000","1","2","3","4"             <- first file
    "849HD","","1939","12"             
    "849D1","39193","313","1"         
    "849D2","","description","48,13" 
    "849RT","133,1","N4","203"         <- end of first file
    "849000","2","","","6"              <- second file
    "849HD","","92","23"              
    "849D1","99","912","1"         
    "849D2","","second description","3,11" 
    "849RT","61","2","UP","102"         <- end of second file
    - How can i parse this file in order to get a nested structure?
    MT_file
    MT_file/row (0.unbounded) <- that would contain 2 files
    MT_file/row/849000
    MT_file/row/849HD
    MT_file/row/849D1
    MT_file/row/849D2
    MT_file/row/849RT
    Because i believe content conversion (KeyFieldValue) is not effective since it will take the key as generate the segments all together without respecting the order
    Any ideas?
    Thanks!

    I'm not so sure about that, i mean i think that the KeyField won't we able to understand the hiercachy
    example:
    "849000","1","2","3","4" <- first file
    "849HD","","1939","12"
    "849D1","39193","313","1"
    "849D2","","description","48,13"
    "849RT","133,1","N4","203" <- end of first file
    "849000","2","","","6" <- second file
    "849HD","","92","23"
    "849D1","99","912","1"
    "849D2","","second description","3,11"
    "849RT","61","2","UP","102" <- end of second file
    with keys:
    849000
    849HD
    849D1
    849D2
    849RT
    i will probably get all in a group, and in doing that i'll loose the reference for the first and second file
    resulting:
    "849000","1","2","3","4" <- first file
    "849000","2","","","6" <- second file
    "849HD","","1939","12"
    "849HD","","92","23"
    "849D1","39193","313","1"
    "849D1","99","912","1"
    "849D2","","description","48,13"
    "849D2","","second description","3,11"
    "849RT","133,1","N4","203" <- end of first file
    "849RT","61","2","UP","102" <- end of second file
    Or am i missing something?
    This is file => EDI, so the channel would be SENDER

  • VBScript for parsing multiple text files

    Hi,
    I have around 175 text files that contain inventory information that I am trying to parse into an Excel file. We are upgrading our Office platform from 2003 to 2010 and my boss wants to know which machines will have trouble supporting it. I found a script
    that will parse a single text file based upon ":" as the delimiter and I'm having trouble figuring out how to change it to open an entire folder of text files and write all of the data to a single Excel spreadsheet. Here is an example of the text
    file I'll be parsing. I'm interested in the "Memory and Processor Information" and "Disk Drive Information" sections mainly.
    ABEHRENS-XP Computer Inventory
    OS Information
    OS Details
    Caption: Microsoft Windows XP Professional
    Description:
    InstallDate: 20070404123855.000000-240
    Name: Microsoft Windows XP Professional|C:\WINDOWS|\Device\Harddisk0\Partition1
    Organization: Your Mom
    OSProductSuite:
    RegisteredUser: Bob
    SerialNumber: 55274-640-3763826-23029
    ServicePackMajorVersion: 3
    ServicePackMinorVersion: 0
    Version: 5.1.2600
    WindowsDirectory: C:\WINDOWS
    Memory and Processor Information
    504MB Total memory HOW CAN I PULL THIS WITHOUT ":" ALSO
    Computer Model: HP d330 uT(DG291A)
    Processor:               Intel(R) Pentium(R) 4 CPU 2.66GHz
    Disk Drive Information
    27712MB Free Disk Space ANY WAY TO PULL THIS WITHOUT ":"
    38162MB Total Disk Space
    Installed Software
    Here is the start of the script I have so far. . .
    Const ForReading = 1
    Set objDict = CreateObject("Scripting.Dictionary")
    Set objFSO = CreateObject("Scripting.FileSystemObject")
    Set objTextFile = objFSO.OpenTextFile("C:\Test\test.txt" ,ForReading)
    WANT THIS TO BE C:\Test
    Do Until objTextFile.AtEndOfStream
    strLine = objTextFile.ReadLine
    If Instr(strLine,":") Then
    arrSplit = Split(strLine,":") IS ":" THE BEST DELIMITER TO USE?
    strField = arrSplit(0)
    strValue = arrSplit(1)
    If Not objDict.Exists(strField) Then
    objDict.Add strField,strValue
    Else
    objDict.Item(strField) = objDict.Item(strField) & "||" & strValue
    End If
    End If
    Loop
    objTextFile.Close
    Set objExcel = CreateObject("Excel.Application")
    objExcel.Visible = True
    objExcel.Workbooks.Add
    intColumn = 1
    For Each strItem In objDict.Keys
    objExcel.Cells(1,intColumn) = strItem
    intColumn = intColumn + 1
    Next
    intColumn = 1
    For Each strItem In objDict.Items
    arrValues = Split(strItem,"||")
    intRow = 1
    For Each strValue In arrValues
    intRow = intRow + 1
    objExcel.Cells(intRow,intColumn) = strValue
    Next
    intColumn = intColumn + 1
    Next
    Thank you for any help.

    You are The Bomb.com! I had to play around with it to pull some additional data (model and processor) and then write a quick macro to remove the unwanted text and finally I wanted the data to write in columns instead of rows so this is what I ended up with:
    Option Explicit
    Dim objFSO, objFolder, strFolder, objFile
    Dim objReadFile, strLine, objExcel, objSheet
    Dim intCol, strExcelPath
    Const ForReading = 1
    strFolder = "c:\Test"
    strExcelPath = "c:\Test\Inventory.xlsx"
    Set objExcel = CreateObject("Excel.Application")
    objExcel.Workbooks.Add
    Set objSheet = objExcel.ActiveWorkbook.Worksheets(1)
    intCol = 0
    Set objFSO = CreateObject("Scripting.FileSystemObject")
    Set objFolder = objFSO.GetFolder(strFolder)
    For Each objFile In objFolder.Files
      intCol = intCol + 1
      Set objReadFile = objFSO.OpenTextFile(objFile.Path, ForReading)
      Do Until objReadFile.AtEndOfStream
        strLine = objReadFile.ReadLine
        If (InStr(strLine, "Computer Inventory") > 0) Then
          objSheet.Cells(intCol, 1).Value = Left(strLine, InStr(strLine, "Computer Inventory") - 2)
        End If
        If (InStr(strLine, "Total memory") > 0) Then
          objSheet.Cells(intCol, 2).Value = Left(strLine, InStr(strLine, "Total memory") - 2)
        End If
        If (InStr(strLine, "Computer Model:") > 0) Then
          objSheet.Cells(intCol, 3).Value = (strLine)
        End If
        If (InStr(strLine, "Processor:") > 0) Then
          objSheet.Cells(intCol, 4).Value = (strLine)
        End If
        If (InStr(strLine, "Total Disk Space") > 0) Then
          objSheet.Cells(intCol, 5).Value = Left(strLine, InStr(strLine, "Total Disk Space") - 2)
        End If
        If (InStr(strLine, "Free Disk Space") > 0) Then
          objSheet.Cells(intCol, 6).Value = Left(strLine, InStr(strLine, "Free Disk Space") - 2)
        End If
      Loop
    Next
    objExcel.ActiveWorkbook.SaveAs strExcelPath
    objExcel.ActiveWorkbook.Close
    objExcel.Quit
    Thanks again!
    Hi ,
    I am have very basic knowledge about VB scripting, but this code could be the perfect solution i am looking for. could you guide me exactly how to run and test the same , i would be really thankful for your kind and generous support on this.
    Thanks ,
    Veer

  • Parsing multiple digit numbers from string

    I have a string, specifically "Cups = 30" that I need to parse out "30" from. How can I do this? I've gone through tokenizing, it's dirivng me crazy!!

    This program will help you
    import java.util.*;
    class s21
    public static void main(String args[])
    try
    String s1="Cups=30";
    StringTokenizer st=new StringTokenizer(s1,"=");
    while(st.hasMoreTokens())
    st.nextToken();
    System.out.println(st.nextToken());//This statement will
    give 30 as output
    catch(Exception e)
    System.out.println(e);

  • Multiple Delimiters from a Text File

    I am having an issue trying to figure out how to seperate this text file into 4 columns so I can use the data,
    S, 0, { }, {a, d}
    a, 7, {S},
    b, 5, {a}, {c, h}
    c, 2, {b, d}, {f}
    d, 10, {S}, {c, e}
    e, 1, {d}, {f}
    f, 3, {c, e, h}, {g}
    g, 4, {f}, {F}
    h, 4, {b}, {f}
    F, 0, {g}, { }
    That is the the text file and i am trying to figure out how I could split the text file up in to four columns to manipulate the data

    The best starting point: What have you come up with so far, and what's wrong with it?

  • Parse multiple items in a field.

    I have a field that contains many part numbers. There can be 2 or more part numbers in the field separated by a space. How can this be dynamically read so you can compare the contents in the where clause to what you are search for? Example Part Numbers in the field: 1X 2X 3X 4Y 5Y 7Y
    I would like to read the field and make rows out of the data.
    1X
    2X
    3X
    4Y
    5Y
    7Y

    997344 wrote:
    I have a field that contains many part numbers. There can be 2 or more part numbers in the field separated by a space. How can this be dynamically read so you can compare the contents in the where clause to what you are search for? Example Part Numbers in the field: 1X 2X 3X 4Y 5Y 7Y
    I would like to read the field and make rows out of the data.
    1X
    2X
    3X
    4Y
    5Y
    7YIf by 'field' you mean 'column', then you are already starting with a seriously flawed design. Please do a little reading on 'data normalization' and 'third normal form'. Hopefully you aren't so committed to this design as to prevent correcting it before you get any deeper.

  • Multiple Delimiters in a single column of a Delimited File

    Source file does not have single / double quotes encapsulation.Whether Escape charater option works? If so, please explain in detail for the procedure.

    Consider a scenario, In a pipe delimited file, a column has pipe value. How to make Integration Service understand the pipe inside a column value is not a de-limiter.

  • Parsing across multiple namespaces... best practice?

    Howdy all, here's my situation:
    I am attempting to create a simple interface to a particular type of webdav server which has some unusual XML data coming back, and need some advice on the best way to parse this XML.
    I have a generic "query" object which consists of a set of "Fields", a "Scope" and a set of "Constraints". This query is ultimately formed as an XML request sent to the WebDAV server. The response is also a generic "ResultSet" response, which consists of "Rows". Each Row consists of "Fields" (the same type as in the query). A Field is basically a key/value pair (all Strings), and a namespace which describes the location of the field within the WebDAV server. (obviously modelled on a JDBC-style interaction)
    I need to build a generic parser which can create a ResultSet from a raw XML document.
    For Example:
    Given a "query" with the following fields:
    FIELD 1
    namespace: urn:schemas:httpmail
    name:fromemail
    FIELD 2
    namespace: DAV
    name: id
    The query will basically say (pseudo SQL)
    "select "urn:schemas:httpmail:fromemail", "DAV:id" from <scope> where <constraints>"
    This will return an XML document something like this:
    <a:multistatus
          xmlns:a="DAV:"
          xmlns:b="urn:uuid:c2f41010-65b3-11d1-a29f-00aa00c14882/"
          xmlns:c="xml:"
          xmlns:d="urn:schemas:httpmail:"
          xmlns:e="urn:schemas:mailheader:">
       <a:response>
          <a:href>
                http://blah.blah.com/someurl
          </a:href>
          <a:propstat>
             <a:status>
                   HTTP/1.1 200 OK
             </a:status>
             <a:prop>
                <d:fromemail>
                     [email protected]
                </d:fromemail>
                <a:id>
                      AQsAAAAARgAgCwAAAABGeAIAAAAA
                </a:id>
             </a:prop>
          </a:propstat>
       </a:response>
       <a:response>
       </a:response>
    </a:multistatus>
    So.. I then need to create a ResultSet object, which looks like this:
    ResultSet<Object>
    |---Rows<Collection>
        |---Row<Object>
            |---Field<Object>
                |---name: fromemail
                |---namespace: urn:schemas:httpmail
                |---value: [email protected]
            |---Field<Object>
                |---name: id
                |---namespace: DAV
                |---value: AQsAAAAARgAgCwAAAABGeAIAAAAA
        |---Row<Object>
                ....As you can see, I need BOTH the value, and the original data relating to the namespace and field name in the java object returned.
    Problems I am having:
    1. Parsing multiple namespaces. The path to multistatus/response/prop/fromemail (for example) spans two namespaces, hence can't use Commons Digester
    2. Including XML element names as values in the returned object. I need/want to be able to store the XML element name (eg "fromemail") as a value in the java object returned.
    I'm looking for thoughts as to the best way to deal with this... XPath?, DOM/SAX? etc
    Any help you can provide.
    Cheers

    Howdy all, here's my situation:
    I am attempting to create a simple interface to a particular type of webdav server which has some unusual XML data coming back, and need some advice on the best way to parse this XML.
    I have a generic "query" object which consists of a set of "Fields", a "Scope" and a set of "Constraints". This query is ultimately formed as an XML request sent to the WebDAV server. The response is also a generic "ResultSet" response, which consists of "Rows". Each Row consists of "Fields" (the same type as in the query). A Field is basically a key/value pair (all Strings), and a namespace which describes the location of the field within the WebDAV server. (obviously modelled on a JDBC-style interaction)
    I need to build a generic parser which can create a ResultSet from a raw XML document.
    For Example:
    Given a "query" with the following fields:
    FIELD 1
    namespace: urn:schemas:httpmail
    name:fromemail
    FIELD 2
    namespace: DAV
    name: id
    The query will basically say (pseudo SQL)
    "select "urn:schemas:httpmail:fromemail", "DAV:id" from <scope> where <constraints>"
    This will return an XML document something like this:
    <a:multistatus
          xmlns:a="DAV:"
          xmlns:b="urn:uuid:c2f41010-65b3-11d1-a29f-00aa00c14882/"
          xmlns:c="xml:"
          xmlns:d="urn:schemas:httpmail:"
          xmlns:e="urn:schemas:mailheader:">
       <a:response>
          <a:href>
                http://blah.blah.com/someurl
          </a:href>
          <a:propstat>
             <a:status>
                   HTTP/1.1 200 OK
             </a:status>
             <a:prop>
                <d:fromemail>
                     [email protected]
                </d:fromemail>
                <a:id>
                      AQsAAAAARgAgCwAAAABGeAIAAAAA
                </a:id>
             </a:prop>
          </a:propstat>
       </a:response>
       <a:response>
       </a:response>
    </a:multistatus>
    So.. I then need to create a ResultSet object, which looks like this:
    ResultSet<Object>
    |---Rows<Collection>
        |---Row<Object>
            |---Field<Object>
                |---name: fromemail
                |---namespace: urn:schemas:httpmail
                |---value: [email protected]
            |---Field<Object>
                |---name: id
                |---namespace: DAV
                |---value: AQsAAAAARgAgCwAAAABGeAIAAAAA
        |---Row<Object>
                ....As you can see, I need BOTH the value, and the original data relating to the namespace and field name in the java object returned.
    Problems I am having:
    1. Parsing multiple namespaces. The path to multistatus/response/prop/fromemail (for example) spans two namespaces, hence can't use Commons Digester
    2. Including XML element names as values in the returned object. I need/want to be able to store the XML element name (eg "fromemail") as a value in the java object returned.
    I'm looking for thoughts as to the best way to deal with this... XPath?, DOM/SAX? etc
    Any help you can provide.
    Cheers

Maybe you are looking for