Parsing multiple delimiters

I have to input and parse the contents of a .txt file. I can do this using Scanner class and defining a delimiter (e.g. scanner.useDelimiter(",");)
However, my txt file has multiple delimiters. For example, this is a snippet of the file:
Question: Question text goes here Location: Chicago, IL School: Unit 5 Grade: 8 Question: Question 2 text goes here Location: Inglewood, CA School: Ben Franklin Elementary Grade: 4
So, basically, there are numerous delimiters (Question:, Location: School:, etc) and nothing that indicates the start of a new question. Note, I can't use : as a delimiter as it often appears in the text as a legitimate character
Can someone inform me on the best way to parse using multiple delimiters so it reads:
Question: Question Text
Location: Chicago, IL
School:
Grade: 8
Question: Question Text
Location: Inglewood, CA
School:
Grade: 4
thank you

Or can use this code, but it has many limitations: the delimiters must start with uppercase (no other word must be uppercase)->instead of this you can use a regexp; also the result will contain all entries for the respective delimiter, delimited with ',' (comma), you can use other delimiter and then use that to split it if you want them separated.
This is very poor quality code; any improvements and tips are more welcomed from experienced users
import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
public class test
public static void main(String[] args)
try {
BufferedReader br = new BufferedReader(new FileReader(new File("c:\\file.txt";)));
String questions=processLine(br.readLine(), "Question:");
System.out.println(questions);
} catch (FileNotFoundException ex) {
System.out.println(ex.getMessage());
catch (IOException ex) {
System.out.println(ex.getMessage());
public static String processLine(String line, String delimiter)
String processLine="";
if(line.contains(delimiter))
String delimiterIncrement=line;
while(delimiterIncrement.contains(delimiter))
int pos=delimiterIncrement.indexOf(delimiter);
if(pos!=-1)
int end=getFirstUpper(delimiterIncrement);
if(pos>end)
delimiterIncrement=delimiterIncrement.substring(pos);
end=getFirstUpper(delimiterIncrement);
processLine=processLine+delimiterIncrement.substring(delimiter.length()+1, end)+",";
delimiterIncrement=delimiterIncrement.substring(end+1);
else
processLine=processLine+delimiterIncrement.substring(pos+delimiter.length()+1, end)+",";
delimiterIncrement=delimiterIncrement.substring(end+1);
else
break;
return processLine;
public static int getFirstUpper(String myString)
int first=1000000;
myString=myString.substring(1);
for(char c:myString.toCharArray())
if(Character.isUpperCase(c))
first=myString.indexOf(c);
break;
return first;
}Or... you can use StringTokenizer as instructed on the other forum :))) That's the advantage of Java: knowledge of already built-in classes that improve and ease your work...
edit: so this code does exactly what you want (printout) you must first adapt it to your specific needs... but hopefully helps a bit...
Edited by: don1983p on Apr 19, 2009 12:11 AM

Similar Messages

Multiple delimiters for StringTokenizer

If I am trying to parse a string which has multiple delimiters like
13:10_76.5
How do I tell the string tokenizer that : _ . (colon, underscore, period) are all delimiters? Or can I?
Thanks,
WF

Constructs a string tokenizer for the specified
string. The characters in the delim argument are
the delimiters for separating tokens.
Delimiter characters themselves will not be treated as
tokens.
[b]Parameters:
str - a string to be parsed.
delim - the delimiters[u].
Okay, I am new at this and perhaps I am missing something obvious, but if my string name is "inputstring''
this should look like this: public StringTokenizer(inputstring, :_.) ??
I am trying to parse out the stuff between these delimiters like in this example 13:10_78.5
If I want 13 then 10 then 78 then 5 pulled out, how do I get it to do that? Is that more clear??
Thanks for your patience!
Wf

Flat File Destination Multiple Delimiters

Hi Guys,
This might been an easy one for you. I need to find out if the data from a table can be loaded into a text file with multiple delimiters or not using SSIS? For example, if I have 2 columns of data from a table,
ID Name
100 Mark
I need to load this into a flat file and the output should be like this
100,@Mark
So basically, I need to have 2 delimiters in the flat file destination namely "," and "@". How can this be done.
Thanks in advance.

Can't you just fill in two delimiters in the Column Delimiter field of the Flat File Connection Manager (never tried it)?
Alternatives:
-In the source query add a @ to the second column: select column1,
'@' + column2 as column2 from yourTable
-Add a Derived Column with an expression that adds a @ in front of column2: "@" + [column2]
Please mark the post as answered if it answers your question | My SSIS Blog:
http://microsoft-ssis.blogspot.com |
Twitter

Read from spreadsheet file with multiple delimiters

Is there a way to specify multiple delimitors in the read from spreadsheet file vi? I have a file that i need to read in that contains both space and comma delimitors and would like to read that data into an array using both delimitors (or not and). Below is the data I'm trying to read.
;attenuator data table
att00:   db       000h,015h,017h,035h,03Ch,03Eh,03Eh,05Ch,05Eh,05Eh
att10:   db       07Ch,07Eh,07Fh,09Dh,09Fh,09Fh,0BDh,015h,017h,035h
att20:   db       03Eh,03Eh,05Ch,05Ch,05Eh,07Ch,07Ch,07Eh,09Dh,09Dh
att30:   db       09Fh,0BDh,000h,000h,000h,002h,002h,002h,002h,003h
att40:   db       021h,021h,021h,021h,021h,023h,023h,023h,023h,023h
att50:   db       041h,041h,048h,048h,048h,04Ah,04Ah,04Ah,04Ah,068h
att60:   db       068h,068h,068h,068h,068h,06Ah,06Bh,06Bh,06Bh,089h
att70:   db       089h,089h,089h,08Bh,08Bh,08Bh,08Bh,0A9h,0A9h,0A9h
att80:   db       0A4h,0A6h,0A6h,0A6h,0A6h,0C4h,0C4h,0C4h,0C4h,0C6h
att90:   db       0C6h,0C6h,0C6h,0E4h,0E4h,0E5h,0E5h,0E7h,0E7h,0E7h
    END
I'm looking to just read in the data adjust the hex values and then save the data in the exact form which I read it in. If read from spreadsheet file can not recognize multiple delimiters that is all I need to know. I do not want to spend time reading it in using a single delimitor and doing a bunch of string manipulation. I'm also working with LabView 8.5 if that makes a difference.

You should use "scan string for tokens", and wire an array of delimiters.
One nice behavior is the fact that consecutive delimiters are contracted into one (by default), so e.g. if your delimiters is an array containing a space and a comma, a sequence of three spaces and a comma would still count as one delimiter.
For some ideas, have a look at my old example here:
http://forums.ni.com/ni/board/message?board.id=170&message.id=192847#M192847
LabVIEW Champion . Do more with less code and in less time .

How to parse multiple xml documents from single buffer

Hello,
I am trying to use jaxb 2.0 to parse a buffer which contains multiple xml documents. However, it seems that it is meant to only parse a single document at a time and throws an exception when it gets to the 2nd document.
Is there a way I can tell jaxb to only parse the first complete document and not fetch the next one out of the buffer? Or what is the most efficient way to separate the buffer into two documents without parsing it manually. If I have to search the buffer for the next document root and then split the buffer, it seems like that defeats the purpose of using jaxb as the parser.
I am using the Unmarshaller.unmarshall method and the exception I am getting is:
org.xml.sax.SAXParseException: Illegal character at end of document, <.]
     at javax.xml.bind.helpers.AbstractUnmarshallerImpl.createUnmarshalException(AbstractUnmarshallerImpl.java:315)
     at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.createUnmarshalException(UnmarshallerImpl.java:476)
     at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:198)
     at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(UnmarshallerImpl.java:167)
     at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(AbstractUnmarshallerImpl.java:137)
     at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(AbstractUnmarshallerImpl.java:184)
Thank you for your help

It's just like any other XML parser, it's only designed to parse one XML document. If you have something that concatenates two XML documents together (that's what your "buffer" sounds like), then stop doing that.

Parsing multiple lines

hey
i'm trying to parse some text
name Pac Man
email pac.man@@gamebots.com
phone oh oh oh
address smaith st
name Luke Skywalker
address Dirt Farm,
near Los Isely,
Tattooine
name abc
phone 9283042
i've managed to parse ok for single lines and input it into a arraylist but with address for luke skywalker i'm unable to parse the multiple lines
i've think i''ll set it so it parses to the next name or blank line but i'm having trouble writing this code, but it still only parses a single line
else if (word.equalsIgnoreCase("address") && text.hasNext()) {
address = text.nextLine();
address = address.trim();
pers.setAddress(address);

You could just read lines until you reach a blank line. Now you have all the info for one person, you can process them to get the name, email and address, concating multiple lines as needed.

Digester parsing multiple tags in XML

Hi,
I am using Digester parser to parse a XML which does have multiple tags in it like for example:-
<Customer>
<Customerdetails>
<Customer name="xxx" age="xxx" />
<Customer name="yyy" age="yyy" />
</Customerdetails>
<Messages>
<Success msg="success" active="true"/>
<Failure msg="failure" active="true"/>
</Messages>
</Customer>
Here i am parsing the the customer details using the following code with Digester:-
Digester digester = new Digester();
digester.setValidating(false);
ArrayList<Customer> customerList = new ArrayList<Customer>();
digester.push(customerList);
digester.addObjectCreate("Customer/Customerdetails/Customer", Customer.class);
          String[] attributeNames = new String[] { "name",
                    "age"};
          String[] propertyNames =new String[] { "name",
                    "age"};
          digester.addSetProperties("Customer/Customerdetails/Customer", attributeNames,
                    propertyNames);
          digester.addSetNext("Customer/Customerdetails/Customer", "add");
// parsing for Message success & failure will go here
          try {
               digester.parse(in);
          } catch (Exception e) {
               throw new RuntimeException(e);
          System.out.println("the customerList.size"+customerList.size());
          return customerList;
here it is giving the CustomerList size as "2" , which is correct and i am getting the customer details (name & age) properly, now i need to parse the <Messages> both "Success" and "Failure", that should be done in the same method , when i try to do the same logic for Messages like:-
digester.addObjectCreate("Customer/Messages/Success", Customer.class);
String[] attributeNames = new String[] { "msg",
                    "active"};
          String[] propertyNames =new String[] { "msg",
                    "active"};
          digester.addSetProperties("Customer/Messages/Success", attributeNames,
                    propertyNames);
          digester.addSetNext("CustomerMessages/Success", "add");
i am getting the success message data properly , but the customerdetails data is lost, since i am writing this logic next after the customerdetails.
Now i need the final list (ie) CustomerList to hold all the values that is "Customerdetails", "MessageSuccess" & "MessageFailure" . But i am getting only the values for the last one that is parsed. Is there any way to get all those values in the final list (CustomerList), i am pretty new to Digester, similarly to parse the message failure details, whether we need create another digester.addObjectCreate("Customer/Messages/Failure", Customer.class); to parse it , cant we optimize this through Digester? Please do shed some light into this.
Thanks,
Rithu

No reply guys?

Passing and Parsing Multiple values for 1 Parameter using ASP and reportInterface.ReportParameters

Post Author: ckwizard77
CA Forum: Crystal Reports
HELP!!
I have been knocking my head against a wall trying to figure out how to pass multiple values to 1 parameter and how to add it to the parameter collection. I have code so if I pass single values for each parameter it works fine. I am passing the parmeters and values in a pipe delimited string through a url where it gets parsed and pass in here.
Any help would be greatly appreciated.
Here is the single param code:
Public Sub SetParamValues(ReportName, strParamName, ParamValue) Dim i,reportInterface, ParamName,strSubReprotName,CurrentValues Set reportInterface = reportObject.PluginInterface("") Set reportParameters = result.Item(1).PluginInterface("Report").ReportParameters For i=1 to reportParameters.Count if ReportName <>"" then strSubReprotName=reportInterface.ReportParameters.Item(i).ReportName if strSubReprotName=ReportName then ParamName = reportInterface.ReportParameters.Item(i).ParameterName if ucase(ParamName)=Ucase(strParamName) then Set CurrentValues = reportInterface.ReportParameters.Item(i).CurrentValues CurrentValues.Clear() Dim newSingleParameter Set newSingleParameter = ReportInterface.ReportParameters.Item(i).CreateSingleValue newSingleParameter.Value = ParamValue reportInterface.ReportParameters.Item(i).CurrentValues.Add newSingleParameter reportInterface.ReportParameters.Item(i).PromptOnDemandViewing=false iStore.Commit result End if End if Else ParamName = reportInterface.ReportParameters.Item(i).ParameterName Set param1 = reportInterface.ReportParameters.Item(i) if Ucase(ParamName)=UCase(strParamName) then Set CurrentValues = reportInterface.ReportParameters.Item(i).CurrentValues CurrentValues.Clear() Set newSingleParameter = ReportInterface.ReportParameters.Item(i).CreateSingleValue newSingleParameter.Value = ParamValue reportInterface.ReportParameters.Item(i).CurrentValues.Add newSingleParameter reportInterface.ReportParameters.Item(i).PromptOnDemandViewing=false iStore.Commit result End if End if NextEnd Sub

That Makes sense.
thanks a lot !
Well now are at it, mind if I ask you another quick question:
If I make an option in the multiselect list called ALL which should return all the results:
as should act like this
select * from dept;
you solution was:
select * from dept
where INSTR(':'||:P1_EMPNO||':', ':'||empno ||':') > 0
Can I modify this to return all the rows ?

Parse multiple files in one flat file?

Hi all,
I'm currently working with flat file with this kind of structure:
"849000","1","2","3","4"             <- begin of file
"849HD","","1939","12"              <- header level
"849D1","39193","313","1"         <- detail level
"849D2","","description","48,13" <- detail description level
"849RT","133,1","N4","203"        <- totals level
The problem is that the file i have to pick up (the map is File => EDI)
can contain many structures (every estructure is an edi to generate)
example:
"849000","1","2","3","4"             <- first file
"849HD","","1939","12"
"849D1","39193","313","1"
"849D2","","description","48,13"
"849RT","133,1","N4","203"         <- end of first file
"849000","2","","","6"              <- second file
"849HD","","92","23"
"849D1","99","912","1"
"849D2","","second description","3,11"
"849RT","61","2","UP","102"         <- end of second file
- How can i parse this file in order to get a nested structure?
MT_file
MT_file/row (0.unbounded) <- that would contain 2 files
MT_file/row/849000
MT_file/row/849HD
MT_file/row/849D1
MT_file/row/849D2
MT_file/row/849RT
Because i believe content conversion (KeyFieldValue) is not effective since it will take the key as generate the segments all together without respecting the order
Any ideas?
Thanks!

I'm not so sure about that, i mean i think that the KeyField won't we able to understand the hiercachy
example:
"849000","1","2","3","4" <- first file
"849HD","","1939","12"
"849D1","39193","313","1"
"849D2","","description","48,13"
"849RT","133,1","N4","203" <- end of first file
"849000","2","","","6" <- second file
"849HD","","92","23"
"849D1","99","912","1"
"849D2","","second description","3,11"
"849RT","61","2","UP","102" <- end of second file
with keys:
849000
849HD
849D1
849D2
849RT
i will probably get all in a group, and in doing that i'll loose the reference for the first and second file
resulting:
"849000","1","2","3","4" <- first file
"849000","2","","","6" <- second file
"849HD","","1939","12"
"849HD","","92","23"
"849D1","39193","313","1"
"849D1","99","912","1"
"849D2","","description","48,13"
"849D2","","second description","3,11"
"849RT","133,1","N4","203" <- end of first file
"849RT","61","2","UP","102" <- end of second file
Or am i missing something?
This is file => EDI, so the channel would be SENDER

VBScript for parsing multiple text files

Hi,
I have around 175 text files that contain inventory information that I am trying to parse into an Excel file. We are upgrading our Office platform from 2003 to 2010 and my boss wants to know which machines will have trouble supporting it. I found a script
that will parse a single text file based upon ":" as the delimiter and I'm having trouble figuring out how to change it to open an entire folder of text files and write all of the data to a single Excel spreadsheet. Here is an example of the text
file I'll be parsing. I'm interested in the "Memory and Processor Information" and "Disk Drive Information" sections mainly.
ABEHRENS-XP Computer Inventory
OS Information
OS Details
Caption: Microsoft Windows XP Professional
Description:
InstallDate: 20070404123855.000000-240
Name: Microsoft Windows XP Professional|C:\WINDOWS|\Device\Harddisk0\Partition1
Organization: Your Mom
OSProductSuite:
RegisteredUser: Bob
SerialNumber: 55274-640-3763826-23029
ServicePackMajorVersion: 3
ServicePackMinorVersion: 0
Version: 5.1.2600
WindowsDirectory: C:\WINDOWS
Memory and Processor Information
504MB Total memory HOW CAN I PULL THIS WITHOUT ":" ALSO
Computer Model: HP d330 uT(DG291A)
Processor:               Intel(R) Pentium(R) 4 CPU 2.66GHz
Disk Drive Information
27712MB Free Disk Space ANY WAY TO PULL THIS WITHOUT ":"
38162MB Total Disk Space
Installed Software
Here is the start of the script I have so far. . .
Const ForReading = 1
Set objDict = CreateObject("Scripting.Dictionary")
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objTextFile = objFSO.OpenTextFile("C:\Test\test.txt" ,ForReading)
WANT THIS TO BE C:\Test
Do Until objTextFile.AtEndOfStream
strLine = objTextFile.ReadLine
If Instr(strLine,":") Then
arrSplit = Split(strLine,":") IS ":" THE BEST DELIMITER TO USE?
strField = arrSplit(0)
strValue = arrSplit(1)
If Not objDict.Exists(strField) Then
objDict.Add strField,strValue
Else
objDict.Item(strField) = objDict.Item(strField) & "||" & strValue
End If
End If
Loop
objTextFile.Close
Set objExcel = CreateObject("Excel.Application")
objExcel.Visible = True
objExcel.Workbooks.Add
intColumn = 1
For Each strItem In objDict.Keys
objExcel.Cells(1,intColumn) = strItem
intColumn = intColumn + 1
Next
intColumn = 1
For Each strItem In objDict.Items
arrValues = Split(strItem,"||")
intRow = 1
For Each strValue In arrValues
intRow = intRow + 1
objExcel.Cells(intRow,intColumn) = strValue
Next
intColumn = intColumn + 1
Next
Thank you for any help.

You are The Bomb.com! I had to play around with it to pull some additional data (model and processor) and then write a quick macro to remove the unwanted text and finally I wanted the data to write in columns instead of rows so this is what I ended up with:
Option Explicit
Dim objFSO, objFolder, strFolder, objFile
Dim objReadFile, strLine, objExcel, objSheet
Dim intCol, strExcelPath
Const ForReading = 1
strFolder = "c:\Test"
strExcelPath = "c:\Test\Inventory.xlsx"
Set objExcel = CreateObject("Excel.Application")
objExcel.Workbooks.Add
Set objSheet = objExcel.ActiveWorkbook.Worksheets(1)
intCol = 0
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFolder = objFSO.GetFolder(strFolder)
For Each objFile In objFolder.Files
intCol = intCol + 1
Set objReadFile = objFSO.OpenTextFile(objFile.Path, ForReading)
Do Until objReadFile.AtEndOfStream
    strLine = objReadFile.ReadLine
    If (InStr(strLine, "Computer Inventory") > 0) Then
      objSheet.Cells(intCol, 1).Value = Left(strLine, InStr(strLine, "Computer Inventory") - 2)
    End If
    If (InStr(strLine, "Total memory") > 0) Then
      objSheet.Cells(intCol, 2).Value = Left(strLine, InStr(strLine, "Total memory") - 2)
    End If
    If (InStr(strLine, "Computer Model:") > 0) Then
      objSheet.Cells(intCol, 3).Value = (strLine)
    End If
    If (InStr(strLine, "Processor:") > 0) Then
      objSheet.Cells(intCol, 4).Value = (strLine)
    End If
    If (InStr(strLine, "Total Disk Space") > 0) Then
      objSheet.Cells(intCol, 5).Value = Left(strLine, InStr(strLine, "Total Disk Space") - 2)
    End If
    If (InStr(strLine, "Free Disk Space") > 0) Then
      objSheet.Cells(intCol, 6).Value = Left(strLine, InStr(strLine, "Free Disk Space") - 2)
    End If
Loop
Next
objExcel.ActiveWorkbook.SaveAs strExcelPath
objExcel.ActiveWorkbook.Close
objExcel.Quit
Thanks again!
Hi ,
I am have very basic knowledge about VB scripting, but this code could be the perfect solution i am looking for. could you guide me exactly how to run and test the same , i would be really thankful for your kind and generous support on this.
Thanks ,
Veer

Parsing multiple digit numbers from string

I have a string, specifically "Cups = 30" that I need to parse out "30" from. How can I do this? I've gone through tokenizing, it's dirivng me crazy!!

This program will help you
import java.util.*;
class s21
public static void main(String args[])
try
String s1="Cups=30";
StringTokenizer st=new StringTokenizer(s1,"=");
while(st.hasMoreTokens())
st.nextToken();
System.out.println(st.nextToken());//This statement will
give 30 as output
catch(Exception e)
System.out.println(e);

Multiple Delimiters from a Text File

I am having an issue trying to figure out how to seperate this text file into 4 columns so I can use the data,
S, 0, { }, {a, d}
a, 7, {S},
b, 5, {a}, {c, h}
c, 2, {b, d}, {f}
d, 10, {S}, {c, e}
e, 1, {d}, {f}
f, 3, {c, e, h}, {g}
g, 4, {f}, {F}
h, 4, {b}, {f}
F, 0, {g}, { }
That is the the text file and i am trying to figure out how I could split the text file up in to four columns to manipulate the data

The best starting point: What have you come up with so far, and what's wrong with it?

Parse multiple items in a field.

I have a field that contains many part numbers. There can be 2 or more part numbers in the field separated by a space. How can this be dynamically read so you can compare the contents in the where clause to what you are search for? Example Part Numbers in the field: 1X 2X 3X 4Y 5Y 7Y
I would like to read the field and make rows out of the data.
1X
2X
3X
4Y
5Y
7Y

997344 wrote:
I have a field that contains many part numbers. There can be 2 or more part numbers in the field separated by a space. How can this be dynamically read so you can compare the contents in the where clause to what you are search for? Example Part Numbers in the field: 1X 2X 3X 4Y 5Y 7Y
I would like to read the field and make rows out of the data.
1X
2X
3X
4Y
5Y
7YIf by 'field' you mean 'column', then you are already starting with a seriously flawed design. Please do a little reading on 'data normalization' and 'third normal form'. Hopefully you aren't so committed to this design as to prevent correcting it before you get any deeper.

Multiple Delimiters in a single column of a Delimited File

Source file does not have single / double quotes encapsulation.Whether Escape charater option works? If so, please explain in detail for the procedure.

Consider a scenario, In a pipe delimited file, a column has pipe value. How to make Integration Service understand the pipe inside a column value is not a de-limiter.

Parsing across multiple namespaces... best practice?

Howdy all, here's my situation:
I am attempting to create a simple interface to a particular type of webdav server which has some unusual XML data coming back, and need some advice on the best way to parse this XML.
I have a generic "query" object which consists of a set of "Fields", a "Scope" and a set of "Constraints". This query is ultimately formed as an XML request sent to the WebDAV server. The response is also a generic "ResultSet" response, which consists of "Rows". Each Row consists of "Fields" (the same type as in the query). A Field is basically a key/value pair (all Strings), and a namespace which describes the location of the field within the WebDAV server. (obviously modelled on a JDBC-style interaction)
I need to build a generic parser which can create a ResultSet from a raw XML document.
For Example:
Given a "query" with the following fields:
FIELD 1
namespace: urn:schemas:httpmail
name:fromemail
FIELD 2
namespace: DAV
name: id
The query will basically say (pseudo SQL)
"select "urn:schemas:httpmail:fromemail", "DAV:id" from <scope> where <constraints>"
This will return an XML document something like this:
<a:multistatus
      xmlns:a="DAV:"
      xmlns:b="urn:uuid:c2f41010-65b3-11d1-a29f-00aa00c14882/"
      xmlns:c="xml:"
      xmlns:d="urn:schemas:httpmail:"
      xmlns:e="urn:schemas:mailheader:">
   <a:response>
      <a:href>
            http://blah.blah.com/someurl
      </a:href>
      <a:propstat>
         <a:status>
               HTTP/1.1 200 OK
         </a:status>
         <a:prop>
            <d:fromemail>
                 [email protected]
            </d:fromemail>
            <a:id>
                  AQsAAAAARgAgCwAAAABGeAIAAAAA
            </a:id>
         </a:prop>
      </a:propstat>
   </a:response>
   <a:response>
   </a:response>
</a:multistatus>
So.. I then need to create a ResultSet object, which looks like this:
ResultSet<Object>
|---Rows<Collection>
    |---Row<Object>
        |---Field<Object>
            |---name: fromemail
            |---namespace: urn:schemas:httpmail
            |---value: [email protected]
        |---Field<Object>
            |---name: id
            |---namespace: DAV
            |---value: AQsAAAAARgAgCwAAAABGeAIAAAAA
    |---Row<Object>
            ....As you can see, I need BOTH the value, and the original data relating to the namespace and field name in the java object returned.
Problems I am having:
1. Parsing multiple namespaces. The path to multistatus/response/prop/fromemail (for example) spans two namespaces, hence can't use Commons Digester
2. Including XML element names as values in the returned object. I need/want to be able to store the XML element name (eg "fromemail") as a value in the java object returned.
I'm looking for thoughts as to the best way to deal with this... XPath?, DOM/SAX? etc
Any help you can provide.
Cheers

Howdy all, here's my situation:
I am attempting to create a simple interface to a particular type of webdav server which has some unusual XML data coming back, and need some advice on the best way to parse this XML.
I have a generic "query" object which consists of a set of "Fields", a "Scope" and a set of "Constraints". This query is ultimately formed as an XML request sent to the WebDAV server. The response is also a generic "ResultSet" response, which consists of "Rows". Each Row consists of "Fields" (the same type as in the query). A Field is basically a key/value pair (all Strings), and a namespace which describes the location of the field within the WebDAV server. (obviously modelled on a JDBC-style interaction)
I need to build a generic parser which can create a ResultSet from a raw XML document.
For Example:
Given a "query" with the following fields:
FIELD 1
namespace: urn:schemas:httpmail
name:fromemail
FIELD 2
namespace: DAV
name: id
The query will basically say (pseudo SQL)
"select "urn:schemas:httpmail:fromemail", "DAV:id" from <scope> where <constraints>"
This will return an XML document something like this:
<a:multistatus
      xmlns:a="DAV:"
      xmlns:b="urn:uuid:c2f41010-65b3-11d1-a29f-00aa00c14882/"
      xmlns:c="xml:"
      xmlns:d="urn:schemas:httpmail:"
      xmlns:e="urn:schemas:mailheader:">
   <a:response>
      <a:href>
            http://blah.blah.com/someurl
      </a:href>
      <a:propstat>
         <a:status>
               HTTP/1.1 200 OK
         </a:status>
         <a:prop>
            <d:fromemail>
                 [email protected]
            </d:fromemail>
            <a:id>
                  AQsAAAAARgAgCwAAAABGeAIAAAAA
            </a:id>
         </a:prop>
      </a:propstat>
   </a:response>
   <a:response>
   </a:response>
</a:multistatus>
So.. I then need to create a ResultSet object, which looks like this:
ResultSet<Object>
|---Rows<Collection>
    |---Row<Object>
        |---Field<Object>
            |---name: fromemail
            |---namespace: urn:schemas:httpmail
            |---value: [email protected]
        |---Field<Object>
            |---name: id
            |---namespace: DAV
            |---value: AQsAAAAARgAgCwAAAABGeAIAAAAA
    |---Row<Object>
            ....As you can see, I need BOTH the value, and the original data relating to the namespace and field name in the java object returned.
Problems I am having:
1. Parsing multiple namespaces. The path to multistatus/response/prop/fromemail (for example) spans two namespaces, hence can't use Commons Digester
2. Including XML element names as values in the returned object. I need/want to be able to store the XML element name (eg "fromemail") as a value in the java object returned.
I'm looking for thoughts as to the best way to deal with this... XPath?, DOM/SAX? etc
Any help you can provide.
Cheers

Parsing multiple delimiters

Similar Messages

Maybe you are looking for