Text processing

Hi All,
Please let me know if there is way to remove spaces that are occuring between words/characters not between numerics in a given string/text using regexp_replace.
Eg.
'PERSONNEL_AREA_CD1 = 03-06-2013 09:07:01 to 30-10-2013 09:07:01 ' should be processed as below
'PERSONNEL_AREA_CD1=03-06-2013 09:07:01to30-10-2013 09:07:01'
Thanks !
DS

Hi,
We can even use TRANSLATE function.
with data as
select 'PERSONNEL_AREA_CD1 =     03-06-2013 09:07:01      to      30-10-2013 09:07:01   ' str from dual)
select str,translate(str,'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789:_-=~!@#$%^&*()+|}{"?><`\]['''';/., ','ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789:_-=') output from data ;
Output of above query is:
STR                                                                                                                                           OUTPUT
PERSONNEL_AREA_CD1 =     03-06-2013 09:07:01      to      30-10-2013 09:07:01            PERSONNEL_AREA_CD1=03-06-201309:07:01to30-10-201309:07:01

Similar Messages

How to use DS's text processing to parsel the text data stored in DB table?

I want to use DS's text processing to parse the text data stored in a database table, but all the demo I see is parsing text from a flat file. I tried with the blueprint sample and edit the input from a flat file to database table, but can not map the column with the base entityExtraction. anyone can give me a guide or example? thanks so much!
regards,
Bin

Still looking for some ansewers. Can any one help me?

Can i pause the RECOGNIZE TEXT process?

I am rendering a large file to recognize text. I need to also use acrobat to find things while it's doing this for days. is there a way to pause it so I can do a little search and turn it on again without having to go all the way back to the beginning?

Are you using Windows Vista or Windows 7? If so just open the start menu and type "Acrobat /n" without the quotes. It will start a new instance of Acrobat that will not be affected by the one doing OCR.
Regarding performance, seeing as how Acrobat only uses a single processor core and most machines these days are at least dual core, you can easily run multiple versions of Acrobat. I've had 4 instances all doing OCR at the same time on a quad core machine. Even if you just have a dual core, each instance of Acrobat can only use a maximum of 50% of your processing power.

Need help with text() processing in XSL

Hello,
I have an xml that contains such text in my xml:
before<a>inside</a>after
and an xsl that transforms it to HTML (a cut for xsl):
<xsl:template match="a">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="text()">
<xsl:value-of disable-output-escaping="yes" select="."/>
</xsl:template>
The result is: inside before after
but I need: before inside after
It seems it happens 'cause of this: http://www.w3.org/TR/xslt#conflict
but I cannot find a way to solve this problem :(
I had tried to use priority in xsl:template, but it didn't help :(
Thanks a lot.

DrClap
here are xml and xsl.
That's not a real xml and xsl, but they might describe the idea and problem. I hope I miss nothing.
P.S. I cannot control xml, that's why I cannot use: <xsl:text> in xml.
Thank you!
xml:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<title>Page title</title>
<page>
Location: <red>http://host</red>
</page>
</root>
xsl:
<?xml version='1.0' encoding='ISO-8859-1'?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:fo="http://www.w3.org/1999/XSL/Format"
xmlns:fox="http://xml.apache.org/fop/extensions"
exclude-result-prefixes="fo">
<xsl:template match="root">
<html>
<head>
<title>
<xsl:apply-templates select="title"/>
</title>
</head>
<body>
<xsl:apply-templates select="page"/>
</body>
</html>
</xsl:template>
<xsl:template match="page">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="title">
[Test]: <xsl:apply-templates/>
</xsl:template>
<xsl:template match="red">
<xsl:element name="span"><xsl:attribute name="style">color:red</xsl:attribute><xsl:apply-templates/></xsl:element>
</xsl:template>
<xsl:template match="text()">
<xsl:value-of disable-output-escaping="yes" select="."/>
</xsl:template>
</xsl:stylesheet>

Problem in text processing

Hi,
I made a program that makes a lot of processing and when i compile it no error appears but a warning happens.
The warning is:
ResultServlet.java uses unchecked or unsafe operations
What exactly is the source of this warning.
Thanks.

http://onesearch.sun.com/search/onesearch/index.jsp?qt=unchecked+or+unsafe+operations&subCat=siteforumid%3Ajava31&site=dev&dftab=siteforumid%3Ajava31&chooseCat=javaall&col=developer-forums

Conditional text processing

I use RoboHelp 8 and my team mate use RoboHelp X5. I had opened one of her RoboHelp X5 project using RoboHelp 8.0. and after making some changes to some topic files, had passed these files to my team mate.
My team mate imported these files to RoboHelp X5, and found some red boxes with the tags <?rh-cbt_start> and <?rh-cbt_end> appearing. These red boxes with text tags appeared in areas where we had used conditional tags.
Is this because of version incompatibility? What is the best work-around to avoid such tags appearing in the files?

The problem is version incompatibility, the code is very different in RH8 as you have seen. Your team mate will have to be upgraded if you are going to work on the same project.
See www.grainge.org for RoboHelp and Authoring tips
Follow me @petergrainge

Text processing question

I have a vCard file that I need to modify by removing all non-digit characters from telephone numbers.
Here's the exact process that needs to occur:
If a line begins with "TEL", then remove all non-digit characters from the portion of the line following ":". (Every line contains only one colon.)
I have no experience with sed or awk, but I was able to hack a solution together using awk:
awk -F: '/^TEL/ { gsub(/[^[:digit:]]/, "", $2); print $1 ":" $2 } !/^TEL/ { print }'
Is there a simpler way to accomplish my goal? The code above feels somewhat redundant.
I initially attempted to use sed, but I couldn't figure out how to remove non-digits from only part of a line.
-David

Haha, I blame that it was late when I began coding it, but it's a simple move of a ^ as you've already done.
The explanation of tx is that I could only get the s/// to substitute the first number of digits, eg TEL11:22x33 -> TEL11:x33 instead of -> TEL11:x
so what I do is that in the beginning I define a label x, with ":x" and in the end I jump to label x if a s/// successfully substituted something with "tx".
IE
"TEL11:22x33" --[s///]-> "TEL11:x33"
s/// was successfull, jump to x
"TEL11:x33" --[s///]-> "TEL11:x"
s/// was successfull, jump to x
"TEL11:x" --[s///]-> "TEL11:x"
s/// was unsuccessfull, parse next next row of input
Last edited by tlvb (2009-09-13 13:13:58)

Writing XML without doing text processing

Hello,
In all examples I have when XML is written to a file the formatting is done my hand; That is, you have something like:
System.out.print("<Tag attr= " + val + "/>");
In .NET, on the other hand, you can just save a DOM XML Document with one line of code, something like:
doc.write("myxml.xml");
How do I save a DOM Document in Java without having to put in tags, attributes, quotes and all the other syntax by hand?
Thank you,
--Sergey

Thank you for answering, DrClap,
I'm sorry if I wasn't clear enough. My question was: I have a DOM and I want to save it into a file (or write to a stream in a more general case) as code-efficiently as possible. That includes not doing the formating (putting in tags, comments etc) by hand.
The link you've given has a section on saving a DOM into a file but it seems to be much more complex than it needs to be. Their way of doing it involves transforms. Like I said, in .NET this would be a one-liner. Are you sure there is no faster way to do it in Java? I'd expect so because saving a DOM into a file looks like one of the most common tasks one might want to perform with XML...
--Sergey

Text determination process in the Delivery Header and item

Hi Exsperts
Please help me to know step wise process of comfiguring the text detremination at the delivery and Item level
I have a one scenario The text process was already determined in the Quality server it was a correct configuration , but one I need to do here I need to copy Quality configuration to Development client and change the Access sequence in the Deveopment client and delete sequence no and move to quality and finally to production
Please help me to know the basic configuaration of text in the delivery and process of copying already determined text from Quality.
Thanks

Go to VOTXN TRANSACTION
for the delivery header or item level
1. create the text procudure for the Text object VBBK
2. Define the Text IDS for the text procedures (For Master data text the ID are not required and the access sequence)
3. Define the acces sequence and the requiremnets
4. Then define the text objects/ tables for accesses
5. assign the procedure Step 1 to your delivery documnet type or item
Thanks!
A S

How can I take text from a webpage that is in multiple rows and move it into a single row in Excel?

I need help figuring out how to take data from internet pages and enter it into one single row in an excel, or numbers if that is the easier way to go. I was also told access might be good to use. Basically I am going to chamber of commerce page and wanting to extract the member listing and enter in a database in a single line. The data is in different numbers of lines as you will see below (info edited to take out personal info). So I want to take the name of business, business owner, address, city, state, zip, and phone and put it into one line on a spreadsheet. I want to do this many times over. I think there is a way to do it through apple script and automator, but I have not been successful after 2 weeks of trying and searching. I have over 800 listings and I surely don't want to go through and do them one at a time. Any suggestions?
Data from website:
Westrock Coffee
Mr.
Collins Industrial Place
North Little Rock, AR 72113
Phone:
Send Email
Member Since: 2011
Sweet Creations by DJ
Ms. J
allace Bridge Road
Perryville, AR 72126
Phone:
Fax:
Send Email
Member Since: 2013
See Also Woman Owned and/or CEO
Premium Refreshment Service
Mr. E
est Bethany Road
North , AR 72117
I want it to look like this
Company name, owner name, address, city, state, zip, phone
How can I get the extra data out of the way and remove the format so that it will go into excel? Thanks for any help you can provide. I am not to savvy with code, but I got a friend who is an IT guy that can help. Thanks again

So, basically, create 800 individual entries, each one containing everything from business name through the phone (not fax) number, add some commas and spaces to entries, and then put each entry on a separate line?
1. Go to website page such as this one-- http://www.littlerockchamber.com/CWT/External/WCPages/WCDirectory/Directory.aspx ?ACTION=newmembers --which seems formatwise very close to what you're trying to scrape.
2. Cmd-A to select all. Cmd-C to copy it to clipboard.
3. Open freeware TextWrangler. Cmd-V to paste info from clipboard into a blank TW document.
4. Remove lines from top and bottom so that only membership list remains.
5. Process lines to remove everything from "Fax" line through "See Also" line. Only business name through phone number will remain in the file.
--A. TW > Text > Process Lines containing . . .
-----(check "Delete matched lines"; uncheck all others)
-----Enter "Send Email" in the search box.
-----Click Process.
--B. Repeat 5A for other lines to be removed
------Member Since
------See Also
------Fax
6. Insert markers to separate entries:
TW: Search > Find . . .
------(check "Wrap around" and "Grep")
------in Find box: \r\r\r\r
------in replace box: \r***
------Click Replace All
7. Remove remaining blank lines:
TW: Search > Find . . .
------(check "Wrap around" and "Grep")
------in Find box: \r\r
------in replace box: \r
------Click Replace All
8. Add comma and space at end of each line:
TW: Search > Find . . .
------(check "Wrap around" and "Grep")
------in Find box: $
------in replace box: , (comma space)
------Click Replace All
9. Remove all returns:
TW: Search > Find . . .
------(check "Wrap around" and "Grep")
------in Find box: \r
------in replace box: (leave blank)
------Click Replace All
10. Insert returns in place of markers:
TW: Search > Find . . .
------(check "Wrap around" and "Grep")
------in Find box: \*\*\*, (backslash asterisk backslash asterisk backslash asterisk comma space)
------in replace box: \r
------Click Replace All
11. Remove trailing comma and blank on each line:
TW: Search > Find . . .
------(check "Wrap around" and "Grep")
------in Find box: , $ (comma space dollar sign)
------in replace box: (leave blank)
------Click Replace All
Import this text file into Excel or Numbers.

Is CPA Cache refresh linked with ftp or file pooling process in XI?

Hi,
I have a file to file scenario using Transport protocol as FTP in XI 3.0 SP 15.
When we try to sends some file using ftp protocol where we are using
FTP connection parameters
Server                          = <CORRECT IP>,
Port                               = 21 ,
User name                <CORRECT NAME>,
Password                  <PASSWORD> ,
Data Connection           = Active
Connection Seq         = None
Connection Mode         = Permanently
Transfer Mode            =   Text
Processing Parameters
Quality of Service    = Exactly Once
Pooling Interval        = 1 sec
Processing Mode    = delete
File Type                   = Text
File encoding           = utf-8
The problem we are facing like some time the ftp is not working even the file is present in the location for pick up. If few files are stacked up to be collected then when we are using CPA Cache refresh in Full mode manually then it fetches all the files from the location but the problem is that ,we have a time constraint for this process to be completed in just 60 seconds if we are not able to pick up a file in 60 Secs then the file will be treated as invalid.
So I just want to know how Manual CPA CACHE refresh in full mode generally solve the problem.
Next if more files will be stacked up then cache refresh also failed to solve the problem and more cache refresh result in NOT pooling any other files in XI including the above discussed flow.
So,in anyway Cache refresh linked with ftp or file pooling process in XI?
Please assist me in correctly understating the whole problem and what solution could be put to solve this.
Thanks,
Satya
Edited by: Satya Jethy on Mar 14, 2008 12:28 PM

Hi Suraj,
If you see my query i have mentioned that the pooling interval is 1 Second.
If we are not able to pick the file with in 60 Secs as this is a real time scenario so the file will be treated as a invalid file.
Moreover this problem is happening some time.
I have also checked the component monitoring it is saying everything is ok as because we are receiving the file with out any error and the file transfer is also success.The only problem is that it is not collecting the files from the given location.
Hope i make you understnad the problem .If not please revert back i will try to explain once again.
Thanks,
Satya

Local Process Chains

HI Guys,
I am in the process of designing the process chains for data loading. I have a situation where I am planning to use local process chains in the meta chain. Situation is like this:I have created Meta Chain for Master Data loading, which is having 2 local process chains, one is for Attributes and another is for loading texts. I placed them in the sequential order. First Attributes chain and on successful completion of Attribute chain it should run the Text process chain. But in the system it executes the Attribute chain successfully but the text process chain is always in the YELLOW( in progress ) state even though in that process chain I have only start process. In the SM50 it shows two background process with the user id ALEREMOTE, but process chain never finishes. In the Messages of the Text process chain I see the message"Communication buffer delted from the previous run ".
So I put the process chains in the parallel manner, meaning after start process of Meta chain I attached the two process chains so that they should execute simultaneously. But in this case too, Attribute local process chain gets executed but not the Text local process chain.
We tried debugging the process in SM50, it seems that there is some infinite loop.
If anybody has any clue then please reply back at your earliest. Your help is highly appreicated,
Thanks.
Santosh Taware

first point, i think you can have single process chian for both attributes and texts one after other.
second thing check in st22 if any dumps are occuring and check in the monitor in which step it is still running.
reward if helps.

Processing routine ENTRY in program ZRVADIN0111 does not exist for smartfor

Hi ,
This is the log , I am getting in vf02 ..
==========log==============
Message Text
Processing routine ENTRY in program ZRVADIN0111 does not exist
Technical Data
Message type__________ E (Error)
Message class_________ VN (Output control)
Message number________ 068
Message variable 1____ ENTRY
Message variable 2____ ZRVADIN0111
Message variable 3____
Message variable 4____
Message Attributes
Level of detail_______
Problem class_________ 0
Sort criterion________
Number________________ 1
======================================end log========================
My driver program is same as below and form is ZSUNDRY_INVOICES_VENU'
================my driver program ===================
*& Report ZRVADIN0111
REPORT ZRVADIN0111.
TABLES : nast.
*TYPES : BEGIN OF ty_header,
       vbeln TYPE vbeln_vf,
       fkdat TYPE fkdat,
       XBLNR TYPE XBLNR_V1,
       STCEG TYPE STCEG,
       kunrg TYPE KUNRG,
       name1 TYPE AD_NAME1,
       city1 TYPE AD_CITY1,
       post_code1 TYPE AD_PSTCD1,
       street TYPE AD_STREET,
       total TYPE NETWR_FP,
       END OF ty_header.
DATA : sum TYPE i VALUE '0'.
*TYPES : BEGIN OF ty_item,
       matnr TYPE matnr,
       arktx TYPE arktx,
       fkimg TYPE fkimg,
       VRKME TYPE VRKME,
       netwr TYPE NETWR_FP,
       MWSBP TYPE MWSBP,
       unipr TYPE NETWR_FP,
       END OF ty_item.
DATA : gs_header TYPE zsd_inv_header,
       it_item TYPE STANDARD TABLE OF zsd_inv_items.
FIELD-SYMBOLS : <fs_item> TYPE zsd_inv_items.
DATA : gv_adrnr TYPE adrnr.
*data: s_vbeln type vbeln_vf.
*select-options : so_vbeln for s_vbeln.
*START-OF-SELECTION.
*form entry.
*--- Get header
SELECT SINGLE vbeln fkdat xblnr stceg kunrg bukrs
           FROM vbrk INTO gs_header
           WHERE vbeln = nast-objky.
SELECT matnr arktx fkimg vrkme netwr mwsbp
       INTO CORRESPONDING FIELDS OF TABLE it_item
       FROM vbrp WHERE vbeln = gs_header-vbeln.
    LOOP AT it_item ASSIGNING <fs_item>.
      <fs_item>-unipr = <fs_item>-netwr / <fs_item>-fkimg.
      sum = sum + <fs_item>-netwr.
      ENDLOOP .
      gs_header-total = sum.
      CLEAR : gv_adrnr.
SELECT SINGLE adrnr FROM kna1 INTO gv_adrnr WHERE kunnr = gs_header-kunrg.
   SELECT SINGLE name1 city1 post_code1 street FROM adrc
          INTO (gs_header-name1,gs_header-city1,gs_header-post_code1,gs_header-street)
          WHERE ADDRNUMBER = gv_adrnr.
*end-OF-SELECTION.
data: fm_name type rs38l_fnam.
****calling entry routine
*FORM entry USING return_code us_screen.
CLEAR retcode.
xscreen = us_screen.
PERFORM processing USING us_screen.
CASE retcode.
   WHEN 0.
     return_code = 0.
   WHEN 3.
     return_code = 3.
   WHEN OTHERS.
     return_code = 1.
ENDCASE.
*ENDFORM.                    "entry
calling smartfrom from ABAP
call function 'SSF_FUNCTION_MODULE_NAME'
exporting
    formname                 = 'ZSUNDRY_INVOICES_VENU'
VARIANT                  = ' '
DIRECT_CALL              = ' '
IMPORTING
    FM_NAME                  = FM_NAME
EXCEPTIONS
    NO_FORM                  = 1
    NO_FUNCTION_MODULE       = 2
    OTHERS                   = 3.
if sy-subrc <> 0.
   WRITE: / 'ERROR 1'.
MESSAGE ID SY-MSGID TYPE SY-MSGTY NUMBER SY-MSGNO
        WITH SY-MSGV1 SY-MSGV2 SY-MSGV3 SY-MSGV4.
endif.
CALL FUNCTION fm_name
EXPORTING
ARCHIVE_INDEX              =
ARCHIVE_INDEX_TAB          =
ARCHIVE_PARAMETERS         =
CONTROL_PARAMETERS         =
MAIL_APPL_OBJ              =
MAIL_RECIPIENT             =
MAIL_SENDER                =
OUTPUT_OPTIONS             =
USER_SETTINGS              = 'X'
    IS_HEADER                  = gs_header
IMPORTING
DOCUMENT_OUTPUT_INFO       =
JOB_OUTPUT_INFO            =
JOB_OUTPUT_OPTIONS         =
TABLES
    IT_ITEMS                   = it_item
EXCEPTIONS
FORMATTING_ERROR           = 1
INTERNAL_ERROR             = 2
SEND_ERROR                 = 3
USER_CANCELED              = 4
OTHERS                     = 5
IF SY-SUBRC <> 0.
MESSAGE ID SY-MSGID TYPE SY-MSGTY NUMBER SY-MSGNO
         WITH SY-MSGV1 SY-MSGV2 SY-MSGV3 SY-MSGV4.
ENDIF.
*call function FM_NAME
EXPORTING
ARCHIVE_INDEX              =
ARCHIVE_INDEX_TAB          =
ARCHIVE_PARAMETERS         =
CONTROL_PARAMETERS         =
MAIL_APPL_OBJ              =
MAIL_RECIPIENT             =
MAIL_SENDER                =
OUTPUT_OPTIONS             =
USER_SETTINGS              = 'X'
IMPORTING
DOCUMENT_OUTPUT_INFO       =
JOB_OUTPUT_INFO            =
JOB_OUTPUT_OPTIONS         =
TABLES
   GS_MKPF                    = INT_MKPF
EXCEPTIONS
   FORMATTING_ERROR           = 1
   INTERNAL_ERROR             = 2
   SEND_ERROR                 = 3
   USER_CANCELED              = 4
   OTHERS                     = 5.
*if sy-subrc <> 0.
MESSAGE ID SY-MSGID TYPE SY-MSGTY NUMBER SY-MSGNO
        WITH SY-MSGV1 SY-MSGV2 SY-MSGV3 SY-MSGV4.
*endif.
end of call function module from abap
*endform.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
can any body help me with proper code , i have to insert in driver program ..to retify error ...
thanks
Regards,
Venu.

Call transaction NACE (type V1 for Sales Order) look for the required output type and check program and form in processing routines. (and add code markups when you copy source code in forums)
Regards,
Raymond

Identifying and Replacing Tagged Text

I have a TextFrame full of text. Different portions of the text are marked with different Tags.
I want to identify the various runs of text by the Tag which marked them, and replace the content of that Tagged run of text according to its Tag using my business logic.
For example, for the tags
Tag1
Tag2
Tag3
And the following Tagged text (I am including the square brackets which InDesign adds around Tagged text runs)
[Vini][Vidi][Vici]
I would like to be able to identify that
"Vini" is marked by Tag1 (and should be replaced by "Ars")
"Vidi" is marked by Tag2 (and should be replaced by "Gratia")
"Vici" is marked by Tag3 (and should be replaced by "Artis")
And then perform the replacement, giving:
[Ars][Gratia][Artis]
How would I go about this?

Hi,
look at the xmlelements ... that's the key to succes...
What i would try to do is ...
1. Loop over every PageItem
2. Retreive the associated XML element
3. Do a recursive search within all sub xmlelements
4. check the name of every XML element
5. If it's the correct XML element, change the content property to the new value
for (var b = 0; b < myDoc.allPageItems.length; b++){
       var myXMLelement = myDoc.allPageItems[b].associatedXMLElement;
      ProcessXMLelement(myXMLelement);
function ProcessXMLelement(elm){
       // Get the name of the Tag
      myTagName = elm.markupTag.name.toString();
     if (myTagName == "myTag"){
               elm.contents = "New Text";
       // Process all sub elements
       for (var i = 0; i < elm.xmlElements.length; i++){
            ProcessXMLelement(elm.xmlElements[i]);
Hopefully this helps
John

Batch text extraction with CleanContent SDK (OIT)

Hi all,
we're using the CleanContent SDK to extract the text from large numbers of documents to do some text processing afterwards. I've succesfully implemented (or should I say copied the sample code) a simple scenario to extract a single file and I'm now calling this function for every single document, which gets quite heavy when we need to process thousands of documents.
I was wondering wether there are any features or techniques to process large numbers of documents more efficiently, but I haven't found too much documentation or forum threads on the subject. Should/can we reuse request or other objects to avoid setup/loading time in any way?
Many thanks in advance for your suggestions, examples and experiences !
Best regards,
Benjamin

Hi Bex,
thanks for your reply.
I'm calling Outside In from within the database using the oracle-embedded JVM, so it's not exactly the same as calling the app 1000 times individually, but probably not the same as calling it 1000 times from within the same java-only app either. I've already parallelized it to the extent that the DB will start several asynchroneous processes each calling out to Outside In and it's indeed taking 100% CPU after a few of these have taken off.
But still, even if the application would create SecureRequests the one after the other rather then calling the app for a single SecureRequest each time, you'd only avoid some default java setup costs right ? As far as I can see (or that's at least why I posted this question), there is no "intra-app" reuse of objects or resources and therefore I'd think the (probably considerable) setup costs of the SecureRequest object stuff will still have to be paid for each document processed.
So what I was looking for is some advice of whether there's any features in the CleanContent modules for batch processing I could take advantage of when finetuning the java code called from the DB.
Best regards,
Benjamin

Text processing

Similar Messages

Maybe you are looking for