Regular Expression in content filter

Hey,
i want to create a content filter with the "body-contains"-condition in combination with a regular expression. To specify it:
I want to check whether a string (disclaimer) is already added to the email. If not, i have to add the footer.
So to say: REGULAR EXPRESSION = does not contain "string"
But how does the regular expression look like?
<rule>
<rule_type>Only_Body_Contains_Rule</rule_type>
<rule_data>REGULAR EXPRESSION (does not contain...)</rule_data>
<rule_extra1>1</rule_extra1>
</rule>
Thx

you MAY be able to use a negative lookahead assertion like:
?!EXPRESSION
this results in:
content_filter: if (only-body-contains("?!disclaimer text", 1) )
OR in message filters you can say:
if ( not body-contains("EXPRESSION",1) )
all that said, you should just have two content filters, 1 to check for the filter and deliver immediately (w/o filter stamp) and another catch-all filter to stamp filters. for example:
disclaimer_skip
disclaimer_skip: if (only-body-contains("disclaimer text", 1)) { deliver(); }
outbound-disclaimer-catchall
outbound-disclaimer-catchall: if (true) { add-footer("my_disclaimer"); }
cheers,
andrew

Similar Messages

Using regular expressions in content dictionaries

I need to create a content dictionary containing regular expressions. I also need to use the "\" to escape some characters that would otherwise be regex meta-characters. When using a regex in a message filter, the "\" must be doubled because of parsing issues. This is clearly documented in the manual. What isn't documented is whether this must be done when the regex is within a content dictionary.
Here's an example:
if (mail-from == "@bad-domain\\.com$") { drop(); }
I want to change this filter to:
if (mail-from-dictionary-match("bad-domains")) { drop(); }
So what do I put in the content dictionary, "@bad-domain\.com$" or "@bad-domain\\.com$"?
Thanks,

You should use this:
"@bad-domain\.com$"
The above tells the system to deference the "." (any character) to mean a literal period.
If you used this,
@bad-domain\\.com$
What the system would match is "@bad-domain\.com", because the first backslash would dereference the second backslash, to be taken literally. So, the double backslashes is the wrong format.
The only reason you see it in the final results when you've committed changes is that the system adds the backslash for you so that there's no error when it gets compiled.
Also, you could have left the single backslash out completely too and it would probably work.
"@bad-domain.com$"
If you sed that as your pattern in the dictionary, it would match against these:
@bad-domain.com
@bad-domainncom
@bad-domain1com
@bad-domain&com
basically, the "." means any character. But to be precise, you should only add one backslash in front of special characters. Here is a list of special characters:
| ( ) [ { ^ $ * + ? .
For a detailed explanation about special characters and how to use them, please see the Advanced User Guide.
[https://supportportal.ironport.com/irppcnctr/srvcd?u=http://secure-support.soma.ironport.com/subproducts/x-c_series&sid=900001]

Regular Expression Credit Card Filter

I've been playing around with filters for credit cards and have yet to find one that stops all credit cards while limiting false positives because it is matching any random 16 characters.
I need one that blocks all amex, visa, mc, and discover without spaces, with spaces or with dashes.
This one has worked the best so far (it's a mish mash of filters I have found or tweaked or have been sent to me), but I think it can be improved. Any ideas? Anyone have a better filter they are using?
Visa/MC/Amex
^((4\d{3})|(5[1-5]\d{2}))(-?|\040?)(\d{4}(-?|\040?)){3}|^(3[4,7]\d{2})(-?|\040?)\d{6}(-?|\040?)\d{5}
Discover Card:
(6011|5[1-5][0-9]{2}|4[0-9]{3}) [0-9]{4} [0-9]{4} [0-9]{4}
(6011|5[1-5][0-9]{2}|4[0-9]{3})-[0-9]{4}-[0-9]{4}-[0-9]{4}
(6011|5[1-5][0-9]{2}|4[0-9]{3})\.[0-9]{4}\.[0-9]{4}\.[0-9]{4}

im not so versed at regular expressions, but i have found the following regular expressions on the web.
If someone can get these Ironport legal, maybe they might help with the false positives.
if (type == "Visa") {
// Visa: length 16, prefix 4, dashes optional.
var re = /^4\d{3}-?\d{4}-?\d{4}-?\d{4}$/;
} else if (type == "MC") {
// Mastercard: length 16, prefix 51-55, dashes optional.
var re = /^5[1-5]\d{2}-?\d{4}-?\d{4}-?\d{4}$/;
} else if (type == "Disc") {
// Discover: length 16, prefix 6011, dashes optional.
var re = /^6011-?\d{4}-?\d{4}-?\d{4}$/;
} else if (type == "AmEx") {
// American Express: length 15, prefix 34 or 37.
var re = /^3[4,7]\d{13}$/;
} else if (type == "Diners") {
// Diners: length 14, prefix 30, 36, or 38.
var re = /^3[0,6,8]\d{12}$/
thanks for the reg expression beneckij. I am still waiting to capture legit traffic, but getting false positves on South american phone numbers

Regular expressions and content filters

I'm having difficulty dropping unwanted mail that contain chinese characters. I have a content filter that looks for the gb2312 charset but it fails to match properly.
Sample header:
Content-Type: text/html; charset="gb2312"
Filter:
header("Content-type") == "(?i)gb2312"
I run a trace and it does not match and the message gets through.
I have noticed some messages have the Content-type header with a different case like "Content-Type". I tried it both ways and it fails.
Some messsages are chines but do not specify a character set. I typically see somethiong like ?utf-8? in the subject line and that will not match either.

You may have try and match the message-body of the email. The following support portal kb article goes over the common character sets and how to match against it.
How to block Russian / Cyrillic / Ukrainian char sets
http://tinyurl.com/23287c
-kevin

Regular expression containing - don't match

Dears,
I have some folders & files created with timestamp in their names.
Example:
2005-07-25-16-13-22-hello
2005-07-15-16-13-20-name.txt
What I want is to make a filter so that when I call File.listFile( filter ), I get the folder matching some criterion, matching prefix, as suffix, or anywere.
I want to get folders with specific dates like i.e. starts with 2005-07-25 as example and ignore the other, so I match between the name of the folder & accepted format
I know I can use startWith, endsWith, indexOf , but I want to use regular expression
In my filter I use the following expression but didn't work:
regExp = "(2005-07-25){1,}." for matching as prefix
regExp = ".(2005-07-25){1,}" for matching as suffix
regExp = ".(2005-07-25){1,}." for matching anywhere
and I use:
boolean isMatch = fileName.matches( regExp );but it never return true for valid folder or file names.
I think the problem is that the timestamp I use contains - which is internally used by regular expression and I think causing this problem, can anyone suggest a solution?
Thanks in advance.

I got the * part
regExp = "(2005-07-25){1,}*"for matching as prefix
regExp = "*(2005-07-25){1,}"for matching as suffix
regExp = "*(2005-07-25){1,}*"for matching anywhere
I didn't get the () part, I just use the () for grouping that I want to match this date at least once using {1,} what I need is to match exactly "2005-07-25" which seems not working so far.
plz clarify what u mean
Thanks, best regards

How do I have to define a regular expression to filter out data from file?

Hi all,
I need to extract parts of lines of a ASCII file and didn't get it done with my low knowledge of regular expressions
The file contains hundreds of lines and I am just interested in a few lines, within that lines I just need a part of the data.
One original line looks like that:
TP3| |TP_SMD|Nicht in Stueckliste|~TP TP_SMD TESTPUNKT|-|0|87.770|157.950|0|top|c| |other|TP_SMD|TP_SMD_60RF-TP
Only the bold and underlined information is of interest, I don't need the rest.
I can open that file, read in each line but then I am struggling to pick out only the lines of interest (starting with TP), taking that TP with its number and the coordinates following later on and then writing these shortened lines to a new text file. So the new line should look like that:
TP3; 87.770;157.950;0 (It doesn't matter if the separator will be ; or |)
I thought of using regular expressions - is that the right way or is there a better approach?
Thanks & regards,
gedi, using LabVIEW 8.5
Regards,
gedi

Hi max,
for finding a specific part of a string you can use the "Match Pattern" VI, it is located in the Strings Palette.
Maybe the Extract Numbers.vi example in the examples browser library can help you.
What I did to filter out my data of interest is first to sort out only the columns which I want to have -
then there are still a lot of lines remaining I don't need (this is the thing described above).
The rest I am going to filter out with a (then easy) regular expression with the "Match Pattern" VI.
Regards,
gedi
Regards,
gedi

How can I remove all content between two tags using Find/Replace regular expressions?

This one is driving me bonkers... I'm relatively new to regular expressions, but I'm trying to get Dreamweaver to remove all content between two tags in an XML document. For example, let's say I have the following XML:
<custom>
<![CDATA[<p>Some text</p>
<p>Some more text</p>]]>
</custom>
I'd like to do a Find/Replace that produces:
<custom>
</custom>
In essence, I'd like to strip all of the content between two tags. Ideally, I'd like to know how to strip the CDATA content as well, to return the following:
<custom>
<![CDATA[]]>
</custom>
I'd much appreciate any suggestions on accomplishing this.
Many thanks!

Thanks much for your response. I found David's article to be a little thin with respect to examples using quantifiers in coordination with the wildcard metacharacters; however, I was able to cobble together a working expression through trial and error using the information he presented. For posterity, here’s the solution:
Find:
<custom>[\d\D]*?</custom>
Replace:
<custom>
<![CDATA[]]>
</custom>
I believe this literally translates to:
[] = find anything in this range/character class
\d = find any digit character (i.e. any number)
\D = find any non-digit character (i.e. anything except numbers)
*? = match zero or more times, but as few times as possible (i.e. match multiple characters per instance, but only match one instance at a time, or none at all)
I’m still not sure how to effectively utilize the . wildcard. For example, the following expression will not find content that ends with a number:
<custom>.*?[\D]*?</ custom >
I'm presuming this is because numbers aren't included in the \D metacharacter; however, shouldn't numbers be picked up by the .*? expression?

IR filter using "matches regular expression"

Hi,
I am familiar with Perl regular expressions, but I'm having trouble using the IR filter by regular expression in Apex.
For instance, I would like to search for dates of format 'MM/DD/YY' - can someone tell me how this would be done? I tried '[0-9](2)/[0-9](2)/[0-9](2)' and many other patterns to no avail.
Also can you point me to a good thread for regular expressions in Apex?
Thanks for any help.

Hi,
you can play around with oracle regular expressions at
http://www.yocoya.com/apex/f?p=YOCOYA:REGEXP_HOME:0:
It's an Apex application, albeit "seasoned", where you can build and test the regex and it will be 100% compatible as it runs natively, so it's not simulated on a different platform.
Most likely the IR filter will make use of REGEXP_LIKE so you can pick that function from the menu.
Flavio
http://oraclequirks.blogspot.com
http://www.yocoya.com

Filter String using Regular Expression

Hello,
I have an application that monitors serial communication between a PC and device. The message protocol is a byte stream that I convert to a string to parse into pretty messages. The start of the string is always "10 02", but if the string is preceded with another "10" like this "10 10 02" it is part of a message. I've been trying to use a regular expression with the Search and Replace VI. My regex is "[^10]\s10\s02" which almost works but it cuts off part of the message:
Before:
10 03 10 02
After:
10 0 <= missing the "3"
10 02
Here's what I'm doing:
Any ideas on what I'm missing? I've attached a simple example.
Thanks
Message Edited by Derek Price on 02-14-2008 08:37 PM
Attachments:
Filter Beginning Message1.vi ‏14 KB
FilterMessageRegex1.png ‏7 KB

Try this approach.
Do search and replace on '10\s02' and replace with '\r\n10\s20'
Then do another search and replace on '10\r\n\10\s20' with '10\s10\s20'
See attached.
Randall Pursley
Attachments:
Message Filter.PNG ‏18 KB

Extracting Content with Regular Expressions

hi all,
i am trying to extract some content from a text file using regular expressions but i am unable to get the exact match that i am expecting
here is the sample text
The passage begins like that (something in text form (some
more embedded text) remaining outer text).
i am expecting to get the following match,
(something in text form (some
more embedded text) remaining outer text)
but my regular expression just returns me a broken section of the text as this one
(something in text form (some
more embedded text)
how can i get this fixed, i mean how can i match evenly across the open parenthesis.
here is my broken reg exp (?:$(?:\\.|[^$\\])*\))
thanks for any help!

Cross-posted. See duplicate http://forum.java.sun.com/thread.jspa?threadID=653042&tstart=0

Splitting html ul tags and their content into string arrays using regular expression

<ul data-role="listview" data-filter="true" data-inset="true">
<li data-role="list-divider"></li><li><a href="#"><h3>
my title
</h3><p><strong></strong></p></a></li>
</ul>
<ul data-role="listview" data-filter="true" data-inset="true">
<li data-role="list-divider"></li><li>test.</li>
</ul>
I need to be able to slip this html into two arrays hold the entire <ul></ul> tag. Please help.
Thanks.

Hi friend.
This forum is to discuss problems of C# development. Your question is not related to the topic of this forum.
You'll need to post it in the dedicated Archived Forums N-R > Regular Expressions
for better support. Thanks for understanding.
Best Regards,
Kristin

Regular Expression Filter Mapping In Web.xml

I have a situation where I need to filter URL's that don't have a file extension. There are hundreds of URL's, so i can't specify each one separately.
From what I understand, you can have a filter mapping in web.xml to map certain file extensions to a filter, such as :
<filter>
<filter-name>MyFilter</filter-name>
<filter-class>com.filters.MyFilter</filter-class>
</filter>
<filter-mapping>
<filter-name>MyFilter</filter-name>
<url-pattern>*.esi</url-pattern>
</filter-mapping>
...where all files of extension .esi go through the filter. Additionally, I want to map all url's WITHOUT a file extension to pass through this filter. So if I have a request for "http://myserver.com/home" it will pass through the filter.
Does weblogic support regular expressions for pattern matching in filter-mapping? If so, can you provide examples or links to documentation?
Thanks,
Jim

No, regular expressions are not supported. You can specify the mapping with extensions, or with paths, not at the same time in the same mapping specification. However, you can specify two servlet mappings, one with extension, and one with path.
The exact wording from the spec (2.4) is as follows:
In the Web application deployment descriptor, the following syntax is used to define
mappings:
• A string beginning with a ‘/’ character and ending with a ‘/*’ suffix is used
for path mapping.
• A string beginning with a ‘*.’ prefix is used as an extension mapping.
• A string containing only the ’/’ character indicates the "default" servlet of
the application. In this case the servlet path is the request URI minus the context
path and the path info is null.
• All other strings are used for exact matches only.

Regular Expression to Filter Out Special Case

I wonder if anybody knows how to use regular expressions to match any possible string in existence, exception one?
For example, lets say the string I DON'T want to match is "Foo".
Then I want a regular expression that will match (and capture) everything except EXACTLY "Foo". Therefore, "FooBar" is ok, "FooFoo" is ok, "Fool" is okay, "Java" is okay, "Supercalifragilistic Expialadocious" is okay. Absolutely anything, exception exactly "Foo".
Does anybody know if or how this can be done, using regular expressions (I know I could simply test for .equals, but the problem is, I don't have control of the code that does the validation. All I can do is supply a regular expression, and let the framework do the rest).
Thanks,
Adam

I guess I don't follow you when you say you have to use regexs, since you provide a regex to split().
Do you mean you can only use Pattern and matches(), etc?
I'm just curious now, since it looks like Sabre150 provided a solution to your problem.

Unix Log Monitoring regular expression not picking up alerts

Hi,
We are moving our unix monitoring to SCOM 2012 SP1 rollup 4.
What I have got working is indvidual alert logging of Unix Log alerts by exporting the MP and changing the <IndividualAlerts> value to true and removing the suppression xml section then reimporting the MP.
What I am trying to do is use the regular expression to peform the suppression of specific event (such as event codes).
The expression is:
((?i:warning)(?!(.*1222)|(.*1001)))
ie Search the log for "warning" (not case sensitive) then check if events 1222 or 1001 exist if so return no match, if they dont exist then return true.
I use the built in test function in SCOM when creating the rule and the tests come back as expected but when I inject test lines into the unix log, no alerts get generted.
I suspect it could be the syntax not being accepted on the system (its running RedHat 6 )
I have tested this with regex tools and works.
When I try and test it on the server i get:
[root@bld02 ~]# grep ((?i:Warning)(?!(.*1222)|(.*1001))) /var/log/messages
-bash: !: event not found
[root@bld02 ~]# tail /var/log/messages
Nov 13 15:07:26 bld02 root: SCOM Test Warning Event ID 1001 Round 18
Nov 13 15:07:29 bld02 root: SCOM Test Warning Event ID 1000 Round 18
Nov 13 15:07:35 bld02 root: SCOM Test Warning Event ID 1002 Round 18
So I am expecting 2 alerts to be generated.
SCOM tests to show expression working:
Test 1 Matching
Test 2 to exclude
Need some help with this, Thankyou in advance :)

Hello,
Here's an example of modifying the MP to exclude particular events. Firstly, I created a log file rule using the MP template that is fairly inclusive - matching the string Warning (with either a lower or upper case W).
I then exported the MP, and modified the rule. I set the IndividualAlerts = true and removed the AlertSuppression element, so that every matched line will fire a unique alert. You don't have to remove the AlertSuppression, but you should use
Individual alerts so that the exclusion logic doesn't exclude concurrent events that you actually want to match.
Implementing the exclusion logic involves the addition of a System.ExpressionFilter definition in the rule. This will use a conditional evaluation of the //row element of the data item. Here's an example of a dataitem matching an individual row:
<DataItem type="System.Event.Data"time="2013-11-15T10:33:14.8839662-08:00"sourceHealthServiceId="667FF365-70DD-6607-5B66-F9F95253B29F">
<EventOriginId>{86AB962D-2F44-29FD-A909-B99FF6FEB2C5}</EventOriginId>
<PublisherId>{EC7EA4B1-0EA5-7E8E-701F-82FEF3367BC4}</PublisherId>
<PublisherName>WSManEventProvider</PublisherName>
<EventSourceName>WSManEventProvider</EventSourceName>
<Channel>WSManEventProvider</Channel>
<LoggingComputer/>
<EventNumber>0</EventNumber>
<EventCategory>3</EventCategory>
<EventLevel>0</EventLevel>
<UserName/>
<RawDescription>Detected Entry: warning 1002</RawDescription>
<CollectDescription Type="Boolean">true</CollectDescription>
<EventData>
<DataItem type="SCXLogProviderDataSourceData"time="2013-11-15T10:33:14.8839662-08:00"sourceHealthServiceId="667FF365-70DD-6607-5B66-F9F95253B29F">
<SCXLogProviderDataSourceData>
<row>warning 1002</row>
</SCXLogProviderDataSourceData>
</DataItem>
</EventData>
<EventDisplayNumber>0</EventDisplayNumber>
<EventDescription>Detected Entry: warning 1002</EventDescription>
</DataItem>
Here is the rule in the MP XML. The <ConditionDetection>...</ConditionDetection> content was what I added to do the exclusion filtering:
<Rule ID="LogFileTemplate_66b86eaded094c309ffd2631b8367a32.Alert" Enabled="false" Target="Unix!Microsoft.Unix.Computer" ConfirmDelivery="false" Remotable="true" Priority="Normal" DiscardLevel="100">
<Category>EventCollection</Category>
<DataSources>
<DataSource ID="EventDS" TypeID="Unix!Microsoft.Unix.SCXLog.VarPriv.DataSource">
<Host>$Target/Property[Type="Unix!Microsoft.Unix.Computer"]/PrincipalName$</Host>
<LogFile>/tmp/test</LogFile>
<UserName>$RunAs[Name="Unix!Microsoft.Unix.ActionAccount"]/UserName$</UserName>
<Password>$RunAs[Name="Unix!Microsoft.Unix.ActionAccount"]/Password$</Password>
<RegExpFilter>warning</RegExpFilter>
<IndividualAlerts>true</IndividualAlerts>
</DataSource>
</DataSources>
<ConditionDetection TypeID="System!System.ExpressionFilter" ID="Filter">
<Expression>
<RegExExpression>
<ValueExpression>
<XPathQuery Type="String">//row</XPathQuery>
</ValueExpression>
<Operator>DoesNotContainSubstring</Operator>
<Pattern>1001</Pattern>
</RegExExpression>
</Expression>
</ConditionDetection>
<WriteActions>
<WriteAction ID="GenerateAlert" TypeID="Health!System.Health.GenerateAlert">
<Priority>1</Priority>
<Severity>2</Severity>
<AlertName>Log File Alert: ExclusionExample</AlertName>
<AlertDescription>$Data/EventDescription$</AlertDescription>
</WriteAction>
</WriteActions>
</Rule>
I traced this with the Workflow Analyzer as I tested, which shows the logic being applied. Here is the exclusion happening:
Here's more info on the definition of an ExpressionFilter:
http://msdn.microsoft.com/en-us/library/ee692979.aspx
And more information on Regular Expressions in MPs:
http://support.microsoft.com/kb/2702651/en-us
You can also have multiple Expressions in the ExpressionFilter joined by OR or AND operators.
Also, if you are comfortable with the MP authoring, you can just skip the step of creating the rules in the MP template and just author your own MP with the VSAE tool:
http://social.technet.microsoft.com/wiki/contents/articles/18085.scom-2012-authoring-unixlinux-log-file-monitoring-rules.aspx
www.operatingquadrant.com

Content filter not fixed, still stripping message body

The content filter that arbitrarily strips out (part of) the body of my
email messages is not fixed: http://forums.adobe.com/message/1867251#1867251
Jochem
Jochem van Dieten
http://jochem.vandieten.net/

I think I found out why the message body of my messages is stripped out. It appears Jive is filtering the content of email messages with the following regular expression:
* Simple bean for storing the contents of an incoming email.
public class Message {
// ripped from EmailParserImpl
private static final Pattern originalMessagePattern = Pattern.compile("(-{5,}|_{5,}|^.*wrote:$)(\\s*.*)*", Pattern.MULTILINE);
My first impression is that this implementation is somewhet simplistic. For instance, it doesn't take into account whether you just quoted a single line or all of the message. For a great example of that, look at the House of Fusion email archives, where you can see selectiive quotes are allowed, but complete quotes of full messages are filtered out. It also doesn't do pattern matching on the standardized string that starts a signature.
More insight into the behaviour of the email integration can be obtained from the sourcecode of the Jive advancedemail plugin. Although I am not sure it is the same version as Adobe is running, there are some comments and TODO's in the code about behaviour I am not seeing in the email from these forums, but it still helps to understand what is happening.

Regular Expression in content filter

Similar Messages

Maybe you are looking for