Find/Replace Extended Character Set characters in filenames in one pipeline

Hello all,
I have to work with some very bored people. Instead of putting a dash (hex 2d) into a filename, they opt for something from this
set of extended characters, which makes my regular expressions fail. IS there a way I can efficiently find & replace anything outside the standard character set
in one pipelinewithout finding and replacing a character at a time?
So,I'd like something like:
get-childitem * | where-object $_.name -match '\x99' | rename-item -newname { $_.name -replace '\x99','='}
from hex 80 to hex FF rather than a for-each.
Thanks.

Answer would depend on the way you want to replace... Easier if you want replace any char in set with selected char:
$Name = -join (180..190|%{[char]$_})
New-Item -ItemType File -Name $Name
Get-ChildItem * | Rename-Item -NewName {
[regex]::Replace(
$_.Name,
'[\xB4-\xBE]',
} -WhatIf
But if you want it more complicated, you may do that too. E.g. defining hashtable that can be used to replace individual elements:
$Replacer = @{}
foreach ($Char in (180..190 | % { [char]$_ })) {
$Replacer.Add(
[string]$Char,
(echo _, -, =, . | Get-Random)
$Replacer
Get-ChildItem * | Rename-Item -NewName {
[regex]::Replace(
$_.Name,
'[\xB4-\xBE]',
$Replacer[$args[0].Value]
} -WhatIf
Using this syntax make it possible to include some logic in replace. E.g. you could easily use switch to decide what to do with given string:
Get-ChildItem * | Rename-Item -NewName {
[regex]::Replace(
$_.Name,
'[\xB4-\xBE]',
switch ($args[0].Value) {
º { "0" }
µ { "u" }
¹ { "1" }
¸ { "," }
Default { "_" }
} -WhatIf

Similar Messages

Inbound MQ with extended character sets

Hi
We are trying to send to PI data containing Swedish characters in both xml and non xml payloads.
The message is placed on an MQ queue (version 6.0.2.3) with a JMS header that has a ccsid of 1208 specified.
The PI adapter is specified as JMS | WebsphereMQ (non-JMS) | JMS Compliant and the payload module has
AF/Modules/MessageTransformerBean | Plain2XML | Transform.ContentType | text/xml;charset=utf-8
The received characters are not displaying correctly, which is a theme on several threads from the past but I've been unable to determine the solution.
I am more familiar of the MQ side so please excuse my bias. I already send extended character sets to other applications using jms over mq and we've tried using the same values on the MQ side to not avail.
In MQ we set the MQ header to the queue manager default but there is a jms specific additional header preceding the payload that specifies that the payload is utf-8.
From my perspecitve I can't see that PI is reading the JMS header at all (in fact if I remove it it has not effect) but we want it there in order to set some extended metadata properties.
When I look at the data on my queue as it leaves MQ it looks correct to view and in hex.
How do I get PI to recognise the JMS properties I've specified (its known as an mqrfh header in MQ).
Any advice, guidance, documentation to a PI novice would be most welcome.
Tim

Thankyou for the replies Sarvesh and Stefan.
I had read your previous replies on this subject, but was still stuck.
The delay in rep[lying is because we were wating for a reply from the Sap Support team.
They have now acknowledged that there may be a fault in the MessageTransformBean.
Its still only a may, but at the moment all your other suggestions have been used but not worked.
I'll update again when I get further information.
Tim

Avant Garde Gothic - Extended character sets?

I am working on a project for a client in Poland. We have been given regular and bold weights of Avant Garde Gothic with extended character sets that cover the Polish language. Now however, we want to use the medium weight as well, but have no idea where one goes to find a special extended character set. I have tried numerous type websites with no luck. Any help would be great :)

There's an Avant Garde CE Gothic Demi, available from Linotype. It
appears to be Adobe's with the Central European character set added.
(I don't understand that at all!)
URW may also have a Medium version. I have one whose font name may be
AvantGarGotItcTEEMed. I don't know if it's still available.
- Herb

Extended character set

I've just had the results of CS3 pages on a PC, packaged and sent to be opened using the ID2Q plug-in on a Mac running Quark 6.
The multiplication symbol 0215 (× if you can see it) has come across as a tall skinny diamond.
As the files were packaged my fonts have been used, so I'm guessing that the Mac didn't like my use of the extended character set.
Can anyone shed light, and is this likely to happen with all extended characters?
k

I can get a multi sign (ALT 0215) on Quark 4, so I don't think this is a
Unicode issue. AFAIK, there is no diamond in the extended (or regular)
set, which leads me to believe ID2Q decided your multi sign should be
formatted in a different font, Symbol maybe?
Are you sure they used *your* fonts (not their fonts which they think
are exactly like your fonts)?
Kenneth Benson
Pegasus Type, Inc.
www.pegtype.com

Unable to find any national character set

please check your oracle installation...
and dbassist fails to load.... any idea?
ps I installed two languages English and italian from the "languages" option
Thanx for your help.
Take care
Renato Dall'Armi

I ran into the same problem that you have. Did you have a solution for it?
Regards
<BLOCKQUOTE><font size="1" face="Verdana, Arial">quote:</font><HR>Originally posted by Pui Endo ([email protected]):
I tried to launch DBAssist and create new database instance
after Oracle8i server installation. I was getting above the
error and it says "Unable to find any National Character Set.
Please check your Oracle installation.
I am using default character set (US). Is something wrong
happening in my server installation. Does Oracle prompt for
choosing character set during installation? I am not a DBA and
new to Oracle8i. I am try to setup Oracle since we don't have a
DBA. Anyhelp will be greatly appreciated. The Oracle is
running on Linux.
Thanks.<HR></BLOCKQUOTE>
null

Character replacement before character set change

We are preparing to change our character set from us7ascii to al32utf8 on 11.2.0.3 HP-UX. The majority of the "lossy" rows identified contain accented and umlauted vowels. I have tried to change the accented and umlauted characters to their "plain" counterparts (eg, accented a to 'a'). I've tried several variations of the replace and translate functions with no succcess. One of the statements I was "sure" that it would work follows - what am I doing wrong?
update tbl1 set desc_fld = regexp_replace(desc_fld, '[=a=]', 'a');

Hi,
bynummike wrote:
We are preparing to change our character set from us7ascii to al32utf8 on 11.2.0.3 HP-UX. The majority of the "lossy" rows identified contain accented and umlauted vowels. I have tried to change the accented and umlauted characters to their "plain" counterparts (eg, accented a to 'a'). I've tried several variations of the replace and translate functions with no succcess. One of the statements I was "sure" that it would work follows - what am I doing wrong?
update tbl1 set desc_fld = regexp_replace(desc_fld, '[=a=]', 'a');Try:
update tbl1
set desc_fld = regexp_replace (desc_fld, '[[=a=]]', 'a');<tt> [=a=] </tt> means "any variation on the letter a" only when it's inside square-brackets; otherwise, it looks like you want the set of characters consisting of '=', 'a' and '='.

Find out the character set of a database

Hi,
I need to know the character set of a database. I don't know which command or utility can I use.
Thanks

This forum is for Oracle Repository.
However, connect as system and run
select * from nls_database_parameters
you should see the nls info.

JS: Finding & replacing a character with a certain para style

I'm scripting an automated indd -> PDF project and I need to minimise the user input.
To this end, I want to search and replace commas in a particular cell of a table (which are the only instances of a particular para style), and replace them with forced line breaks.
The script I have for another part of the XML import looks like this:
//Search the document for the umlauts and replaces them with macrons. I
app.findTextPreferences.findWhat = "Ï";
app.changeTextPreferences.changeTo = "Ī";
myDocument.changeText();
What do I add to a similar script to make it target only content whish has a particular para style? And will it be compatible with CS4, 5 and 5.5, i.e. be agnostic to nested styles etc?
Thanks in advance,
Simon.

Use app.findTextPreferences.appliedParagraphStyle:
app.findTextPreferences.appliedParagraphStyle = "my_style";
.. your snippet here ..
I bet there is a reset somewhere above your snippet:
app.findTextPreferences = null;
app.changeTextPreferences = null;
and it would be safest to insert this as well below your code, or else this particular setting will "stick" and also be applied to following replaces.
This command is virtually unchanged since CS3, CS4? So far it seems to work in every version since then. But either way it's inconsequential to your nested styles, as these can only be character styles.

How to find Database, APPL_TOP and IANA character set on 11i?

Hi,
Could you please tell How to find out Database character set, APPL_TOP character set and IANA character set on existing 11i environment?
This is required to pass the input during R12 upgrade.
Regards,
AV

Database:
SQL> select value
from V$NLS_PARAMETERS
where parameters='NLS_CHARACTERSET';
Application:
$ echo $NLS_LANG
IANA:
Check the value of "s_iana_cset" context variable in the context file or check the value of "ICX:Client IANA Encoding" profile option.
NLS Frequently Asked Questions [ID 399789.1]
Oracle Applications 11i Internationalization Guide [ID 333785.1]
How autconfig determines the value for Iana Charsets s_iana_cset value set in XML context file [ID 1380683.1]
Thanks,
Hussein

How to review implication of database character set change on PL/SQL code?

Hi,
We are converting WE8ISO8859P1 oracle db characterset to AL32UTF8. Before conversion, i want to check implication on PL/SQL code for byte based SQL functions.
What all points to consider while checking implications on PL/SQL code?
I could find 3 methods on google surfing, SUBSTRB, LENGTHB, INSTRB. What do I check if these methods are used in PL/SQL code?
What do we check if SUBSTR and LENGTH functions are being used in PL/SQl code?
What all other methods should I check?
What do I check in PL/SQL if varchar and char type declarations exist in code?
How do i check implication of database characterset change to AL32UTF8 for byte bases SQL function.
Thanks in Advance.
Regards,
Rashmi

There is no quick answer. Generally, the problem with PL/SQL code is that once you migrate from a single-byte character set (like WE8ISO8859P1) to a multibyte character set (like AL32UTF8), you can no longer assume that one character is one byte. Traditionally, column and PL/SQL variable lengths are expressed in bytes. Therefore, the same string of Western European accented letters may no longer fit into a column or variable after migration, as it may now be longer than the old limit (2 bytes per accented letter compared to 1 byte previously). Depending on how you dealt with column lengths during the migration, for example, if you migrated them to character length semantics, and depending on how relevant columns were declared (%TYPE vs explicit size), you may need to adjust maximum lengths of variables to accommodate longer strings.
The use of SUBSTR, INSTR, and LENGTH and their byte equivalents needs to be reviewed. You need to understand what the functions are used for. If the SUBSTR function is used to truncate a string to a maximum length of a variable, you may need to change it to SUBSTRB, if the variable's length constraint is still declared in bytes. However, if the variable's maximum length is now expressed in characters, SUBSTR needs to be used. However, if SUBSTR is used to extract a functional part of a string (e.g. during parsing), possibly based on result from INSTR, then you should use SUBSTR and INSTR independently of the database character set -- characters matter here, not bytes. On the other hand, if SUBSTR is used to extract a field in a SQL*Loader-like fixed-format input file (e.g. read with UTL_FILE), you may need to standardize on SUBSTRB to make sure that fields are extracted correctly based on defined byte boundaries.
As you see, there is universal recipe on handling these functions. Their use needs to be reviewed and understood and it should be decided if they are fine as-is or if they need to be replaced with other forms.
Thanks,
Sergiusz

Build new database through scripts must understand spanish character sets.

Hello Gurus,
I need some simple advice, a good chance for some quick points for you.
I have never built a database to understand any other character set other than American English. I now have to build a database that will be used for Spanish characters- keyboards, etc. But I will be using English for the 11g software install. I only wish to be able to show Spanish characters in the data for customers names.
I will be creating the database with scripts I have made to make the standard template for database files, control files, etc.
Then I will be importing from a dump I have done that was made with American English character sets.
System is 11g (11.2.0.3.0) on Linux Enterprise Server 5.8.
I was thinking to use the AL32UTF8 character set, but I am unsure where to use it.
My original test did not show Spanish characters for customers names like the 'tilda' or 'sueano' (pardon my spelling). But in this case I did not make the exeception for Spanish, I only used the standard American English build (no changes in the init.ora file or initial database build script).
How can I adjust my parameter file for the initial creation of the database template to be able to understand the Spanish character set and still be able to import my dump file without error.
EXAMPLE of a build script:
CREATE DATABASE mynewdb
USER SYS IDENTIFIED BY sys_password
USER SYSTEM IDENTIFIED BY system_password
LOGFILE GROUP 1 ('/u01/app/oracle/oradata/mynewdb/redo01.log') SIZE 100M,
GROUP 2 ('/u01/app/oracle/oradata/mynewdb/redo02.log') SIZE 100M,
GROUP 3 ('/u01/app/oracle/oradata/mynewdb/redo03.log') SIZE 100M
MAXLOGFILES 5
MAXLOGMEMBERS 5
MAXLOGHISTORY 1
MAXDATAFILES 100
CHARACTER SET US7ASCII
NATIONAL CHARACTER SET AL16UTF16
If I replace NATIONAL CHARACTER SET AL16UTF16 to AL32UTF8 will it work to show Spanish characters?
Sorry for the long winded question, any advice will be great.
Thankfully,
Shawn

Hello,
the national charsets is for column types like nvarchar not for normal varchar data types. So if your dump file contains such column types you will also need to set it. The charset is for the normal column types like varchar. The use of unicode is best pratice if you use multiel language, but keep in mind that multibyte charset can be a problem during the import because varchar2(10) means 10byte and not 10 chars, so errors like identifier to long can occur during import.
You can create the database.
Check this documentation:
http://docs.oracle.com/cd/B28359_01/server.111/b28298/ch2charset.htm
You can use a charset like WE8MSWIN1252 which covers spanish also (as far i know) and is a superset to us7ascii
regards
Peter

Find & replace part of a string in Numbers using do shell script in AppleScript

Hello,
I would like to set a search-pattern with a wildcard in Applescript to find - for example - the pattern 'Table 1::$*$4' for use in a 'Search & Replace script'
The dollar signs '$' seem to be a bit of problem (refers to fixed values in Numbers & to variables in Shell ...)
Could anyone hand me a solution to this problem?
The end-goal - for now - would be to change the reference to a row-number in a lot of cells (number '4' in the pattern above should finally be replaced by 5, 6, 7, ...)
Thx.

Hi,
Here's how to do that:
try
    tell application "Numbers" to tell front document to tell active sheet
        tell (first table whose selection range's class is range)
            set sr to selection range
            set f to text returned of (display dialog "Find this in selected cells in Numbers " default answer "" with title "Find-Replace Step 1" buttons {"Cancel", "Next"})
            if f = "" then return
            set r to text returned of (display dialog "Replace '" & f & "' with " default answer f with title "Find-Replace Step 2")
            set {f, r} to my escapeForSED(f, r) -- escape some chars, create back reference for sed
            set tc to count cells of sr
            tell sr to repeat with i from 1 to tc
                tell (cell i) to try
                    set oVal to formula
                    if oVal is not missing value then set value to (my find_replace(oVal, f, r))
                end try
            end repeat
        end tell
    end tell
on error number n
    if n = -128 then return
    display dialog "Did you select cells?" buttons {"cancel"} with title "Oops!"
end try
on find_replace(t, f, r)
    do shell script "/usr/bin/sed 's~" & f & "~" & r & "~g' <<< " & (quoted form of t)
end find_replace
on escapeForSED(f, r)
    set tid to text item delimiters
    set text item delimiters to "*" -- the wildcard
    set tc1 to count (text items of f)
    set tc2 to count (text items of r)
    set text item delimiters to tid
    if (tc1 - tc2) < 0 then
        display alert "The number of wildcard in the replacement string must be equal or less than the number of wildcard in the search string."
        error -128
    end if
    -- escape search string, and create back reference for each wildcard (the wildcard is a dot in sed) --> \$.\$
    set f to do shell script "/usr/bin/sed -e 's/[]~$.^|[]/\\\\&/g;s/\\*/\\\$.\\\$/g' <<<" & quoted form of f
    -- escape the replacement string, Perl replace wildcard by two backslash and an incremented integer, to get the back reference --> \\1 \\2
    return {f, (do shell script "/usr/bin/sed -e 's/[]~$.^|[]/\\\\&/g' | /usr/bin/perl -pe '$n=1;s/\\*/\"\\\\\" . $n++/ge'<<<" & (quoted form of r))}
end escapeForSED
For what you want to do, you must have the wildcard in the same position in both string. --> find "Table 1::$*$3", replace "Table 1::$*$4"
Important, you can use no wildcard in both (the search string and the replacement string) or you can use any wildcard in the search string with no wildcard in the replacement string).
But, the number of wildcard in the replacement string must be equal or less than the number of wildcard in the search string.

JNLS - National Character Sets

If I open the Oracle Database Configuration Assistent the following Error is returning:
JNLS Exception: Oracle.ntpg.jnls.JNLS Exception
Unable to find any National Character Sets. Please check your Oracle installation.
How can I fix this problem under th following configuration: Linux Slackware 7.0, KDE, Oracle 8i EE.

Hi,
I am having the same problem you had two years ago. Could you please let me know if you got a solution to it. And if so, how.
Thankyou very much.
Sincerely,
Simon.

Oracle 8i us7ascii character set problem - help required urgent.

Hi frnds,
I have a oracle 8i database server installed on sun solaris os. The database character set is us7ascii. In one of the tables TIFF images are stored in a long column. I m trying to fetch these images using oracle 9i client and visual basic(oracle ODBC drivers). But i m unable to do so. I can not fetch special characters.
Is it because of the character set problem? but when i run my code on the server itself, i m able to fetch the images. I tried to fetch the images using oracle 8 i client on windows XP machine but could not do so. Are there any special settings that i have to do on the client side?

Indeed, it's an ODBC issue. Read this statement from Oracle:
From ODBC 8.1.7.2.0 drivers onwards it's NOT possible any more to
"disable" Characterset conversion by specifying for the NLS_LANG
the same characterset as the database characterset. There is now
ALWAYS a check to see if a codepoint is valid for that characterset.
Typically you will encounter problems if you upgrade an environment
that has NO NLS_LANG set on the client (or US7ASCII) and the database
was also US7ASCII. This incorrect setup allowed you to store characters
like èçàé in an US7ASCII database, with the new 8i drivers this is not possible
any more.
Basic problem is the 'wrong' characterset US7ASCII in the database. As long as no characterset conversion happens (that's the case on the unix server), special characters are no problem.
Werner

Oracle character set problem - help reqed urgent !!

Hello frnds,
I have a oracle 8i database server installed on sun solaris os. The database character set is us7ascii. In one of the tables TIFF images are stored in a long column. I m trying to fetch these images using oracle 9i client and visual basic(oracle ODBC drivers). But i m unable to do so. I can not fetch special characters.
Is it because of the character set problem? but when i run my code on the server itself, i m able to fetch the images. I tried to fetch the images using oracle 8 i client on windows XP machine but could not do so. Are there any special settings that i have to do on the client side?

i run my code on the server itself, i m able to fetch
the images. I tried to fetch the images using oracle
8 i client on windows XP machine but could not do so.You able to fetch the image , So it is not because of the character set.
First thing you need to consider is that use certified combination of OS , client and database server only. Check Certify - Oracle's Certification Matrices
Virag

Find/Replace Extended Character Set characters in filenames in one pipeline

Similar Messages

Maybe you are looking for