Sorting with Diacritic characters

Hi,
While implementing sorting in Endeca search we came across a scenario where the sorting should include diacritic characters. Taking the example of Polish, there are characters like *"A"* and *"Ą"* starting for the product names. When we sort with the product name (A-Z sort), we had seen that the records starting with *"Ą"* are retrieved at the end. In fact the expected behavior being it should follow after the names starting with *"A"*. Please let me know if anyone has come across similar behavior and the fix made for it.
The endeca version being used is 6.1.3 and the language used is Polish.

You need to add --lang pl-u-co-standard to the dgidx and dgraph components (in ./config/script/AppConfig.xml).  By default Endeca sorts using endeca collation which "sorts text with lower case before upper case and does not account for character accents and punctuation."  Standard collation sorts data "according to the International Components for Unicode (ICU) standard for the language you specify".  See http://docs.oracle.com/cd/E35641_01/MDEX.621/pdf/AdvDevGuide.pdf , Chapter "Using Internationalized Data" for further details. 
If by 6.1.3 you mean MDEX 6.1.3 (as opposed to Platform Services) I'm not sure this sorting was available then, you would need to check the chapter listed above in the MDEX 6.1.3 documentation.

Similar Messages

  • Error while crawling URL containing diacritic characters

    Hi,
    I have a content source in SharePoint 2013 that is showing errors while trying to crawl links with diacritic characters (portuguese words). The reason is that the crawler regards the URL as invalid.
    The problem still occurs if the link URL is encoded (see example 2).
    Examples:
    1) Atualização 037 de 16-4-2008.htm
    2) Atualiza%E7%E3o%20037%20de%2016-4-2008.htm
    Log message:
    The item could not be accessed on the remote server because its address has an invalid syntax.
    I already tried to save the home page (which contains the links) as UTF-8, UTF-8 without BOM, and ANSI.
    Also, I tried to include a meta charset tag:
    <meta charset="UTF-8">
    in addition to the first line with:
    <?xml version="1.0" encoding="UTF-8"?>
    All unsuccessful attempts. Has anyone found solution for this problem?

    Hi,
    Just checking in to see if the information was helpful. Please let us know if you would like further assistance.
    Have a great day!
    Best Regards,
    Lisa Chen
    TechNet Community Support
    Please remember to mark the replies as answers if they help, and unmark the answers if they provide no help. If you have feedback for TechNet Support, contact
    [email protected]

  • Sorting of contacts with foreign characters in the name

    Hello,
    I have a large address book, some with Japanese characters in their name.
    I'm on English Language setting for the OS on my iphone and ipad.
    All the contacts with Japanese characters in their name seem to get pushed to the # section of the alphabet rather than at the equivalent roman alphabet letter.
    Also note that adding the Japanese name in the secondary fields of "phonetic first" and "phonetic second" of an existing Roman English based contact sends this to the # section.
    This is poor sorting of contacts. 
    Very poor....

    After posting this I found a solution.  I hadn't tried it this way round.
    https://discussions.apple.com/message/7589528#7589528
    Go into one of your Japanese contacts and click edit.
    Go right to the bottom of the edit page and add field, add phonetic first name, then repeat the step to add phonetic last name.
    Now edit the name of your contact, keep their name in Japanese and enter the phonetic spelling in English.
    Smashin....  :-)

  • When i send an email with greek characters in the body, the recipient receive it in an unreadable form.

    When i send an email with greek characters in the body, many recipients (not all) cannot read it.
    At the same time when i use the internet mail it can be read successfully by the recipient.
    I have already checked the encoding settings in the Fonts in order to be "unicode (UTF-8)".
    What else can i check?
    Thanks in advance.
    e.g:
    From: Eleni Kontomari [email address removed by moderator Andrew]
    Sent: Thursday, February 27, 2014 12:33 PM
    To: "Nikos Totsios (Ηλεκτρονική διεύθυνση)"; "Giannis Diokarantos (Ηλεκτρονική διεύθυνση)"; "Dimitris Papadopoulos (Ηλεκτρονική διεύθυνση)"; Vassilis Gounaris; "Vassos Efthymiadis (Ηλεκτρονική διεύθυνση)"; Νικηφόρος Κεκρίδης; Ιωάννης Αθανασόπουλος; "Alexis Katsivas (Ηλεκτρονική διεύθυνση)"; Μιχάλης Παπαοικονόμου; Fomesa Hellas; "Χρήστος Σπηλιάδης (Ηλεκτρονική διεύθυνση)"; Δημήτρης Μπενάκης; "Ν. Γαλάνης"; [email protected]; "Αποστόλης Σαμούδης (Ηλεκτρονική διεύθυνση)"; "Β. Ντουρτόγλου (Ηλεκτρονική διεύθυνση)"; [email protected]; [email protected]; "Φοίβη Λεγάκι (Ηλεκτρονική διεύθυνση)"; "Παναγιώτης Κουμεντάκος (Ηλεκτρονική διεύθυνση)"; Σπύρος Ζαφείρης; Hans- Joachim Henn; "Κώστας Αλεξανδρόπουλος (Ηλεκτρονική διεύθυνση)"; [email protected]; [email protected]
    Subject: ΠΡΟΣΚΛΗΣΗ ΤΑΚΤΙΚΗΣ ΓΕΝΙΚΗΣ ΣΥΝΕΛΕΥΣΗΣ ΕΣΥΦ
    ... 13 2014.
    ''Please read [[Forum rules and guidelines]] when posting a question in a public forum''

    There are some language add ons that support emails from other languages that you can check out: [https://addons.mozilla.org/en-us/thunderbird/extensions/language-support/?sort=popular]
    The recipient, if they also use thunderbird may need to have a language pack to read the email: [https://addons.mozilla.org/en-US/thunderbird/language-tools/]
    You may also need to have them check their interpreter to make sure the email is being received in the same format it is being sent.

  • Problem with accentuated characters in XLIFF file (no problem anymore)

    Sorry, the following turned out to be a non-problem. XLIFF editor understands accentuated characters in the translation file very well :o)
    I am trying to translate an APEX v 2.2 application from french to german.
    Generated xlf file header reads:
    <file original="f101_102_fr-ch_de.xlf" source-language="fr-ch" target-language="de" datatype="html">
    All french diacritical characters in the file are replaced by something like é. Something similar happens when one exports CSV file to Excel with Automatic CSV Encoding set to "No".
    Application main language can be swiss french (fr-ch) or french french (fr), with the same result.
    Application Language Derived From is "Use application primary language"
    Automatic CSV Encoding is set to "Yes".
    Is there a possibility to export non-english characters correctly?
    Thank you.
    Message was edited by: kortchnoi
    kortchnoi

    Well... In a way, it was my fault. I have created xliff file and looked at its contents using CodeWright text editor. Using CW's regular expressions and macros, one could process translation fairly easily.
    However, in this file foreign characters are scrambled (=> my post).
    Then, I decided to install and try XLIFF editor. The one recomended by SUN handled all diacritical characters nicely.
    So, as long as I don't try to proceed to direct translation of the xlf file, but through XLIFFF editor, I don't have to worry about french and german accents. At least, I hope so (I still have to import the german translation to be sure).
    Igor
    Message was edited by: I. Kortchnoi
    kortchnoi

  • Special (diacritical) characters in Spotlight

    When I have English set as "iPhone Language", and I type "tomas" in Spotlight search, I get search results including "Tomáš" (a name which includes special characters with diacritical marks). If I switch iPhone Language to Czech, I get no search result for "tomas" and have to type "tomáš" to find Tomáš which is not too much comfortable.
    Is there any way to make it working (diacritical mark insensitive) also when I switch iPhone Language to Czech? It's the current iOS 8.2.

    environment
    Oracle 10g R2 x86 10.2.0.4 on RHEL4U8 x86.
    db NLS_CHARACTERSET WE8ISO8859P1
    After following the following note:
    Changing US7ASCII or WE8ISO8859P1 to WE8MSWIN1252 [ID 555823.1]
    the nls_charset was changed:
    Database character set WE8ISO8859P1
    FROMCHAR WE8ISO8859P1
    TOCHAR WE8MSWIN1252
    And the error:
    ORA-31011: XML parsing failed
    ORA-19202: Error occurred in XML processing
    LPX-00217: invalid character 8217 (U+2019)
    was no longer generated.
    A Unicode database charset was not required in this case.
    hth.
    Paul

  • A simple question about wrong sorting with multiple sort columns in Excel 2010

    Hi, everyone! I have encountered a problem that I don't know how to explain.
    So I post it here because I don't know if there is another more relevant forum...
    I have a data sheet with the students' scores for the course. 
    All the data were generated with the randbetween function,
    and pasted with the values.
    To rank the students by their performance,
    I did the sort with the column "total score" as the first sort-column
    and "final term" as the second.
    The weird thing to me is that the order of the data seems problematic.
    That is, all the rows are sorted correctly with the first sort-column.
    But for the rows with the same values of the first sort-column,
    there are some rows whose values of the second sort-column are out of order.
    (please look at the data file at
    www_dot_kuaipan_dot_cn/file/id_67204268108546068_dot_htm
    Please change the "_dot_" to the real literal dot.
    Especially the rows with 56.7 as the first sort-column value
    and some other values near the tail of the list.)
    I tried to manually input and sort the same values of both columns
    in a near-by region. The result was correct.
    When some friend copied all the data to paste in Notepad,
    and reload them in excel. The problem disappears.
    Some friend also tried to wrap a round function at the values of the 1st sort-column,
    the sorting order became correct!
    But they could not explain why either.
    How confusing! I even tried to swap the first and secod sort-column;
    the output was right.
    All the data were generated by randbetween function and pasted with values.
    Where could all the special characters, if any, come?
    Can anyone give me an explanation? I would be very grateful.
    Thanks in advance!

    Re:  Sort is not in proper order
    Sounds as if the data includes spaces or hidden characters that are affecting the sort.
    That is indicated by the fact that manually entering the data resolves the problem.
    Note:  You can use a public file storage website to hold your file and add the link to it in your post.
    Jim Cone
    Portland, Oregon USA
    Special Sort excel add-in (30+ ways to sort) - 3 week no obligation trial
    https://jumpshare.com/b/O5FC6LaBQ6U3UPXjOmX2

  • JMF Can't play file with cyrillic characters

    Fairly self explanetory - JMF fails to load files or directories with cyrillic characters. It throws the following exception:
    javax.media.NoPlayerException: Cannot find a Player for :file:///C:\Documents and Settings\drdanielfc\Desktop\DATscape\The Mad Heads\????? ?\06 ?????.wma
         at javax.media.Manager.createPlayerForContent(Manager.java:1412)
         at javax.media.Manager.createPlayer(Manager.java:417)
         at tune.in.music.MusicVideoPlayer.<init>(MusicVideoPlayer.java:56)
         at tune.in.components.library.Library$1.valueChanged(Library.java:52)
         at javax.swing.DefaultListSelectionModel.fireValueChanged(Unknown Source)
         at javax.swing.DefaultListSelectionModel.fireValueChanged(Unknown Source)
         at javax.swing.DefaultListSelectionModel.setValueIsAdjusting(Unknown Source)
         at javax.swing.plaf.basic.BasicTableUI$Handler.setValueIsAdjusting(Unknown Source)
         at javax.swing.plaf.basic.BasicTableUI$Handler.mouseReleased(Unknown Source)
         at java.awt.AWTEventMulticaster.mouseReleased(Unknown Source)
         at java.awt.Component.processMouseEvent(Unknown Source)
         at javax.swing.JComponent.processMouseEvent(Unknown Source)
         at java.awt.Component.processEvent(Unknown Source)
         at java.awt.Container.processEvent(Unknown Source)
         at java.awt.Component.dispatchEventImpl(Unknown Source)
         at java.awt.Container.dispatchEventImpl(Unknown Source)
         at java.awt.Component.dispatchEvent(Unknown Source)
         at java.awt.LightweightDispatcher.retargetMouseEvent(Unknown Source)
         at java.awt.LightweightDispatcher.processMouseEvent(Unknown Source)
         at java.awt.LightweightDispatcher.dispatchEvent(Unknown Source)
         at java.awt.Container.dispatchEventImpl(Unknown Source)
         at java.awt.Window.dispatchEventImpl(Unknown Source)
         at java.awt.Component.dispatchEvent(Unknown Source)
         at java.awt.EventQueue.dispatchEvent(Unknown Source)
         at java.awt.EventDispatchThread.pumpOneEventForFilters(Unknown Source)
         at java.awt.EventDispatchThread.pumpEventsForFilter(Unknown Source)
         at java.awt.EventDispatchThread.pumpEventsForHierarchy(Unknown Source)
         at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
         at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
         at java.awt.EventDispatchThread.run(Unknown Source)The actual file is located at "C:\Documents and Settings\drdanielfc\Desktop\DATscape\The Mad Heads\&#1053;&#1040;&#1044;&#1030;&#1071; &#1028;\06 &#1043;&#1088;&#1086;&#1096;&#1110;.wma". JMF appears to turn all of the characters into ? marks as it cannot understand them. Is there some sort of workaround without renaming the files?
    Thanks ahead of time

    The actual file is located at "C:\Documents and Settings\drdanielfc\Desktop\DATscape\The Mad Heads\&#1053;&#1040;&#1044;&#1030;&#1071; &#1028;\06 &#1043;&#1088;&#1086;&#1096;&#1110;.wma".
    JMF appears to turn all of the characters into ? marks as it cannot understand them. Is there some sort of workaround without renaming the files?Well, a few things... first off, JMF won't play your wma file anyway as it doesn't support WMA files.
    But you might try reversing the order of the slashes.
    IE..
    MediaLocator ml = new MediaLocator("file:/C:/Documents and Settings/drdanielfc/Desktop/DATscape/The Mad Heads/&#1053;&#1040;&#1044;&#1030;&#1071; &#1028;/06 &#1043;&#1088;&#1086;&#1096;&#1110;.wma");
    You might also try constructing your MediaLocator from a URL, instead of a String.
    IE...
    URL url = new URL("C:\Documents and Settings\drdanielfc\Desktop\DATscape\The Mad Heads\&#1053;&#1040;&#1044;&#1030;&#1071; &#1028;\06 &#1043;&#1088;&#1086;&#1096;&#1110;.wma");
    MediaLocator ml = new MediaLocator(url);
    If those 2 suggestions don't work, nothing will. Also, test them with a WAV file since WMA isn't supported ;-)

  • Importing music with international characters

    I want to import some old music with international characters in the filenames (and probably id3 tag) into itunes.
    I want to use the music in my car's mp3 reader which uses ms-dos and can't handle international characters. How can I get itunes to sort this for me when I import? can I get tunes to convert the French characters into ms-dos suitable ones?

    Perhaps something at Doug's Scripts this would help, e.g. http://dougscripts.com/itunes/scripts/ss.php?sp=tracknameeditwithsed although doing it one character at a time might get a little dull. I've not spotted a "replace all non-ascii charcters" script but it shouldn't be too hard to create one.
    tt2

  • Problem with some characters in complex objects

    Hi all,
    I've built a webservice which returns a complex object with several fields inside. All fields are public and accessable via getter and setter methods.
    The problem is, that some of these fields contains numbers or underscores in their names.
    For example:
    public int field_a;
    or
    public String house3of4;
    When I try to import these webservice as a model in a Web Dynpro project, it doesn't work until I remove these characters.
    Is this a known problem or is there any solution for it?
    Thanks
    Thomas

    NLS_LANG in registry is "ARABIC_UNITED ARAB EMIRATES.AR8MSWIN1256"
    I use oracle form 10g for developer
    oracle form 9i for database
    when I build a form in client side and make a text with farsi characters, when I run the form,all characters shows me correct in farsi except four characters(گ چ ژ پ)

  • CRVS2010 Beta - Cannot export report to PDF with unicode characters

    My report has some unicode data (Chinese), it can be previewed properly in the windows form report viewer. However, if I export the report document to PDF file, the unicode characters in exported file are all displayed as a square.
    In the version of Crystal Report 2008 R2, it can export the Chinese characters to PDF when I select a Chinese font in report. But VS2010 beta cannot export the Chinese characters even a Chinese font is selected.

    Barry, what is the specific font you are using?
    The below is a reformatted response from Program Management:
    Using non-Chinese font with Unicode characters (Chinese) the issue is reproducible when using Arial font in Unicode characters field. After changing the Unicode character to Simsun (A Chinese font named 宋体 in report), the problem is solved in Cortez and CR both.
    Ludek

  • LoadUserProfile() creates a profile with Chinese characters on a remote system

    Hi,
    I'm working on an application where LoadUserProfile() is being used to remotely load a user profile on a machine. The token being passed to LoadUserProfile() is obtained from LogonUser(). 
    When doing this only with a Domain Admin user which is added in Active Directory, it creates a profile with Chinese characters in the C:\Users\ folder of the remote machine. Note that this happens only when logging in for the first time with
    this Domain Admin account remotely on that machine.
         // code:
          PROFILEINFO pi;
          memset((void *) &pi, 0, sizeof(PROFILEINFO));
          pi.dwSize = sizeof(PROFILEINFO);
          pi.dwFlags = PI_NOUI;
          pi.lpUserName = (TCHAR *)strUser;   //strUser is the User name, and it shows correctly here when debugging
          if (LoadUserProfile(hToken, &pi))
    //It is actually successful, and comes here when debugging.
    Although the name shows up correctly when debugging (remotely), why is it creating a profile with Chinese characters on the remote machine? 
    TIA,
    Jy

    CreateProfile won't load the profile.  You need to use LoadUserProfile to load the profile, and you need to query for a roaming profile path to put in the lpProfileInfo parameter if you want to include that as well.  You need a token for a
    user to call LoadUserProfile, but not a profile handle.  LoadUserProfile will populate that for you before it returns if it was successful.  See this excerpt from
    https://msdn.microsoft.com/en-us/library/windows/desktop/bb762281%28v=vs.85%29.aspx:
    Upon successful return, the hProfile member
    of PROFILEINFO is
    a registry key handle opened to the root of the user's hive. It has been opened with full access (KEY_ALL_ACCESS). If a service that is impersonating a user needs to read or write to the user's registry file, use this handle instead of HKEY_CURRENT_USER.
    Do not close thehProfile handle.
    Instead, pass it to the UnloadUserProfile function.
    This function closes the handle. You should ensure that all handles to keys in the user's registry hive are closed. If you do not close all open registry handles, the user's profile fails to unload. For more information, see Registry
    Key Security and Access Rights and Registry
    Hives.
    WinSDK Support Team Blog: http://blogs.msdn.com/b/winsdk/

  • Error in PSA with special characters

    Hello,
    I'm having a problem with sprecial characters in PRD system.
    This happen with standard extractor for Confirmations in SRM module, 0BBP_TD_CONF_1.
    I have 5 records with error in PSA. The field that have the error is 0bbp_delref and it has the text: “GUIA ENTREGA Nº2”.
    The error message says that field OBBP_DELREF has special characters that are not supported by BW.
    The field is char 16 and in spro I have the characters:
    QWERTYUIOPASDFGHJKLǪNºZXCVBM/~^´`* '()"#$%&!?.-0123456789«»<>
    In QUA, I created confirmations with this text and it worked fine and the characters in SPRO are the same.
    Any idea?
    Thanks and best regards,
    Maria

    Hi Maria,
    There will be no issue with update rules as the issue is with Data..
    ALL_CAPITAL should resolve your issue, .otherwise try individually executing those spl characters in RSKC..
    have you tried in SPRO...
    Check in this table RSALLOWEDCHAR in SE11 and find what all characters allowed(permitted)  in PRD system...
    please go through the above mentioned blog once
    Thanks,
    Sudhakar.

  • Create a flat file with multiple characters for enclosures

    Hello,
    we use OWB 11g2 (11.2.02).
    Now we try to create a flat file with multiple characters for enclosures. The manual wrote:
    "Enclosures (Left and Right): Some delimited files contain enclosures that denote text strings within a field. If the file contains enclosures, enter an enclosure character in the text box or select one from the list. The list displays commonenclosures. However, you may enter any character. The default for both the left and right enclosure is the double quotation mark ("). You can specify multiple characters and hexadecimal characters as field enclosures."
    But it will not work. The OWB use the first character from the left enclosure definition as left enclosure and the second one as right enclosure !?!
    Did anyone know this behavior? Is there a solution for this problem?
    Thanks and regards
    Norbert

    HI Raghu,
               Use the function module 'GUI_UPLOAD'.
               In that you have to specify the field_separator value = 'X' in export section.
    Regards,
    S.C.K

  • Profit Centers(CEPCT) selection with Wild Characters(*) condition

    HI, I'm trying to select Profit centers from CEPCT table for the given profit center parameter. here I'm trying with wild characters.
    for example: 1.
    * check if wild character exists
        FIND c_st IN p_i_profit_ctr.
        IF sy-subrc EQ 0.
    * replace * with %
          REPLACE ALL OCCURRENCES OF c_st IN p_i_profit_ctr WITH c_pr.
    * get profit center list for given pattern
          SELECT prctr  "Profit Center
                 ktext  "General Name
              FROM cepct INTO TABLE t_profit_ctr
            WHERE spras = c_en              "english language
            AND   prctr LIKE p_i_profit_ctr.   "profit center
    CEPCT table data is:
    Profit Center
    0000001000  
    0000002000  
    CORPORATE   
    (1) when I try with C*, i'm getting properly.
    (2)when I try with 1, I'm getting it properly.
    (3) when I try with 1*, not getting data even if 1000 is available. this is beacuse leading zeores in the table.
    Could anyone please help me out how do I write a qeury to rectrive the above. (Note: we may not be sure how many zeroes to be included, I hope this is not good practive to include zeroes because in case of 200/2000 which will not work and for chars which will not be good code)
    thanks in advance.

    Hi,
    Its better to specify a range for profit center in the selection screen and he use in your query as follows:
    Example :
    ranges r_profcent FOR GLPCA-RPRCTR.
    Query to fetch Range:
    SELECT PRCTR INTO R_PROFCENT-LOW
      FROM CEPC
      WHERE ( PRCTR BETWEEN '0000001111' AND '0000009999'
        OR PRCTR = 'DUMMY' )
        AND KOKRS = '1000'.
        R_PROFCENT-SIGN = 'I'.
        R_PROFCENT-OPTION = 'EQ'.
        APPEND R_PROFCENT.
    Use of range in Query:
      SELECT  REFDOCNR RPRCTR BLDAT  BUDAT
      FROM GLPCA CLIENT SPECIFIED
      APPENDING CORRESPONDING FIELDS OF TABLE ITAB1
      WHERE RCLNT = SY-MANDT
        AND RPRCTR IN R_PROFCENT
        AND RACCT IN GLACNO
    Hope it could help you out.
    Regds,
    Anil

Maybe you are looking for

  • Error while using web service pack to deploy exchange agent

    Hi all, I have a problem of deploying exchange agent before creating exchange source in SES. OS : Windows Server 2003 R2 The exchange server was installed successfully, I follow the steps below to create the exchange agent: 1. unzip $ORACLE_HOME/sear

  • How do I remove an app that won't download

    I tried to download an app called piictu. The app will not download and I cannot remove it. When I try to download from iTunes it tells me there was an error and the connection was reset. How do I get rid of the bloody thing! Here is a pic showing no

  • 10.7 Bluetooth not working anymore

    I'm trying to setup my mighty mouse for the first time. It showed up in the bluetooth preferences but was not connecting, so I deleted it from the list and now, it doesn't show up anymore. I tried resetting the PRAM and SMC, dumping the bluetooth pre

  • Oracle 11g R2 Windows Installation issue

    Hey everybody, i was trying to install Oracle 11g R2 on a Windows Server 2008 64-bit machine. During the installation i am getting an error message multiple times (about 20 times) where it says it cannot find certain files like C:\app\ora11g\product\

  • Tracing error log in Application Server for report service

    Hi guys, My company use Oracle Application Server with the Form version 10.1.2.0.2. Recently we experienced an error whereby the report services go down so frequent within a day. I checked into the alert log file of the Database and found out ORA-006