Document Indexing

If we go through the Google Search, we can find out that it display's URLs which not only has been registered with the search engine with some keywords but also does the complete document indexing. (I hope I am correct, atleast that is what I have seen during some searches).
Can I acheive the same type of indexing on documents in iFS? If yes, does it work on all types of documents?

Sastry--
Yes, definitely, Oracle iFS can do this. Text indexing of HTML and 150+ other formats is what we get by using interMedia Text. The 150+ formats covers everything that you might want to index, including HTML, XML, the Microsoft Office document formats, PDF, and as we product marketing types like to say, "and so much more!"
null

Similar Messages

  • Can't open my IBA project: error parasing document index: invalid character in attribute value

    Hi guys,
    I am new of iBooks Author and I am not HTML code savvy (expect the very basics). I have to send an iBooks for an University exam and I can't open my file anymore.
    I was working on it since Saturday, I saved many times and I have quit the iBooks Author and rebooted my iMac few moments ago. There were no crashes.
    The project is a iBook with pictures, videos and some Tumult Hype animations. It is 248Mb.
    Now I can't open the file (but I can see it in quick view). The error message is in the subject, but here it is again:
    "Progetto iBooks" couldn't be opened. Error parsing document index: invalid character in attribute value.
    Now, I have to send the project today (here in Italy it is now 7:10 am) and so I am kind of desperate.
    Any suggestion?
    I am already working on making a new ibook, but if someone could give me a workaround to fix it and open it on iBooks I would be extremely grateful.
    Best regards,
    Sebastiano

    Hi Sebastiano,
    Did you ever get any answer? I have the same problem and now have to work from an old version again (good thing I had backup one manually!)
    Thx
    JP.

  • Error parsing document index: invalid character in attribute value

    Hi,
    I am working on an eBook with iBook Author. Once in a while, after I finish working, and want to reopen my eBook a few hours or days after I get the error:
    'title' could not be opened
    Error parsing document index: invalid character in attribute value
    I have to go back to an old backup I always do before quitting.
    What is the reason of this error message and anyw ay to fix/repair it on a version I just saved?
    Thanks,
    JP.

    Hello,
    I have the same error, but for me the above solution did not work.
    Did I understand correctly? This is what I did:
    1. I changed the extension of the IBA file in ZIP
    2. I unzipped the file
    3. in the folder with the unzipped book I have changed the file index.xml in index.html
    4. I zipped it all back (in a ZIP file)
    5. I renamed the extension ZIP archive in IBA
    6. I tried to open the book and I got  the error that there was no index.xml file
    7. I changed the extension of the IBA and in ZIP
    8. I unzipped the file again
    9. I changed back index.xml to index.html and I zipped it back (compressed it)
    10. I renamed the extension ZIP archive in IBA
    11. I opened the book
    Is this correct?
    The problem is that I am still receiving the same error message: Error parsing document index: invalid character in attribute value
    Did I do something wrong?
    If you can help I would be very greatfull, I worked hard on this book and it is the only backup that I have saved.
    Thank you!

  • Can't open file "/Library/WebServer/Documents/index.html.en."

    Can’t open file “/Library/WebServer/Documents/index.html.en.”

    You will never be able to open this file with iWeb, because, as it states it is a User/Sites/index.html file and iWeb cannot open this files - anything with html on the end of it is already published and iWeb cannot open this. It is not the same as your domain file.
    Go to User/Library/Application Support/iWeb/domain.sites and try opening your domain file, rather than something that is already published. iWeb cannot import - at least not published pages.

  • Fireworks: How to find document index for scripting?

    Can anyone share a simple method for obtaining the document index of an active Fireworks document, within the array of open documents? Normally, the current document is accessed by using fw.getDocumentDOM(), but I'd like to obtain the actual array index value (e.g., 0, 1, 2, 3, etc.) to use elsewhere in a script.
    I've created a function to obtain this index value, but it's ungainly: It compares entire DOMs that have been converted to source. This requires too much memory or processing and nearly brings the script to a halt. I need something simpler.
    var dom = fw.getDocumentDOM();
    function documentIndex() {
        if (fw.documents.length == 1) {
            return 0;
        else if (fw.documents.length > 1) {
            var i = 0;
            for (i = 0; i < fw.documents.length; i++) {
                if (fw.documents[i].toSource() == dom.toSource()) {
                    return i;
                    break;
    I've considered using a document property like docTitleWithoutExtension, filePathForRevert, or filePathForSave as a basis for comparison between documents, but these are unreliable: They won't work if the documents in question have have not yet been saved. 
    I figure there must be a simple method, I just don't know what it is.

    In case anyone's interested, here's the workaround I'm now using to find the position of an active Fireworks document within the array of open documents (a.k.a., the document index). I'd still love to hear from anyone who can suggest a simpler method.
    The basic idea is to first look at two properties of the active document—docTitleWithoutExtension and filePathForRevert—and, where possible, compare those values to that of the same properties within each open document. On their own, each property has loopholes, which is why I'm combining them. (For example, it's possible to have two documents with the same title—if they have different extensions or file paths. Likewise, it's possible to open multiple copies of an unsaved document, all with the same filePathForRevert value.)
    New or untitled documents demand an entirely different criteria. I'm not crazy about this approach, but it seems like the best option at this point: When you can't find a DOM property to reliably distinguish one document from another, you temporarily write a distinctive value into a property of the document you want to find. Think of it like tagging a wild animal. In this case, when the active document lacks both a docTitle and file path, I write a crazy piece of "alt text" gibberish into the defaultAltText property and use that to identify my active document. (Even though I'm taking care to restore the original "alt text" afterwards, I don't love this method... but it seems to work.)
    var dom = fw.getDocumentDOM();
    function documentIndex() {
        if (fw.documents.length == 1) {
            return 0;
        else if (fw.documents.length > 1) {
            var docTitle = dom.docTitleWithoutExtension;
            var filePath = dom.filePathForRevert;
            if ((docTitle != "") && (filePath != null)) {
                var i = 0;
                for (i = 0; i < fw.documents.length; i++) {
                    if ((fw.documents[i].docTitleWithoutExtension == docTitle) && (fw.documents[i].filePathForRevert == filePath)) {
                        return i;
            else {
                var originalDefaultAlt = dom.defaultAltText;
                dom.defaultAltText = "toRtoiSeOfThEsLowAcoRnsRejoiCe";
                var i = 0;
                for (i = 0; i < fw.documents.length; i++) {
                    if (fw.documents[i].defaultAltText == dom.defaultAltText) {
                        dom.defaultAltText = originalDefaultAlt;
                        return i;

  • [CS3] Used font in a document: Index?

    Hello!
    I am working with IUsedFontList and IPMFont.
    But none is telling me the index of a font used in a document.
    How can I get the index of a font used in a document?
    Alois Blaimer

    Changing fonts/font sizes in a scanned document requires a product like Acrobat to
    convert the scanned image into text (OCR)
    make actual changes to the text
    With Adobe Reader you can use the Zoom function to enlarge the PDF document content.

  • Cannot open pages documents "index.xml file is missing"

    Working on an iMac Mountain lion and using pages 3.0.3.  Just today, none of my pages documents will open.  I haven't stored anything in the cloud, I haven't changes anything, just working on the pages documents  simple text.
    The last thing I did with it was to print the file, as I have so often before.  I have a MacBook with the same pages app, but running on Yosemite -- bought a new version of pages.
    The MacBook will open .pages files stored on that hard drive, but files taken from the iMac will not open on the MacBook.
    I'm in a tight.  I have a vast number of documents as I've been using pages for years.

    By default, Pages v5.5.2 on Yosemite saves out a Single File format document, which is a compressed renamed zip folder that the Finder allows us to believe is a document. When you attempt to open that document from within Pages '08, or Pages '09, you get the following dialog:
    The Pages v5 generation documents do not use an internal index.xml file. Pages v5.5.2 also allows one to change that single file format document into a package file format document, which is not a compressed, renamed folder. When you open one of these package format documents in the older Pages applications, you now get a different dialog message. The index.xml file is still missing, but…
    If you are using Pages '08 v3.0.3, or Pages '09 v4.3, there are no newer versions of these applications available. This is an Apple attempt to steer users to a Yosemite update, and then downgrade to the latest Pages applications that now require Yosemite. They could have just said, “we made a document format that is incompatible with older Pages applications.”

  • TREX: Preparation Failed in document index

    Hi,
    I have defined an index with one datasource and I get an error in 317 documents, while 204 documents are OK.
    Error are due to 6401 code error (HTTP Status Code 401 : Unauthorized) but I don't know how to solve it. Everyone has full control to datasource (is a development environment).
    Any suggestion?
    Thanks and best regards!
    Damian

    Hi All,
    When go this patha <b>SysAdmin>Monitoring>IndexingMonitor</b> iam getting the error.
    Trex: Preparation failed: index operation.
    Could anybody tell me what could be the reason...i had give host name in URL generator also. 
    <b>SysAdmin>SysConfig>KM--> Index administrator</b> i can able to see all Acive and Gree tick mark for all categeory..but indexing was not working for me...

  • The document "index.html" could not be opened.

    Hi,
    I'm having trouble opening iWeb. When i double click on the iweb icon it tries to open a particular file (index.html), but claims it can't open it and subsequently quits. This happens no matter if I try to open any other file with iWeb, even files that were originally created with iWeb. I can't find the iLife installation CD (i think because it came installed on the computer when i bought it). Any help would be appreciated
    thanks!

    iWeb stores your website data in a domain file located in Home Folder/Library/Application Support/iWeb.
    Go look and see if you have one there. It should be the only file there - normally.
    Double click it to launch iWeb

  • Problems in indexing MS word document. Please help!

    Hi
    I'm using oracle 8.1.6 on solaris 5.7
    I stored a MS word document in a table as a internal blob.
    The word document contains one line:
    "This is test word document." Then I indexed it with inso_filter preference. I created a log file during indexing. The log file showed thatb there was no document indexed. Here was what I did:
    ===============================================================
    --Create preference
    exec CTX_DDL.drop_preference('MY_LEXER');
    exec CTX_DDL.create_preference('MY_LEXER','BASIC_LEXER');
    exec CTX_DDL.set_attribute('MY_LEXER','MIXED_CASE', 'NO');
    exec CTX_DDL.set_attribute('MY_LEXER','INDEX_THEMES','NO');
    exec CTX_DDL.set_attribute('MY_LEXER','INDEX_TEXT', 'YES');
    exec ctx_ddl.Drop_Preference ('MY_FILTER');
    exec ctx_ddl.Create_Preference ('MY_FILTER','INSO_FILTER');
    exec ctx_ddl.drop_section_group ('MY_SECTION');
    exec ctx_ddl.create_section_group ('MY_SECTION','NULL_SECTION_GROUP');
    --Create table
    drop table test;
    create table test
    (id number primary key,
    text blob
    --Initialize blob column with an empty blob
    insert into test (id,text) values (1,empty_blob());
    --Create an directory in which a word file (test.doc) exsits
    create directory filedir as '/home/mydir';
    --Insert the word file
    DECLARE
    lobd BLOB;
    fils BFILE;
    BEGIN
    fils := BFILENAME('FILEDIR','test.doc');
    SELECT text INTO lobd FROM test WHERE id = 1 FOR UPDATE;
    dbms_lob.fileopen(fils, dbms_lob.file_readonly);
    dbms_lob.loadfromfile(lobd, fils, dbms_lob.getlength(fils));
    COMMIT;
    dbms_lob.fileclose(fils);
    END;
    ---Start logging
    exec ctx_output.start_log('index.log');
    ---Create index with INSO_FILTER defined in preference
    create index test_index on TEST(text) indextype is ctxsys.context
    parameters ('lexer MY_LEXER filter MY_FILTER section group MY_SECTION memory 50M');
    ---Stop loggin
    exec ctx_output.end_log;
    =============================================================
    The indes was created. And I open the index.log file. It is:
    ==============================================================
    Oracle interMedia Text: Release 8.1.6.0.0 - Production on Tue Feb 19 16:22:50 2002
    (c) Copyright 1999 Oracle Corporation. All rights reserved.
    16:22:50 02/19/02 begin logging
    16:23:48 02/19/02 populate index: CALLOB.TEST_INDEX
    16:23:48 02/19/02 Begin document indexing
    16:23:49 02/19/02 End of document indexing. 0 documents indexed.
    16:24:06 02/19/02 log
    16:24:06 02/19/02 logging halted
    ===============================================================
    I did the query:
    select token_text from dr$test_index$i;
    no rows returned.
    Could anyone tell me why this happened? An advices are appreciated.
    Thansk,
    George

    Hi, Omar:
    I tried use SQL*Loader to load the word document. Part of the loader logging reads as following:
    Table TEST:
    1 Row successfully loaded.
    0 Rows not loaded due to data errors.
    0 Rows not loaded because all WHEN clauses were failed.
    0 Rows not loaded because all fields were null.
    Space allocated for bind array: 6720 bytes(64 rows)
    Space allocated for memory besides bind array: 0 bytes
    Total logical records skipped: 0
    Total logical records read: 1
    Total logical records rejected: 0
    Total logical records discarded: 0
    ================================================================
    It seems that the file was sucessfully loaded into the database. Then I created index using the procedure I posted on this thread. I checked the table ctx_user_index_errors.
    select * from ctx_user_index_errors;
    the returns are:
    ERR_INDEX_NAME ERR_TIMES
    TEST_INDEX 20-FEB-02
    ERR_TEXTKEY
    AAAGtpABLAAAAAXAAA
    ERR_TEXT
    ----------------------------------------------------------------DRG-11207: user filter command exited with status 137
    What does this return tell?
    Thanks.

  • Snow Leopard indexing - mail and documents.

    Mail index is seriously broken and document index seems to be failing.
    Is there a simple way to force Snow Leopard to rebuild all indexes?

    bump - can anyone help with Snow Leopard indexing problem?

  • KM Document iView - index.html and main.css not properly displayed

    Hello,
    as a test we have put two files in the /documents repository in KM :
    a) index.html
    <head>
    <link rel="stylesheet" type="text/css" href="./main.css"/>
    </head>
    <table width="92%" bgcolor="#FFFFFF">
      <tr align="left" valign="top">
        <td> </td>
        <td colspan="5"><table width="100%" border="0" cellpadding="5" cellspacing="0">
            <tr valign="middle">
              <td width="85" bgcolor="#C7D9E9"> <p><b>Top Links</b></p></td>
              <td width="125" class="document-list"><a href="impax.html">IMPAX Client
                </a> </td>
              <td width="125" class="document-list"><a href="talkstation.html">TalkStation</a></td>
              <td width="125" class="document-list"><a href="ris.html">RIS</a></td>
              <td width="125" class="document-list"><a href="connectivity.html">Connectivity
                Manager</a></td>
              <td width="125" class="document-list"><a href="impax.html">IMPAX Server</a></td>
            </tr>
          </table></td>
      </tr>
    </table>
    b) main.css
    A:visited
        color: #264560
    A:active
        color: #12212E
    A:hover
        color: #14623D
    A
        color: #336699
    table
        margin-top: 0px;
        margin-bottom: 0px;
    p
        color:#000000;
         font-family: Arial, Helvetica, sans-serif;
         margin-bottom: 0px;
        margin-top: 5px;
         font-size: 12px;
    .document-list
        background-color:#C7D9E9;
        font-family: Arial, Helvetica, sans-serif;
        font-size: 12px;
        font-color: #000000
        margin-bottom:3px;
    When going to Content Administration -> KM Content -> Documents and clicking the index.html file, the css file is taken into account, when i.e hovering over the IMPAX hyperlink, the path is http://<host>:<port>/irj/go/km/docs/documents/impax.html and the impax.html page is displayed when clicked.
    However, when creating a KM Document iView (with or without content filter) pointing to /documents/index.html and displaying the iView, the style sheet is ignored, and the same hyperlink as above now refers to http://<host>:<port>/irj/servlet/prt/portal/prtroot/impax.html, which is incorrect.
    -> How can this behaviour be explained?
    -> When creating an URL iView pointing to /irj/go/km/docs/Agfa_Knowledgebase/index.html , everything works as expected.
    Thanks for the help -

    Hi,
    You should correct the path to your css file in your index.html:
    href="/irj/go/km/docs/documents/main.css"
    Regards,
    Praveen Gudapati

  • Ultrasearch doesn't index documents processed by remote crawler

    Hi,
    My Oracle9i 9.2 database is on Solaris. I have a remote crawler on Windows 2000. The remote crawler seems working fine. No error message is in log file. Every file has been processed. However, I can't query the documents processed by remote crawler. $ORACLE_HOME/ctx/log/ultrasearch_log reads:
    Oracle Text, 9.2.0.1.0
    15:24:45 07/12/02 begin logging
    15:31:04 07/12/02 sync index: LING.WK$DOC_PATH_IDX
    15:31:04 07/12/02 Begin document indexing
    15:31:05 07/12/02 End of document indexing. 0 documents indexed.
    The last part of the log file is:
    =================== Crawling results ===================^M
    Crawling started at 7/12/02 3:25 PM^M
    Crawling stopped at 7/12/02 3:32 PM^M
    Total crawling time = 0:6:31^M
    ^M
    Total number of documents fetched = 179^M
    Document fetch failures = 0^M
    Document conversion failures = 0^M
    Total number of unique documents indexed = 178^M
    Total data collected = 1,975,751 bytes^M
    Total number of non-indexable documents = 0^M
    Average size of fetched document = 11,099 bytes^M
    ^M
    Total indexing time = 0:0:0 for 1,975,751 bytes of data^M
    Number of documents collected/indexed per hour = 1,638^M
    ^M
    Number of times disk cache is full = 0^M
    I have another crawler on the database host. It works fine. I can query the documents processed by this crawler.
    Any idea?
    Ling Niu

    More Information about my question.
    I used samba to share a directory /ling on Solaris to my Windows 2000, and map it to a drive on Windows NT, say, E:. This directory is used as both log and temp directory. I have a account on Solaris with same name/password as the my Widnows 2000 account. When the schedule is executing, I can see crawler create a directory inso_tmp on shared directory, and I believe it is used to filter. To my knowledge, after filtering, Oracle or crawler will copy the filtered file in inso_tmp to temp directory, which is /ling or E:\ in my case. But I failed to catch the temporary file which I know is transient files. I've given the write privilege on /ling to Oracle. I checked the table WK$DOC and WK$URL. In these tables, my temporary file's name is E:\****. If Oracle Server use this table to get the name of temporary file, it will fail because Oracle on Solaris doesn't know where is E:\. I can't get any log message to prove my guess. And if this is the case, it will be difficult to set up a remote crawler on Windows/Unix mixed enviroment, Right?
    Any leads would be welcome,
    Ling
    Hi,
    My Oracle9i 9.2 database is on Solaris. I have a remote crawler on Windows 2000. The remote crawler seems working fine. No error message is in log file. Every file has been processed. However, I can't query the documents processed by remote crawler. $ORACLE_HOME/ctx/log/ultrasearch_log reads:
    Oracle Text, 9.2.0.1.0
    15:24:45 07/12/02 begin logging
    15:31:04 07/12/02 sync index: LING.WK$DOC_PATH_IDX
    15:31:04 07/12/02 Begin document indexing
    15:31:05 07/12/02 End of document indexing. 0 documents indexed.
    The last part of the log file is:
    =================== Crawling results ===================^M
    Crawling started at 7/12/02 3:25 PM^M
    Crawling stopped at 7/12/02 3:32 PM^M
    Total crawling time = 0:6:31^M
    ^M
    Total number of documents fetched = 179^M
    Document fetch failures = 0^M
    Document conversion failures = 0^M
    Total number of unique documents indexed = 178^M
    Total data collected = 1,975,751 bytes^M
    Total number of non-indexable documents = 0^M
    Average size of fetched document = 11,099 bytes^M
    ^M
    Total indexing time = 0:0:0 for 1,975,751 bytes of data^M
    Number of documents collected/indexed per hour = 1,638^M
    ^M
    Number of times disk cache is full = 0^M
    I have another crawler on the database host. It works fine. I can query the documents processed by this crawler.
    Any idea?
    Ling Niu

  • UltraSearch - Numbers of document discovered, fetched and indexed

    I am using US 1.0.3.
    - I have a table data source mapped to a table with the following characteristics:
    > PK is a composite of three columns
    > table has a total of 970 rows
    > the column TITLE which is of varchar2 is specified as the content column
    > Of the 971 rows, 82 rows have NULL in TITLE column.
    > Of the 971 rows, only 196 rows have unique TITLE.
    > There is no attribute mapping
    - Here is the crawler summary:
    Document discovered: 381
    Document fetched: 381
    Document indexed: 196
    The rest are zeros.
    My questions are:
    (1) It seems US only indexes rows with unique value which explains why only 196 rows/documents are indexed. That is, rows with duplicate TITLE are not indexed. It seems to make sense. Is that correct?
    (2) But why only 381 documents/rows are discovered and fetched? I would think it would discovered all the rows with NON-NULL value in the TITLE column: 889 (i.e. 971 82).
    (3) In summary, how does US determine what rows to fetch and index?
    Thanks!
    C Cheung

    Hi nyzonegirl,
    Welcome to Numbers discussions.
    Yvan is correct; nothing neither in iWork nor any Mac application will remove MS Office—Excel. If you were using Numbers there is a 30 trial for it, after 30 days it stops working or one purchases it. So it seems that the Excel work sheet you were using may have opened in Numbers not your Excel—a file association thingy.
    Find on your HDD the Excel .XLS file, click on it once to highlight it. Now click File > Get Info, down the list it will read Open with:, change it to Excel (it may read Numbers). You'll have the choice to have all like files open in Excel as well.
    Yes you're correct, Windows users won't be able to open Numbers files so if you decide to purchase iWork you'll need to do as Yvan suggested, Export to Excel.
    As the need arises I use Excel, however, for my personal use 100% of the time I use Numbers. When I know Functions are the same I'll use Numbers then Export to Excel for Windows users.
    Hope this helps you. Do let us know the outcome.
    Sincerely,
    RicD

  • Change the Index from documents to All

    Hi all
    I created an index in the index administration only for documents(Items to Index=Documents).
    This was a long time ago...
    Now we have the problem that our search engine only shows documents to this index.
    OK its how it works.
    But my question is: How can I change the "Items to index" from documents to All?
    Is there a way, because its nearly impossible to delete the index a create a new one, because
    we have a lot of documents indexed.
    Thanks in advance
    Steve

    Hi Steve,
    Can you try the following -
    1. Create a new Index with the required properties (items to index set to "All") and select the same data source as done in the old index.
    2. Provide schedule for the index.
    3. Re-index it one time.
    4. When everything is done then you can remove the old index and use the new one.
    5. Modify your Search Options Set accordingly.
    Note: There should be sufficient space in the TREX Server to accommodate both the indexes for some time.
    Regards,
    Sudip

Maybe you are looking for

  • Macbook no longer recognises any disks

    My Macbook (aluminium 13" - late 2008) no longer mounts disks. Not even 2 years old but that unfortunately means no longer covered under the warranty. Wondering if anyone can help me? Tried a disk cleaner but that had no effect, when I insert a disk,

  • Is it possible to run dual display monitors on a KVM switch between my mac mini and a PC desktop?

    I have two 23" monitors utilizing DVI that I currently have plugged into my PC desktop.  I'd like to purchase the mac mini and utilize the same monitors for both systems and switch back and forth between the two.  So, I'm wondering if it's possible. 

  • Java mail api error- Please Help

    Hi, i have set up an smtp server on my local machine. i have given the smtp host address as 127.0.0.1 I am getting the following error though. please help me out with the problem C:\trainee>java Emailer javax.mail.SendFailedException: Invalid Address

  • KE24 Report requirement

    Hi Experts, I have requirement in KE24 report. When ever the credit notes or debit notes issued for a respective sales order user is able to see side by side credit/debit note for that respective sales order. I have tried in standard KE24 report but,

  • Ranges as export parameter in function modules

    hi guys i have to design a function module in such a way that i have to pass a ranges table. watt structure or type should i use? thanks sameer