Indexing ob binary documents

Hello,
my question is about the INSO_FILTER that is used when binary files are indexed(pdf, doc,..). In the IFS context search is about that, and there is IFS for Linux, that means that there is some kind of filter for Linux. Can that filter be used to index documents with Oracle text under Linux?
Thanks

For LINUX there is an INSO_FILTER with 8.1.7
(executeable ctxhx).
null

Similar Messages

  • Oracle Text - Problem with filtering binary documents (.doc, .pdf, etc...)

    Hi, I have a problem with filtering binary documents (.doc, .pdf, etc...). I use SQL*PLUS for remote access to Oracle 10.2 on Linux and I create table:
    CREATE TABLE test (id NUMBER PRIMARY KEY, text VARCHAR2(100));
    I insert to this table:
    INSERT into test values(1, 'PATH/text1.doc‘);
    INSERT into test values(2,'PATH/text2.doc‘);
    and then:
    CREATE INDEX test_index ON test(text) indextype is ctxsys.context
    parameters (’datastore ctxsys.file_datastore
    filter ctxsys.auto_filter’);
    Message "Index created" is displayed, but objects: DR$test_index$I, DR$test_index$K, DR$test_index$N, DR$test_index$R and DR$test_index$P are empty => index wasn´t created probably.
    I don´t know, where is bug, either bug is somewhere in this code or on the server (wrong installation oracle or constraint privileges). Do you know in what is bug?

    The following is an excerpt from the 10g online documentation. Note the items that I have put in bold.
    "FILE_DATASTORE
    The FILE_DATASTORE type is used for text stored in files accessed through the local file system.
    Note:
    FILE_DATASTORE may not work with certain types of remote mounted file systems.
    FILE_DATASTORE has the following attribute(s):
    Table 2-4 FILE_DATASTORE Attributes
    Attribute Attribute Value
    path path1:path2:pathn
    path
    Specify the full directory path name of the files stored externally in a file system. When you specify the full directory path as such, you need only include file names in your text column.
    You can specify multiple paths for path, with each path separated by a colon (:) on UNIX and semicolon(;) on Windows. File names are stored in the text column in the text table.
    If you do not specify a path for external files with this attribute, Oracle Text requires that the path be included in the file names stored in the text column.
    PATH Attribute Limitations
    The PATH attribute has the following limitations:
    If you specify a PATH attribute, you can only use a simple filename in the indexed column. You cannot combine the PATH attribute with a path as part of the filename. If the files exist in multiple folders or directories, you must leave the PATH attribute unset, and include the full file name, with PATH, in the indexed column.
    On Windows systems, the files must be located on a local drive. They cannot be on a remote drive, whether the remote drive is mapped to a local drive letter."
    With accessible paths and files, you get something like:
    SCOTT@orcl_11g> CREATE TABLE test (id NUMBER PRIMARY KEY, text VARCHAR2(100));
    Table created.
    SCOTT@orcl_11g>
    SCOTT@orcl_11g>
    SCOTT@orcl_11g> INSERT into test values(1,'c:\oracle11g\banana.pdf');
    1 row created.
    SCOTT@orcl_11g> INSERT into test values(2,'c:\oracle11g\cranberry.pdf');
    1 row created.
    SCOTT@orcl_11g>
    SCOTT@orcl_11g> CREATE INDEX test_index ON test(text) indextype is ctxsys.context
      2  parameters ('datastore ctxsys.file_datastore
      3  filter ctxsys.auto_filter');
    Index created.
    SCOTT@orcl_11g>
    SCOTT@orcl_11g> select count(*) from dr$test_index$i
      2  /
      COUNT(*)
           608
    SCOTT@orcl_11g> In the following, I used a non-existent path and non-existent file name, which produces the same results as when you use a remote path that does not exist locally.
    SCOTT@orcl_11g> CREATE TABLE test (id NUMBER PRIMARY KEY, text VARCHAR2(100));
    Table created.
    SCOTT@orcl_11g>
    SCOTT@orcl_11g>
    SCOTT@orcl_11g> INSERT into test values(3,'c:\nosuchpath\nosuchfile.pdf');
    1 row created.
    SCOTT@orcl_11g>
    SCOTT@orcl_11g> CREATE INDEX test_index ON test(text) indextype is ctxsys.context
      2  parameters ('datastore ctxsys.file_datastore
      3  filter ctxsys.auto_filter');
    Index created.
    SCOTT@orcl_11g>
    SCOTT@orcl_11g> select count(*) from dr$test_index$i
      2  /
      COUNT(*)
             0
    SCOTT@orcl_11g>

  • Problems in indexing MS word document. Please help!

    Hi
    I'm using oracle 8.1.6 on solaris 5.7
    I stored a MS word document in a table as a internal blob.
    The word document contains one line:
    "This is test word document." Then I indexed it with inso_filter preference. I created a log file during indexing. The log file showed thatb there was no document indexed. Here was what I did:
    ===============================================================
    --Create preference
    exec CTX_DDL.drop_preference('MY_LEXER');
    exec CTX_DDL.create_preference('MY_LEXER','BASIC_LEXER');
    exec CTX_DDL.set_attribute('MY_LEXER','MIXED_CASE', 'NO');
    exec CTX_DDL.set_attribute('MY_LEXER','INDEX_THEMES','NO');
    exec CTX_DDL.set_attribute('MY_LEXER','INDEX_TEXT', 'YES');
    exec ctx_ddl.Drop_Preference ('MY_FILTER');
    exec ctx_ddl.Create_Preference ('MY_FILTER','INSO_FILTER');
    exec ctx_ddl.drop_section_group ('MY_SECTION');
    exec ctx_ddl.create_section_group ('MY_SECTION','NULL_SECTION_GROUP');
    --Create table
    drop table test;
    create table test
    (id number primary key,
    text blob
    --Initialize blob column with an empty blob
    insert into test (id,text) values (1,empty_blob());
    --Create an directory in which a word file (test.doc) exsits
    create directory filedir as '/home/mydir';
    --Insert the word file
    DECLARE
    lobd BLOB;
    fils BFILE;
    BEGIN
    fils := BFILENAME('FILEDIR','test.doc');
    SELECT text INTO lobd FROM test WHERE id = 1 FOR UPDATE;
    dbms_lob.fileopen(fils, dbms_lob.file_readonly);
    dbms_lob.loadfromfile(lobd, fils, dbms_lob.getlength(fils));
    COMMIT;
    dbms_lob.fileclose(fils);
    END;
    ---Start logging
    exec ctx_output.start_log('index.log');
    ---Create index with INSO_FILTER defined in preference
    create index test_index on TEST(text) indextype is ctxsys.context
    parameters ('lexer MY_LEXER filter MY_FILTER section group MY_SECTION memory 50M');
    ---Stop loggin
    exec ctx_output.end_log;
    =============================================================
    The indes was created. And I open the index.log file. It is:
    ==============================================================
    Oracle interMedia Text: Release 8.1.6.0.0 - Production on Tue Feb 19 16:22:50 2002
    (c) Copyright 1999 Oracle Corporation. All rights reserved.
    16:22:50 02/19/02 begin logging
    16:23:48 02/19/02 populate index: CALLOB.TEST_INDEX
    16:23:48 02/19/02 Begin document indexing
    16:23:49 02/19/02 End of document indexing. 0 documents indexed.
    16:24:06 02/19/02 log
    16:24:06 02/19/02 logging halted
    ===============================================================
    I did the query:
    select token_text from dr$test_index$i;
    no rows returned.
    Could anyone tell me why this happened? An advices are appreciated.
    Thansk,
    George

    Hi, Omar:
    I tried use SQL*Loader to load the word document. Part of the loader logging reads as following:
    Table TEST:
    1 Row successfully loaded.
    0 Rows not loaded due to data errors.
    0 Rows not loaded because all WHEN clauses were failed.
    0 Rows not loaded because all fields were null.
    Space allocated for bind array: 6720 bytes(64 rows)
    Space allocated for memory besides bind array: 0 bytes
    Total logical records skipped: 0
    Total logical records read: 1
    Total logical records rejected: 0
    Total logical records discarded: 0
    ================================================================
    It seems that the file was sucessfully loaded into the database. Then I created index using the procedure I posted on this thread. I checked the table ctx_user_index_errors.
    select * from ctx_user_index_errors;
    the returns are:
    ERR_INDEX_NAME ERR_TIMES
    TEST_INDEX 20-FEB-02
    ERR_TEXTKEY
    AAAGtpABLAAAAAXAAA
    ERR_TEXT
    ----------------------------------------------------------------DRG-11207: user filter command exited with status 137
    What does this return tell?
    Thanks.

  • Seeking recommendations for handling large binary documents with security(preferable) for inbound and outbound scenarios from OSB- SOA and SOA- OSB

    Hi,
    I am currently working on a project with the following requirements
    1. Client transfers binary document (between 1-20MB in size) from OSB proxy to SOA composite to Content Management system
    2. Client retrieves binary document (between 1-20MB in size) from Content Management system to SOA composite to OSB proxy
    In otherwords, a inbound and outbound integration.
    What I have tried so far and my results:
    Scenario A
    1. Enabled MTOM on SOA composite by attaching wsmtom policy
    2. Created an OSB business service and consumed the SOA composite application
    3. Enabled MTOM on OSB proxy and business service and configured it to pass by reference
    Scenario B
    1. Enabled MTOM and security on SOA composite by attaching wsmtom policy and SAML policy
    2. Created an OSB business service and consumed the SOA composite application
    3. Enabled MTOM on OSB proxy and business service and configured it to pass by reference
    I have a demo integration setup that writes a binary document to a file using the above steps. My SOA composite has a file adapter that writes the binary data to an external file and it is exposed as a web service with a simple WSDL definition that has an inline XSD schema with an single element of base64binary type. I have added a mediator that maps this base64binary element node to the file adapter's input node.
    Result for Scenario A with file size less than 1 MB:
    Flawless execution with sub-second response times
    Result for Scenario A with file size of 8MB
    First attempt: SOA composite faults with database transaction related error, solved by increasing JTA timeout
    Second attempt: Flawless execution, but file transfer took over 100 seconds to complete. This is very poor performance and my suspicions are that this cannot be the expected behaviour, but I dont know the internal workings of the SOA composite and why its taking this long.
    Result for Scenario B:
    The OSB business service does not accept/recognize the SAML policy in the WSDL and suggests to configure OWSM policies manually, but OWSM policy in OSB does not have the wsmtom policy. Regardless of this, any permutation of MTOM + WSS security in this integration scenario either did not work outright or MTOM optimization was not happening ie binary data was materalizing in the message body.
    I have only about 3 weeks left to implement a viable solution and the closest ive come to a solution is Scenario A but that +100 second response time for an 8MB file is really worrying.
    I would appreciate any level of guidance, recommendations or suggestions as to how I go about tackling this problem.
    Thanks
    regards,
    Johnny

    I think this is due to the underlying mechanism of weblogic classloading..
    You can contact oracle support @ https://support.oracle.com to report issues. Roughly this is the process .
    1- get the Oracle Customer Support Identifier (CSI) for the client you are working for.
    2- Create a user profile quoting the CSI. This will send an approval request to oracle support admins at your client.
    3- Get the oracle support admins at your client site to approve your request for support access.
    4-Once they approve , you can access the support site and raise service requests.

  • TREX doesn't index some PDF documents ...

    Hello all,
    we have installed EP'04s and TREX ver. 7.00.42.00. It works well except one thing. TREX is not able to index some PDF documents. Most of them is indexed correctly but some documents not. Actually the problematic PDF documents are indexed but their content is not. In search result is displayed message "No document excerpt available" for the documents.
    I found SAP Note 622419 that could relate with that but I don't know how can I check:
    1. What encoding was used in particular PDF form.
    2. What fonts or what font types were used in particular PDF form.
    Do you have any idea how to find out these information about a document? Or do you know where could be problem when TREX are able to index content of just some PDF documents?
    Regards,
    Zbynek

    Thank you, that was what I needed.
    So now I know the test PDF document use only TrueType fonts. That means its content should be indexable for TREX but it isn't. There is just message "No document excerpt available" for this document in search results.
    Could someone look at the document and try to index it? It can be downloaded from http://www.volny.cz/kabrtz/TREX/indexing_test.pdf
    Regards,
    Zbynek

  • Ultra Search Indexer: Adding 'alien' document types.

    The way the Ultra Search indexer finds src material will not work in my situation. While I may be able to give it databases to crawl, it cannot crawl our content, so the way that you tell the indexer about 'alien' document types by adding custom code to return lists of URLs so the indexer can read the src documents won't work in my scenario.
    I want to know what the Ultra Search application does special when indexing documents?
    Is there a description so I can reproduce using Oracle Text and perhaps point the Ultra Search querying component against my manufactured repository and have it work?
    Thanks.

    Is there a way to set up finder search with additional criteria so that it isolates file extensions with .docx, .pdf, .txt all in one single search?
    currently the "kind is document" also brings up .jpgs and .wavs which I dont want, (or consider documents).

  • How to create index in word document?

    Hello,
    Is it possible to create a index in word document from abap code (ole)?
    Thank you for response..
    Alfonso

    1.Goto the transparent table  KNA1
    2. select the button Indexes (which is next to the techincal settings button)
    3.list of alredy existing indexes are displayed
    4. in the dialog displayed select the icon create.
    5. specify the name for the index to be created ( should start with Z)
    6. screen for specify the index fields will appear, specify the details based on your requirement.
    Note:
    1.creating an index will created a sorted  copy of the DB table with data  with limited fields
    2. Try using already created indexes , only if necessary create new index
    3. Here table KNA1 is used as an example

  • Can Secure Enterprise Search index Open Office documents?

    Hi, I'm wondering if Secure Enterprise Search can index Open Office documents, and if not, is there a planned release where this will be supported?
    Thanks!
    Dan

    The current release - SES 11g (not yet on Windows) - can work with Open Office files.

  • Error when indexing WebDav Repository documents

    Hello,
    We're using SAP EP 7.0 and when trying to index WebDav Repository documents that exist, we get an error.
    User has full authorization on these files and WebDav folder. What could be the problem?
    Thanks in advance&Regards

    Hi Belen:
    Kindly post the error which you receive.
    P.S: Kindly assign points if your query is resolved, also close the question to assist other users narrow the search and find solutions

  • Change file type from BINARY DOCUMENT to QUicKTIME VIDEO?

    Hi,
    I'm using Bridge CS4 to help catalogue my Quicktime videos (captured via FCP into a capture scratch file) and for the most part it works well. The problem is it seems to list in the "type" column many of the videos as BINARY DOCUMENTS instead of QUICKTIME VIDEO which means I can't play or scrub through them in the preview window. Instead they appear as a single static frame. They still open up into a playable video though when double clicked.
    Adding or removing the extension .mov seems to have no affect.
    Does anyone have a suggestion as to how to get them back into being recognized as QUICKTIME VIDEO?
    Thanks!

    Ooops, I just found the solution. Right-click on clip and select "Purge Cache For Selection", it changes all clips back to the right type.

  • Does Oracle 11g index Office 2007 documents?

    I recently upgraded to 11g, because 10g didn't seem to index Office 2007 documents (e.g. Word, Excel, and PowerPoint) or PDFs v1.5 or higher. I need to be able to search on text in those documents. Everything works fine for PDFs and files generated using earlier versions of Microsoft products, but not for Office 2007 documents. The Oracle documentation for 11g says that it supports Office 2007, but I haven't had any luck. Any thoughts?
    Edited by: sac1222 on Nov 1, 2009 11:49 AM
    Edited by: sac1222 on Nov 1, 2009 1:59 PM

    All download versions of Oracle software are the full versions. The 11.1.0.7 patchset will work with your downloaded 11.1.0.6, but you will need an Oracle Support account before you can download the patchset.
    I believe you can download 11.1.0.7 directly for certain platforms - like Windows Server 2008 64 bit. If you don't have access to Oracle Support this might be an option for testing - you can get a 60 day evaluation copy of Windows Server 2008 [from Microsoft|http://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=13c7300e-935c-415a-a79c-538e933d5424].

  • TREX – Indexed information by document type

    Hi,
    Where can i find documentation about what properties TREX indexes for each document type?
    For example: 
    - Each word document has the following properties (Title, Subject, Author, Manager, Company, Category, Comments Keywords, …). The PDF documents have similar properties. Is this information indexed by TREX?
    My question is related with AutoCAD documents. These documents contain legend information, and need to know if this information is indexed by TREX and can be used to search?
    Thanks and regards,
    John

    Hi,
    Check this thread:
    https://www.sdn.sap.com/irj/sdn/thread?threadID=140959
    Greetings,
    Praveen Gudapati
    p.s. Points are always welcome for helpful answers

  • UDP and Binary document transfer

    I am attempting to transfer a binary document from one machine to another utilizing UDP. I am able to get the document moved over, however the document will not open. It is basically a word document and I am chunking it over. I am sure that I have received all the packets and inserted them into the final file.
    I can transfer a text document and not have any issues opening it.
    So based on being able to chunk a large text document over and being able to view it, I am wondering if I need to do something different for binary documents?
    Thanks

    jverd wrote:
    Peter__Lawrey wrote:
    UDP is not as reliable as TCP.
    For example, you can only reliably send a packet of up to 532 bytes, any larger than this and you can get packet fragmentation,Not to mention that regardless of packet size, there's no guarantee of delivery.
    Still, if it's as the OP seems to be saying--text files consistently work fine and binary files consistently don't--then it's most likely what the good Dr. said.Hmm.. I wonder how he verified that large text files work. Do you think he did a diff or compare of the original and received text file? He could be missing pieces in the text file as well, but text is text, so it's still possible to open and read it.

  • Errors creating indexes on Binary XML Tables

    Hi,
    Oracle details are as follows:
    Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
    PL/SQL Release 11.2.0.2.0 - Production
    "CORE 11.2.0.2.0 Production"
    TNS for IBM/AIX RISC System/6000: Version 11.2.0.2.0 - Production
    NLSRTL Version 11.2.0.2.0 - Production
    I'm currently having an issue when attempting to create a suitable index on a binary xml table. I have a Binary xml table that stores a number of xml documents. I have created an index on this table as follows:
    create index TEST_WQI_idx_1 on TEST_WEB_QUOTE_INDX1(OBJECT_VALUE)
    INDEXTYPE IS XDB.XMLINDEX
    PARAMETERS ('PATHS (INCLUDE
    (/webPolicy/QuoteId))');
    Querying the table with the following:
    SQL> l
    1 select xmlcast(xmlquery('/webPolicy/Sections/interopSection[1]/PolicyItems/item/Version/text()' PASSING OBJECT_VALUE RETURNING CONTENT)
    2 as number) "VERSION",
    3 xmlcast(xmlquery('/webPolicy/QuoteId/text()' PASSING OBJECT_VALUE RETURNING CONTENT)
    4 as number) "QUOTEID"
    5 FROM TEST_WEB_QUOTE_INDX1
    6 where xmlcast(xmlquery('/webPolicy/QuoteId/text()' PASSING OBJECT_VALUE RETURNING CONTENT)
    7* as number) = 22824
    SQL> /
    VERSION QUOTEID
    1 22824
    Execution Plan
    Plan hash value: 3559428808
    | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
    | 0 | SELECT STATEMENT | | 545 | 1071K| 1030 (1)| 00:00:19 |
    | 1 | SORT AGGREGATE | | 1 | 3022 | | |
    |* 2 | TABLE ACCESS BY INDEX ROWID | SYS895336_TEST_WQI__PATH_TABLE | 1 | 3022 | 2 (0)| 00:00:01 |
    |* 3 | INDEX RANGE SCAN | SYS895336_TEST_WQI__PIKEY_IX | 1 | | 1 (0)| 00:00:01 |
    |* 4 | FILTER | | | | | |
    | 5 | TABLE ACCESS FULL | TEST_WEB_QUOTE_INDX1 | 545 | 1071K| 4 (0)| 00:00:01 |
    | 6 | SORT AGGREGATE | | 1 | 3022 | | |
    |* 7 | TABLE ACCESS BY INDEX ROWID| SYS895336_TEST_WQI__PATH_TABLE | 1 | 3022 | 2 (0)| 00:00:01 |
    |* 8 | INDEX RANGE SCAN | SYS895336_TEST_WQI__PIKEY_IX | 1 | | 1 (0)| 00:00:01 |
    Predicate Information (identified by operation id):
    2 - filter(SYS_XMLI_LOC_ISTEXT("SYS_P0"."LOCATOR","SYS_P0"."PATHID")=1)
    3 - access("SYS_P0"."RID"=:B1 AND "SYS_P0"."PATHID"=HEXTORAW('5E6C') )
    4 - filter(CAST( (SELECT "SYS"."STRAGG"("SYS_P2"."VALUE") FROM
    "WEB_STAGING"."SYS895336_TEST_WQI__PATH_TABLE" "SYS_P2" WHERE "SYS_P2"."PATHID"=HEXTORAW('5E6C') AND
    "SYS_P2"."RID"=:B1 AND SYS_XMLI_LOC_ISTEXT("SYS_P2"."LOCATOR","SYS_P2"."PATHID")=1) AS number)=22824)
    7 - filter(SYS_XMLI_LOC_ISTEXT("SYS_P2"."LOCATOR","SYS_P2"."PATHID")=1)
    8 - access("SYS_P2"."RID"=:B1 AND "SYS_P2"."PATHID"=HEXTORAW('5E6C') )
    Note
    - dynamic sampling used for this statement (level=2)
    - Unoptimized XML construct detected (enable XMLOptimizationCheck for more information)
    However, I also need to add an additional field to this index to allow appropriate queries against the data in the table, version number. This field can be seen from the statement above (VERSION) which runs OK and returns the data I’d expect. However, when I attempt to add this index using the following statement I get an error returned and the index becomes corrupted:
    alter index TEST_WQI_idx_1 rebuild
    parameters ('PATHS (INCLUDE ADD
    (/webPolicy/Sections/interopSection[1]/PolicyItems/item/Version))');
    After some investigation, the issue seems to revolve around the use of the [1] condition in the statement /interopSection[1]. I can create the index by removing the [1] condition, but this does not return the expected result. In actual fact, as there are 2 interopSection elements in the xml file, both with a version number of 1, the statement returns 11, which would appear to be the two version numbers concatenated together. I need to be able to reference the version number from the first interopSection in the queries against the table, and I need to be able to index this column correctly for performance issues.
    I'm unsure why this xpath statement is not working correctly in the alter index statement, but returns ok when used within the query against the table and was wondering if you would be able to help me to have a working index against this element.
    Thanks in advance for any help you can provide in relation to this.

    Sorry, here is the error:
    Error starting at line 20 in command:
    alter index TEST_WQI_idx_1 rebuild
    parameters ('PATHS (INCLUDE ADD
    (/webPolicy/Sections/interopSection[1]/PolicyItems/item/Version))')
    Error report:
    SQL Error: ORA-29858: error occurred in the execution of ODCIINDEXALTER routine
    ORA-64131: XMLIndex Metadata: failure during the looking up of the dictionary
    ORA-30968: invalid XPATH or NAMESPACE option for XML Index
    29858. 00000 - "error occurred in the execution of ODCIINDEXALTER routine"
    *Cause:    Failed to successfully execute the ODCIIndexAlter routine.
    *Action:   Check to see if the routine has been coded correctly.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           

  • Create a Macro to Index a Word Document Line by Line

    Background
    I have collected a bunch of keywords and references to where I can find these words in a textbook.  I've put them into a Word document where each line is one Index beginning with the “Main Entry” followed by a colon and then the “Subentry”.  Note
    that in the "Subentry" I have included my reference in parenthesis (b1-m2).
    Example Original Text:
            Main Entry:Subentry (b1-m2)
    Example Marked for Index:
            Main Entry:Subentry (b1-m2){ XE "Main Entry:Subentry (b1-m2)" \t "" }
    When it was only a page of content it was no big deal to select the entire line and <Ctrl>+<Alt>+<x> then <Enter> down the line.  Now that I have about 500 lines of these word combinations, I need a more automated solution. 
    I have searched for KB articles that explain the various elements I need, though unsuccessfully since I don’t know what I need.  I doubt I am the first person to do this, so if anyone could point me to the right documents I would greatly appreciate
    it.
    As a noob, how difficult of a task is this to automate with a Macro or some other method and should I even attempt it? I have a very short window of time to figure this out.
    Nice to have: In the final index, I don't need the Word page numbering after the term.  My references are in the parenthesis.  I know how to remove it manually in Word, when I mark the index entry: Chose Options, Cross-reference
    and remove the pre-populated text of "see". That adds  \t "" to the index reference. 
    Illustrated as such:
            From: Main Entry:Subentry (b1-m2){ XE "Main Entry:Subentry (b1-m2)" }
            To:     Main Entry:Subentry (b1-m2){ XE "Main Entry:Subentry (b1-m2)" \t "" }
    My Attempt at Recording a Macro
    Having zero experience writing macro’s myself, I tried recorded a simple macro using the manual keystrokes below however the text reflected in the actual index reference does not change.  I also have to manually kick off the macro on every line of text.
    I walked through each of the steps outlined below as I was recording a macro, however when I replay the macro, the index itself contains the exact same text for every line and does not match the original text on the new line.
    I could not get the macro to repeat itself on every line.  I had to keep running it until I was done (technically, when I realized it was repeating the same text within the index reference itself) but I’d like the macro to run from beginning
    to end; line by line and then insert the Index itself at the end on a new page.
    I realize I need some kind of loop to keep the macro going line by line.
    I also need some way to mark the Main Entry within the loop (everything to the left of the colon) and then the Subentry (everything to the right of the colon to the end of the line). 
    Example “Index” Macro
    Sub Index()
    ' Index Macro
        Selection.EndKey Unit:=wdLine, Extend:=wdExtend
        ActiveWindow.ActivePane.View.ShowAll = True
        ActiveDocument.Indexes.MarkEntry Range:=Selection.Range, Entry:= _
            "TV\:The Good Wife (y2014-y2015)", EntryAutoText:= _
            "TV\:The Good Wife (y2014-y2015)", CrossReference:="", _
            CrossReferenceAutoText:="", BookmarkName:="", Bold:=False, Italic:=False
        Selection.MoveRight Unit:=wdCharacter, Count:=1
    End Sub
    Manually Marking Index Entries
    Manually, here are the keystrokes I use to iterate my way through the document.
    Manual Index Marking Keyboard Combinations:
    At the beginning of the first line, press <Shift> + <End>
    This selects the entire rows text
    Then press <Ctrl> + <Shift> + <x>
    This allows me to “Mark Index Entry”
    Then press <Enter>
    This confirms the Index entry
    Then press <Esc>
    This closes the “Mark Index Entry”
    Go to the next line and repeat.
    Replacing anchor
    Once the creation of each index is complete, I need to be able to iterate through the document and find all anchor + colons (IE: \: ) and replace with colon (IE: :). This way, the “Main entry” and “Subentry” are handled properly when the Index is inserted.
    Manual Anchor Replacement Keyboard Combinations:
    At the beginning of the Word document, press <Ctrl> + <h>
    Find what:      \:
    Replace with:   :
    Then <Alt> + <a>
    Press the "Ok" button (or make replace silent somehow)
    Then press <Esc>
    This should close the "Find and Replace" screen
    Inserting Index
    Ideally, I would like the macro to create and insert the newly marked content into an index at the end of the document.
    Manually Inserting Index Keyboard Combinations:
    Press <Ctrl> + <End>
    this takes us to the bottom of the document
    Then press <Alt> + <s>
    this chooses the "References" tab
    Next press <Alt> + <x>
    this chooses "Insert Index"
    Next press <Alt> + <t>
    This should allow you to choose a "Format" option for the index
    Next press <m>
    This should chose "Modern" from the "Formats" options
    Finally, press <Enter>
    End the macro
    Example before Indexing:
              TV:The Good Wife (y2014-y2015)
              TV:Phineas and Ferb (y2011)
              TV:Curb Your Enthusiasm (y2011-y2015)
              Game:Back to the Future (y2012)
              Made for TV Movie:The Magic 7(y2009)
              Main Entry:Subentry (b1-m2)
    Example after Indexing is completed:
    The marked up text/references did not transfer over properly from the Word document I copied my question from.  I had to manually type the text within the {} brackets for illustrative purposes here:
              TV:The Good Wife (y2014-y2015){ XE "TV:The Good Wife (y2014-y2015)" }
              TV:Phineas and Ferb (y2011){ XE "TV:Phineas and Ferb (y2011)" }
              TV:Curb Your Enthusiasm (y2011-y2015){ XE "TV:Curb Your Enthusiasm (y2011-y2015)" }
              Game:Back to the Future (y2012){ XE "Game:Back to the Future (y2012)" }
              Made for TV Movie:The Magic 7(y2009){ XE "Made for TV Movie:The Magic 7(y2009)" }
              Main Entry:Subentry (b1-m2){ XE "Main Entry:Subentry (b1-m2)" }
    Example Index 
    G
        Game
              Back to the Future (y2012) · 2
    M
        Made for TV Movie
              The Magic 7(y2009) · 2
        Main Entry
              Subentry (b1-m2) · 1,
    2
    T
         TV
              Curb Your Enthusiasm (y2011-y2015) · 2
              Phineas and Ferb (y2011) · 2
              The Good Wife (y2014-y2015) · 2
    Chris Schurman

    Once I combined my Excel knowledge and Word knowledge, this became a piece of cake.  Sharing my solution for anyone else who may have the need.  The point of this exercise is to prepare for an open book exam and I need a quick index of my books
    (there are 6 for this class).  Anyway, here is how i solved (though slightly clunky, it works in seconds!)"
    In Excel, I pieced the text together by concatenating the indexing markup and the contents of the pertinent cells as such:
        =CONCATENATE("XE """,A2,":",B2," (b",C2,"-p",D2,")"" \t """"")
    Content from Excel (with results of concantenate statement in last column:
    Heading    Slide Title    Book    Page    Copy this into notepad then into word
    Game    Back to the Future    1    12    XE "Game:Back to the Future (b1-p12)"
    Made for TV Movie    The Magic 7    2    7    XE "Made for TV Movie:The Magic 7 (b2-p7)"
    Main Entry    Subentry    3    48    XE "Main Entry:Subentry (b3-p48)"
    TV    Curb Your Enthusiasm    4    100    XE "TV:Curb Your Enthusiasm (b4-p100)"
    TV    Phineas and Ferb    5    20    XE "TV:Phineas and Ferb (b5-p20)"
    TV    The Good Wife    6    35    XE "TV:The Good Wife (b6-p35)"
    Then I paste special the "Values" of the last column into Word.
    I run the macro below (haven't figured out how to loop yet) a few doxen times an insert the index at the bottom.
        Sub Index()
        ' Index Macro
            Selection.HomeKey Unit:=wdLine
            Selection.EndKey Unit:=wdLine, Extend:=wdExtend
            Selection.MoveLeft Unit:=wdCharacter, Count:=1, Extend:=wdExtend
            Selection.Fields.Add Range:=Selection.Range, Type:=wdFieldEmpty, _
                PreserveFormatting:=False
            Selection.EndKey Unit:=wdLine
            Selection.MoveRight Unit:=wdCharacter, Count:=1
        End Sub
    Bam; instant Index!
    Chris Schurman

Maybe you are looking for