Ultra Search Indexer: Adding 'alien' document types.

The way the Ultra Search indexer finds src material will not work in my situation. While I may be able to give it databases to crawl, it cannot crawl our content, so the way that you tell the indexer about 'alien' document types by adding custom code to return lists of URLs so the indexer can read the src documents won't work in my scenario.
I want to know what the Ultra Search application does special when indexing documents?
Is there a description so I can reproduce using Oracle Text and perhaps point the Ultra Search querying component against my manufactured repository and have it work?
Thanks.

Is there a way to set up finder search with additional criteria so that it isolates file extensions with .docx, .pdf, .txt all in one single search?
currently the "kind is document" also brings up .jpgs and .wavs which I dont want, (or consider documents).

Similar Messages

  • TREX – Indexed information by document type

    Hi,
    Where can i find documentation about what properties TREX indexes for each document type?
    For example: 
    - Each word document has the following properties (Title, Subject, Author, Manager, Company, Category, Comments Keywords, …). The PDF documents have similar properties. Is this information indexed by TREX?
    My question is related with AutoCAD documents. These documents contain legend information, and need to know if this information is indexed by TREX and can be used to search?
    Thanks and regards,
    John

    Hi,
    Check this thread:
    https://www.sdn.sap.com/irj/sdn/thread?threadID=140959
    Greetings,
    Praveen Gudapati
    p.s. Points are always welcome for helpful answers

  • Trex is not searching texts in any document types other then PDF.

    Dear All
    We are implementing DMS in ECC 6.0.We have configured Trex 7.0 text search in ABAP stack. Trex not searching text in .dwg (Autocad) *.doc (Word files) files in SAP System through CV04N T-code it is not searching.
    It is searching only pdf files.
    System Details:
    Server ECC 6.0
    SAP_BASIS - SAPKB70010
    SAP_ABA - SAPKA70010
    SAP_APPL - SAPKH60007
    EA-APPL - SAPKGPAD07
    Error Message:
    We have added the mime types for full text search in SAP System, SPRO &#61664; Cross-Application Components &#61664; Document Management &#61664; General Data &#61664; Settings for Storage Systems &#61664; Maintain Storage System as application/acad & application/doc. And also in Trex server usr\sap\<SID>\TRX00\Trex\TREXValidMimeTypes.ini file.
    After adding we have restarted the Trex server & done the Reindexing in SAP System & tried. But it is not searching the text in autocad files.
    Kindly support for us, to solve this issue.
    Regards
    Harshavardhan.G
    Mob: - 91 99130 88039

    Hi Harshavardhan,
    could you please create a OSS ticket for BC-TRX and attach an example of DWG document to this. Please also check if the includehidden parameter (TREXFilter.ini) is set to true.
    Best regards,
    Mikhail

  • Can Secure Enterprise Search index Open Office documents?

    Hi, I'm wondering if Secure Enterprise Search can index Open Office documents, and if not, is there a planned release where this will be supported?
    Thanks!
    Dan

    The current release - SES 11g (not yet on Windows) - can work with Open Office files.

  • Adding field 'Document Type' to Cash Flow Statement form in FSI5

    Dear SAP Experts
    I am developing a 'Cash Flow Statement' using FSI5/FSI3 functionality. I need to have a field 'Document Type' in the list of characteristics.
    Can anyone of you guide me how can I add this field?
    An urgent reply is highly appreciated.
    Thanks
    Syed Zia Abbas

    Hi Abbas,
    I have few comments, which should help to solve your
    problem. Please take a look to note 43661 which describes the
    creation of new forms and reports.
    Following are the steps to create the form and report:
    FSI4 => Form type: Financial Statement Key Figures => Copy
                 Form: Name and Text for the new form
                 Copy from: 0SAPRATIO-04
    FSI5 => find and double click the new form just created => Edit Gen.
    data selection => change the FSV to what you want to use
    FSI1 => Report type: Financial Statement Key Figures  => Create
                 Report: Name and Text for the new report
                 With form: Name of the new form details please refer to standard SAP report 0SAPRATIO-04
    FSI3 => find and execute the new report to check if you can get the
    result you want.
    To change the fin. statement version in the general data selections of
    your new form is not enough. You have to check each line of the new form
    too and replace the fin. statement items with items of your own fin. statement structure.
    - The mentioned forms contain restrictions via for example financial
      statement items in rows and years and/or periods in columns. But, if
      you execute the reports, you get no results. It is because the forms
      are just an example templates for items of financial statement version
      (FSV) INT, which is also just a template.
      To be able to get requested results, you have to create your own
      forms (and reports based on your forms) with restriction via your
      own financial statement version (in general data selection) and the
      financial statement items corresponding to your FSV (in particular
      rows). Otherwise, you cannot get any results, as there are no values
      for the template FSV INT and items from the INT.
      But it is up to you, how you arrange the form definition. It is
      closely related to your own FSV definition.
    I hope you find this information useful.
    Jose Luis Carbajo

  • Search index attached to document

    Hello. I have a collection of documents and I've created a catalog index (.pdx) of them. I've attached it to the main document from the Advanced tab of Document Properties. Everything works correctly so far. But when I try to move the folder of documents somewhere else, it still tries to find the attached index in the old location. How do you get it to look for the index at its relative location?
    Thanks in advance!

    Thank you for your reply. The index is moving with the other files. I'll give a little more information:
    The index is in a subfolder so it is getting moved with the rest of the files. But it continues to look in the original location where it was attached. What's interesting is the index itself works fine if I manuall select it during search, but when I open up the main document that it's attached to, Adobe Reader first displays the normal warning about your document is trying to access another file, and in that warning it has the old path. When you click OK to the warning it then says it can't find the file or it's corrupt. Of course, Adobe Acrobat doesn't display any warning, it just doesn't load the index.
    I'm using Acrobat 11 by the way.

  • Getting SES to index unknown document types in UCM?

    Hi
    We have a SES set up to crawl/index a Oracle UCM, and when configuring the crawler source we can define which document types it should crawl/index. It makes sense that SES only knows how to index the content of some known document types, but why can SES not index any document type without looking inside the actual document? I mean, it is possible to upload any document type in UCM and give it UCM specific metadata like title and so on, and it should be easy for SES to index these UCM metadata for unknown document types also.
    How can I get SES to crawl/index all unknown document types?
    Thank you
    Søren

    I've checked with our UCM connector expert, and he says all document types are passed from UCM to SES.
    So hopefully it should be just a matter of editing the crawler.dat file (found in $ORACLE_HOME/search/data/config)
    You would need to add a MIMEINCLUDE line for the mimetype of the documents you want included - if you're not sure what the mimetype is for any document, you can usually see it in the crawler log file.
    You'd also need to check that the document suffix is not in the list:
    # default file name suffix exclusion list
    RX_BOUNDARY (?i:(?:\.jar)|(?:\.bmp)|(?:\.war)|(?:\.ear)|(?:\.mpg)|(?:\.wmv)|(?:\.mpeg)|(?:\.scm)|(?:\.iso)|(?:\.dmp)|(?:\.dll)|(?:\.cab)|(?:\.so)|(?:\.avi)|(?:\.wav)|(?:\.mp3)|(?:\.wma)|(?:\.bin)|(?:\.exe)|(?:\.iso)|(?:\.tar)|(?:\.png))$You can edit this list to remove any suffixes that you do want included.

  • Ultra Search/ Oracle Text capabilities

    Our decision to go forward with Oracle9i is contingent upon the extensible use of Ultra Search and Oracle Text in our planned endeavors.
    Basically we are to build a system to do the following:
    1) download information (html files, links, documents) from web sites and accessible disk archives. The url sites are particular to a domain.
    2) place the downloaded file information into our Oracle database or download to local system with appropriate links in database.
    3) perform queries on the downloaded information through the database to isolate files for analysis.
    4) analyze and perform extraction on the information. For example, query based on a defined hierarchy of vulnerability terms.
    I've demoed Ultra Search and Oracle Text. I believe that Ultra Search can handle step 1, and possibly step 2 and that Oracle Text can help in step 4. Step 3 is satisfied by the Oracle database.
    I need to know details concerning Ultra Search and Oracle Text before committing:
    o when Ultra Search performs its crawling, how is found information represented in the database. Is a whole html file or document downloaded or are references to these documents stored in the database? If references are stored does Ultra Search embed the capability to download these files to be analyzed?
    o is Oracle Text the right tool to provide the capability for robust analysis of downloaded documents.
    o I have used the sample JSP that came with Ultra Search. Are there any more detailed examples which my above steps. In particular, performing robust analysis on downloaded documents from step 1.
    We have and are still exploring other COTS products to find a solution. Are main goal is to have the retrieved documents and analysis information resident in the database in this phase of our project. We find other COTS can perform the web crawling, but lack analysis, or vice versa and that their solutions are so vendor specific that in many times their services would be required to build a suitable solution that is not very extensible.
    Thanks for any feedback.

    Ultra Search does not keep documents in the database permanently. We bring them in for indexing purposes, but remove them after
    the indexing is completed. However, we keep the URLs of each unique document that was found during the crawling. You would
    have to do the downloading yourself. However, we are thinking about providing a mechanism, maybe in the form of an API, that
    would allow customers to retrieve documents. Please contact me on this issue if you are interested to discuss this: (650)-506-8173.
    Generally speaking you will find that Oracle Text is a very powerful tool for analysis of textual documents, especially since it is
    driven through the SQL language, has extensive functionality (themes, user-defined knowledge base, thesaurus, and many useful linguistic
    functions like segmentation, stemming, and globalisation support).
    The philosophy of Ultra Search is to provide you with an out-of-the-box solution for crawling and searching your data without the
    need for programming. Ultra Search is built on top of Text, so I would advise you to use Text to do the further analysis of your
    documents after they have been located by the crawler.
    Best Regards,
    Stefan Buchta

  • Document Type for invoice reduction

    Good day,  the SAP Library mentioned of invoice reduction due to price variance.  We use txn MIRO for invoice verification,  following process as in SAP library;  when posting error occured No separate type exist for invoice reduction.  When adding the Document type for invoice reduction in txnOMR4, I've used KG (Credit memo) also used KA, but these postings (credit note) go to the Goods Received/Invoice Received, is it possible that these credit memo created by the system go against the vendor?

    u do not need any separate document type for invioce reduction
    what u need to do is select the invoice reduction varaint from layout in miro
    system will genrate 2 fi document after posting in miro
    1 will be invoice
    2 will be creedit memo

  • Batch option to automatically place the search index path

    HI
    My company has 1000s of PDFs that we deliver to customers with search capabilities across all documents. We have created the catalog index for these documents, and in the past, we used a plugin called Options to batch set the Search Index path in Document Properties. This was a lot easier and less time consuming than opening every single PDF and manually inputting the path. Before options, it took me well more than a day to do this to every document we deliver. The plugin reduced my time to less than an hour. Very valuabe when in a delivery crunch!!!
    But since upgrading to Acrobat X Pro which we need for our Office 2010 upgrade, we have been unable to use the plugin. I am looking for a new solution that can run a process to set this option across all documents. We are not a bunch of people who can go write scripts etc to do this so something off the shelf/ready made would be great.
    Any recommendations?

    "... manually inputting the path."
    That is one method.
    While not on your scale I maintain a large "eLibrary" of PDF document collections.
    Also provide OSM for distribution of some of the collections.
    Each topic has a cataloged index. Each sub-topic has a cataloged index.
    For a topic / sub-topic a PDF that the user will land on has the path to its PDX.
    The PDF opens, the PDX is mounted, advanced search is available for the respective collection.
    The PDX stays 'mounted' until the end-user moves on.
    Also, PDXs can be selected from the Search dialog/pane.
    The PDX and its associated folder of index files is not in the folder that holds PDF(s) which are periodically updated.
    So, update a collection, rebuild the index.
    Good to go.
    Not as quick as the plug-in but less mind-numbing as the "each PDF manually".
    Be well...

  • Document type: Z5

    Hi,
    We have  AR interface program which takes different documents information from a flat file and posts in SAP through FB01 Using BDC sessions. We recently added new document type Z5. We have been using the docuemnt types Z1,Z2,Z3 and Z4. We can post Z1, Z2, Z3 and Z4 with duplicate reference document number but not with Z5. Any idea what could be the reason Z5 is not allowing duplicate reference document numbers. The AR interface program does not do any validations and it just passes the information to SAP  for each document.
    Thanks for the help,
    Sobhan.

    Hi Sobhan
    Does it allows to post the document for doc type 'Z5' manually, with out using the interface program
    Regards
    MD

  • Adding mimetype to document types in Web Access

    I need more mimetypes than are in document types in Web Access.
    Have tried to add a mimetype just to the wks$mimetype table but get java errors when go in the Usearch Admin web site. Is there a set way to add these to the Not Processed list in Web Access, so can add to the Processed list?

    Can you tell us what is the mimetype that is missing from the list?
    Is it a proprietary mimetype? Ultra Search does not support adding of new
    mimetype as it perceive the list is complete (it is from IANA, Internet Assigned Numbers Authority).
    Every new mimetype needs a message id which points to the description of the mimetype. Since
    user can not add new message, you will have to use an arbitary id outside the range of 26000-26034,
    you might want to try id 80189 "Document type", e.g.,
    insert into wk$mimetypes(descriptor,media_type,extensions, id, msg_id)
    values ('abc document','application/abc','abc',wk$mimetypes_seq.nextval, 80189);
    Another workaround is to add this new mimetype directly to the ultrasearch/data/config/crawler.dat file. In that case every
    data source will pick up document of this mimetype. e.g.,
    MIMEINCLUDE application/abc
    Keep in mind operation like this is not supported by Ultra Search.

  • Embedded Search Index AND Document Security?

    I'm using Adobe Acrobat Standard 8.1.7.
    It appears that I cannot have both an embedded search index and restricted security (e.g., password required to change document) on the same document.
    Why is that?
    If I start with security ON and then attempt to embed a search index, I get below error message:
    A search index can not be embedded in this document because this document has restricted security permissions.
    If I start with security OFF, successfully embed a search index, and then secure the document, Acrobat "strips off" the previously embedded search index.  No warning message; no feedback to end-user; just kills it!
    Why are those two functions mutually exclusive?  Anyone know of a work-around?
    Thank you in advance!

    Hi,
    As to "why", that might be floating out there in Adobe's devnet space or in one of the blogs maintained by Adobe's devnet crew.
    Also good to know about use of embedded index - if used, cannot apply fast web view to the PDF. It is one or the other, but not both.
    Work around? I've not come across one; but, that does not mean something isn't "out there" <g>.
    Be well...

  • How to get the document type, name and revision from the search page cv04n

    Hello,
    After performing cv04n and getting a list of documents satisfying the search criteria, how can I then get the document type, name and revision of the selected document using ABAP?
    Thanks

    HI,
    IN table DRAW... u have Document type, Version and document number.
    In Table DRAT also u get Document Type, Version, Number and Description of Doucment.
    Regards
    SAB

  • Document TYPE DA is adding 180 days to payment terms in fbl5n

    Hi,
    In transaction code fbl5n , with document type DA system is adding 180 day + payment terms days in net due date.But we want only payment term days in addition for net due date excluding 180 days with document type DA.
    How we can do that.
    Regards,
    Praveen

    Hi Praveen,
    For the time being you can edit the base line date for this document and adjust according to your requirement.
    And secondly, double click on the questioned customer line item and check which payment term is used in the line item.
    Then in OBB8, choose that payment term and remove value in additional months field.
    There must be value "6" in that field.
    SAVE and move the transport request to your production environment.
    Nest time onwards, this will work according to your requirement only.
    Regards,
    Srinu

Maybe you are looking for