Remote content crawler on a file directory in a different subnet

I'm trying to crawl a file directory that is on our company network but in a different subnet. It seems to be set up correctly, because I have managed to import most of the documents to the knowledge directory. However, when running the job a few times, sometimes it succeeds and sometimes it fails, without consistency. The main thing I notice is that it doesn't import the larger files (>5 MB), but our maximum allowed is 100 MB. Even when the job runs "successfully" there is a message in the job log:
Feb 21, 2006 12:08:14 PM- com.plumtree.openfoundation.util.XPNullPointerException: Error in function PTDataSource.ImportDocumentEx (vDocumentLocationBagAsXML == <?xml version="1.0" encoding="ucs-2"?><PTBAG V="1.1" xml:space="preserve"><S N="PTC_DOC_ID">s2dC33967209AEE4710C5ED073C04B3EDCF_1.pdf</S><I N="PTC_DTM_SECT">1000</I><I N="PTC_PBAGFORMAT">2000</I><S N="PTC_UNIQUE">\\10.105.1.33\digitaldocs\s2dC33967209AEE4710C5ED073C04B3EDCF_1.pdf</S><S N="PTC_CDLANG"></S><S N="PTC_FOLDER_NAME">s2dC33967209AEE4710C5ED073C04B3EDCF_1.pdf</S></PTBAG>, pDocumentType == com.plumtree.server.impl.directory.PTDocumentType@285d14, pCard == com.plumtree.server.impl.directory.PTCard@1f6ef01, bSummarize == false, pProvider == [email protected]4)ImportDocumentExfailed for document "s2dC33967209AEE4710C5ED073C04B3EDCF_1.pdf"
When the job fails, there is a different message:
*** Job Operation #1 failed: Crawl has timed out (exception java.lang.Exception: Too many empty batches.)(282610)
I tried increasing the time out periods for the crawler web service and the crawler job. That didn't seem to work. Any suggestions?

Hi Dave,
Did you fix this issue? I'm having the same error.
Thanks!

Similar Messages

  • Windows File CS Content Crawler - How to change crawl method?

    Hi,
    I have encountered an odd issue. I am using a Windows File CS Content Crawler to pull in some PDF files from a remote folder.
    However I noticed the Windows File CS Content Crawler is crawling in the document title rather than the actual PDF filename.
    I have checked the admin screens and I cannot find any way to tell my Windows File CS Content Crawler portlet to grab filenames and use these as the file display name when crawled in.
    The business clients do not use the document title and meta features and will not use them in the future so the way the current Windows File CS Content Crawler portlet crawls documents into the portal will not work as most will not have a document title or will have some default internal name.
    For example the PDF documents I am crawling in have Document Title's that have no relevance to the document filename.
    We are only interested in the filename and not the document title (however we may use the field in the future for meta information). Apart from doing some development work to make a custom crawler is there any other way to change the behaviour of the Windows File CS Content Crawler.
    I had to go into each document property in portal and modify the name field to the correct filename as it was using the value found in the Document Title field (incorrect).
    This seems wrong to me, I can understand using the meta data information contained in files to do the crawling and make searching better however how do you do this with a PDF file that you cannot search against inside portal anyway?!

    You can acomplish by changing the Global Document Property Map.
    I don't have the specific screen in front of me so this is from memory.
    Change the Name property to File name
    Change the Title property to File name
    You might have to experiment a bit with some test documents.

  • Windows File CS Content Crawler - Permission Issues

    Hi,
    Having a strange problem with a Windows File CS Content Crawler.
    We have setup a shared directory on a server (on the same subnet/network as the portal server) where users drop their files (Word, PDF, Excel etc).
    We have a Content Crawler job that crawls the files into Knowledge Directory, we then display the documents in a portal on one of our communities.
    When an administrator level user clicks on the document link in the portal, the document is opened correctly i.e. you get a prompt to open or save. When a normal non-admin user clicks on the document link they receive the following error:
    Error - Gateway was not able to access requested content. If the error persists, contact your portal administrator.
    When I reviewed PTSpy I found the following message:
    "user xxxxx does not have access to object id=232......."
    Anyone experienced the above and know how they resolved this?
    Thanks in advance,
    Tahir

    This is a permission problem, most likely on the data source that the crawler is using. Open the data source you used to create the crawler and see if its ID is 232. If it is, then you need to add whatever users/groups that should have access to the documents to the security ACL for the data source object. That will fix the problem.
    Does the error also say what class id the object is? That would let you determine exactly which object is causing the problem if it is not the data source.
    DJ Dewey | VHA Inc. | [email protected]

  • How to append a file directory and its contents in a jTextArea

    how will i display the contents of this file and file directory in a jTextArea or ...let's say it will display the content of "c:/AutoArchive2007-05-24.csv" in a jTextarea
    say this file "c:/AutoArchive2007-05-24.csv"contains the ff:details
    user:jeff AutoArchive-Process Started at Thu May 24 15:41:54 GMT+08:00 2007
    and ended at Thu May 24 15:41:54 GMT+08:00 2007
    Message was edited by:
    ryshi1264

    use the append(...) to append data to the text area.

  • I use FPT to try to view a directory on a remote server that contains 19000 files,  only 10000 are showing.  Please help.

    I use FPT to try to view a directory on a remote server that contains 19000 files,  but only 10000 are showing.  Please help.

    Hi Steve,
    You can be right that DISM don’t support the scenario that User profile and System are in the separated volumes.
    As a workaround, we should use the defaults to put the profile on C: during capture image, and then set the profile path to be on the D: drive after the image deploying.
    Please refer to below KB for how to using Sysprep to redirect profiles’ locations.
    Customize the default local user profile when preparing an image of Windows
    http://support.microsoft.com/kb/973289/en-hk
    Hope these could be helpful.
    Kate Li
    TechNet Community Support

  • System Profiler Content File Directory Location?

    Hello,
    I recently changed out my microprocessor in my PowerBook G3 because my old one was not working properly. All works fine now as it is smoothly running 10.4.11. However, I look in system profiler and the serial number has changed to the new microprocessor and does not match the sticker on the bottom on the machine. I managed to modify the "About This Mac" and "Login Window" serial number just fine, but I am stumped regarding system profiler. I did research and found that in 10.5, you can manually input the serial by finding a "SMBIOS" type file architecture in the extensions folder of System>Library. My question is:
    Where can I find the system files that give System Profiler its information. or Where can I find the equivalent in 10.4 to 10.5's SMBIOS system files.I appreciate any advice. Please indicate the entire file directory so I can easily find the location of the files I need to modify. Have a nice day.

    I suspect that System Profiler gets the serial number either by an internal call to the I/O registry which in turn has gotten it from the logic board firmware, or else directly from the firmware itself. If either of these mechanisms is correct, then there are NO files that carry the serial number, and therefore I think there is probably nothing you can change. This may be incorrect, but so far I've seen no evidence of a datafile that has this information.
    By way of background, here is my understanding. Hopefully I will be corrected if I am wrong:
    A file is a particular type of data structure, a collection of bytes assembled by the operating system in a specific way. Files usually reside on physical storage devices such as disk drives, and persist even after the computer is turned off. A user with sufficient privileges can generally access the information stored in a file.
    There are, however, other types of OS data structures in the "kernel," below the level of the user interface. These structures are not files - it is not that they are "hidden files" or "system files" - they are not files at all. These data structures exist in a protected space that users cannot alter.
    ioreg -l | grep IOPlatformSerialNumber | awk '{ print $4 }'
    I have no idea exactly "where" this piece of information is coming from, meaning the serial number that appears. It would be greatly appreciated if you could help me track down exactly where on the system the terminal command here is pulling this info from. It clearly gets it from some place, but it just may not be something we can see on the system as an object.
    According to the link I posted earlier, the ioreg terminal command directly displays (but cannot alter) the I/O registry. This data structure is not a file, and does not exist in the filesystem that you access via the GUI or via Terminal filesystem commands. Even worse, the link said:
    the Registry is not stored on disk or archived between boots. Instead, it is built at each system boot and resides in memory
    What this means is that the I/O Registry resides only in RAM - it is completely destroyed when you turn off the computer, and gets rebuilt when you start back up. And again, it is not a file.
    That's why I said "This doesn't sound promising"!
    there has to be a way to override the "connection" between the dynamic database itself and how it obtains information from it.
    If System Profiler.app does either read the I/O registry or read the firmware directly, then this instruction may be hardcoded in the app itself. I don't know this, but if so then I don't think there is any way to get it to look "somewhere else."
    Good luck, but I think this project is going to be "The Impossible Dream"

  • Unable to view the contents in NQS Config file

    Hi Forum,
    After I successfully install the OBI EE 10.1.3 Application, I am unable to view the contents in NQS Config file. It is opening in Notepad with error showing as "The filename, directory name or volume label syntax is incorrect."
    So please guide me in resolving this issue.
    Regards
    Cool j

    You probably have restricted access to the file. Check if the drive is ntfs. If the os user is in administrator group, you can give permissions in file properties.

  • Errors in Ultra Search crawler's log file

    Hi all,
    I'm trying to configure Ultra Search to do advanced search on a set of attributes of my Portal items.
    After executed Synchronization Schedule in the Ultra Search Administration Tool, I found these in the Crawler Progress Summary:
    Documents to Fetch 0
    Documents Fetched 284
    Document Fetch Failures 0
    Documents Rejected 0
    Documents Discovered 284
    Documents Indexed 59
    Documents non-indexable 225
    Document Conversion Failures 0
    The number of non-indexable documents are 225 while the indexed documents are only 59. Then I looked into the Crawler's log file and found that all the almost processes tried to index Portal items got errors. The errors look like:
    http://winas10g.tinhvan.com/pls/portal/PORTAL.wwsbr_srchxml.execute?p_action=generate_item&p_thingid=276935&p_siteid=753&p_result_lang=en: Portal server returned an error message.
    Documents to process = 0
    Error message returned from Portal server is: User-Defined Exception.
    The successful processed documents look like:
    Documents to process = 186
    Processing http://winas10g.tinhvan.com/pls/portal/PORTAL.wwsbr_srchxml.execute?p_action=generate_item&p_thingid=276386&p_siteid=753
    Total documents successfully processed = 86
    Documents to process = 185
    Then when I performed search, I just can find the text that appear directly on Portal pages with URL address format: http://host:port/pls/portal/url/PAGE/pagegroup_name/page_name.
    Is there any error with my Utra Search configuration here? How can I index all the Portal items along with their attributes?
    Thanks,
    Vietdt

    Liya
    The log file for the crawler schedule should be explicitly listed in the information for the schedule. Please check the information for that schedule.
    This log file will be located in the log directory that you've specified in the "crawler" tab.
    Please locate the log file and let us know what you see in that.
    thanks
    edward

  • File Adapetr: File is not picking from File directory

    Hello,
    Issue: we are facing issue with file sender adapter like file is not picking from source file directory.
    Scenario: FTP1>PI>FTP2
    1. SAP PI sender communication channel will pick the file from FTP1 and process to PI file server (NFS)
    2. from PI file server (NFS), file will process again to FTP2
    So while picking the file, it shows following error:
    Could not process due to error: java.lang.IllegalStateException: Error during RETR epilogue: com.sap.aii.adapter.file.ftp.FTPEx: 451 Transfer aborted. Broken pipe
    Conversion of file content to XML failed at position 0: sun.io.MalformedInputException
    Processing started
    this issue occur after we migrate the PI server from hpunix to aixunix.
    Best Regards,SARAN

    Hi,
    check the FCC parameters, fieldfixedlength option would be used when the exact length of field is constant in the Sender, In the FCC the fieldfixedlength parameter carries the length of the each.
    Check out Michal's blog on content conversion for all your doubts :The specified item was not found.
    Kindly check the below links for further assistance.
    http://help.sap.com/saphelp_nw04/helpdata/en/2c/181077dd7d6b4ea6a8029b20bf7e55/frameset.htm
    regards,
    ganesh.

  • How to write the JTables Content into the CSV File.

    Hi Friends
    I managed to write the Database records into the CSV Files. Now i would like to add the JTables contend into the CSV Files.
    I just add the Code which Used to write the Database records into the CSV Files.
    void exportApi()throws Exception
              try
                   PrintWriter writing= new PrintWriter(new FileWriter("Report.csv"));
                   System.out.println("Connected");
                   stexport=conn.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_UPDATABLE);
                   rsexport=stexport.executeQuery("Select * from IssuedBook ");
                   ResultSetMetaData md = rsexport.getMetaData();
                   int columns = md.getColumnCount();
                   String fieldNames[]={"No","Name","Author","Date","Id","Issued","Return"};
                   //write fields names
                   String rec = "";
                   for (int i=0; i < fieldNames.length; i++)
                        rec +='\"'+fieldNames[i]+'\"';
                        rec+=",";
                   if (rec.endsWith(",")) rec=rec.substring(0, (rec.length()-1));
                   writing.println(rec);
                   //write values from result set to file
                    rsexport.beforeFirst();
                   while(rsexport.next())
                        rec = "";
                         for (int i=1; i < (columns+1); i++)
                             try
                                    rec +="\""+rsexport.getString(i)+"\",";
                                    rec +="\""+rsexport.getInt(i)+"\",";
                             catch(SQLException sqle)
                                  // I would add this System.out.println("Exception in retrieval in for loop:\n"+sqle);
                         if (rec.endsWith(",")) rec=rec.substring(0,(rec.length()-1));
                        writing.println(rec);
                   writing.close();
         }With this Same code how to Write the JTable content into the CSV Files.
    Please tell me how to implement this.
    Thank you for your Service
    Jofin

    Hi Friends
    I just modified my code and tried according to your suggestion. But here it does not print the records inside CSV File. But when i use ResultSet it prints the Records inside the CSV. Now i want to Display only the JTable content.
    I am posting my code here. Please run this code and find the Report.csv file in your current Directory. and please help me to come out of this Problem.
    import javax.swing.*;
    import java.util.*;
    import java.io.*;
    import java.awt.*;
    import java.awt.event.*;
    import javax.swing.table.*;
    public class Exporting extends JDialog implements ActionListener
         private JRadioButton rby,rbn,rbr,rbnore,rbnorest;
         private ButtonGroup bg;
         private JPanel exportpanel;
         private JButton btnExpots;
         FileReader reading=null;
         FileWriter writing=null;
         JTable table;
         JScrollPane scroll;
         public Exporting()throws Exception
              setSize(550,450);
              setTitle("Export Results");
              this.setLocation(100,100);
              String Heading[]={"BOOK ID","NAME","AUTHOR","PRICE"};
              String records[][]={{"B0201","JAVA PROGRAMING","JAMES","1234.00"},
                               {"B0202","SERVLET PROGRAMING","GOSLIN","1425.00"},
                               {"B0203","PHP DEVELOPMENT","SUNITHA","123"},
                               {"B0204","PRIAM","SELVI","1354"},
                               {"B0205","JAVA PROGRAMING","JAMES","1234.00"},
                               {"B0206","SERVLET PROGRAMING","GOSLIN","1425.00"},
                               {"B0207","PHP DEVELOPMENT","SUNITHA","123"},
                               {"B0208","PRIAM","SELVI","1354"}};
              btnExpots= new JButton("Export");
              btnExpots.addActionListener(this);
              btnExpots.setBounds(140,200,60,25);
              table = new JTable();
              scroll=new JScrollPane(table);
              ((DefaultTableModel)table.getModel()).setDataVector(records,Heading);
              System.out.println(table.getModel());
              exportpanel= new JPanel();
              exportpanel.add(btnExpots,BorderLayout.SOUTH);
              exportpanel.add(scroll);
              getContentPane().add(exportpanel);
              setVisible(true);
          public void actionPerformed(ActionEvent ae)
              Object obj=ae.getSource();
              try {
              PrintWriter writing= new PrintWriter(new FileWriter("Report.csv"));
              if(obj==btnExpots)
                   for(int row=0;row<table.getRowCount();++row)
                             for(int col=0;col<table.getColumnCount();++col)
                                  Object ob=table.getValueAt(row,col);
                                  //exportApi(ob);
                                  System.out.println(ob);
                                  System.out.println("Connected");
                                  String fieldNames[]={"BOOK ID","NAME","AUTHOR","PRICE"};
                                  String rec = "";
                                  for (int i=0; i <fieldNames.length; i++)
                                       rec +='\"'+fieldNames[i]+'\"';
                                       rec+=",";
                                  if (rec.endsWith(",")) rec=rec.substring(0, (rec.length()-1));
                                  writing.println(rec);
                                  //write values from result set to file
                                   rec +="\""+ob+"\",";     
                                   if (rec.endsWith(",")) rec=rec.substring(0,(rec.length()-1));
                                   writing.println(rec);
                                   writing.close();
         catch(Exception ex)
              ex.printStackTrace();
         public static void main(String arg[]) throws Exception
              Exporting ex= new Exporting();
    }Could anyone Please modify my code and help me out.
    Thank you for your service
    Cheers
    Jofin

  • How to get the FILE COUNT from File directory

    Hello,
    i have to develop a scenario like, get  the file count from source file directory and validate whether the file count is 5 or not. if 5 files exist i need to process those 5 files to DB tables. if file count is not equal to 5 then i need to send a mail to customer that files are missed at source directory. (subject as files were missed at source directory and in content i need to display the file names exist at source file directory. So that missed file will be generated by the customer based on this mail).
    Could you please let me know how to get the count of files from source file directory. if it is possible only with UDF please provide the Java code
    Best Regards,
    SARAN

    Do these files have some fixed names?
    Can you try to use the option Advanced Selection For Source File to make XI  pick all 5 files in one shot?
    Check this blog on the same -
    /people/mickael.huchet/blog/2006/09/18/xipi-how-to-exclude-files-in-a-sender-file-adapter
    If this is not a option - BPM sounds the only possible way.
    Regards,
    Bhavesh

  • Accessing file directory objects in 9.1 and /WEB-INF/classes zip

    It appears that in WebLogic 9.1 that the contents of the /WEB-INF/classes directory is being zipped up and placed in the /WEB-INF/lib directory under some arbitrary name.
    Is there a way to tell weblogic not to do this, but leave the /WEB-INF/classes directory expanded as it was in weblogic 8?
    Is there a particular reason, developers should be aware of, to why this is being done in 9.1 (9.x?)?
    Background :
    This particular app has a set of several hundred xml files that describe all of the screens (and thus the forms) of the app. They are used not only in the generation of the actual jsps (and believe it or not the action class as well as other supporting class, the app is really an interface to a legacy backend) but are also packaged within the WAR for the dynamic configuration of plugins used for complex validation; a quasi 'rules' engine.
    While there are several different versions of the app, and thus several different versions of xml files, there is only one version of the rules engine.
    The problem that has arised when running on 9.1 is the plugins access to those xml files.
    The plugin attempts to load the xml files by creating File object for the directory containing the xml files, and then iterating through the contents of that directory.
    The xml files are packaged within the /WEB-INF/classes directory and are thus accessible using a simple resource look-up (in actuality, a 'token' xml file is specified, looked-up as a resource and then used to determine the parent file directory's url).
    This has worked well enough as 'most' servers deploy the contents of /WEB-INF/classes directory in an expandable fashion. Obviously, this strategy readly breaks when those same contents are jar'd and placed in the /lib directory.
    It is prefered to not have to maintain a cataloge or index of the xml files because of the volume of xml files, the multiple versions of the xml files, and of course the volitility of the xml files, although this is an obvious option.
    I personally have mixed feelings about using a parent directory reference to load a set of resource files within a j2ee app. If anyone has any other suggestions, I would greatly appreciate it!
    Thanks
    Andrew

    Hi,
    Usually, the best approach would be to just to load the resources as InputStream and have a catalog (and I know this is what you do not want to do :-) So the only hacky workaround that I can think of would be to use something like Jakarta Commons Virtual File System (http://jakarta.apache.org/commons/vfs/) and read the .zip
    Regards,
    LG

  • FTP to Read content of Text/xml file

    Hi,
    I need a help for reading content of text/xml file through FTP. Below I just am explaining the scenario.
    Our application server is in UNIX. Now we have to run a report program to access in to a FTP server which is in windows platform. Using FTP_CONNECT, FTP_COMMAND, FTP_DISCONNECT we are able to connect to FTP server and also able to copy files from FTP server to SAP application server. After copied in application server, we are able to read the content of the txt file in to internal table in ABAP program using OPEN DATASET. But our requirement is that we want to read the text or xml file content into internal table while accessing into FTP server from SAP application instead of after copying the file into application server.
    So please help me to solve my problem.
    -Pk

    Thank you Bala,
    But can you help me what should I pass against FNAME and CHARACTER_MODE Import parameter? Should I pass the full path of the file with name or only I have to pass file name ? For example if my text file name in FTP Server is test.txt and the IP of the ftp server is 10.10.2.3 then should I pass the value against FNAME as '
    10.10.2.3\xyz\text.txt' ? Here xyz is the name of the directory in C drive where test.txt is exist.
    Please help me.
    -pk

  • File directory on a JSP

    Hi,
    I need to put a file directory on a jsp page. Does anyone know how to do that or have an example?
    thanks,
    Marc

    Do you mean you want to display the contents of a directory, like Apache or Tomcat does? I'd read up on the File API first. Basically you'll want to use the list() or listFiles() method.

  • Show remote content in restored ldif address book

    Some while way back when I created & exported my address book as an LDIF file, lots of these links had the option for show remote content checked. As the list has grown the file has been re-exported as LDIF to keep it up to date
    When I import the file for a clean profile, the remote content check box's have been reset. Needless to say as the list grows the number of contacts to reset also.
    Is this unchecked by default for security reasons,
    Setting stored in some other file?
    Any way to import with this setting in tact

    It is true that the remote content setting is cleared when an LDIF file is imported, for reasons I can't explain, but if an address book is exported to a .mab file, and then re-imported, the setting is retained. Export and import of address books in mab format is provided with the [https://freeshell.de//~kaosmos/morecols-en.html MoreFunctionsForAddressBook] add-on. Tested and working on my system.
    http://chrisramsden.vfast.co.uk/3_How_to_install_Add-ons_in_Thunderbird.html

Maybe you are looking for