Web Content Source- Ability to Crawl Links

We are attempting to crawl a web-based content source where the first two levels of the site have actual anchors that point to other .aspx pages. These crawl and index just fine. The third level contains pages where all anchors are actually JavaScript function calls that ultimately submit the page after a number of JavaScript-based calculations occur.
Our problem is that when SES crawler encounters this page, it does not seem to be capable of navigating the JavaScript-based links. Is this expected behavior, or have we not configured something correctly? If this is expected behavior, what are our options? I am confident we are not the first to run into this.

SES can follow simple Javascript-based links, but if they're too complicated it may not be able to.
Where is the actual content stored? Is it in a database, or or a file system? You may be better off crawling those instead of using a web source.
Is there any way you can produce a "map" page that points to all the other pages? If so, you just need to crawl this.

Similar Messages

  • Web Content Source - Not Working

    Created a new web-content source & full crawled it.
    When i searched with content text, not working.

    Hi ,
    According to your description, my understanding is that it did not work when you searched a new web site content source in SharePoint 2013.
    Please check whether the web site allows anonymous access to its content. If not, we need to configure a rule for the URL which allows us to enter the credential to access the content of site
    More information, please refer to the link:
    http://mohitvash.wordpress.com/2012/03/07/configure-external-site-as-content-sources-in-sharepoint-search/
    I hope this helps.
    Thanks,
    Wendy
    Wendy Li
    TechNet Community Support

  • Content Source Error - Unexpected Network Error

    A content source (file share) crawls zero items and the crawl log has one error:
    An unexpected network error occurred. (Exception from HRESULT: 0x8007003B)  I have a few crawl rules setup to exclude certain folders.

    I presume that your windows version is 2008 R2 , check the below post this HRESULT error is related to windows 2008 issue when accessing network drive with files larger than 100 MB
    http://social.technet.microsoft.com/Forums/en-US/968a04ed-472c-4265-a41e-c9e201ecb12e/error-0x8007003b-unexpected-network-error-at-copying-big-files-on-a-dfs-share?forum=winserverfiles
    http://support.microsoft.com/kb/983620
    Hope that helps|Amr Fouad|MCTS,MCPD sharePoint 2010

  • Folio Overlays-Web Content: Work in Container/Frame.

    By adding a check box to “Work in Container” in the Web Content window you could make it so that you never activate the full screen web browser in IOS or be redirected to a web browser in Android when navigating inside of a web content container / frame.  Tapping links would navigate you inside of the container / frame, not the full screen (native) browser.
    This would be so beneficial in many instances in keeping the user engaged and not making it feel like you have left the App.  This would allow greater control for say adding purchases to a cart.  Navigate to the next page of a website without leaving the current page of the App by being redirected to a full screen web browser that you have to tap Done to.  The user experience could become that much more fluid and engaging.  I still like the idea of the full page browser but it would be nice to keep the experience in the same screen as the App all be it a smaller screen or even a full page screen that I do not have to tap "Done" to.  I know you would not have navigation buttons like back but for what I want to accomplish I don’t need them.  There are far more positives than negative in adding such a feature.
    Thanks for reading.
    Sincerely,
    Ryan.

    I have found what was causing this:
    I used the example that is on the Apple developer site and they added a line in the <head> section that sets the viewport size and scale. As soon as I commented this line out, everything was fine!
    Thanks for the suggestions!

  • How can create webi content link in BI Workspace?

    Dear Expert,
    I'm facing problem with creating 2 webis content link in BI Workspace.
    I tried to follow this Creating a BI Workspace in BI4 Feature Pack 3.
    But I can not find "PARAMETER_OUT".
    Please help me to how can I create "PARAMETER_OUT" to connect with "PARAMETER_IN".
    Best regards,
    Chenna Yon

    I have created a "Scope Resource Filter"
    Knowledge Management --> Content-Management --> Global Services --> Resource Filters --> Scope Resource Filter
    "URL (Content Link) Mode (Documents/Web-Pages Only): *" <<exclude>>
    I have created a new crawler which uses this filter and re-indexed my content, but without success.
    I guess this is the right way, but It seems that I have forgot something...
    I would appreciate any help
    Thank you and best regards
    Khaled el Taki / Cologne, Germany

  • Content sources and Crawling rules

    Hi,
    I tried using "http://www.microsoft.com/downloads" as a content source. For some reason directories like  "http://www.microsoft.com/documentation" get crawled too. But I just want to crawl the subdirectories of the "downloads"
    directory...
    I think there should be a better way then excluding each wrong path with a crawl rule...
    What am I doing wrong?
    Looking forward to your reply.

    Hi Severin,
    If you set the content source using
    http://www.microsoft.com/downloads, SharePoint will crawl starting with the URL.However, if there is a link linked to other directories in a page which is under the URL, it will also crawl the directories.
    To avoid this issue, you need to limit page depth for crawling the content source. You can set page depth as Content source->[your content source]->Crawl settings.
    Best Regards,
    Wendy
    Wendy Li
    TechNet Community Support

  • Crawl Rule for Crawling Specific Page across all the site collection under one content source

    I have a MOSS 2007 web Application added to SharePoint 2010 Search Service Application Content Source, which is having 50+ Site Collections which follows same template. Every Site collection having one CustomPages Library and CustomPage.aspx. 
    If search Service would like to crawl only CustomPage.aspx from all the site collection under the web application, what would be the Path or Regular expression while creating the Crawl Rule. 
    i have given the path as  http://webapplication/sites/*/CustomPages/CustomPage.aspx, but this is not working. Can anyone help me out with correct path or regular expression in this case.
    Thanks..

    To crawl that page you'd also need to crawl the pages beneath it, otherwise SharePoint will never get to the page to check if it matches a rule. I assume you have some other rules that are blocking the rest of the content of the site?
    Try adding another rule that allows http://webapplication/sites/* then have your include rule beneath that and another exclude rule for
    http://webapplication/sites/*/* beneath that. That should eliminate nearly all the other content and provide you with your custompage.aspx results.

  • SSL Web Applications in Content Sources

    Hi there,
    I'm a bit stumped.  I have created two web applications - an Intranet and My Sites Host.  When I created them I selected the option on both for an SSL site.
    I have now set up the Search Service Application and added my site addresses to the "Local SharePoint Sites" content source.  When I start a full crawl, I get the following error:-
    This item could not be crawled because the repository did not respond within the specified timeout period. Try to crawl the repository at a later time, or increase the timeout value on the Proxy and Timeout page in search administration. You might
    also want to crawl this repository during off-peak usage times.
    Found some answers to that problem on the internet but none of them worked for me.
    So, I then created a new content source to crawl the "SharePoint - 80" web app which is an http app and the crawl was successful.
    Is there a little trick I am missing because my web apps are SSL?

    Open SharePoint Central Administration
    Select Manage service applications, from the Application Management section
    Select the Search Service Application. e.g. Search Service 1
    Select Farm Search Administration
    Toggle the Ignore SSL warnings to Yes
    http://www.infotext.com/help/sharepoint-could-not-estabilish-trust-relationship-for-the-ssltls-secure-channel-when-crawling-ssl-enabled-websites/

  • Modify Content Source Crawl Settings Programmatically

    Hi,
    I am working with SharePoint 2010 Search Administration project, in our project we need to create content sources from custom SharePoint web part. I am using SearchContext.ContentSources.Create() function to create the content source but the issue is
    that I want to set the Crawl Settings (Limit Page Depth & Limit Server Hops) values while or after creation programmtically.
    I know that it can be done from Central Administration and PowerShell commands but how can I can do it programmatically, Is there a way for that.
    Regards,
    Ehab
    Ehab

    You don't run PowerShell commands from the web part. (Technically you might be able to do it but it's not a good idea).
    First make the commands work from either PowerShell or find the alternatives for C#/VB.NET. The syntax is not identical but it's very similar and the object model is the same.
    Once you've got that working, port it to C# if it's not already there so it runs as a console app. Then bolt on the custom web part at the ednd.

  • Some processes (like iTunes, QTKitServer Safari Web Content) stop responding more and more often in OSX Mavericks. What can be the source of that?

    I have a rMBP mid-2012 with OSX 10.9 and some processes (e.g. iTunes, QTKitServer Safari Web Content) appear in red in the Activity Monitor, leading to high CPU temperature and low performance. What can be the source of this?

    See this link: https://discussions.apple.com/message/23933863#23933863
    What I do is turn on the Activity Monitor with a search for qt, and I continue to "force quit" the process.

  • Web Content Overlay Link to the HTML Resources

    Hi guys,
    Does anyone know what should go inside of "URL or File" field for the Web Content overlay if I need file from the HTMLResources.zip?
    DPS documentation is very unclear/confusing (e.g. there is no ".html" at the end):
    For the file inside of HTMLResources.zip with this path:
    HTMLResources.zip/Cartoons/train1.html
    Documentation (https://helpx.adobe.com/digital-publishing-suite/help/hyperlink-overlays.html#link_to_asse ts_in_the_html_resources_folder) has:
    Web Content overlays are nested two levels deeper than an HTML article. Example:
    <a href=”../../../HTMLResources/Cartoons/train1”>See Train Cartoon Gallery</a>
    I'm getting an error "File URL for web content is missing or does not exist" if I enter this inside of the "URL or File" field:
    ../../../HTMLResources/Cartoons/train1
    Thank you for your help,
    Gennady

    Thank you, joev and Bob,
    Update to the text below: I just realized that documentation explains how to call some file from the HTML content of the Web Content Overlay, not how to get the HTML content inside the Web Content Overlay. I can probably do redirect to the train1.html from plain HTML code inserted into Web Content Overlay. So there is no way to use link to the file as for the http://webServerPath/Cartoons/train1.html?
    We have 1 html file that uses 1 image file, both files inside the folder inside HTMLResources.zip and I need to display that html, so:
    HTMLResources.zip/Cartoons/train1.html
    HTMLResources.zip/Cartoons/train1.png - referenced inside train1.html
    If I use http URL, e.g.
    http://webServerPath/Cartoons/train1.html
    then everything works, if I try to set that URL to the HTMLResources:
    ../../../HTMLResources/Cartoons/train1
    or
    ../../../HTMLResources/Cartoons/train1.html
    I'm getting an error, even documentation explains what link should be used.
    Workaround when we just use the local HTML folder also works.
    Thank you!

  • Link from Web Content overlay to another article

    Thinking caps on, guys. I am trying to reduce the overall size of a folio that contains a pop-out panel (a two-state MSO) that reveals image thumbnails in a vertical scrolling content frame. Each thumbnail is a button that takes the user to another article in the folio. The panel is repeated across all articles, which is handy for the user but wasteful in file size.
    In theory, I ought to be able to reduce the file size of the pages containing these panels if I was to replace the vertical scrolling content frame with a tall Web Content frame that would point to an appropriately designed HTML doc in HTMLResources.zip. This would minimise repetition of all those thumbnails.
    The problem, if you haven't already guessed it, is that DPS does not allow you to link from a Web Content overlay to folio articles. The 'navto://' convention simply won't work in this scenario.
    Is there ANY way to jump a Web Content frame to other locations in the folio? A non-standard, non-supported workaround would do me just fine.
    Thanks for reading this far.
    Ali

    Ali - Check out the new "Linking" article in the Advanced Overlays issue of DPS Tips. It explains how to use an unsupported method for cross-folio linking. I explained how to create buttons with this format, but the same format would work using <a href="URL"> in HTML.
    Johannes also breaks down this method here:
    http://digitalpublishing.tumblr.com/post/26564811021/cross-publication-linking-bonus-qr-co des

  • How to find out web content files linked in folio through scripting

    Hi all,
    Please suggest me, how to find out web content files linked in folio through scripting.
    Regards,
    Moorthy

    @Moorthy – can you tell us a bit more? By mentioning "folio", I think you are referring to Adobe Digitial Publishing Suite (ADPS or short: DPS). If yes:
    1. Do you want to analyze Folio files *.folio and get the linked web content files?
    2. Or do you want to check an InDesign file with an overlay and check what files are linked as web content?
    3. Or something else?
    Where is your base problem?
    Packaging the InDesign files and copy/relinking the web content files after the packaging process?
    Uwe

  • Having installed thunderbird, and subsequently disinstalled, Safari cannot find Mail.app anymore. This is rather annoying, since I cannot send links or web contents anymore from winthin Safari. Is there any way to reinstall Safari or make Mail visible?

    Having installed thunderbird, and subsequently disinstalled it, Safari cannot find Mail.app anymore.
    This is rather annoying, since I cannot send links or web contents anymore from winthin Safari.
    Is there any way to reinstall Safari or make Mail visible to it?
    E.O.

    Open Mail's Preferences, and change the default email reader back to Mail

  • Url will use to include in crawl rules settings and in content source

    HI
    i have a internet application and i want to add some urls to crawl rules to not display in search results.
    below is  alternate access mappings settings , which has an url for Default zone , and one for Intranet zone
    which url i will use to include in crawl rules settings and in content source
    adil

    Hi,
    As I understand, you want to know which Url of the application should be added to the crawl rules setting and content
    source.
    I recommend you to use default Url.
    If you use non-Default zone Url, it will impact at query time by crawling the non-Default zone, because the query processor
    attempts to translate the URL to the Default zone's Public URL before processing the query.
    The article below is about Alternate Access Mappings (AAMs) *Explained.
    http://blogs.msdn.com/b/sharepoint_strategery/archive/2013/05/27/alternate-access-mappings-explained.aspx
    Best regards
    Sara Fan
    TechNet Community Support

Maybe you are looking for

  • Reversal of a maint. order which was released through a maintenance plan?

    Hi All, Is there a way to do reversal of a maintenance order that was released through a maintenance plan? Because I do not want to process the order; as such I would like to put a reversal indicator for that order and go ahead with the next order th

  • How to update my iPads ios

    Need to find how to update to at least iOS 5 to update current apps and download new ones

  • Photo to Facebook

    In order to post photos from "library" or the "camera roll" to Facebook on my iPhone I must remove the Facebook application, then re-install it. I can then upload my photos. Once I go offline, its like I lose my privilege to post, so I must remove an

  • Why cant i see print preview for HTML in English

    If I want to print a web page it does not appear in print preview nor will it print. It seems to have started this with FF5.0 Win 7, FF 5.0, 8meg ram

  • 8.6 -- 8.5.1

    Please convert to ver 8.5.1  Thanks, Kevin UTHSC-Houston Attachments: E709_All_VIs.vi ‏163 KB E709_Configuration_Setup.vi ‏86 KB E709_Sample_Application_1.vi ‏23 KB