Main.html to index.html

Hi there, I'm in a bit of a rush to get my website up and running, and I have published the site through Flash Catalyst and uploaded the final site to my website server.
Flash Catalyst publishes the final website as Main.html so I renamed it to index.html for the server to recognise, but the site doesn't seem to load.
Is there a way for Catalyst to publish with index.html instead of main.html
Thanks
Arvind!

yeah the website is:
www.sondproductions.com
But now I have another problem, when you first load the website, the centre cube swf file starts on the right hand side of the screen and not in the centre.
When the swf animation finishes then it switches back to the centre, I dont suppose you know why this do you?
Arvind!

Similar Messages

  • Can't link to index.html

    (Domains and project names are redacted.)
    I am publishing a Doxygen HTML tree using OS X Server (Mountain Lion). I have "Websites" on, and in "Server Website," I have aliased "/MyProject" to "/Users/fritza/Department/Project/docs/html". "index.html" is the first in the list (I have not touched it) of index file names.
    There is an index.html file in that directory. Accessing http://my.example.com/MyProject gives me the Doxygen main page, and I can navigate from there. Good.
    But there is a "Main Page" link on each page, which is coded as:
    <a href="index.html"><span>Main&#160;Page</span></a>
    An HREF to "index.html" should be no problem, right? (The website's tree is flat, by the way.)
    Instead, when I click the link, I get a 404 page, and the complaint that /MyProject/index.html/ (note the trailing slash) is not found. This happens even if you click the link as it appears in index.html itself.
    Linking to index.html, the file, seems like an elementary thing to do. Other people must be getting the correct behavior. What have I done wrong?

    Logo is what file type?
    What is the URL to your online problem page?
    Nancy O.

  • Can i embed HTML file into HTML file?

    Recently i learn to design and code my website from scratch with some help from my friend and asking here and there.
    I want to make it easier for editing so i make header and footer. At first i want to embed header.html and footer.html into index.html, as my friend said index.html is more google friendly than index.php
    I can't find the answer anywhere for embedding HTML into HTML, is there a way?
    My website is at http://www.baliweddingphoto.com
    Probably you guys can check and help me embed .html into .html
    Thanks before.

    Already did, i'm glad to hear from MurraySummers and Rob Hecker2 that no differences between index.html and index.php, so i will stick on using my index.php
    Now i can sleep well and continue my code learning without doubt
    Thanks Guys.

  • KM Document iView - index.html and main.css not properly displayed

    Hello,
    as a test we have put two files in the /documents repository in KM :
    a) index.html
    <head>
    <link rel="stylesheet" type="text/css" href="./main.css"/>
    </head>
    <table width="92%" bgcolor="#FFFFFF">
      <tr align="left" valign="top">
        <td> </td>
        <td colspan="5"><table width="100%" border="0" cellpadding="5" cellspacing="0">
            <tr valign="middle">
              <td width="85" bgcolor="#C7D9E9"> <p><b>Top Links</b></p></td>
              <td width="125" class="document-list"><a href="impax.html">IMPAX Client
                </a> </td>
              <td width="125" class="document-list"><a href="talkstation.html">TalkStation</a></td>
              <td width="125" class="document-list"><a href="ris.html">RIS</a></td>
              <td width="125" class="document-list"><a href="connectivity.html">Connectivity
                Manager</a></td>
              <td width="125" class="document-list"><a href="impax.html">IMPAX Server</a></td>
            </tr>
          </table></td>
      </tr>
    </table>
    b) main.css
    A:visited
        color: #264560
    A:active
        color: #12212E
    A:hover
        color: #14623D
    A
        color: #336699
    table
        margin-top: 0px;
        margin-bottom: 0px;
    p
        color:#000000;
         font-family: Arial, Helvetica, sans-serif;
         margin-bottom: 0px;
        margin-top: 5px;
         font-size: 12px;
    .document-list
        background-color:#C7D9E9;
        font-family: Arial, Helvetica, sans-serif;
        font-size: 12px;
        font-color: #000000
        margin-bottom:3px;
    When going to Content Administration -> KM Content -> Documents and clicking the index.html file, the css file is taken into account, when i.e hovering over the IMPAX hyperlink, the path is http://<host>:<port>/irj/go/km/docs/documents/impax.html and the impax.html page is displayed when clicked.
    However, when creating a KM Document iView (with or without content filter) pointing to /documents/index.html and displaying the iView, the style sheet is ignored, and the same hyperlink as above now refers to http://<host>:<port>/irj/servlet/prt/portal/prtroot/impax.html, which is incorrect.
    -> How can this behaviour be explained?
    -> When creating an URL iView pointing to /irj/go/km/docs/Agfa_Knowledgebase/index.html , everything works as expected.
    Thanks for the help -

    Hi,
    You should correct the path to your css file in your index.html:
    href="/irj/go/km/docs/documents/main.css"
    Regards,
    Praveen Gudapati

  • Index.html pages in local sub-directories will not load automatically when sub-directory name entered into address field, followed by a forward-slash

    I am in the process of designing a website-on-a-CD. In order to make things easier for my clients, I decided to organize this site into local sub-directories with an index.html page in each one. However, when I try to enter the local sub-directory name, followed by a forward-slash, the index page does not open automatically. Instead, I get a raw directory listing that includes the index.html file, and anything else that is present within it. I would like to know if there is any way to force this page to load automatically within Firefox from a local storage medium as it would load from a web-server.

    You need to use file:// as the protocol instead of C:/. The latter may never work (C:\ might).
    Where is the main HTML located ?
    Easiest if to store the images in a sub folder of the location because you can't go back via ../ beyond that root location for security reasons.
    See:
    *http://kb.mozillazine.org/Links_to_local_pages_do_not_work

  • Blog Summary page as index.html

    hi,
    my web hosting server have index.html as my main page.
    is there any way i could have my blog summary page as my index page?
    all i want is the blog as my entire site, without any other pages.
    thnx

    moved the Blog page to the top of the tree.

  • Problems with index.html

    When I first started using dreamweaver mx 2004 I created my
    site in what I now think might be the wrong way but that was what
    the books that I was using at the time said to do.
    I created a site folder and then I created the homepage. I
    then created two more folders inside the site folder, one called
    text_copy and one called images_copy (in a different website I
    created everything inside the main site folder-images,homepage and
    text files all live together).
    This original method of separating everything has come back
    to haunt me.
    Dreamweaver wants your homepage called index.html and when I
    type in my domain-
    dimension x toys
    the homepage comes up. But the problem is none of the
    javascript pull down menus that I have created work on the
    homepage. They work on all of the other pages in the site but not
    this one. I created another page called index and set it inside the
    main text_copy folder. This page-
    index2
    has the pull downs work but it cannot be accessed as the
    homepage.
    Is the only alternative to
    1. re-create the entire site
    2. move all the information from the other folders out into
    the main folder.
    I hope there is a solution with what I currently have because
    the website is a lot of pages. Option 2 is less work but all of the
    links to the other files and images would be broken and it still
    while less, would be a considerable amount of work as I have a
    great many pages to the website.
    Thank you for your time.
    harlan1s

    you have to change the paths for the root level index page.
    The links won't
    have the same relative path as the ones that are in the
    subfolder.
    did you just copy/paste the menu onto the index page from a
    page in the
    subfolder? Or do a "save as"?
    Look in the head section of the code.
    The sections of javascript that start with mm_menu are the
    instructions for
    how that menu works.
    mm_menu_1215135054_0.addMenuItem("13 INCH FIGURES","location='../t
    ext_copy/batman13family.html'");
    for the root level homepage index.html file, that link should
    be:
    mm_menu_1215135054_0.addMenuItem("13 INCH FIGURES","location='text
    _copy/batman13family.html'");
    Look at the location= part.
    notice i took out ../
    and actually- for the menu on pages IN the text_copy folder
    linking to other
    pages IN the text_copy folder- the correct relative path
    would be:
    mm_menu_1215135054_0.addMenuItem("13 INCH FIGURES","location='batm
    an13family.html'");
    There's no need to go Up a folder, then go down into the same
    current
    folder.
    another way to do it, so it's the same on all the pages,
    would be to use the
    full absolute http path.
    mm_menu_1215135054_0.addMenuItem("13 INCH FIGURES","location='http
    ://www.dimensionxtoys.com/text_copy/batman13family.html'");
    note- those menus have problems and are a maintenence
    nightmare. A menu
    built from tutorials and free extensions at
    http://projectseven.com would
    make life easier and work better.

  • Removing index.html from home page / remove .html tags from all other pages

    Hello All,
    I was wondering if there is a way to remove the index.html from my main page, so that www.mypage.com/index.html is simply www.mypage.com/. This also applies to all the other pages in the site, but only for the .html extension: www.mypage.com/contact.html becomes www.mypage.com/contact. Almost all sites I know do this, but I can't figure out the technique beyond working some crazy voodoo with Apache.
    Thanks in advance for your help!

    if 'index.html' is in your HOST server's default filename list, then you can omit its name from any link, e.g.,
    <a href="/">Home</a>
    That link will cause the server to load the default file found in the root folder of the site.
    This is actually the preferred way to link to your home page.

  • Index.html Showing in Address Bar

    Hi everyone,
    This is the first time that this happens to me, either I forgot how to fix it or it is a new Markup that is doing it.
    In my website, each main page is called index.html and located inside its own folder ... For example:
    Folder: about
                        -index.html
    Folder: products
                        -index,html
    Now everything uploads and works perfect, except that once I click back to home page, the URL shows: www.mysite.com/index.html  and if I click on About us it shows: www.mysite.com/about/index.html
    Normally it should just have www.mysite.com/about/
    right?
    Is there anything that I'm supposed to do to fix this?
    I was probably thinking maybe my links should be linking to the folder only (../about/) instead of linking to the page itself.
    As we speak, my site just finished uploading with this method mentioned and it works perfectly.
    But Now the question is.... is that the right thing to do?
    Thank you.

    Ok, here we go, now my sites are live and everything works perfect.
    Just so I'm clear here: I have the about us page called index inside a
    folder called about.
    So instead of linking to the index.html, my link was "as Mark said"
    ../about/   so my URL becomes something like:
    www.mysite.com/about/
    As for Subdomains, Murray is right, you never never never ever  break your
    pages that way. You won't have any SEO working accordingly. Subdomains are
    used for something that is totaly away from the concept of you main domain.
    So if you have a website where you sell DVDs called www.yourdvdsite.com
    and then you want to create a website that sells poster of your dvds but you
    don't want people to get confused at the same time you want them to know
    that it is one company - one concept but 2 different products - then you use
    subdomains.  With this case SEO reads 2 different area the posters and the
    dvds each one in its own.
    So imagine if you have subdomains of about.site.com and
    products.site.com you will definitely loose ranking and everything
    that helps on SEO.
    Thank you guys for all the input.
    Summary and Conclusion:
    Your website URL is http://www.site.com
    if you want your other main pages' URLs showing in a clean manner in the
    address bar, just create a folder in you Root folder and call it whatever
    you want (for example: products or about or contact)
    Then inside of this folder, create a new html page and call it index.html
    Now in the other pages, if you want to link to that new page, the link will
    be ../about/ (DON'T LINK DIRECTLY TO THE index.html)
    so in the address bar, you will see http://www.site.com/about/
    If you link directly to the index.html the URL will show
    http://www.site.com/about/index.html
    And you really don't want your savvy visitors to see that do you?
    HOME PAGE ISSUE:
    Now in you other pages, you definitely have a link to the home page, of
    course your home page is seating in the Root folder and it's called
    index.html.  YES, when you click the URL will show:
    http://www.site.com/index.html
    Try 2 methods, if you site is still in your local drive, replace the link
    "that is linking to the home page index.html" as       ../          yes 2
    dots and a forward slash (this works for me on CS4.
    or just type in your site's URL (in this example http://www.site.com/)

  • Lost the "Over" stste of my buttons on my index.html page only.

    I must have done something to my button set on my home page for www.carpenterslocal19.org
    because on mouseover I don't get the "Over" state.  The buttons still function as hypelinks but no change to the over state.
    All the other pages are okay, just index.html and index.htm.

    Hi Joe -
    That extra line break in your code is still present on the server.
    I it were my page, the time it takes to prowl through every character in the code is not worth it.
    I would make a copy.of a working page, correct the title tag, cut and paste only the content table from
    the troubled page into the new page.
    If that's not clear, you may call me at 914-941-3616

  • Lost my index.html

    I lost my "index.html" on my iDisk .
    When i go to my iDisk, escapade/Web/Sites/ i saw the folder "iWeb" everethig inside is correct but can see my index.html? it's a Safari logo.
    When i rigth on Safari my adress, web/mac.com/escapade saw nothing, before on Safari rigth only this and saw all complete adres like
    web.mac.com/escapade/iWeb/agent/Bienvenue.html
    Thank's to take your time to help me.
    Michel

    You might have to do a "Publish All to .Mac" to get the index.html file back. How did it disappear in the first place? Any idea?

  • Does DW recognize default.htm / index.html?

    I'm new to Dreamweaver and am working on an existing web
    site. Web servers typically load up a "default.htm" or "index.html"
    page when you give it a URL with just the directory name, e.g.
    http://foo.bar/about/. Most of
    the links in my site are set up that way. How can I get Dreamweaver
    to recognize that <a href="/about/"> is not a broken link? Or
    does it automatically know to look for default.htm?

    He is gaining a URL that never changes, according to
    Micha.... 8)
    Murray --- ICQ 71997575
    Adobe Community Expert
    (If you *MUST* email me, don't LAUGH when you do so!)
    ==================
    http://www.dreamweavermx-templates.com
    - Template Triage!
    http://www.projectseven.com/go
    - DW FAQs, Tutorials & Resources
    http://www.dwfaq.com - DW FAQs,
    Tutorials & Resources
    http://www.macromedia.com/support/search/
    - Macromedia (MM) Technotes
    ==================
    "Paul Whitham AdobeCommunityExpert"
    <[email protected]> wrote in message
    news:erafv6$35u$[email protected]..
    > You are actually making your server work harder by
    ommitting the full file
    > reference and I can;t really see what you are gaining.
    You can define a
    > homepage within DW but that is only one for the site and
    not for each
    > folder.
    >
    > --
    > Paul Whitham
    > Certified Dreamweaver MX2004 Professional
    > Adobe Community Expert - Dreamweaver
    >
    > Valleybiz Internet Design
    > www.valleybiz.net
    >
    > "iganpo" <[email protected]> wrote in
    message
    > news:er8phi$1f7$[email protected]..
    >> I'm new to Dreamweaver and am working on an existing
    web site. Web
    >> servers
    >> typically load up a "default.htm" or "index.html"
    page when you give it a
    >> URL
    >> with just the directory name, e.g.
    http://foo.bar/about/. Most of
    the
    >> links in
    >> my site are set up that way. How can I get
    Dreamweaver to recognize that
    >> <a
    >> href="/about/"> is not a broken link? Or does it
    automatically know to
    >> look
    >> for default.htm?
    >>
    >
    >

  • SAPUI5 in SAP MII 14.0 - error on running index.html

    Dear All,
    I am working on SAP MII 14.0.
    I am trying to run a sample example on SAPUI5 implementation in SAP MII 14.0 taken from link: http://scn.sap.com/community/manufacturing/mii/blog/2013/03/21/making-engaging-ui-on-sap-mii-with-sapui5
    But I got an Error when I tested index.html page !!
    Problem Description:
    My index.html code is :
    <!DOCTYPE HTML>  
    <html><head>
    <meta http-equiv="X-UA-Compatible" content="IE=edge"> 
    <script src="/sapui5/resources/sap-ui-core.js" 
                          id="sap-ui-bootstrap“  type="text/javascript" 
                          data-sap-ui-libs="sap.ui.commons,sap.ui.table,sap.viz, sap.ui.ux3" 
                          data-sap-ui-theme="sap_goldreflection" > 
    </script>
      </head>  
          <body class="sapUiBody" role="application">  
          <div id='plantkpiDiv'></div>
    <script>  
      alert("1"); // this comes
      //register the application  
    jQuery.sap.registerModulePath("kpidashboard", "/XMII/CM/547555/SAPUI5/kpidashboard/webcontent"); 
      alert("2");   // this comes
    //instantiate the view
      var plantView = sap.ui.view({id:"idPlantView", viewName:"kpidashboard.PlantView", type:sap.ui.core.mvc.ViewType.JS});  
       //add the view to the div 
    alert("3"); // this does not come
      plantView.placeAt("plantkpiDiv");    
       </script> 
           </body>
    </html>  
    Folder Structure in Workbench is as follows:  /XMII/CM/547555/SAPUI5/kpidashboard/webcontent
    Inside webcontent I have created three files i.e. PlantView.controller.js , PlantView.view.js and index.html.
    Webpage error details
    User Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0)
    Timestamp: Tue, 11 Mar 2014 10:32:30 UTC
    Message: Unterminated string constant
    Line: 13
    Char: 260
    Code: 0
    URI: http://inpuneme01:50200/XMII/CM/547555/SAPUI5/kpidashboard/webcontent/index.html?JSESSIONID=G59R-clhqrId7QDW_a_VBOSXQqyvRAEC2ZsB_SAPqh1fgxWruoAQAKceTGLKZ-6J
    Message: failed to load 'kpidashboard.PlantView.view' from /XMII/CM/547555/SAPUI5/kpidashboard/webcontent/PlantView.view.js: SyntaxError: Unterminated string constant
    Line: 41
    Char: 11332
    Code: 0
    URI: http://inpuneme01:50200/sapui5/resources/sap-ui-core.js
    Thanks and Regards,
    Anshul Arora

    Hi Rohit,
    I checked and found that PCo Mgmt service was not started. SO I started it and now I am able to get the XML when I open PCO Mgmt URL in the browser of PCo Server
    But,
    When my agent is running, I don't get Browse button enabled in "Subscription Items" tab.
    When I stop the agent instance, I can see Browse button enabled in "Subscription Items" tab. But when I click on browse, it gives me following error:
    Not sure what's wrong?
    Soham

  • Index.html - SWF file not working in browser

    The navigation button on my Flash/SWF intro page (www.benscarrillustration.com) isn't working in web browsers. What am not doing correctly.
    I have the following 3 files in my cPanel File Manager....
    1. index.html
    2. Ben Scarr Illustration.swf
    3. Ben Scarr illustration.html
    <!DOCTYPE html>
    <html>
        <head>
            <meta charset="UTF-8">
            <link rel="shortcut icon" href="/avatar.ico" type="image/x-icon">
            <title>Ben Scarr Illustration</title>
            <style type="text/css" media="screen">
            html, body { height:100%; background-color: #ffffff;}
            body { margin:0; padding:0; overflow:hidden; }
            #flashContent { width:1214px; height:100%; margin:auto;}
            </style>
        </head>
        <body>
            <div id="flashContent">
                <object type="application/x-shockwave-flash" data="Ben Scarr Illustration.swf" width="1214" height="717" id="Ben Scarr Illustration" style="float: none; vertical-align:middle">
                    <param name="movie" value="Ben Scarr Illustration.swf" />
                    <param name="quality" value="high" />
                    <param name="bgcolor" value="#ffffff" />
                    <param name="play" value="true" />
                    <param name="loop" value="true" />
                    <param name="wmode" value="window" />
                    <param name="scale" value="showall" />
                    <param name="menu" value="true" />
                    <param name="devicefont" value="false" />
                    <param name="salign" value="" />
                    <param name="allowScriptAccess" value="sameDomain" />
                    <a href="http://www.adobe.com/go/getflash">
                        <img src="http://www.adobe.com/images/shared/download_buttons/get_flash_player.gif" alt="Get Adobe Flash player" />
                    </a>
                </object>
            </div>
        </body>
    </html>

    Thanks for the information about Flash no longer being the best or contemporary way to design web sites. This shows me how long I've been away from designing them.
    I will redesign my web site in Dreamweaver very soon, as I would like it to play on iOS devices.
    But at the moment I would just like to see my Flash page working in a browser.
    I have inserted my Flash page in to Dreamweaver, and new code has been automatically added. But still the navigation button doesn't work.
    <!DOCTYPE html>
    <html>
        <head>
            <meta charset="UTF-8">
            <link rel="shortcut icon" href="/avatar.ico" type="image/x-icon">
            <title>Ben Scarr Illustration</title>
            <style type="text/css" media="screen">
            html, body { height:100%; background-color: #ffffff;}
            body { margin:0; padding:0; overflow:hidden; }
            #flashContent { width:1214px; height:100%; margin:auto;}
            </style>
        <script src="Scripts/swfobject_modified.js" type="text/javascript"></script>
        </head>
        <body>
            <div id="flashContent">
                <object type="application/x-shockwave-flash" data="benscarrillustration.swf" width="1214" height="717" id="benscarrillustration" style="float: none; vertical-align:middle">
                    <param name="movie" value="benscarrillustration.swf" />
                    <param name="quality" value="high" />
                    <param name="bgcolor" value="#ffffff" />
                    <param name="play" value="true" />
                    <param name="loop" value="true" />
                    <param name="wmode" value="window" />
                    <param name="scale" value="showall" />
                    <param name="menu" value="true" />
                    <param name="devicefont" value="false" />
                    <param name="salign" value="" />
                    <param name="allowScriptAccess" value="sameDomain" />
                    <embed src="benscarrillustration.html" width="1214" height="717" quality="high" bgcolor="#ffffff" play="true" loop="true" wmode="window" scale="showall" menu="true" devicefont="false" salign="" allowscriptaccess="sameDomain"><noembed><img src="http://www.adobe.com/images/shared/download_buttons/get_flash_player.gif" alt="Get Adobe Flash player" /></noembed></embed>
                    <a href="http://www.adobe.com/go/getflash">
                    </a>
                </object>
                <object classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" width="1214" height="717" id="FlashID" title="benscarrillustration.swf">
                  <param name="movie" value="benscarrillustration.swf" />
                  <param name="quality" value="high" />
                  <param name="wmode" value="opaque" />
                  <param name="swfversion" value="6.0.65.0" />
                  <!-- This param tag prompts users with Flash Player 6.0 r65 and higher to download the latest version of Flash Player. Delete it if you don’t want users to see the prompt. -->
                  <param name="expressinstall" value="Scripts/expressInstall.swf" />
                  <!-- Next object tag is for non-IE browsers. So hide it from IE using IECC. -->
                  <!--[if !IE]>-->
                  <object type="application/x-shockwave-flash" data="benscarrillustration.swf" width="1214" height="717">
                    <!--<![endif]-->
                    <param name="quality" value="high" />
                    <param name="wmode" value="opaque" />
                    <param name="swfversion" value="6.0.65.0" />
                    <param name="expressinstall" value="Scripts/expressInstall.swf" />
                    <!-- The browser displays the following alternative content for users with Flash Player 6.0 and older. -->
                    <div>
                      <h4>Content on this page requires a newer version of Adobe Flash Player.</h4>
                      <p><a href="http://www.adobe.com/go/getflashplayer"><img src="http://www.adobe.com/images/shared/download_buttons/get_flash_player.gif" alt="Get Adobe Flash player" width="112" height="33" /></a></p>
                    </div>
                    <!--[if !IE]>-->
                  </object>
                  <!--<![endif]-->
              </object>
            </div>
        <script type="text/javascript">
    swfobject.registerObject("FlashID");
            </script>
        </body>
    </html>

  • How ias integrate with Snacktory for getting main text from an html page

    Hi All,
    i am new to endeca and ias, i have an requirement, need to get main text from whole html page before ias save text to Endeca_Document_Text property,
    as ias save all text in page to endeca_document_text property, it is not ok for reading when show in web page, i use an third party API to filter out the main text from original page,
    now i want to save these text to endeca_document_text property,
    an another question,
    i get zero page when doing the logic of filtering main text from original html text in ParseFilter( HTMLMetatagFilter implements ParseFilter) using Snacktory.
    if only do little things, it will work fine, if do more thing, clawer fail to crawl page. any one know how to fix it.
    log for clawler.
    Successfully set recordstore configuration.
    INFO    2013-09-03 00:56:42,743    0    com.endeca.eidi.web.Main    [main]    Reading seed URLs from: /home/oracle/oracle/endeca/IAS/3.0.0/sample/myfirstcrawl/conf/endeca.lst
    INFO    2013-09-03 00:56:42,744    1    com.endeca.eidi.web.Main    [main]    Seed URLs: [http://www.liferay.com/community/forums/-/message_boards/category/]
    INFO    2013-09-03 00:56:43,497    754    com.endeca.eidi.web.db.CrawlDbFactory    [main]    Initialized crawldb: com.endeca.eidi.web.db.BufferedDerbyCrawlDb
    INFO    2013-09-03 00:56:43,498    755    com.endeca.eidi.web.Crawler    [main]    Using executor settings: numThreads = 100, maxThreadsPerHost=1
    INFO    2013-09-03 00:56:44,163    1420    com.endeca.eidi.web.Crawler    [main]    Fetching seed URLs.
    INFO    2013-09-03 00:56:46,519    3776    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-1]    come into EndecaHtmlParser getParse
    INFO    2013-09-03 00:56:46,519    3776    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-1]    come into HTMLMetatagFilter
    INFO    2013-09-03 00:56:46,519    3776    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-1]    meta tag viewport ==minimum-scale=1.0, width=device-width
    INFO    2013-09-03 00:56:52,889    10146    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-1]    come into EndecaHtmlParser getParse
    INFO    2013-09-03 00:56:52,889    10146    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-1]    come into HTMLMetatagFilter
    INFO    2013-09-03 00:56:52,890    10147    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-1]    meta tag viewport ==minimum-scale=1.0, width=device-width
    INFO    2013-09-03 00:56:59,184    16441    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-2]    come into EndecaHtmlParser getParse
    INFO    2013-09-03 00:56:59,185    16442    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-2]    come into HTMLMetatagFilter
    INFO    2013-09-03 00:56:59,185    16442    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-2]    meta tag viewport ==minimum-scale=1.0, width=device-width
    INFO    2013-09-03 00:57:07,057    24314    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-2]    come into EndecaHtmlParser getParse
    INFO    2013-09-03 00:57:07,057    24314    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-2]    come into HTMLMetatagFilter
    INFO    2013-09-03 00:57:07,057    24314    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-2]    meta tag viewport ==minimum-scale=1.0, width=device-width
    INFO    2013-09-03 00:57:07,058    24315    com.endeca.eidi.web.Crawler    [main]    Seeds complete.
    INFO    2013-09-03 00:57:07,090    24347    com.endeca.eidi.web.Crawler    [main]    Starting crawler shut down
    INFO    2013-09-03 00:57:07,095    24352    com.endeca.eidi.web.Crawler    [main]    Waiting for running threads to complete
    INFO    2013-09-03 00:57:07,095    24352    com.endeca.eidi.web.Crawler    [main]    Progress: Level: Cumulative crawl summary (level)
    INFO    2013-09-03 00:57:07,095    24352    com.endeca.eidi.web.Crawler    [main]    host-summary: www.liferay.com to depth 1
    host    depth    completed    total    blocks
    www.liferay.com    0    0    1    1
    www.liferay.com    1    0    0    0
    www.liferay.com    all    0    1    1
    INFO    2013-09-03 00:57:07,096    24353    com.endeca.eidi.web.Crawler    [main]    host-summary: total crawled: 0 completed. 1 total.
    INFO    2013-09-03 00:57:07,096    24353    com.endeca.eidi.web.Crawler    [main]    Shutting down CrawlDb
    INFO    2013-09-03 00:57:07,160    24417    com.endeca.eidi.web.Crawler    [main]    Progress: Host: Cumulative crawl summary (host)
    INFO    2013-09-03 00:57:07,162    24419    com.endeca.eidi.web.Crawler    [main]   Host: www.liferay.com:  0 fetched. 0.0 mB. 0 records. 0 redirected. 4 retried. 0 gone. 0 filtered.
    INFO    2013-09-03 00:57:07,162    24419    com.endeca.eidi.web.Crawler    [main]    Progress: Perf: All (cumulative) 23.6s. 0.0 Pages/s. 0.0 kB/s. 0 fetched. 0.0 mB. 0 records. 0 redirected. 4 retried. 0 gone. 0 filtered.
    INFO    2013-09-03 00:57:07,162    24419    com.endeca.eidi.web.Crawler    [main]    Crawl complete.
    ~/oracle/endeca
    -======================================
    source code for parsefilter
    package com.endeca.eidi.web.parse;
    import java.util.Map;
    import java.util.Properties;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.log4j.Logger;
    import org.apache.nutch.metadata.Metadata;
    import org.apache.nutch.parse.HTMLMetaTags;
    import org.apache.nutch.parse.Parse;
    import org.apache.nutch.parse.ParseData;
    import org.apache.nutch.parse.ParseFilter;
    import org.apache.nutch.protocol.Content;
    import de.jetwick.snacktory.ArticleTextExtractor;
    import de.jetwick.snacktory.JResult;
    public class HTMLMetatagFilter implements ParseFilter {
        public static String METATAG_PROPERTY_NAME_PREFIX = "Endeca.Document.HTML.MetaTag.";
        public static String CONTENT_TYPE = "text/html";
        private static final Logger logger = Logger.getLogger(HTMLMetatagFilter.class);
        public Parse filter(Content content, Parse parse) throws Exception {
            logger.info("come into EndecaHtmlParser getParse");
            logger.info("come into HTMLMetatagFilter");
            //update the content with the main text in html page
            //content.setContent(HtmlExtractor.extractMainContent(content));
            parse.getData().getParseMeta().add("FILTER-HTMLMETATAG", "ACTIVE");
            ParseData parseData = parse.getData();
            if (parseData == null) return parse;
            extractText(content, parse);
            logger.info("update the content with the main text content");
            return parse;
        private void extractText(Content content, Parse parse){
            try {
                ParseData parseData = parse.getData();
                if (parseData == null) return;
                 Metadata md = parseData.getParseMeta();
                ArticleTextExtractor extractor = new ArticleTextExtractor();
                String sourceHtml = new String(content.getContent());
                JResult res = extractor.extractContent(sourceHtml);
                String text = res.getText();
                md.set("Endeca_Document_Text", text);
            } catch (Exception e) {
                // TODO: handle exception
        public static void log(String msg){
            System.out.println(msg);
        public Configuration getConf() {
            return null;
        public void setConf(Configuration conf) {

    but it only extracts URLs from <A> (anchor) tags. I want to be able to extract URLs from <MAP> tags as wellGee, do you think you could modify the code to check for "Map" attributes as well.
    Can someone maybe point a page containing info on the HTML toolkit for me?It's called the API. Since you are using the HTMLEditorKit and an ElementIterator and an AttributeSet, I would start there.
    There is no such API that says "get me all the links", so you have to do a little work on your own.
    Maybe you could use a ParserCallback and every time you get a new tag you check for the "href" attribute.

Maybe you are looking for

  • Using Adobe Media Encoder with After Effects and ExtendScript

    Hi, I'm trying to automate the process of encoding a video using the Adobe Media Encoder. The project I'm working on includes allowing a user to upload a video and having that video then be encoded in different formats. I'm very new to After Effects

  • Dead links - not another "how to remove them" thread

    I know this question has been answered a few times but the question is not a "how to remove them" or "what do they mean" question. I would like to know how this happens other than changing the file directory or information outside of iTunes. Here is

  • Torch 9860 stopped pairing with Mercedes C180

    My BB9860 stopped suddenly pairing with my MB car. it keeps ringing endlessly when cnnecting via bluetooth. Other phones (HTC, iPhone) can pair with the same car. I am running the latest softwares from RIM. Could you help please

  • ITunes Wish List on iOS?

    I looked in the App Store and iTunes apps, but I can't find my Wish List!  Can I get it some other way?

  • Arguments in shell script

    HI all, I have a shell script test.sh it contains the following A=$1 B=$2 C=$3 D=$4 sqlplus usr/pwd<<eof a varchar2(10); b varchar2(10); c date; d date; a:='$A'; b:='$B'; c:='$C'; d:='$D'; func(a,b,c,d(; eof func() defbn func(name varchar2,trc varcha