Search Engine Robot

Hi,
version: Sun One Portal Server 6.0
Is there any way to catch the event from search engine Robot whenever a new RD is generated?
Also how to retrieve the details of that RD?
Any help will be appreciated.
--Anu.                                                                                                                                                                                                                                                                                                                                                                                                                                                       

On 04 Mar 2007 in macromedia.dreamweaver, gareth wrote:
> Can you post your main site address, so I can see your
robots.txt
> file, as I can't believe the search engines are ignoring
it.
The underlying problem here is that /well behaved/ spiders
will obey
ROBOTS metas and robots.txt. Not all bots are well-behaved.
After
trying those techniques, I finally had to ban Baidu's spider
by IP block
because it was hitting my site so hard.
Also, once a SE spider gets a page into its cache, it may
well start its
search at a lower level, completely missing a top-level
robots.txt.
I suspect that the only way to get rid of them are really
hard-core
methods:
- Delete/rename the file(s) or directory(ies) in question
- Use some kind of server-side security to restrict access
without a
password
- Ban the problematic spider by IP block
Joe Makowiec
http://makowiec.net/
Email:
http://makowiec.net/email.php

Similar Messages

  • How do prevent HTML snippets from being listed on search engines?

    I just created a new website using iWeb 3.0.1. On a couple of my pages I embedded flash music players and video players using the HTML Snippet widget. I did a quick search of my website on google and noticed that the HTML widgets were showing in the results as separate pages. I then clicked on these HTML Snippet pages and up came the widget on a separate page by itself. How do I prevent search engines from listing any HTML Snippets on my iWeb site as separate pages?
    Thanks
    athafran

    Paste this in the HTML Snippet.
    In the <body> ... </body> part you paste the code you currently use for your Snippet.
    I haven't tried it, the noindex part, but the code itself is accepted by iWeb.
    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
    <meta name="robots" content="noindex,nofollow">
    <meta name="description" content="instructions for excluding search engine robots">
    <title>Search Engine exclusion</title>
    </head>
    <body>
    Here your current code.
    </body>

  • Free iWeb SEO Tool to Help Improve Search Engine Rankings

    Hi,
    I have just posted a free iWeb SEO Tool that will let you add and edit all your meta tags and header information, as well as a few other important search engine optimization factors. Here is a brief description of what it can do;
    iWeb SEO Tool is the only software that makes it easy to get your iWeb built website ready for search engines.
    Since Apple has often neglected key SEO strategies in their iWeb software, it is difficult for many iWeb based websites to rank high in search engines. This is why we have created this free utility to help you properly optimize your website.
    Features include;
    1) Easily add meta tags such as description and keywords
    2) Edit your title tags for each pages
    3) Add robot rules and language meta tags
    4) Add alternative text for images
    5) All settings are saved in a private database so next time you publish your site you can take all the saved SEO copy and apply to your new site
    6) Edit sites locally or directly on your iDisk
    I am looking for feedback on this tool to see how useful it can be. If you find any problems, please let me know. You can download this tool by creating a free beta testers account at;
    http://www.trybeta.com/home/
    Please let me know if this is useful for anyone. It is completely free for all iWeb users.
    *I may receive some form of compensation, financial or otherwise, from my recommendation or link*
    <Edited by Moderator>

    1. To publish to .Mac just mount your iDisk and go to Web/Sites/ and you will see your old site there. Delete that site and just copy your new site to the location.
    2. Just press the load from iDisk button in the toolbar and enter your username and password. Your sites from your iDisk will load in iWeb SEO Tool.

  • Need advice on web search engine

    I may need to install a search engine for a asp.x web site I am working on. I know nothing about ASP.NET and the these are secure web pages. Also, this is for just a portion of the web site.I have no experience with search engines.
    First question: Can search engines limit their search to just a section of a web site?
    Second question: What would recommend for my circumstances?
    Thanks!

    Answer to your first question: No, I don't want to provide a search engine to the entire site. There is a section you go in with several divisions. One division is the directory of services. The way that it's set up now is they have about 9 PDF files that are link to 9 headings. You click on a heading and it opens the PDF for that topic. The problem is that it takes a long time for the PDF to open and then the person may have to search it for the information they are looking for.
    I think you may have answered my question already. If robot files are required to limit a search, then a search engine may not be the practical choice in my case.
    By the way, to answer your last question, I am not coding in ASP.NET. I have made it clear to these people that I don't know a line of it. But I also think the solution to the problem is to convert the Directory to web pages, set up links of topics and maybe a JavaScript jump menu for linking to other bookmarked information. That way the searched is limited to only certain topics we decide beforehand but also limited to the directory.
    What do you think of this solution? Do you foresee any problems?
    Michael

  • Meta tag /search engine issue

    I am looking for a useful code on how to block a search
    engine from scanning and indexing a site. We have several small
    sites that when up and running will be indexed through all the
    search engines. 2 of the sites need to be customized for a regional
    people and I cannot have those sites come up as duplicates and ruin
    our rankings. The regional sites must not get indexed, having a
    hard time finding a resource.
    Thanks in advance

    Example of a robot.txt file and another couple of articles
    about this:
    http://tinyurl.com/2ekwjy
    Nadia
    Adobe® Community Expert : Dreamweaver
    CSS Templates |Tutorials |SEO Articles
    http://www.DreamweaverResources.com
    ~ Customisation Service Available ~
    http://www.csstemplates.com.au
    ~ Forum Posting Guidelines ~
    http://www.adobe.com/support/forums/guidelines.html

  • Sun One Search Engine Integration with the Fatwire Content Server

    Hello everyone,
    I am presently using Fatwire(Divine) Content Server. I have uploaded certain documents using Fatwire flex attributes(metadata). I want to use portal search engine for implementing Full Text Search of uploaded documents. how can i map metadata(attribute) of fatwire/divine content server with the portal server search engine metadata.
    I hope someone might have tried it earlier while integrating content server with the portal. please give me some suggestions in this regard.
    thanku
    jenni

    Hi,
    I don't know about the metadata mapping,
    but you can definitely index the FatWire local
    data directory with the search engine.
    Just specify "file:///.../<fatwiredir>" as starting point for your search robot.
    Cheers,
    Alex :-)
    PS: After "Sun Forum Accounts Update" I couldn't login to this forum and at SUN
    no one cares - they just ignore my mails. "Thanks a lot" for supporting free community!
    (Check my old profile at <http://swforum.sun.com/jive/profile.jspa?userID=3455>)
    OK. I have now a new account and I will try to help you out here...
    -------------------------------------------------------------------------

  • [FLASH] Macromedia Flash Search Engine SDK

    He leído ue con Macromedia Flash Search Engine SDK se
    puede ver aquello que Google indexa de los archivos swf.
    Me gustaría saber dónde puedo descargarlo, ya que
    la url oficial de la descarga está NOT AVAILABLE.
    Muchas gracias,
    Xiskya

    Bueno, me gustaría tener una respuesta tan compacta y
    segura en su contenido
    que hasta los ingenieros de google y adobe se quedaran
    boquiabiertos, pero
    no es el caso: Ellos también están detrás de
    la solución a este engorroso
    asunto desde hace mucho tiempo y es muy complejo hacer que el
    ingeniosos
    robot de google renderice, (perdón, represente) el
    contenido de las
    películas interactivas de flash o que los swf's se
    vuelvan harto
    transparentes para la torpe lectura de una máquina. Pero
    sí que podemos
    hacer muchas cosas para facilitar la tarea del indexado,
    qunque repito que
    ninguna de las que te diré son absolutamente fiables y
    que todo tiene sus
    ventajas e inconvenientes. Veamos:
    - Hace ya años que Google indexa parte del contenido
    textual (texto
    estático, no externo) del archivo swf, en especial los
    títulos que se
    encuentran en la parte superior de las pantallas, los enlaces
    fabricados
    diréctamente (sin que sean botones) y algún
    contenido javascript.
    Contrariamente a lo que se cree esto no solamente sucede con
    las películas
    principales, sino también con las cargadas desde ellas,
    pero nunca estaremos
    seguros de si un texto estático será
    corréctamente interpretado por el gran
    hermano.
    Para asegurarnos de que lo haga debemos recurrir a una
    técnica que suele
    funcionar muy bien.
    Para ello hay que entender que los desarrollos web deben
    separar el
    contenido del estilo y del comportamiento, y aunque el uso de
    Flash puede
    incluirse en las tres categorías, hace ya tiempo lo
    desarrolladores tendemos
    a separar el contenido de nuestras películas en archivos
    externos (tanto por
    la carga modular de la información, como por la
    organización de las clases o
    la división del trabajo entre diseñadores y
    programadores). Pues bien, esta
    técnica consiste en utilizar el FlashObject para
    incrustar nuestra película
    en el html, mediante un archivo javascript, como hacemos de
    contínuo hace ya
    tiempo, pues de otra manera el Internet Explorer nos avisas
    con un cartelito
    pedorro.
    En primer lugar haremos nuestra página principal como si
    no tuviera Flash y
    apuntara a un enlace que muestre el contenido (bien un html,
    un xml o hasta
    una base de datos, lo que quieras que se indexe por google) y
    solo si el
    usuario tiene el plugin de flash requerido para nuestra
    versión y el
    javascript activado, le mandamos el contenido flash, de esta
    manera:
    <div id="flashcontent"> Esto se sustituye por el
    contenido de Flash si el
    usuario tiene la versión correcta del plugin de Flash
    instalado. Coloque su
    contenido HTML aquí y Google lo indexará sólo,
    ya que es contenido HTML
    normal (se trata de contenido HTML!) Utilizar HTML, insertar
    imágenes, puede
    ser cualquier cosa en lugar de una página HTML que
    está muy bien. </div>
    <script type="text/javascript"> // <![CDATA[ var fo
    = new
    FlashObject("flashmovie.swf", "flashmovie", "300", "300",
    "8", "#FF6600");
    fo.write("flashcontent"); // ]]> </script> </
    div> <script
    type="text/javascript"> / / <! [CDATA [var a = new
    FlashObject (
    "flashmovie.swf", "flashmovie", "300", "300", "8", " # FF6600
    "); fo.write
    (" flashcontent "); / /]]> </ script>
    Esto, escrito en tu archivo js, hace que google se salte el
    swf e indexe lo
    que ocupa su lugar, pero si el visitante tiene el plugin y el
    javascript
    instalado y activo, respectivamente, entonces lo
    mostrará. Incluso si
    apuntas tu web a una base de datos google la reindexará
    automáticamente de
    contínuo (salvo que el archivo robots.txt de tu servidor
    le indique lo
    contrario). Puedes, pero debes tener cuidado con, incluir
    enlaces externos
    en el contenido alternativo, ya que podrían considerarte
    una linkfarm y
    penalizarte.
    Existen otras técnicas, pero esta me da muy buen
    resultado.
    Salu2
    `8¬]
    Juan Muro
    "xiskya_lucy" <[email protected]>
    escribió en el mensaje de
    noticias news:[email protected]...
    > Hola Juan!!
    >
    > Me alegro de re-encontrarte!! Y como siempre, a punto
    con tus respuestas.
    > Mil
    > gracias.
    >
    > En realidad, lo que más me ayudaría es saber
    qué debo tener en cuenta para
    > que
    > Google indexe más o menos bien mis archivos SWF. Y
    seguro que me puedes
    > ayudar
    > en ello.
    >
    > Mil gracias de antemano.
    >
    > Xiskya
    >

  • Portal Page Dynamic URLS & Search Engines

    Hi,
    I have created a portal site. These portal pages urls are created dynamically.
    How can I optimize this dynamic (and long) URL for a search engine?
    I am having trouble getting them spidered and ranked in major search engines.
    Any suggestions?
    Thanks.

    Despite many posts on this issue, I am skeptical that there
    is a problem
    with search engines indexing pages with query strings. I have
    a number
    of pages with URLs like showdetail.cfm?articleID=25 and they
    are always
    indexed by Google and other search engines.
    It is my understanding it largely depends on how you link to
    the detail
    pages. If you have direct anchors to the detail pages Google
    will index
    them. If you use a form or something else, Google is much
    less likely
    to follow.
    For example:
    <a href="showdetail.cfm?articleID=25">The best red ball
    ever</a>
    <!--- This will be followed by Google to be indexed--->
    <form action="showdetail.cfm" method="get">
    <select name="articleID">
    <option value="25">The best red ball
    ever<option>
    <option value="26">The best green ball
    ever<option>
    <option value="27">The best blue ball
    ever<option>
    </select>
    <input type="submit"/>
    </form>
    <!--- This will not be followed by Google to index the
    detail pages. --->
    This makes some sense, because how can a robot complete a
    form even if
    it could determine it is for navigation.

  • PHP & Search engines

    Do search engines recognize text that is retrieved from a
    MySQL database with PHP?

    > I think a robot (spider)
    > can get to any page on your site if its on your sever?
    Nonsense. If there's no link to it, then it can only be found
    by a lucky
    guess.
    > But the CAPTCHA is no longer secure
    If done right it is.
    > How do the mail spam bots get in they have to login
    first, right?
    Impossible.
    Murray --- ICQ 71997575
    Adobe Community Expert
    (If you *MUST* email me, don't LAUGH when you do so!)
    ==================
    http://www.projectseven.com/go
    - DW FAQs, Tutorials & Resources
    http://www.dwfaq.com - DW FAQs,
    Tutorials & Resources
    ==================
    "Baxter" <baxter(RemoveThe :-)@gtlakes.com> wrote in
    message
    news:[email protected]...
    >I don't know? But the CAPTCHA is no longer secure, I
    think a robot
    >(spider)
    > can get to any page on your site if its on your sever?
    How do the mail
    > spam
    > bots get in they have to login first, right?
    > Thanks for your info on this,
    > Dave
    > "Murray *ACE*" <[email protected]>
    wrote in message
    > news:[email protected]...
    >> Such pages are secure. How would the spider login?
    >>
    >> --
    >> Murray --- ICQ 71997575
    >> Adobe Community Expert
    >> (If you *MUST* email me, don't LAUGH when you do
    so!)
    >> ==================
    >>
    http://www.projectseven.com/go
    - DW FAQs, Tutorials & Resources
    >>
    http://www.dwfaq.com - DW FAQs,
    Tutorials & Resources
    >> ==================
    >>
    >>
    >> "Baxter" <baxter(RemoveThe :-)@gtlakes.com>
    wrote in message
    >> news:[email protected]...
    >> > Yes that's what I mean, If you have to login to
    get to the page with
    >> > the
    >> > database info or there is no link to the page
    they will not index it?
    >> > Thanks
    >> > I think that is what I was looking for.
    >> > Dave
    >> > "Murray *ACE*"
    <[email protected]> wrote in message
    >> > news:[email protected]...
    >> >> If you can see it in your browser, they can
    read it. But - they
    >> >> cannot
    >> >> enter information in forms or use logins
    and passwords, if that's what
    >> >> you
    >> >> mean.
    >> >>
    >> >> --
    >> >> Murray --- ICQ 71997575
    >> >> Adobe Community Expert
    >> >> (If you *MUST* email me, don't LAUGH when
    you do so!)
    >> >> ==================
    >> >>
    http://www.projectseven.com/go
    - DW FAQs, Tutorials & Resources
    >> >>
    http://www.dwfaq.com - DW FAQs,
    Tutorials & Resources
    >> >> ==================
    >> >>
    >> >>
    >> >> "bregent"
    <[email protected]> wrote in message
    >> >> news:[email protected]...
    >> >> > >Then they must have some way to
    not index pages with sensitive data
    > on
    >> >> > >them
    >> >> > >is that what your saying. I know
    they can index the php, asp pages
    > but
    >> >> > >what
    >> >> > >about the data that the page
    produces?
    >> >> >
    >> >> > Not sure what you are asking about
    sensitive data. They read pages
    > the
    >> >> > same
    >> >> > way they are rendered to browsers and
    follow the links on the pages.
    >> > They
    >> >> > don't
    >> >> > know if the text is coming from a
    database or is static. They don't
    >> >> > care
    >> >> > if the
    >> >> > data is sensitive or not. Look at the
    source view of a dynamic web
    >> >> > page.
    >> >> > That
    >> >> > is what the SE sees.
    >> >> >
    >> >>
    >> >
    >> >
    >>
    >
    >

  • Developing search engine

    i want to develop a search engine that should search web and ftp sites as per the given string.
    what should i do for this?
    is there any code available for this...

    Just use Google. Use Google's API if you have to.
    Or use Amazon's search engine. ("Alexa", I think?) It has an API too.
    Or use Lucene.
    Although I can't help wondering if you're asking how to build a spider aka robot, rather than a search engine.

  • Are test sites able to be found on search engines?

    Hello all. I have created a site as a test and did not want this test site found by Google etc, are BC test sites found by search engines?
    thanks
    Martin

    Post a link of the development site to another site, forum or blog that is indexed then google will go take a look at the site linked and index it unless you deny it fully with a robots.txt

  • Making pages that will not be found by search engines?

    What can I do in Dreamweaver OR what code can I add to my HTML pages so that Google or any other search engines will not index the pages?
    Thanks.

    This site is a subfolder of a larger website and i do not have access to the root, so I am guessing I can not use the robots.txt file.
    From what I have read, this seems like it will work:
    <meta name="robots" content="noindex">
    1.) here is my existing code below. Do I just insert it after the <head>?
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
        <meta http-equiv="X-UA-Compatible" content="IE=EmulateIE7" />
    Thanks!

  • Hide Muse wire frame from search engines while in dev

    Hi,
    I am using muse to develop a wire frame for a client.
    Could you advise on how I hide the site from search engines during this process.
    Many Thanks
    Cas

    Hi Cas,
    If you have published the site to Business Catalyst and the site is still in trial mode, you have nothing to worry about. A robots.txt file is automatically added to the trial sites in BC. So search engines can't index the BC trial sites.
    Regards,
    Aish

  • How do I add multiple scripts from search engines to my meta tag properties?

    I currently have copied the goolge script for website varification and analytics, etc and pasted it into my meta tag properties dialog box. There is no problem as far as Google varifying the page. However, I would like to copy Bing's search engine script into my meta tag in addition to Googles script. How do I go about doing this? Do I hit the return on my keyboard under the ending of Googles script, then paste in the Bing script?
    The the last part of the Google script ending in this:
    </script>
    (paste new script from Bing here?)
    Will this cancel out each other and cause problems?
    Can someone walk me through this process, because Bing's search engine will not varify my site through two of the three other methods.
    Ben

    Adding a script after the closure of previous script is the way to go i.e. right after the </script> tag.
    So it should look something like below:
    <script>
    Google's script
    </script>
    <script>
    Bing's script
    </script>
    Cannot comment on one interfering with the other since it really depends on what exact code is there in the scripts. Google and Bing help resources will be able to help more with this.
    Thanks,
    Vikas

  • How do I delete a toolbar that was added by a plugin? Plug-in's been uninstalled, but the toolbar is still there and any new tabs open on their search engine (even though I have Tab Mix set to open tabs on a duplicate page). Help?

    Running Firefox on Windows XP3. I installed a plugin from Savevid.com, which they now require. (I've used that site for at least a year with no problem ever.) Everything was fine but it added a toolbar to my browser, and it seems to have reset Firefox somehow so that any new tab opens on "Searchqu", some low-rent search engine. I uninstalled the plug-in altogether, but the toolbar is still there, and I can't make the reset on the search engine go away. I use Tab Mix Plus (also never had this problem before with it), but no matter what I do, the tabs will not open to anything other than searchqu. (Tab Mix has the option to open to a duplicate tab - what it was using BEFORE all this - but changing that no longer has any effect.)
    Anyone know how to deal with a problem like this?

    Start Firefox in <u>[[Safe Mode|Safe Mode]]</u> to check if one of the extensions (Firefox/Tools > Add-ons > Extensions) or if hardware acceleration is causing the problem (switch to the DEFAULT theme: Firefox/Tools > Add-ons > Appearance).
    *Do NOT click the Reset button on the Safe Mode start window or otherwise make changes.
    *https://support.mozilla.org/kb/Safe+Mode
    *https://support.mozilla.org/kb/Troubleshooting+extensions+and+themes
    You can also check for problems with the localstore.rdf file.
    *http://kb.mozillazine.org/Corrupt_localstore.rdf

Maybe you are looking for