Search Engine Robot
Hi,
version: Sun One Portal Server 6.0
Is there any way to catch the event from search engine Robot whenever a new RD is generated?
Also how to retrieve the details of that RD?
Any help will be appreciated.
--Anu.
On 04 Mar 2007 in macromedia.dreamweaver, gareth wrote:
> Can you post your main site address, so I can see your
robots.txt
> file, as I can't believe the search engines are ignoring
it.
The underlying problem here is that /well behaved/ spiders
will obey
ROBOTS metas and robots.txt. Not all bots are well-behaved.
After
trying those techniques, I finally had to ban Baidu's spider
by IP block
because it was hitting my site so hard.
Also, once a SE spider gets a page into its cache, it may
well start its
search at a lower level, completely missing a top-level
robots.txt.
I suspect that the only way to get rid of them are really
hard-core
methods:
- Delete/rename the file(s) or directory(ies) in question
- Use some kind of server-side security to restrict access
without a
password
- Ban the problematic spider by IP block
Joe Makowiec
http://makowiec.net/
Email:
http://makowiec.net/email.php
Similar Messages
-
How do prevent HTML snippets from being listed on search engines?
I just created a new website using iWeb 3.0.1. On a couple of my pages I embedded flash music players and video players using the HTML Snippet widget. I did a quick search of my website on google and noticed that the HTML widgets were showing in the results as separate pages. I then clicked on these HTML Snippet pages and up came the widget on a separate page by itself. How do I prevent search engines from listing any HTML Snippets on my iWeb site as separate pages?
Thanks
athafranPaste this in the HTML Snippet.
In the <body> ... </body> part you paste the code you currently use for your Snippet.
I haven't tried it, the noindex part, but the code itself is accepted by iWeb.
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="robots" content="noindex,nofollow">
<meta name="description" content="instructions for excluding search engine robots">
<title>Search Engine exclusion</title>
</head>
<body>
Here your current code.
</body> -
Free iWeb SEO Tool to Help Improve Search Engine Rankings
Hi,
I have just posted a free iWeb SEO Tool that will let you add and edit all your meta tags and header information, as well as a few other important search engine optimization factors. Here is a brief description of what it can do;
iWeb SEO Tool is the only software that makes it easy to get your iWeb built website ready for search engines.
Since Apple has often neglected key SEO strategies in their iWeb software, it is difficult for many iWeb based websites to rank high in search engines. This is why we have created this free utility to help you properly optimize your website.
Features include;
1) Easily add meta tags such as description and keywords
2) Edit your title tags for each pages
3) Add robot rules and language meta tags
4) Add alternative text for images
5) All settings are saved in a private database so next time you publish your site you can take all the saved SEO copy and apply to your new site
6) Edit sites locally or directly on your iDisk
I am looking for feedback on this tool to see how useful it can be. If you find any problems, please let me know. You can download this tool by creating a free beta testers account at;
http://www.trybeta.com/home/
Please let me know if this is useful for anyone. It is completely free for all iWeb users.
*I may receive some form of compensation, financial or otherwise, from my recommendation or link*
<Edited by Moderator>1. To publish to .Mac just mount your iDisk and go to Web/Sites/ and you will see your old site there. Delete that site and just copy your new site to the location.
2. Just press the load from iDisk button in the toolbar and enter your username and password. Your sites from your iDisk will load in iWeb SEO Tool. -
Need advice on web search engine
I may need to install a search engine for a asp.x web site I am working on. I know nothing about ASP.NET and the these are secure web pages. Also, this is for just a portion of the web site.I have no experience with search engines.
First question: Can search engines limit their search to just a section of a web site?
Second question: What would recommend for my circumstances?
Thanks!Answer to your first question: No, I don't want to provide a search engine to the entire site. There is a section you go in with several divisions. One division is the directory of services. The way that it's set up now is they have about 9 PDF files that are link to 9 headings. You click on a heading and it opens the PDF for that topic. The problem is that it takes a long time for the PDF to open and then the person may have to search it for the information they are looking for.
I think you may have answered my question already. If robot files are required to limit a search, then a search engine may not be the practical choice in my case.
By the way, to answer your last question, I am not coding in ASP.NET. I have made it clear to these people that I don't know a line of it. But I also think the solution to the problem is to convert the Directory to web pages, set up links of topics and maybe a JavaScript jump menu for linking to other bookmarked information. That way the searched is limited to only certain topics we decide beforehand but also limited to the directory.
What do you think of this solution? Do you foresee any problems?
Michael -
Meta tag /search engine issue
I am looking for a useful code on how to block a search
engine from scanning and indexing a site. We have several small
sites that when up and running will be indexed through all the
search engines. 2 of the sites need to be customized for a regional
people and I cannot have those sites come up as duplicates and ruin
our rankings. The regional sites must not get indexed, having a
hard time finding a resource.
Thanks in advanceExample of a robot.txt file and another couple of articles
about this:
http://tinyurl.com/2ekwjy
Nadia
Adobe® Community Expert : Dreamweaver
CSS Templates |Tutorials |SEO Articles
http://www.DreamweaverResources.com
~ Customisation Service Available ~
http://www.csstemplates.com.au
~ Forum Posting Guidelines ~
http://www.adobe.com/support/forums/guidelines.html -
Sun One Search Engine Integration with the Fatwire Content Server
Hello everyone,
I am presently using Fatwire(Divine) Content Server. I have uploaded certain documents using Fatwire flex attributes(metadata). I want to use portal search engine for implementing Full Text Search of uploaded documents. how can i map metadata(attribute) of fatwire/divine content server with the portal server search engine metadata.
I hope someone might have tried it earlier while integrating content server with the portal. please give me some suggestions in this regard.
thanku
jenniHi,
I don't know about the metadata mapping,
but you can definitely index the FatWire local
data directory with the search engine.
Just specify "file:///.../<fatwiredir>" as starting point for your search robot.
Cheers,
Alex :-)
PS: After "Sun Forum Accounts Update" I couldn't login to this forum and at SUN
no one cares - they just ignore my mails. "Thanks a lot" for supporting free community!
(Check my old profile at <http://swforum.sun.com/jive/profile.jspa?userID=3455>)
OK. I have now a new account and I will try to help you out here...
------------------------------------------------------------------------- -
[FLASH] Macromedia Flash Search Engine SDK
He leído ue con Macromedia Flash Search Engine SDK se
puede ver aquello que Google indexa de los archivos swf.
Me gustaría saber dónde puedo descargarlo, ya que
la url oficial de la descarga está NOT AVAILABLE.
Muchas gracias,
XiskyaBueno, me gustaría tener una respuesta tan compacta y
segura en su contenido
que hasta los ingenieros de google y adobe se quedaran
boquiabiertos, pero
no es el caso: Ellos también están detrás de
la solución a este engorroso
asunto desde hace mucho tiempo y es muy complejo hacer que el
ingeniosos
robot de google renderice, (perdón, represente) el
contenido de las
películas interactivas de flash o que los swf's se
vuelvan harto
transparentes para la torpe lectura de una máquina. Pero
sí que podemos
hacer muchas cosas para facilitar la tarea del indexado,
qunque repito que
ninguna de las que te diré son absolutamente fiables y
que todo tiene sus
ventajas e inconvenientes. Veamos:
- Hace ya años que Google indexa parte del contenido
textual (texto
estático, no externo) del archivo swf, en especial los
títulos que se
encuentran en la parte superior de las pantallas, los enlaces
fabricados
diréctamente (sin que sean botones) y algún
contenido javascript.
Contrariamente a lo que se cree esto no solamente sucede con
las películas
principales, sino también con las cargadas desde ellas,
pero nunca estaremos
seguros de si un texto estático será
corréctamente interpretado por el gran
hermano.
Para asegurarnos de que lo haga debemos recurrir a una
técnica que suele
funcionar muy bien.
Para ello hay que entender que los desarrollos web deben
separar el
contenido del estilo y del comportamiento, y aunque el uso de
Flash puede
incluirse en las tres categorías, hace ya tiempo lo
desarrolladores tendemos
a separar el contenido de nuestras películas en archivos
externos (tanto por
la carga modular de la información, como por la
organización de las clases o
la división del trabajo entre diseñadores y
programadores). Pues bien, esta
técnica consiste en utilizar el FlashObject para
incrustar nuestra película
en el html, mediante un archivo javascript, como hacemos de
contínuo hace ya
tiempo, pues de otra manera el Internet Explorer nos avisas
con un cartelito
pedorro.
En primer lugar haremos nuestra página principal como si
no tuviera Flash y
apuntara a un enlace que muestre el contenido (bien un html,
un xml o hasta
una base de datos, lo que quieras que se indexe por google) y
solo si el
usuario tiene el plugin de flash requerido para nuestra
versión y el
javascript activado, le mandamos el contenido flash, de esta
manera:
<div id="flashcontent"> Esto se sustituye por el
contenido de Flash si el
usuario tiene la versión correcta del plugin de Flash
instalado. Coloque su
contenido HTML aquí y Google lo indexará sólo,
ya que es contenido HTML
normal (se trata de contenido HTML!) Utilizar HTML, insertar
imágenes, puede
ser cualquier cosa en lugar de una página HTML que
está muy bien. </div>
<script type="text/javascript"> // <![CDATA[ var fo
= new
FlashObject("flashmovie.swf", "flashmovie", "300", "300",
"8", "#FF6600");
fo.write("flashcontent"); // ]]> </script> </
div> <script
type="text/javascript"> / / <! [CDATA [var a = new
FlashObject (
"flashmovie.swf", "flashmovie", "300", "300", "8", " # FF6600
"); fo.write
(" flashcontent "); / /]]> </ script>
Esto, escrito en tu archivo js, hace que google se salte el
swf e indexe lo
que ocupa su lugar, pero si el visitante tiene el plugin y el
javascript
instalado y activo, respectivamente, entonces lo
mostrará. Incluso si
apuntas tu web a una base de datos google la reindexará
automáticamente de
contínuo (salvo que el archivo robots.txt de tu servidor
le indique lo
contrario). Puedes, pero debes tener cuidado con, incluir
enlaces externos
en el contenido alternativo, ya que podrían considerarte
una linkfarm y
penalizarte.
Existen otras técnicas, pero esta me da muy buen
resultado.
Salu2
`8¬]
Juan Muro
"xiskya_lucy" <[email protected]>
escribió en el mensaje de
noticias news:[email protected]...
> Hola Juan!!
>
> Me alegro de re-encontrarte!! Y como siempre, a punto
con tus respuestas.
> Mil
> gracias.
>
> En realidad, lo que más me ayudaría es saber
qué debo tener en cuenta para
> que
> Google indexe más o menos bien mis archivos SWF. Y
seguro que me puedes
> ayudar
> en ello.
>
> Mil gracias de antemano.
>
> Xiskya
> -
Portal Page Dynamic URLS & Search Engines
Hi,
I have created a portal site. These portal pages urls are created dynamically.
How can I optimize this dynamic (and long) URL for a search engine?
I am having trouble getting them spidered and ranked in major search engines.
Any suggestions?
Thanks.Despite many posts on this issue, I am skeptical that there
is a problem
with search engines indexing pages with query strings. I have
a number
of pages with URLs like showdetail.cfm?articleID=25 and they
are always
indexed by Google and other search engines.
It is my understanding it largely depends on how you link to
the detail
pages. If you have direct anchors to the detail pages Google
will index
them. If you use a form or something else, Google is much
less likely
to follow.
For example:
<a href="showdetail.cfm?articleID=25">The best red ball
ever</a>
<!--- This will be followed by Google to be indexed--->
<form action="showdetail.cfm" method="get">
<select name="articleID">
<option value="25">The best red ball
ever<option>
<option value="26">The best green ball
ever<option>
<option value="27">The best blue ball
ever<option>
</select>
<input type="submit"/>
</form>
<!--- This will not be followed by Google to index the
detail pages. --->
This makes some sense, because how can a robot complete a
form even if
it could determine it is for navigation. -
Do search engines recognize text that is retrieved from a
MySQL database with PHP?> I think a robot (spider)
> can get to any page on your site if its on your sever?
Nonsense. If there's no link to it, then it can only be found
by a lucky
guess.
> But the CAPTCHA is no longer secure
If done right it is.
> How do the mail spam bots get in they have to login
first, right?
Impossible.
Murray --- ICQ 71997575
Adobe Community Expert
(If you *MUST* email me, don't LAUGH when you do so!)
==================
http://www.projectseven.com/go
- DW FAQs, Tutorials & Resources
http://www.dwfaq.com - DW FAQs,
Tutorials & Resources
==================
"Baxter" <baxter(RemoveThe :-)@gtlakes.com> wrote in
message
news:[email protected]...
>I don't know? But the CAPTCHA is no longer secure, I
think a robot
>(spider)
> can get to any page on your site if its on your sever?
How do the mail
> spam
> bots get in they have to login first, right?
> Thanks for your info on this,
> Dave
> "Murray *ACE*" <[email protected]>
wrote in message
> news:[email protected]...
>> Such pages are secure. How would the spider login?
>>
>> --
>> Murray --- ICQ 71997575
>> Adobe Community Expert
>> (If you *MUST* email me, don't LAUGH when you do
so!)
>> ==================
>>
http://www.projectseven.com/go
- DW FAQs, Tutorials & Resources
>>
http://www.dwfaq.com - DW FAQs,
Tutorials & Resources
>> ==================
>>
>>
>> "Baxter" <baxter(RemoveThe :-)@gtlakes.com>
wrote in message
>> news:[email protected]...
>> > Yes that's what I mean, If you have to login to
get to the page with
>> > the
>> > database info or there is no link to the page
they will not index it?
>> > Thanks
>> > I think that is what I was looking for.
>> > Dave
>> > "Murray *ACE*"
<[email protected]> wrote in message
>> > news:[email protected]...
>> >> If you can see it in your browser, they can
read it. But - they
>> >> cannot
>> >> enter information in forms or use logins
and passwords, if that's what
>> >> you
>> >> mean.
>> >>
>> >> --
>> >> Murray --- ICQ 71997575
>> >> Adobe Community Expert
>> >> (If you *MUST* email me, don't LAUGH when
you do so!)
>> >> ==================
>> >>
http://www.projectseven.com/go
- DW FAQs, Tutorials & Resources
>> >>
http://www.dwfaq.com - DW FAQs,
Tutorials & Resources
>> >> ==================
>> >>
>> >>
>> >> "bregent"
<[email protected]> wrote in message
>> >> news:[email protected]...
>> >> > >Then they must have some way to
not index pages with sensitive data
> on
>> >> > >them
>> >> > >is that what your saying. I know
they can index the php, asp pages
> but
>> >> > >what
>> >> > >about the data that the page
produces?
>> >> >
>> >> > Not sure what you are asking about
sensitive data. They read pages
> the
>> >> > same
>> >> > way they are rendered to browsers and
follow the links on the pages.
>> > They
>> >> > don't
>> >> > know if the text is coming from a
database or is static. They don't
>> >> > care
>> >> > if the
>> >> > data is sensitive or not. Look at the
source view of a dynamic web
>> >> > page.
>> >> > That
>> >> > is what the SE sees.
>> >> >
>> >>
>> >
>> >
>>
>
> -
i want to develop a search engine that should search web and ftp sites as per the given string.
what should i do for this?
is there any code available for this...Just use Google. Use Google's API if you have to.
Or use Amazon's search engine. ("Alexa", I think?) It has an API too.
Or use Lucene.
Although I can't help wondering if you're asking how to build a spider aka robot, rather than a search engine. -
Are test sites able to be found on search engines?
Hello all. I have created a site as a test and did not want this test site found by Google etc, are BC test sites found by search engines?
thanks
MartinPost a link of the development site to another site, forum or blog that is indexed then google will go take a look at the site linked and index it unless you deny it fully with a robots.txt
-
Making pages that will not be found by search engines?
What can I do in Dreamweaver OR what code can I add to my HTML pages so that Google or any other search engines will not index the pages?
Thanks.This site is a subfolder of a larger website and i do not have access to the root, so I am guessing I can not use the robots.txt file.
From what I have read, this seems like it will work:
<meta name="robots" content="noindex">
1.) here is my existing code below. Do I just insert it after the <head>?
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=EmulateIE7" />
Thanks! -
Hide Muse wire frame from search engines while in dev
Hi,
I am using muse to develop a wire frame for a client.
Could you advise on how I hide the site from search engines during this process.
Many Thanks
CasHi Cas,
If you have published the site to Business Catalyst and the site is still in trial mode, you have nothing to worry about. A robots.txt file is automatically added to the trial sites in BC. So search engines can't index the BC trial sites.
Regards,
Aish -
How do I add multiple scripts from search engines to my meta tag properties?
I currently have copied the goolge script for website varification and analytics, etc and pasted it into my meta tag properties dialog box. There is no problem as far as Google varifying the page. However, I would like to copy Bing's search engine script into my meta tag in addition to Googles script. How do I go about doing this? Do I hit the return on my keyboard under the ending of Googles script, then paste in the Bing script?
The the last part of the Google script ending in this:
</script>
(paste new script from Bing here?)
Will this cancel out each other and cause problems?
Can someone walk me through this process, because Bing's search engine will not varify my site through two of the three other methods.
BenAdding a script after the closure of previous script is the way to go i.e. right after the </script> tag.
So it should look something like below:
<script>
Google's script
</script>
<script>
Bing's script
</script>
Cannot comment on one interfering with the other since it really depends on what exact code is there in the scripts. Google and Bing help resources will be able to help more with this.
Thanks,
Vikas -
Running Firefox on Windows XP3. I installed a plugin from Savevid.com, which they now require. (I've used that site for at least a year with no problem ever.) Everything was fine but it added a toolbar to my browser, and it seems to have reset Firefox somehow so that any new tab opens on "Searchqu", some low-rent search engine. I uninstalled the plug-in altogether, but the toolbar is still there, and I can't make the reset on the search engine go away. I use Tab Mix Plus (also never had this problem before with it), but no matter what I do, the tabs will not open to anything other than searchqu. (Tab Mix has the option to open to a duplicate tab - what it was using BEFORE all this - but changing that no longer has any effect.)
Anyone know how to deal with a problem like this?Start Firefox in <u>[[Safe Mode|Safe Mode]]</u> to check if one of the extensions (Firefox/Tools > Add-ons > Extensions) or if hardware acceleration is causing the problem (switch to the DEFAULT theme: Firefox/Tools > Add-ons > Appearance).
*Do NOT click the Reset button on the Safe Mode start window or otherwise make changes.
*https://support.mozilla.org/kb/Safe+Mode
*https://support.mozilla.org/kb/Troubleshooting+extensions+and+themes
You can also check for problems with the localstore.rdf file.
*http://kb.mozillazine.org/Corrupt_localstore.rdf
Maybe you are looking for
-
How do I connect via bluetooth my ipad with my Blackberry Torch
How do I connect my ipad to my blackberry torch via bluetooth.
-
I am trying to deploy BC4J as Ejb to 8i (8.1.6). I use the following deployment method with j: my Jdev root. setjboenv j: ejb8i loadjava - u user/passwd@host:port:sid - thin -v - r grant "SYS,PUBLIC" j:lib\xmlparserv2.jar loadjava - u user/passwd@hos
-
Need some sample documents for DME development
Hi, I need some sample documents for DME development. I expect any information covering developing DME formats and steps for downloading a DME file (this I need for checking my formats). please, send any relevant info you have to mail id mindaugas.ka
-
Xfinity TV website constantly crashing with different error codes
I am extremely fed up with this website. I constantly get 900 codes, which were fixed for a short while and now I can't load any videos with a 422 code. What is the major malfunction with this website? I never have issues streaming video on any other
-
Creating DB - stuck at 85%.....help?
while installin 8i PE, in minimal mode, the progress bar hangs at 85 % (creating database) i waited for abt 25 min.... no progres.... thanks... sundar