Robot.txt.....google won't look at me
So I have this site and I can't get my google adsense ads to be read nor can I be indexed.
[img]http://img34.picoodle.com/img/img34/3/12/14/fscreencaptum06b4534.png[/img]
[img]http://img19.picoodle.com/img/img19/3/12/15/fscreencaptumdde57d5.png[/img]
This is what I get! Why?!?
Also I upload my site by individual folders which does give me problems with my podcast and blog pages. Is it effecting my google too?
I just hate who it adds in the extra letters into the link
Like my file name is JJ
www.johnnyjungle.com/JJ/Main is supposed to be my home page link?
So I just upload the folders separate to load to www.johnnyjungle.com/Main
anyone?
Similar Messages
-
why won't google work? whenever i try to go to google it just saya 404 Not Found. Also why won't captcha show up. when i have to type in a captcha to show i'm not a robot the words won't show up
Hey briannagrace96,
Welcome to Apple Support Communities! I'd check out the following article, it looks like it applies to your situation:
iPod: Appears in Windows but not in iTunes
http://support.apple.com/kb/ts1363
You'll want to go through the following troubleshooting steps, and for more detail on each step follow the link to the article above:
Try the iPod troubleshooting assistant:
If you have not already done so, try the steps in the iPod Troubleshooting Assistant (choose your iPod model from the list).
If the issue remains after following your iPod's troubleshooting assistant, follow the steps below to continue troubleshooting your issue.
Restart the iPod Service
Restart the Apple Mobile Device Service
Empty your Temp directory and restart
Verify that the Apple Mobile Device USB Driver is installed
Change your iPod's drive letter
Remove and reinstall iTunes
Disable conflicting System Services and Startup Items
Update, Reconfigure, Disable, or Remove Security Software
Deleting damaged or incorrect registry keys
Take care,
David -
Disallow URLs by robots.txt but still Appear In Google Search Results.
disallow URLs by robots.txt but still Appear In Google Search Results.
Can you expand on your problem? Are you being indexed despite not wanting to be indexed?
You are almost certainly in the wrong forum as this relates to SharePoint search, not how Google indexes your content. -
Robots.txt -- how do I do this?
I'm not using iWeb, unfortunately, but I wanted to protect part of a site I've set up. How do I set up a hidden directory under my domain name? I need it to be invisible except to people who have been notified of its existence. I was told, "In order to make it invisible you would need to not have any links associated with it on your site, make sure you have altered a robots.txt file in your /var/www/html directory so bots cannot spider it. A way to avoid spiders crawling certain directories is to place a robots.txt file in your web root directory that has parameters on which files or folders you do not want indexed."
But, how do I get/find/alter this robots.txt file? I unfortunately don't know how to do this sort (hardly any sort) of programming. Thank you so much.Muse does not generate a robots.txt file.
If your site has one, it's been generated by your hosting provider, or some other admin on your website. If you'd like google or other 'robots' to crawl your site, you'll need to edit this file or delete it.
Also note that you can set your page description in Muse using the page properties dialog, but it won't show up immediately in google search results - you have to wait until google crawls your site to update their index, which might take several days. You can request google to crawl it sooner though:
https://support.google.com/webmasters/answer/1352276?hl=en -
Question about robots.txt
This isn't something I've usually bothered with, as I always thought you didn't really need one unless you wanted to disallow access to pages / folders on a site.
However, a client has been reading up on SEO and mentioned that some analytics thing (possibly Google) was reporting that "one came back that the robot.txt file was invalid or missing. I understand this can stop the search engines linking in to the site".
So I had a rummage, and uploaded what I thought was a standard enough robots.txt file :
# robots.txt
User-agent: *
Disallow:
Disallow: /cgi-bin/
But apparently this is reporting :
The following block of code contains some errors.You specified both a generic path ("/" or empty disallow) and specific paths for this block of code; this could be misinterpreted. Please, remove all the reported errors and check again this robots.txt file.
Line 1
# robots.txt
Line 2
User-agent: *
Line 3
Disallow:
You specified both a generic path ("/" or empty disallow) and specific paths for this block of code; this could be misinterpreted.
Line 4
Disallow: /cgi-bin/
You specified both a generic path ("/" or empty disallow) and specific paths for this block of code; this could be misinterpreted.
If anyone could set me straight on how a standard / default robots.txt file should look like, that would be much appreciated.
Thanks.Remove the blank disallow line so it looks like this:
User-agent: *
Disallow: /cgi-bin/
E. Michael Brandt
www.divahtml.com
www.divahtml.com/products/scripts_dreamweaver_extensions.php
Standards-compliant scripts and Dreamweaver Extensions
www.valleywebdesigns.com/vwd_Vdw.asp
JustSo PictureWindow
JustSo PhotoAlbum, et alia -
Robots.txt - default setup
Hey!
Since im using Iweb for creating my websites i know that i have to setup robots.txt for SEO.
I have made several sites: one for restaurant, one is about photography, one personal etc...
There is nothing i want to "hide" from google robots on those websites.
So my question is:
When we create a website and publish it is there at least a default setup for robots.txt ?
For example:
Website is parked in folder: public_html/mywebsitefolder
Inside mywebsitefolder folder i have:
/nameofthewebsite
/cgi-bin
/index.html
Structure is same for all websites created with Iweb so what should we by default put in robots.txt ?
Ofcourse, in case you dont want to hide any of the pages or content.
Azz.If you don't want to stop the bots crawling any folder - don't bother with one at all.
The robots.txt should go in the root folder since the crawler looks for....
http://www.domain-name.com/robots.txt
If your site files are in a sub folder the robots.txt would be like...
User-agent: *
Disallow: /mywebsitefolder/folder-name
Disallow: /mywebsitefolder/file.file-extension
To allow all access...
User-agent: *
Disallow:
I suppose you may want to use robots.txt if you want to allow/disallow one particular bot. -
Just done a replacement of the pata DVD drive (its new and is working ok), and a fresh clean install of Snow Leopard with all updates, iMac 5.1 behaves strange, sometimes I start Google Chrome and look at youtube videos the computer freezes and shuts down immediately,
I believe its related to overheating ?
iStat Pro shows GPU diode temp at 66 C, CPU at 48 C, Fan rpms is around 1000
Any ideas somebody ?
The hard disk has been previously checked with state of the art techniques that have confirmed that the hard disk drive is in perfect condition.1.5-3 minute boot up as opposed to 15-20 seconds
And
why it takes a long time to load a lot of things.
I have restored this
from a time machine partition.
TimeMachine is only a backup and restore, it won't fix issues in software and according to your information, doesn't even optimize the restore for best performance on boot hard drives.
What you need to do to regain your speed is to understand how your machine works
Why is my computer slow?
Fix any and all issues in software following this list of fixes
..Step by Step to fix your Mac
Then follow this defrag method I've outlined
How to safely defrag a Mac's hard drive
Most commonly used backup methods
There shouldn't be need to reinstall OS X fresh unless your having file structure issues which if they are should appear when in the Steps, which then a zero erase and install will cure as well as any bad sector issues, the defrag step wouldn't be necessarry on a freshly installed system obviously as the files are written all together, not in portions all over the drive.
Hope this assists. -
Use of robots.txt to disallow system/secure domain names?
I've got a client who's system and secure domains are ranking very high on google. My SEO advisor has mentioned that a key way to eliminate these URLs from google is through the use of disallowing content through robots.txt. Given BC's unique nature of dealing with system and secure domains I'm not too sure if this is even possible as any disallowances I've seen or used before have been directories and not absolute URL's, nor have I seen any mention of this possibility around. Any help or advice would be great!
Hi Mike
Under Site Manager > Pages, when accessing a specific page, you can open the SEO Metadata section and tick “Hide this page for search engines”
Aside from this, using the robots.txt file is indeed an efficient way of instructing search engine robots which pages are not to be indexed. -
Robots.txt and Host Named Site Collections (SEO)
When attempting to exclude ALL SharePoint Sites from external indexing, when you have multiple web apps and multiple Host Named Site Collections, should I add the robots.txt file to the root of each web app, as well as each hnsc? I assume so, but, thought
I would check with the gurus...
- RickI think, one for each site collection as each site collection has different name and treated as web site.
"he location of robots.txt is very important It must be in the main directory because otherwise user agents (search engines) will not be able to find it. Search engines look first in the main directory (i.e.http://www.sitename.com/robots.txt)
and if they don’t find it there, they simply assume that this site does not have a robots.txt file"
http://www.slideshare.net/ahmedmadany/block-searchenginesfromindexingyourshare-pointsite
Please remember to mark your question as answered &Vote helpful,if this solves/helps your problem. ****************************************************************************************** Thanks -WS MCITP(SharePoint 2010, 2013) Blog: http://wscheema.com/blog -
Web Repository Manager and robots.txt
Hello,
I would like to search an intranet site and therefore set up a crawler according to the guide "How to set up a Web Repository and Crawl It for Indexing".
Everything works fine.
Now this web site uses a robots.txt as follows:
<i>User-agent: googlebot
Disallow: /folder_a/folder_b/
User-agent: *
Disallow: /</i>
So obviously, only google is allowed to crawl (parts of) that web site.
My question: If I'd like to add the TRex crawler to the robots.txt what's the name of the "User-agent" I have to specify here?
Maybe the name I defined in the SystemConfiguration > ... > Global Services > Crawler Parameters > Index Management Crawler?
Thanks in advance,
StefanHi Stefan,
I'm sorry but this is hard coded. I found it in the class : com.sapportals.wcm.repository.manager.web.cache.WebCache
private HttpRequest createRequest(IResourceContext context, IUriReference ref)
HttpRequest request = new HttpRequest(ref);
String userAgent = "SAP-KM/WebRepository 1.2";
if(sessionWatcher != null)
String ua = sessionWatcher.getUserAgent();
if(ua != null)
userAgent = ua;
request.setHeader("User-Agent", userAgent);
Locale locale = context.getLocale();
if(locale != null)
request.setHeader("Accept-Language", locale.getLanguage());
return request;
So recompile the component or changing the filter... I would prefer to change the roberts.txt
hope this helps,
Axel -
Robots.txt question?
I am kind of new to web hosting, but learning.
I am hosting with just host, I have a couple of sites (addons). I am trying to publish my main site now and there is a whole bunch of stuff in site root folder that I have no idea what it is. I don't want to delete anything and I am probably not going too lol. But should I block a lot of the stuff in there in my Robots.txt file?
Here is some of the stuff in there:
.htaccess
404.shtml
cgi-bin
css
img
index.php
justhost.swf
sifr-addons.js
sIFR-print.cs
sIFR-screen.css
sifr.js
should I just disallow all of this stuff in my robots.txt? or any recommendations would be appreciated? ThanksSeaside333 wrote:
public_html for the main site, the other addons are public_html/othersitesname.com
is this good?
thanks for quick response
Probably don't need the following files unless youre using text image-replacement techniques - sifr-addons.js, sIFR-print.cs, sIFr-screen.css, sifr,js
Good to keep .htaccess - (can insert special instrcutions in this file) - 404.shtml (if a page can't be found on your remote server it goes to this page) - cgi-bin (some processing scripts are placed in this folder)
Probably you will have your own 'css' folder. 'img' folder not needed. 'index.php' is the homepage of the site and what the browser looks for initially, you can replace it with your own homepage.
You don't need justhst.swf.
Download the files/folders to you local machine and keep them in case you need them. -
Hello,
just a short question: Why does Muse not create a robots.txt?
A couple of months ago a had a client who didn´t showed up on any search results but the site was online for more than a year.
We investigated and found out that the client had no robots.txt on his server. Google mentions ( sorry i cannot find the source right now) that it will not index a page if there is no robots file.
I think that it is important to know this. It would be cool if there is a feature in the export dialog ( checkbox "create robots.txt" - and maybe a Settings Panel (follow, nofollow, no directories...)
Regards
AndreasHere's one example of the text Google is posting:
http:/ / webcache. googleusercontent. com/ search? rlz= 1T4GGLR_enUS261US323&hl= en&q= cache:SSb_hvtcb_EJ:http:/ / www. inmamaskitchen. com/ RECIPES/ RECIPES/ poultry/ chicken_cuban. html+cuban+chicken+with+okra&ct= clnk Robots.txt File May 31, 2011
http:/ / webcache. googleusercontent. com/ search? q= cache:yJThMXEy-ZIJ:www. inmamaskitchen. com/ Nutrition/ Robots.txt File May 31, 2011
Then there are things relating to Facebook????
http:/ / www. facebook. com/ plugins/ like. php? channel_url= http%3A%2F%2Fwww. inmamaskitchen. com%2FNutrition%2FBlueberries. html%3Ffb_xd_fragment%23%3F%3D%26cb%3Df2bfa6d78d5ebc8%26relation%3Dparent. parent%26transport%3Dfragment&href= http%3A%2F%2Fwww. facebook. com%2Fritzcrackers%3Fsk%3Dapp_205395202823189&layout= standard&locale= en_US&node_type= 1&sdk= joey&send= false&show_faces= false&width= 225
THNAK YOU! -
How do I create a robots.txt file for my Muse site?
You can follow the guidelines from Google to create a robots.txt file and place it at the root of your remote site.
https://support.google.com/webmasters/answer/156449?hl=en
Thanks,
Vinayak -
Problems with robots.txt Disallow
Hi
I have a problem with the robots.txt and google.
I have this robots.txt file:
User-agent: *
Disallow: page1.html
Disallow: dir_1/sub_dir_1/
Disallow: /data/
When I enter 'site:www.MySite.com' into Google search box,
Goolge gets the content from the 'data' directory as well. Google
should not have indexed the content of data directory.
So why is google getting the results from 'data' directory,
whereas I have disallowed it.
How can I restrict everyone from accessing the data
directory?
ThanksI found workaround. To have sitemap URL linked pub page, pub page needs to be in the Internet zone. If you need to have sitemap URL linked to the real internet address (e.g. www.company.example.com) you need to put auth page in the default zone, pub
page in the intranet zone and create AAM http://company.example.com in the internet zone. -
Robot.txt and duplicat conent i need help
Hello Guys i´m new in BC i have 2
Questions. 1
My startpage is available as xxxx.de
and xxxx.de/index.html
and xxx.de/index.aspx
how can i change this "Duplicate Content!!!!
and the 2 Questions where i have to load the robot.txt.
THXAs long as you do not link to other versions and be inconsistent you do not need to worry about your start page.
Maybe you are looking for
-
mba sound: just bought my first mba and I have been using it in my house no problem...took it to my empty office today and I can hear a small vibration/sound coming from the upper left part of the keyboard...constant and rhythmical...ideas?
-
Unable to connect to Deployment share via the 2012 Update 1 workbench.
Hi Everyone, This is a tricky one, I've been scratching my head for a few days now. I have a MDT deployment share, which lives on a server, for argument sake, lets call it: MDT-01. Via the Servers MDT Workbench, I can connect to the deployment share
-
The message says...iCloud backup, this iPad hasn't been backed up in 4 weeks. Backups happen when the iPad is plugged in, locked and connected to wifi. I don't know wether doing that will make the sign go away and allow me to use my iPad, but I don't
-
Ken Burns reverts to nil in iPhoto 11 ver 9.1
I'm making a slide show from an album. I set the Ken Burns effect in individual slides and it plays correctly when I preview it but a few slides later or if I quit for lunch and restart some of the effects remain but many have disappeared and the eff
-
Using SetLeaf seteaf to get the top hierarchy of the setnode its setname
hi, using setleaf I have to set its setnode. I have to get the top standard hierarchy setname of that setnode. Is there any function module developed or is there anyway i can get the data. Kamlesh