[solved]Wget: ignore "disallow wget" +comply to the rest of robots.txt

Hello!
I need to wget a few (maybe 20 -.- ) html files that are linked on one html page (same domain) recursively, but the robots.txt there disallows wget. Now I could just ignore the robots.txt... but then my wget would also ignore the info on forbidden links to dynamic sites which are forbidden in the very same robots.txt for good reasons. And I don't want my wget pressing random buttons on that site. Which is what the robots.txt is for. But I can't use the robots.txt with wget.
Any hints on how to do this (with wget)?
Last edited by whoops (2014-02-23 17:52:31)

HalosGhost wrote:Have you tried using it? Or, is there a specific reason you must use wget?
Only stubborness
Stupid website -.- what do they even think they achieve by disallowing wget? I should just use the ignore option and let wget "click" on every single button in their php interface. But nooo, instead I waste time trying to figure out a way to exclude those GUI links from being followed even though wget would be perfectly set up to comply to that automatically if it weren't for that one entry to "ban" it. *grml*
Will definitely try curl next time though - thanks for the suggestion!
And now, I present...
THE ULTIMATIVE SOLUTION**:
sudo sed -i 's/wget/wgot/' /usr/bin/wget
YAY.
./solved!
** stubborn version.
Last edited by whoops (2014-02-23 17:51:19)

Similar Messages

  • [SOLVED] Get filesize with wget

    Hi again!
    Is it possible to get filesize using the wget command?
    wget -O- http://cuddlewagon.org/somefile.txt | du -b THEFILE
    Last edited by svanberg (2009-07-29 20:56:32)

    When running
    wget --spider http://cuddlewagon.org/somefile.txt 2>&1 | grep Length
    Will give me:
    Length: 543 [text/plain]
    Can i use awk to grep only numbers? These numbers will grow in so the regular expression must take value from 1 and above.
    Last edited by svanberg (2009-07-29 09:39:43)

  • Want TV guide so can mark hrs ahead favorite ahead shows,&ignore the rest.

    I don't know where else to post this question, so here goes:
    Want online TV guide so can mark hrs ahead favorite ahead shows,&ignore the rest.
    Oh, and I mean for antenna broadcast tv. Not for cable or satellite, can't afford that.
    Is there such an online site or application for Mac, Firefox.
    Want to be able to look over a TV schedule and click on the shows interested in for the hours ahead and not have to bother even seeing (much less having to re-read) again the shows we have no interest in.
    I would like the shows I do not mark to just disappear from the schedule and see them no more that day and evening.
    Less clutter, much simpler, and a lot less re-reading.
    So frustrating to have to skim down the column for each hour and look at all those shows channels I've already looked at and know I have no interest in.
    I know, I should be able to remember (naahhhh), or write them down (naahhh). The mac is supposed to do all our work and thinkin for us these days.
    Hasn't some bright bulb already solved this problem for millions of viewerrs? Made such an online tv guide website, invented sucha application?
    I have been using zap2it, but it does not have any sucha feature.
    I have Mac, Snow Leopard, and using Firefox.
    Don't think I won't thank you, cuz
    I will.
    Forest Gump ("I'm not a smaaarrrt maaaaannnn.")
    Always very grateful.
    Snow Mac
    P.S.: New to Using Intel MacMini with SnowLeopard.

    thelnukus wrote:
    The mac is supposed to do all our work and thinkin for us these days.
    Oh, dear, then I must have a seriously defective Mac. It refuses to go to work for me, doesn't do my laundry or feed the cat. And it won't do my homework for me.

  • TS3899 In my iPad 2 with IO6 today I can not send emails from my gmail account, they go to the outbox directly...why? How can i solve this problem? ..I restarted the IPad but the problem was not solved. Please help.

    In my iPad 2 with IO6 today I can not send emails from my gmail account, they go to the outbox directly...why? How can i solve this problem? ..I restarted the IPad but the problem was not solved. Please help.

    Greetings,
    Questions:
    1. What version of the Mac OS are you running (Apple > About this Mac)?
    2. What version of the iOS are you running (Settings > About)?
    3. Do you use MobileMe/ iCloud or another server based sync solution like Google or Yahoo?
    4. Do other changes to synced information like Address Book content sync successfully back and forth?
    Based on your description it sounds like you have a 1 way sync issue.  Events sync fine between the iOS devices and fine from the computer to the iOS devices but not from the iOS devices to the computer.
    Try:
    Backup your computer and iOS devices before doing anything else:
    http://support.apple.com/kb/HT1427
    http://support.apple.com/kb/ht1766
    Ensure all the devices in use are fully up to date: Apple > Software Update / Settings > General > Software Update
    Make separate backups of critical data:
    Backup your computer Addressbook: http://docs.info.apple.com/article.html?path=AddressBook/4.0/en/ad961.html
    Backup your computer iCal: http://support.apple.com/kb/HT2966
    Reset syncing on your Mac: http://support.apple.com/kb/TS1627
    Reply back if that does not resolve your issue.
    Hope that helps.

  • IPhoto frustrating error..The volume for "Df23.JPG" cannot be found. It prompts for all photos with this issue rather than offering an option to ignore. I can find the images in spotlight but not in Find Photo. Does anyone have a solution

    iPhoto frustrating error..The volume for "Df23.JPG" cannot be found. It prompts for all photos with this issue rather than offering an option to ignore. I can find the images in spotlight but not in Find Photo. Does anyone have a solution?

    Unless you have the source files that were on the TC or Windows machine you will have to start over with a new library as follows:
    Start over from scratch with new library
    Start over with a new library and import the Originals (iPhoto 09 and earlier) or the Masters (iPhoto 11) folder from your original library as follows:
    1.  Move the existing library folder to the desktop.
    2. Open the library package like this.
    Click to view full size
    3. Launch iPhoto and, when asked, select the option to create a new library.
    4. Drag the Masters (iPhoto 11) folder from the iPhoto Library on the desktop into the open iPhoto window.
    Click to view full size
    This will create a new library with the same Events as the original library but will not keep the metadata, albums, books slideshows and other projects.  Your original library will remain intact for further attempts at fixes is so desired.
    OT

  • I have purchased a in app purchase of a 'gcsepod' for my little brother however the purchase does not come up in the app and requests me to buy another. How do i solve this? I have also got the receipt for this purchase.

    i have purchased a in app purchase of a 'gcsepod' for my little brother however the purchase does not come up in the app and requests me to buy another. How do i solve this? I have also got the receipt for this purchase.

    I'm not sure I can make sense of this but without asking too many questions, you can resolve your question by contacting iTunes direct.
    Apple - Support - iTunes - Contact Us
    But they will be wondering about the reference to hacking and you feeling bad about getting something for free.
    Best step in my view is to make sure you don't get similarly involved in future.  You must know roughly what you were doing.   Then writie it off to a not to be repeated experience.

  • TS4002 Hello, icloud receive messages from gilly hicks, but does not receive messages from another personal account... this is happening me since one week and i dont know how to solve this.... error in the mail delivery system says not valid IPv4 SMTP err

    Hello, icloud receive messages from gilly hicks, but does not receive messages from another personal account... this is happening me since one week and i dont know how to solve this.... error in the mail delivery system says not valid IPv4
    SMTP error from remote mail server after RCPT TO:<[email protected]>:
       host mx6.me.com.akadns.net [17.158.8.114]: 550 5.7.0 Blocked - see https://support.proofpoint.com/dnsbl-lookup.cgi?ip=184.173.9.56:
       [email protected]
    i do alse receive from gmail....
    please help... what is happening!!!!

    Just to recap, this is a collection of ports I have collected over time for people who needed this information when setting up the HP ePrint app so that they could view their email from within the app.  I am certain other applications also need this information.  Although lengthy, I could not find a more comprehensive place to retrieve this information.  Feel free to post additional information, faulty information, or other related topics below as this is simply a collection of data and it would be practically impossible to test all of them. Thank you!
    Don't forgot to say thanks by giving "Kudos" if I helped solve your problem.
    When a solution is found please mark the post that solves your issue.
    Every problem has a solution!

  • Crashes when I attempt to clear browser history;temp fix:I use IObit Security to clean browsing history,cannot clean it all,cleans most,I am then able to clear the rest of my history thru Firefox;I thought this info would help solve the problem

    Crashes when I attempt to clear browser history;temp fix:I use IObit Security to clean browsing history,cannot clean it all,cleans most,I am then able to clear the rest of my history thru Firefox;I thought this info would help solve the problem

    Crap thought that text was all the characters I was allotted, not just the title.
    Anyway I wouldn't have even posted this if you guys would have allowed me to reply to the already existing thread I found through google. Said about 20 or so people had the same problem and no one found the contributor's answer useful, including myself.
    Anyway, I use Firefox, because it is one of two (other being internet explorer) that Trend Micro's security package includes protection with. Same with spyware blaster which I also use but Trend Micro is the deal breaker. Also I like Firefox 4 and your ad blocker is better than google chrome's. However if I continue to have problems with clearing my browser history it is back to Opera and Google Chrome for me. IObit Security supports both of them, it also seems to be the best security system I have ever come across, I will just pay for their subscription and stop using Trend Micro.
    This is a very big problem I hope it is being addressed. Only addon I have is your ad blocker. Also this is completely on your end, not ours. I am the perfect test, my computer is brand new, and by brand new I mean I just started using it the other day, the same day my firefox browser kept crashing as I attempted to clear my browsing history.
    Windows 7, 8GBs RAM, AMD athlon II quad core 2.90 GHz
    Can't even clear an hours worth of browsing history without firefox crashing.

  • When I sync my iphone I get a popup that says 'ABAssistantService quit unexpectedly'. I can 'ignore', 'Report' or 'Reopen'. The popup comes right back after a few seconds no matter which option I choose. The only way to make it stop is to reboot the compu

    When I sync my iphone I get a popup that says 'ABAssistantService quit unexpectedly'. I can 'ignore', 'Report' or 'Reopen'. The popup comes right back after a few seconds no matter which option I choose. The only way to make it stop is to reboot the computer. Any ideas?

    iTunes places the .ipsw file here:
    OSX:
    ~/Library/iTunes/iPhone Software Updates
    XP:
    C:\Documents and Settings\[username]\Application Data\Apple Computer\iTunes\iPhone Software Updates
    7 &Vista:
    C:\Users\[username]\AppData\Roaming\Apple Computer\iTunes\iPhone Software Updates
    Find and delete any and all .ipsw files on your computer. There should only be one, but delete all that you find. Next, disable ALL firewalls & security software on your computer. Connect your phone, iTunes running & restore from backup. Follow this by syncing your content back to your phone.
    This will force iTunes to re-download the .ipsw file, as I suspect yours is corrupt.

  • Hi i got a mac mini but when i connect it to my smartax mt882 modem via ethernet it says device not connected can anyone solve this issue it work fine with the usb connection but the ethernet is giving me problems plz help

    hi i got a mac mini but when i connect it to my smartax mt882 modem via ethernet it says device not connected can anyone solve this issue it work fine with the usb connection but the ethernet is giving me problems plz help

    Hello, give this a try...
    Make a New Location, Using network locations in Mac OS X ...
    http://support.apple.com/kb/HT2712
    10.5, 10.6, 10.7 & 10.8…
    System Preferences>Network, top of window>Locations>Edit Locations, little plus icon, give it a name.
    10.5.x/10.6.x/10.7.x/10.8.x instructions...
    System Preferences>Network, click on the little gear at the bottom next to the + & - icons, (unlock lock first if locked), choose Set Service Order.
    The interface that connects to the Internet should be dragged to the top of the list.
    For 10.5/10.6/10.7/10.8, System Preferences>Network, unlock the lock if need be, highlight the Interface you use to connect to Internet, click on the advanced button, click on the DNS tab, click on the little plus icon, then add these numbers...
    208.67.222.222
    208.67.220.220
    (There may be better or faster DNS numbers in your area, but these should be a good test).
    Click OK.

  • [svn:bz-trunk] 20986: Classes that are neither explicitly allowed nor disallowed get added to the disallow class cache .

    Revision: 20986
    Revision: 20986
    Author:   [email protected]
    Date:     2011-03-29 05:17:12 -0700 (Tue, 29 Mar 2011)
    Log Message:
    Classes that are neither explicitly allowed nor disallowed get added to the disallow class cache. So we should clear the disallow classes cache also on changing allow rules. Exposing a clearClassCache() method to allow subclasses also to clear the class cache.
    Modified Paths:
        blazeds/trunk/modules/core/src/flex/messaging/validators/ClassDeserializationValidator.ja va

    sorry i forgot that... i use php5 so i guessed at the module name... upon looking closer at the conf its mod_php4.c
    <pre>
    <IfModule mod_php4.c>
    AddType application/x-httpd-php .php
    </IfModule>
    </pre>

  • HT3825 i cant import my 5d 3 raw file using iphoto. how i can solve my problem? i already download the last version of adobe raw ubdat

    i cant import my 5d 3 raw file using iphoto. how i can solve my problem? i already download the last version of adobe raw ubdat

    A Good samaritan answered this for me on another forum
    How to update Camera Raw

  • IPhoto suddenly crashing after it's open for 3 seconds.Its saying Terminating app due to uncaught exception 'NSUnknownKeyException', reason: '[ __NSCFDictionary 0x18d70d10 valueForUndefinedKey:]: this class is not key value coding-compliant for the key

    Iphoto is suddenly crashing after about 3 seconds on my MacBook Pro. - Here is problem report
    Crashed Thread: 46  Import thread 0
    Exception Type: EXC_BREAKPOINT (SIGTRAP)
    Exception Codes: 0x0000000000000002, 0x0000000000000000
    Application Specific Information:
    *** Terminating app due to uncaught exception 'NSUnknownKeyException', reason: '[<__NSCFDictionary 0x18d70d10> valueForUndefinedKey:]: this class is not key value coding-compliant for the key .'
    Why? all of a sudden??
    Thanks

    Wow, that' s awesome, I encountered the same problem with ya today when trying implement Pickers in Tabbar, and can't figure out why, the debugger just throw back such vague message and guess what ? I spent like 3 hours banging my head against the wall wonder why it's not working, I using XCode 2.2.1, my debugger wont work ( program won't stop at break points for what ever I tried ), and not very friendly with those newbie like me - really don't know where are all the Apple's software developer 1337 ? As I remembered, all the good ones goes for the hardware, "the rest" goes for the software =)), but thanks again, thanks, and again, click the tab bar, set both the class and nib file would get rid of the errors, if it would help some one, the credit goes to the two above, say thanks to them, like me. Have a nice day guys, I 'ma happy .

  • How can I ignore multiple clicks while processing the first one?

    Hi -
    I am writing my first Flex program so I'm sorry if this is a newbie question.
    I have some code which handles a click but takes a while to finish. What I want to do is proces steh first click but ignore any more clicks until the code has finished dealing with the first one.
    Can anyone tell me the standard way to do this?
    Here's a simplified example with a builtin delay. If I click on the button several times and then wait, the clicks are all processed in sequence and the test count slowly increases. What I want the code to do is ignore any clicks which teh user made during the two second delay but start handling new clicks after the delay has finished.
    Example Code...
    <<?xml version="1.0"?>
    <s:Application xmlns:fx="http://ns.adobe.com/mxml/2009"
          xmlns:s="library://ns.adobe.com/flex/spark"
          xmlns:mx="library://ns.adobe.com/flex/mx">
       <fx:Script>
          <![CDATA[
    private var count:Number = 0;
    private function clickHandler():void {
       // Something which takes a while to complete...
       var timer:Date = new Date();
       while( (new Date()).valueOf() - timer.valueOf() <  2000 ) {
          var nop:int = 0;
       // Report result
       count++;
       output.text="Test Number "+count;  
    ]]>
       </fx:Script>
       <s:Panel title="Example">
          <s:VGroup left="10" right="10" top="10" bottom="10">
               <s:Label id="output" text=""/>
               <s:Button label="Click Me" click="clickHandler();"/>
          </s:VGroup>
       </s:Panel>
    </s:Application>
    Thanks
    - Jon

    Thanks for the response Pramod. I tried to disable and enable the button but it doesn't seem to have any effect at all. If I click the button several times, this code still processs the clicks one at a time over the next few seconds. I am sure I am making a simple mistake somewhere here...
    private function clickHandler():void {
       mybutton.enabled = false;
    ...slow stuff here
       mybutton.enabled = true;
    Full code...
    <?xml version="1.0"?><s:Application xmlns:fx="http://ns.adobe.com/mxml/2009"      xmlns:s="library://ns.adobe.com/flex/spark"      xmlns:mx="library://ns.adobe.com/flex/mx">   <fx:Script>      <![CDATA[ private var count:Number = 0;  private function clickHandler():void {   mybutton.enabled = false;    // Something which takes a while to complete...   var timer:Date = new Date();   while( (new Date()).valueOf() - timer.valueOf() <  2000 ) {      var nop:int = 0;   }     // Report result   count++;   output.text="Test Number "+count;       mybutton.enabled=true;}  ]]>   </fx:Script>     <s:Panel title="Example">      <s:VGroup left="10" right="10" top="10" bottom="10">         <s:Label id="output" text=""/>         <s:Button id="mybutton" label="Click Me" click="clickHandler();"/>      </s:VGroup>   </s:Panel>  </s:Application>                         

  • Vim's errorformat `ignore the rest of the line'

    I am working with a program which check rules violation in C programs, the output line are like this one:
    r219_n1.c:4:1: violated rule r11 (Avoid implicit conversion between int of different sizes); PP #1 (begin)
    and I'd like to use the quickfix of Vim.
    in my vimrc I put:
    map <Leader>e :set makeprg=rchecker\ %<cr>
    \:set errorformat=%f:%l:%c:\ violated\ rule\ r%n\ %m<cr>
    so when I am ready to check (after compiling) I just press `,e' and :make again.
    It works, but I'd like to keep only the message between () and not the noise after. So I'd need something like:
    map <Leader>e :set makeprg=rchecker\ %<cr>
    \:set errorformat=%f:%l:%c:\ violated\ rule\ r%n\ (%m)[[[just ignore the rest of the line!!!]]]<cr>
    but I can not find a way to just ignore a piece of output! I can not use %s otherwise vim tries to use the matched string to seek the error...
    Do anyone knows how to obtain this?
    thanks

    Thank' for the links.
    I reply a simplified code.
    *** definition of mainClass I tried
    public class mainClass () {
    public static void main( String [] args )
    IgnoreCode ignorecode = new IgnoreCode ();
    classX x = new classX ();
    /* associate x to ignorecode to allowe to call the doPrint
    method of x object in the doprint method of ignorecode object */
    ignorecode.associate(x);
    /* associate ignorecode to x to allowe to call the doPrint
    method of ignorecode object when ......an event occur in the
    swing interface shown by the show method of x object */
    x.associate(ignorecode);
    x.show (); /* see definition of classX under
    *** definition of IgnoreCode class
    public class IgnoreCode extends SuperClass
    public void doPrint(){
    System.out.println( "In do" );
    x.doPrint();
    System.out.println( "In do" );
    *** definition of SuperClass class
    public abstract class SuperClass {
    protected classX x;
    public void associate(classX x1){
    x = x1;
    *** definition of classX class
    public class classX {
    protected IgnoreCode ignorecode;
    public void doPrint() {
    System.out.println( "x In do" );
    public void associate( IgnoreCode ignorecode1 ){
    ignorecode = ignorecode1;
    public void show () {
    /* swing interface */
    }

Maybe you are looking for