Problem of reading source HTML document  via URL object?

Hi, Friends,
why the following simple program can't work correctly?
import java.net.*;
import java.io.*;
public class URLReader {
public static void main(String[] args) throws Exception {
     URL ieee = new URL("http://www.ieee.com/");
     BufferedReader in = new BufferedReader(
                    new InputStreamReader(
                    yahoo.openStream()));
     String inputLine;
     while ((inputLine = in.readLine()) != null)
     System.out.println(inputLine);
     in.close();
you know, when we use http://www.ieee.com/ to visit the ieee website via browser, it works well, only the adress is reset to http://ieee.com/portal/site/iportals/ , why the browser can do that, while the URL object can't?

A URL is simply a reference to a "file" location.
To do I/O on that object requires some sort of protocol. HTTP (a "web server"), will allow you to do this, but uses the HttpURLConnection object.
The Java Tutorials shows you CAN read/write to a URL, but requires the underlying HTTP protocol client implementation.
I think you are using this code from the Sun site: http://java.sun.com/docs/books/tutorial/networking/urls/readingURL.html
Which may be why you create 'ieee' but reference 'yahoo' in your code.
Were you getting an error message when trying to compile your program?
You may wish to look further into the tutorial pages at: http://java.sun.com/docs/books/tutorial/networking/urls/readingWriting.html
But I still would refer you in the end to HttpURLConnection which is much more in-depth.

Similar Messages

  • JEditorPane - backmapping highlighted text to source html document

    Hello everyone,
    I have the following problem: I display a HTML document in JEditorPane. Then I highlight some chunk of text and get the start and the end selection offsets by using getSelectionStart() and getSelectionEnd() methods, respectively. What I would like to know is the exact position (character offset) of the highlighted text in the original HTML document that is being displayed.
    Does anybody know the solution for that?
    Thanks.

    I don't think it's possible. The parser builds a Document out of the text and doesn't keep the original source text. For example if your html is like this:
    String text = "<html><body>     some     text     </body></html>";All the multiple spaces between the text is parsed out.
    Use a JTextPane and use attributes. Its easier to work with than using HTML.

  • I want to read a HTML document from internet

    Is it i should use javax.swing.text.html.HTMLDocument?

    Is it i should use javax.swing.text.html.HTMLDocument?What do the docs for that class say? Does it have methods for reading an HTML document from the internet?

  • Users able to see documents via URL when they cannot see the folder containing them?

    I took an AD group out of the permissions list for a folder in a document library. Then I ran an incremental search.
    Now, users in that group cannot see the folder (correct behavior!).
    Users in that group do not find documents in that folder when performing a site search (correct behavior!).
    BUT, users in that group CAN open a document in that folder by using the full URL to a document there, the full link.
    This seems like a security violation - how can the members of that removed group open a document in a library that they cannot even see?
    SP2010 Enterprise, SP1, Feb 2014 CU on Windows Server 2008 R2
    Win7 clients with IE10
    I have reviewed the groups in use and the permissions lists where they are used and don't see any source of "leakage".
    Any ideas why this might be happening and how to fix it?

    Hi,
    Based on your description, my understanding is that the user without permission can see a document in a folder by using the full URL to a document.
    I have followed your description to do a test, but I can’t reproduce your problem.
    My 
    suggestion is that:
    Sign in your site with a user who has full control 
    permission to the site, and go to  the documents in the folder and check whether the users in that group without permission to the folder have permissions to the documents.
    When users in that group use the full URL to a document , please check whether the current user is the user in that group without permission to a
    document in the folder. You can try a test in another computer.
    Please clear your browser Cache and then test again.
    If the issue still exists
     , please don’t hesitate to let me know.
    Best Regards
    Wendy Li
    TechNet Community Support

  • Problem when post G/L Document via BAPI_ACC_GL_POSTING_POST

    What I am doing is to post new G/L Accounting Document on each successfully paid transaction, each successfully reversed transaction, each rejected transaction for bank XXX, and each payment transaction for bank YYY in 3 Events respectively. And I am using the BAPI BAPI_ACC_GL_POSTING_POST to post.
    but in the return table there are error messages, the sample code is as follows:
    *& Get the OBJ_KEY of DOCUMENTHEADER *
    get the current fiscal year
       l_fiscyear = sy-datum(4).
    select the Number Range NO. out from T003
       select single numkr
          from t003
           into l_numkr
        where blart = 'SY'.
      if sy-subrc <> 0.
        message .
      endif.
    call FM to get the Accounting Doc. No.
      call function 'NUMBER_GET_NEXT'
        exporting
           nr_range_nr = l_numkr
               object      = ‘RF_BELEG’
          subobject      = ‘HKHK’
              toyear      = l_fiscyear
          importing
              number = p_belnr
          exceptions
            interval_not_found = 1
            number_range_not_intern = 2
            object_not_found = 3
            quantity_is_0 = 4
            quantity_is_not_1 = 5
            interval_overflow = 6
            buffer_overflow = 7
            others = 8.
    if sy-subrc <> 0.
    message
    endif.
    concatenate the Accountint Doc. No, the company code and the
    fiscal year together as the OBJ_KEY
    concatenate p_belnr ‘HKHK' l_fiscyear into lx_docheader-obj_key.
    &--Get OBJ_KEY end--
    get the OBJ_SYS
      select single logsys
                  from t000
                  into lx_docheader-obj_sys
                where mandt = sy-mandt.
    if sy-subrc <> 0.
    message
    endif.
    fill the structure parameter DOCUMENTHEADER of the BAPI
    lx_docheader-obj_type = 'BKPFF'.
    lx_docheader-username = sy-uname.
    lx_docheader-comp_code = 'HKHK'.
    lx_docheader-ac_doc_no = p_belnr.
    lx_docheader-fisc_year = l_fiscyear.
    lx_docheader-doc_date = p_doc_date.
    lx_docheader-pstng_date = p_post_date.
    lx_docheader-doc_type = ‘SY’.
    fill the table parameter ACCOUNTGL of the BAPI
    lwa_tabaccgl-itemno_acc = 1.
    lwa_tabaccgl-gl_account = ‘0000050107’ .
    lwa_tabaccgl-comp_code = ‘HKHK’.
    lwa_tabaccgl-pstng_date = p_post_date.
    lwa_tabaccgl-doc_type = ‘SY’.
    lwa_tabaccgl-ac_doc_no = p_belnr.
    lwa_tabaccgl-fisc_year = l_fiscyear.
    append lwa_tabaccgl to li_tabaccgl.
    clear lwa_tabaccgl.
    lwa_tabaccgl-itemno_acc = 2.
    lwa_tabaccgl-gl_account = ‘0000082910’.
    lwa_tabaccgl-comp_code = ‘HKHK’.
    lwa_tabaccgl-pstng_date = p_post_date.
    lwa_tabaccgl-doc_type = ‘SY’.
    lwa_tabaccgl-ac_doc_no = p_belnr.
    lwa_tabaccgl-fisc_year = l_fiscyear.
    lwa_tabaccgl-costcenter = ‘0000011401’.
    append lwa_tabaccgl to li_tabaccgl.
    clear lwa_tabaccgl.
    fill the table parameter CURRENCYAMOUNT of the BAPI
    lwa_tabcurramt-itemno_acc = 1.
    lwa_tabcurramt-currency = ‘HKD’.
    lwa_tabcurramt-amt_doccur = ‘1.5-’.
    append lwa_tabcurramt to li_tabcurramt.
    clear lwa_tabcurramt.
    lwa_tabcurramt-itemno_acc = 2.
    lwa_tabcurramt-currency = ‘HKD’.
    lwa_tabcurramt-amt_doccur = ‘1.5’.
    append lwa_tabcurramt to li_tabcurramt.
    clear lwa_tabcurramt.
    Call BAPI to post the FI-GL Document
      call function 'BAPI_ACC_GL_POSTING_POST'
          exporting
          documentheader = lx_docheader
          tables
          accountgl = li_tabaccgl
          currencyamount = li_tabcurramt
          return = li_tabreturn.
    The return table says:
    Error in document: BKPFF 1200000068HKCG2007 UD1CLNT120
    Field Value date is a required field for G/L account HKCG 50107
    would you experts please tell me what's the problem is?? it's emergency, thanks in a million advance.

    Hi
    U haven't to transfer the reference data to the BAPI, the document number is picked up automatically:
    * fill the structure parameter DOCUMENTHEADER of the BAPI
    *    LX_DOCHEADER-OBJ_TYPE = 'BKPFF'.
        LX_DOCHEADER-USERNAME = SY-UNAME.
        LX_DOCHEADER-COMP_CODE = 'HKHK'.
    *    LX_DOCHEADER-AC_DOC_NO = P_BELNR.
    *    LX_DOCHEADER-FISC_YEAR = L_FISCYEAR.
        LX_DOCHEADER-DOC_DATE = P_DOC_DATE.
        LX_DOCHEADER-PSTNG_DATE = P_POST_DATE.
        LX_DOCHEADER-DOC_TYPE = ‘SY’.
    * fill the table parameter ACCOUNTGL of the BAPI
        LWA_TABACCGL-ITEMNO_ACC = 1.
        LWA_TABACCGL-GL_ACCOUNT = ‘0000050107’ .
        LWA_TABACCGL-COMP_CODE = ‘HKHK’.
        LWA_TABACCGL-PSTNG_DATE = P_POST_DATE.
        LWA_TABACCGL-DOC_TYPE = ‘SY’.
    *    LWA_TABACCGL-AC_DOC_NO = P_BELNR.
    *    LWA_TABACCGL-FISC_YEAR = L_FISCYEAR.
        APPEND LWA_TABACCGL TO LI_TABACCGL.
        CLEAR LWA_TABACCGL.
        LWA_TABACCGL-ITEMNO_ACC = 2.
        LWA_TABACCGL-GL_ACCOUNT = ‘0000082910’.
        LWA_TABACCGL-COMP_CODE = ‘HKHK’.
        LWA_TABACCGL-PSTNG_DATE = P_POST_DATE.
        LWA_TABACCGL-DOC_TYPE = ‘SY’.
    *    LWA_TABACCGL-AC_DOC_NO = P_BELNR.
    *    LWA_TABACCGL-FISC_YEAR = L_FISCYEAR.
        LWA_TABACCGL-COSTCENTER = ‘0000011401’.
        APPEND LWA_TABACCGL TO LI_TABACCGL.
        CLEAR LWA_TABACCGL.
    Max

  • Problems installing Reader 10.1.3 via GPO from AIP

    I have been using GPO to install Reader for a while.  Up til now I have not had any problems that I have not been able to resolve thanks to all of the posts I have seen here.
    I have had Reader 10.1.1 install on all of the computers on the network using GPO.  I just went through the process of creating an AIP (upgraded from 10.1.0 with the MSP file) for 10.1.3 and setup a new package in my GPO for the 10.1.3 install (after using the Customization Wizard to modify it).  Everything appeared to be working properly.  I removed the 10.1.1 package and the 10.1.3 package seems to be getting pushed to the computers on the network.  Unfortunately many of the users have started reporting that they are unable to open PDF files.  Upon investigation the Add/Remove Programs show that the Reader 10.1.3 package has been installed on the computers, but when an Administrative user logs onto the computer and runs the Adobe Reader it appears to complete installation (installing settings or something else).  When the administrative user logs off and the normal user logs onto the same computer it then is fully functional.
    Have I missed something?
    Most computers on the network are Windows XP (fully patched).  We are slowly replacing them with Windows 7.  Users do not have admin rights.
    Anybody else see similar problems?

    lrcjc,
    Have you developed a solution for this issue yet?  I am pushing the installation with a Computer Policy > Startup Script and am having a similar issue.  When I log in as a user I can not use the program.  When I log in as an administrator it will work.  Then, when I log back in as a regular user I can see Reader 10.1.3.  I am in an enterprise environment with XP and 7 machines.  Here is my installation script:
    @echo off
    set server=leejsp-fs2
    set update=10.1.3
    if exist "\\leejsp-fs2\syncnet\Reader_Test\logs\%computername%.txt" goto patch
    echo Adobe Reader install started on %date% %time% from %server% >> "\\leejsp-fs2\syncnet\Reader_Test\logs\%computername%.txt"
    echo Installing Adobe Reader...
    msiexec /i "\\%server%\syncnet\Reader_Test\acroread.msi" /qb
    echo Adobe Reader install finished on %date% %time%  >> "\\leejsp-fs2\syncnet\Reader_Test\logs\%computername%.txt"
    :patch
    if exist "\\leejsp-fs2\syncnet\Reader_Test\logs\%computername%_%update%.txt" goto end
    echo Adobe Reader %update% Update install started on %date% %time% from %server% >> "\\leejsp-fs2\syncnet\Reader_Test\logs\%computername%_%update%.txt"
    msiexec /p "\\%server%\install\syncnet\Reader_Test\AdbeRdrUpd1013_MUI.msp" /qb
    echo Adobe Reader %update% Update install started on %date% %time% from %server% >> "\\leejsp-fs2\syncnet\Reader_Test\logs\%computername%_%update%.txt"
    :end
    This script worked perfectly with Reader 9 and update 9.3.2.  Could my parameters be outdated?  Am I missing a parameter that is stopping a non-administrator from finishing the installation?  Anything helps!

  • Help: Problem with scrolling my html items and placed objects they keep cutting through my top menu

    Basically whenever i place an html item in my Muse site or an object I encounter a problem when scrolling down past that object in the preview. I have a horizontal menu bar that sits on the top of my site and whenever I scroll down the html items and objects
    cut through my menu. Is there any way to rectify this? - i've tried pinning objects but can't figure it out. Any help would be greatly appreciated.
    Thanks,

    So you have a pinned Menu in the Master page but is being overlayed by the objects in page when you scroll down? If that is the case, select the Menu in the Master, right click on it and Move To Master Foreground.
    Thanks,
    Vinayak

  • KM Access via URL aborts with HTTP Error 403

    Dear experts,
    i am currently working on the following problem:
    - we want to use 'KM Deep Links' to documents (open documents via URL Access)
    - when people are authenticated they can use this deep links (no problem in this case)
    - when they are not authenticated -> HTTP Error 403 is displayed
         -> we want a login form to be displayed and after successful authentication the document is displayed
    - example:
         http://localhost:50000/irj/go/km/docs/documents//SAP%20Portal%20Documents/Members%20Area/Contracts%20_%20Guidelines/example.pdf
    This is our scenario:
    - Our Portal is a SAP Nw 7.0 with SP20
    - We run an External Facing Portal and therefore we have applied SAP Note 837898 (Standard Configuration in current SPS)
       -> URL Access to Documents to which Anonymous Users have access works fine
    - as a test we used the 'old Content Acess Path' mentioned in SAP Note 837898, which looks like this:
         http://localhost:50000/irj/servlet/prt/portal/prtroot/com.sap.km.cm.docs/documents//SAP%20Portal%20Documents/Members%20Area/Contracts%20_%20Guidelines/example.pdf
           - this access works as desired. Login Form is prompted and after authentication, the document is displayed. Problem here  is that this Content Access Path doesn't work with Anonymous Access to KM.
    So the goal is: Display Login form instead of this Error 403 Page. What configuration mistakes could we have done?
    With best regards,
    Marcus
    Edited by: Marcus Böhm on Jan 25, 2010 12:08 PM

    Update:
    He tried a file extension other than txt or htm/html, and he still gets the error.
    He also tried fiddling with IE -- Internet Options -- Languages to remove extraneous languages, and he still gets the error.

  • Upload HTML document...

    When I try to upload a HTML document into a folder using Upload iView everything works fine and the portal reports "... document uploaded", but the document does not appear inside the folder. I guess it gets deleted immediately. Strangely, this problem
    applies only  to  HTML documents. Is it because of some content translation malfunction?
    Thank you for your comments.

    The problem is solved. It was because of a repository service with the code
    if (resource.getContent().getContentLength() < 0)
      resource.delete();
    Somehow, this statement always returned -1 for HTML files (some filter probably is screwed up!)
    I changed it to:
    if (resource.getUnfilteredContent().getContentLength() <=0)  ....
    Regards.

  • Collect email addresses from html file or URL

    Hi guys, it sounds never been disccussed here,
    I have a HTML file which is huge , contains text graphics and email addresses, if I want to copy the email address manually, it will take long time,
    so have anybody thought of a JSP script to do this task ? , just to read the html or the URL and to collect all the email addresses there !

    Hi guys, it sounds never been disccussed here, Probably because most developers dont like to be
    spammed and dont like to make programs that can
    collect email addresses from a website or url?OWNED
    definitely owned.

  • Problem parsing a html document

    Hi all,
    I need to parse a html document.
    InputStream is = new java.io.FileInputStream(new File("c:/temp/htmldoc.html"));
    DOMFragmentParser DOMparser = new DOMFragmentParser();
    DocumentFragment doc = new HTMLDocumentImpl().createDocumentFragment();
    DOMparser.parse(new InputSource(is), doc);
    NodeList nl = doc.getChildNodes();
    I get just 3 of the following nodes...... though the document htmldoc.html is a proper html doc..
    #document-fragment
    HTML
    #text
    Any suggestions/help are most welcome. Thanks

    Here's an example showing how to do this via javax.xml:
    import java.io.*;
    import java.net.*;
    import javax.xml.parsers.*;
    import org.w3c.dom.*;
    public class HTMLElementLister {
         public static void main(String[] args) throws Exception {
              URLConnection con = new URL("http://www.mywebsite.com/index.html").openConnection();
              con.connect();
              InputStream in = (InputStream)con.getContent();
              Document doc = null;
              try {
                   DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
                   DocumentBuilder db = dbf.newDocumentBuilder();
                   doc = db.parse(in);
              } finally {
                   in.close();
              NodeList nodes = doc.getChildNodes();
              for (int i=0; i<nodes.getLength(); i++) {
                   Node node = nodes.item(i);
                   String nodeName = node.getNodeName();
                   System.out.println(nodeName);
                   if ("html".equalsIgnoreCase(nodeName)) {
                        System.out.println("|");
                        NodeList grandkids = node.getChildNodes();
                        for (int j=0; j<grandkids.getLength(); j++) {
                             Node contentNode = grandkids.item(j);
                             nodeName = contentNode.getNodeName();
                             System.out.println("|- " + nodeName);
                             if ("body".equalsIgnoreCase(nodeName)) {
                                  System.out.println("   |");
                                  NodeList bodyNodes = contentNode.getChildNodes();
                                  for (int k=0; k<bodyNodes.getLength(); k++) {
                                       node = bodyNodes.item(k);
                                       System.out.println("   |- " + node.getNodeName());
    }

  • Loading scripts - what's the difference between loading into edge via script window and including a script in the html document?

    I have a html page that loading in two edge compositions and an external custom javascript file. The javacsript file includes the bootstrapCallback so I can store references to the loaded compositions and can communicate with them. This seems to work well. The problem have is when I also try and load in a custom plugin javascript files into the edge compositions via the script window inside edge - I don't understand how this works, for example if I load in a custom javascript file into one of the compositions can only that composition use it's funcitionality? Is loading in scripts via edge script window the same as including in html document, I'm confused how the two relate, please help me understand.

    I have a html page that loading in two edge compositions and an external custom javascript file. The javacsript file includes the bootstrapCallback so I can store references to the loaded compositions and can communicate with them. This seems to work well. The problem have is when I also try and load in a custom plugin javascript files into the edge compositions via the script window inside edge - I don't understand how this works, for example if I load in a custom javascript file into one of the compositions can only that composition use it's funcitionality? Is loading in scripts via edge script window the same as including in html document, I'm confused how the two relate, please help me understand.

  • How to read the html source code of a webpage.

    How can I read the html source code of a webpage with a java application?
    Is there a good idea?

    >
    How can I read the html source code of a webpage
    with a java application?
    Is there a good idea?
    I don't know if this is a good idea, but it works.
    1) Use a URL to obtain the document's location
    2) Use a URLConnection to open a connection between your computer and the
    document server
    3) Connect to the server
    4) Get the InputStream of said connection
    5) Associate the Input Stream with a Buffered Input Stream
    At this point you can use a loop to read lines from the BufferedInput Stream and append them to a TextArea or other suitable text component.

  • Receiving XML document and getting attachment via URL

    What is the best method to retrieve a file from a provided URL. Here is the scenario. A system will send us an XML document via an HTTP post but the source system is unable to send us the associated attachment as a MIME attachment instead in the XML they will include a extra node with the URL to a file we need to get. What is the ideal method of retrieving this file once we receive the XML message. If the provided link was an ftp site we would hold the message in BPM and use the file adapter but since this is a full url such as http://someserver.com/myfile/file.doc we are looking for the best approach. On thought was a Java Proxy.
    Regards

    Hi,
    there are just two approaches:
    easy one - java proxy
    more diffcult - creating sync adapter for http xmls
    I'd go for java proxy (if possible from secutiry point of view)
    as you will have the link (so no "adapter configuration" necessary in your case)
    Regards,
    michal
    <a href="/people/michal.krawczyk2/blog/2005/06/28/xipi-faq-frequently-asked-questions"><b>XI / PI FAQ - Frequently Asked Questions</b></a>

  • Problems reading an html page encoded in UTF-8

    I'm trying this code with eclipse:
    public class TestBooking {
          * @param args
         public static void main(String[] args) throws Exception {
              URL url;
              String url_string="http://www.booking.com/searchresults.html?checkin_monthday=21;checkin_year_month=2007-8;checkout_monthday=22;checkout_year_month=2007-8;class_interval=1;offset=0;si=ai%2Cco%2Cci%2Cre;ss_all=0;city=-126693";
              url = new URL(url_string);
              URLConnection connection = url.openConnection();
              HttpURLConnection httpConnection =(httpURLConnection)connection;
              InputStream input=connection.getInputStream();
              BufferedReader prova=new BufferedReader(new InputStreamReader(input));
              String str;
              while ((str=prova.readLine())!=null)
                   System.out.println(str);
    }And I get in the console a strange set of charachers:
    <link rel="alternate" hreflang="el" href="/searchresults.el.html?sid=f190f46ada5404fc896b33035b20d50d;checkin_monthday=21;checkin_year_month=2007-8;checkout_monthday=22;checkout_year_month=2007-8;city=-126693;class_interval=1;offset=0;si=ai%2Cco%2Cci%2Cre" title="������������" />
    instead of the correct:
    <link rel="alternate" hreflang="el" href="/searchresults.el.html?sid=f190f46ada5404fc896b33035b20d50d;checkin_monthday=21;checkin_year_month=2007-8;checkout_monthday=22;checkout_year_month=2007-8;city=-126693;class_interval=1;offset=0;si=ai%2Cco%2Cci%2Cre" title="&#917;&#955;&#955;&#940;&#948;&#945;" />
    How can I fix the problem to read the input correctly?
    Thanks for help.

    No conversion is needed, just specify the encoding when you create your InputStreamReader. But you can't expect text in all those different scripts to display correctly in your console. Even if the console is configured to use an encoding like UTF-8 that can handle all the characters, it won't be using a font with all the appropriate glyphs. But, like you said, it doesn't matter if you can't display all of the page's source code correctly. All you need is to be able to read it, which means using the correct encoding.

Maybe you are looking for

  • Classpath troubles in Solaris 8

    Hi, I'm trying to run my application build on JWSDP1.2 on Solaris 8. I've installed the JWSDP and J2SDK and they work fine. But now I'm stuck with the classpath. I use (something like :) the following start script: HME="/export/home/NAME" JAX="$HME/j

  • While doing MIGO for a material we getting error as?

    Hi, While doing a MIGO for a material we are getting a error as ''special stock o of a vendor doesnot exists" Thanks KK

  • Clueless how to mate the AC3 sound file given me with my video project

    Hi gang. This morning I started researching how to do this, but haven't figured it out yet. My DSLR projct has ben edited on Premiere Pro CS5. I created the project in stereo and once I picture locked, outputted an OMF file to the sound mixer. He jus

  • Photoshop CC stürzt bei 3D Anwendung ab

    Guten Abend, ich habe mir Adobe Photoshop CC auf meinem Macbook Air mid 2013 installiert. Technische Daten: 4GB RAM Intel 5000 Graphics Intel i5 1,3 GHz Funktioniert auch alles aber sobald ich die 3D Funktion nutze schmiert mir das Programm ab. Habt

  • A "replacement" phone has been shipped...will it be a new iphone 4?

    i recently sent my iphone 4 to apple's repair facility and i was just notified that my iphone 4 could not be repaired so they are sending me a "replacement" phone.  will i get a new phone? thanks all.