Extracting info from webpages

I am trying to create a hotel program that has various features including finding cheapest hotel prices on the web. My program searces the web and returns the approriate web page results in html format. My problem is I'm not sure the best way to extract the information I want. Below is an example of a web page (I know it's long but if you copy it into a web browser, it does work, honest!). From this page I want to extract the hotels name and prices.
http://www.bookings.org/searchresults.html?class_interval=2&country=gb&error_url=http%3A%2F%2Fwww.bookings.org%2Fcountry%2Fgb.html%3F&search_by=city&city=-2595386&region=Avon+Aberdeenshire&class_key=1&class=0&do_availability_check=on&checkin_monthday=21&checkin_year_month=2005-6&checkout_monthday=22&checkout_year_month=2005-6&newlangurl=%2Fcountry%2Fgb.en.html&x=77&y=14
At present I can return the HTML of this page. However can anyone suggest how I go about extracting the specific info i require. The html file is huge!
Regards
Ross

Sorry pasted the wrong url. This one should work.
http://www.bookings.org/searchresults.html?class_interval=2&country=gb&error_url=http%3A%2F%2Fwww.bookings.org%2Fcountry%2Fgb.html%3F&search_by=city&city=-2595386&region=Avon+Aberdeenshire&class_key=1&class=0&do_availability_check=on&checkin_monthday=24&checkin_year_month=2005-3&checkout_monthday=27&checkout_year_month=2005-3&newlangurl=%2Fcountry%2Fgb.en.html&x=88&y=5

Similar Messages

  • Extracting info from a web page

    Hi,
         I m not sure if i m asking this question at the right forum.
    Can anyone tell me if there is a way to extract data from a web page.
    This means, say for example a web site Yahoo displays stock quotes
    updated or NASDAQ values almost in real time.
    Now if i want to get that information from the web page into one
    of my applications ,say, something that uses that data. Is there
    a way to do it?
    Just curious

    Yes, it's possible. You can use the java.net.URL object to connect to websites and download the html. Doing the coding is not that easy, and you should also be mindful of not redistributing data you've gotten from another site without permission

  • Extracting info from HTML documents

    My program returns the HTML of any web page entered by the user. The HTML documents that are returned all contain pricing infomration that I want to extract. Any idea of the best way to search an HTML document for specific infomration I require. Seems like a huge task to split it all into tokens and searching for � sign!!!!!

    This a nightmare of a problem........... the html
    files that I am retrieving are huge. All I need from
    them are a couple of lines of information. How do I
    find the specific infomration I need???Load the entire file, search for it. You find the information in the same way like you'd do when ouy look for it in the file's source code.
    Is it possible from a java program to open the HTML
    file in web broweser, search, then return the info?
    The html files seem really complex to search on.How would this help?

  • Extract info from RFH2 header in MQ message

    Hi All,
    I'm trying to send a MQ message with RFH header. Purpose is that JMS adapter extracts user data from RFH Header.
    A few questions:
    1.Format RFH2 header:
    -In the MQMD I assign the format: MQFMT_RF_HEADER_2
    -Apart from the mandatory fields in the RFH2 header I write user data to it as well and it looks like this:
    <usr><msgType dt="string">data</msgType></usr>    Is this sufficient or should there be an additional surrounding tag like <MQRFH2>?
    2.In PI, I use a JMS Adapter. On the Module-tab I have added AF_Modules/DynamicConfigurationBean (type: Local Enterprise Bean, Module Key: RFHHEADER), after ConvertBinaryToXMBMessage  and before CallSapAdapter.
    Under Module Configuration I added 2 entries:
    RFHHEADER     key.0     read http://sap.com/xi/XI/System/JMS DCJMSMessageProperty0
    RFHHEADER     value.0     msgType
    I want to extract the value of msgType and use it.
    On the Parameter-tab, I suppose I have to add Adapter-specific message attributes, but it is not exactly clear what I should put there.
    3.Dynamic Configuration Bean
    Is there anything I should do to activate that Dynamic Configuration Bean, because when I sent I message, I don't see anything in the monitor of the processed XML messages, not even an error.
    Kind Regards
    Edmond Paulussen

    Edmond Paulussen wrote:>
    > Hi All,
    >
    > I'm trying to send a MQ message with RFH header. Purpose is that JMS adapter extracts user data from RFH Header.
    >
    > A few questions:
    > 1.Format RFH2 header:
    > -In the MQMD I assign the format: MQFMT_RF_HEADER_2
    > -Apart from the mandatory fields in the RFH2 header I write user data to it as well and it looks like this:
    > <usr><msgType dt="string">data</msgType></usr>  
    > On the Parameter-tab, I suppose I have to add Adapter-specific message attributes, but it is not exactly clear what I should put there.
    The list of additional parameters are stored in the ASMA field.
    So you enter here the header value "msgType" with type String
    The value "data" is stored in DCJMSMessageProperty0
    You do not need the DynamicConfigurationBean in this scenario.

  • Extracting info from v$sqlarea

    Hi,
    I use Oracle 9i and I am trying to extract what a session is exactly doing in a certain moment.
    So, on sqlplus I execute
    select a.sid,a.username,b.sql_text
    from v$session a inner join v$sqlarea b
    on a.sql_address=b.address and
    a.sql_hash_value=b.hash_value
    where sid=&sid;
    but sql_text column is just 1000 characters size and I want the whole statement.
    Is there any way to extract this information using sqlplus? I know that Enterprise Manager shows me this information, but to get sql text at the right moment, I would like to use a script.
    Thanks in advance
    Alex

    Hi Arup,
    I am writing a linux shell script that will automatically logmine information from the archivelogs in the database,
    I want to try to build a shell script that will make information from the archivelogs aceesible to users i.e automating logminer with either shell scripting or set of
    pl/sql procedures so that a user can use his reporting tool to specify the start date , end date and all other parameters to spool the information.
    You could also help with any ideas you have.
    Best Regards

  • How to extract info from an almost diseased HD?

    Hi,
    My mac won't boot up it and after a silent "burr" sound, it shuts itself down during the grey "apple"-screen (while the gear is spinning). I have tested all the tips I could find on this forum and others (incl. resetting the PRAM, starting in Single user mode, in Safe mode etc.). It seems to me that my hard drive is dead (or hopefully almost dead)...
    When I try to do Repair Disk through my installation disc it says there is an error with the HD and it can't repair it. My HD is visible in the Disk Utility – (but i.e. not on someone else's desktop in Target-mode.). I cannot mount the disk in Disk Utility and I can't copy the disk as a disk image.
    Is there any other way I can copy or extract the info on my HD (I need some files that weren't backed up)? Perhaps you have some tips for more advanced software? I don't want to go to a Data Recovery shop yet... Please help.
    Cheers,
    Mat.

    Mat:
    Disk Utility reports "Underlying task reported failure" when repairing a volume is a serious directory/file structure error that, as you have learned, cannot be repaired by Disk Utility. The article linked suggests the use of third party software, which we have noted earlier. For more technical information on the subject see Kappy on Error -9972
    There is a possibility that the condition can be brought about by a failing HDD, although that is not always, or even usually, the case. So the first order of business would be to run either TTP or DW and see what you come up with.
    As to what you can do to recover your data if you are unable to mount the disk that can be a challenge. Here is what I suggest:
    Download the demo version of DataRescue II
    Boot computer in FW-TDM and see if DataRescue can locate your data.
    If the files which you want to backup show up you will have to buy the license to be able to recover them. Dr. Somke's FAQ Data Recovery is a good tutorial on the subject.
    Please do post back with further questions or comments.
    cornelius

  • Extract URLs from webpage

    Hi guys,
    I'd like to know if there is a possibility to find hyperlinks on a webpage and write their targets to a text file using applescript.
    I'll take youtube as an example, because nearly everyone is familiar with it:
    When I click on a username, I will be redirected to the Channel Page of that particular user. What I want is to get the URL of this Channel Page.
    I think I'll have to create a list, with all the URLs in it, then filter them and save them into a text file.
    The problem is that I can't find a way to do that, whereas Automator offers that feature.
    Hope you can help me!
    Cheers

    I have no experience with automator but I think I understand. Here is something to get you started.
    set theURL to text returned of (display dialog "Enter URL" default answer "http://www.")
    try
    set theSource to (do shell script "curl " & theURL)
    on error
    display dialog "Error getting website source."
    end try
    This will set the variable theSource to the source code of the entered webpage. You then need to search through this source code and extract the information that you want. I'm still not exactly sure what that is though and there are going to be a ton of links in it. Hope that helps at least a little.

  • Extracting info from info window OTHER THAN the usual properties

    Hi, the info window's property list does not include what's in the "more info" section: I am specifically looking for the "dimensions" property of a jpeg file. Is there some way to access it in a script?
    thanks
      Mac OS X (10.4.6)  

    I don't believe Finder exposes that information to AppleScript directly. There are many other ways to get image info, however. Here are three of them…
    click here to open this script in your editor<pre style="font-family: 'Monaco', 'Courier New', Courier, monospace; overflow:auto; color: #222; background: #DDD; padding: 0.2em; font-size: 10px; width:400px">set hfs_image to choose file
    tell application "Image Events" to tell (open hfs_image)
    set {w, h} to get dimensions
    close
    end tell
    {w, h}
    -- UNIX METHODS
    set posix_image to POSIX path of hfs_image
    -- Using MDLS (Metadata List, AKA Spotlight)
    set w to last word of (do shell script "mdls -name kMDItemPixelWidth " & quoted form of posix_image)
    set h to last word of (do shell script "mdls -name kMDItemPixelHeight " & quoted form of posix_image)
    {w, h}
    -- Using SIPS (Scriptable Image Processing System)
    set w to last word of (do shell script "sips -g pixelWidth " & quoted form of posix_image)
    set h to last word of (do shell script "sips -g pixelHeight " & quoted form of posix_image)
    {w, h}
    </pre>

  • BO 4.0 Extract info from SAP BW Table

    Hello Guys,
    we would like to join SAP BEx Query with SAP BW Table. I thought of using the Information Desing Tool.
    Do you know if Information Design tool can connect a SAP Table?
    Thanks and regards,
    Markus

    Hello Markus,
    the Information Design Tool is able to create a relational Universe on top of InfoProvider which will show the tables from the cube, but the Information Design Tool in the current release is not able to show the tables from the ERP system.
    regards
    Ingo Hilgefort

  • Script to extract info from Mail subject line?

    I'd like to set up a rule that copies all subject lines containing text I specify to a text file. For example... all subject lines containing "has subscribed" would be copied to subscribers.txt. Is this possible?

    <pre style="
    font-family: Monaco, 'Courier New', Courier, monospace;
    font-size: 10px;
    font-weight: normal;
    margin: 0px;
    padding: 5px;
    border: 1px solid #000000;
    width: 720px;
    color: #000000;
    background-color: #E6E6EE;
    overflow: auto;"
    title="this text can be pasted into the AppleScript Editor">
    using terms from application "Mail"
    on perform mail action with messages theMessages
    tell application "Finder" to set ptd to (path to desktop folder) as string
    tell application "Mail"
    repeat with theMessage in theMessages
    if subject of theMessage contains "has subscribed" then
    set theText to (subject of theMessage & return)
    set theFile to ptd & "subscribers.txt"
    set theFileID to open for access file theFile with write permission
    write theText to theFileID starting at eof
    close access theFileID
    end if
    end repeat
    end tell
    end perform mail action with messages
    end using terms from
    </pre>

  • How to extract data from info cube into an internal table using ABAP code

    HI
    Can Anyone plz suggest me
    How to extract data from info cube into an internal table using ABAP code like BAPI's or function modules.
    Thankx in advance
    regds
    AJAY

    HI Dinesh,
    Thankq for ur reply
    but i ahve already tried to use the function module.
    When I try to Use the function module RSDRI_INFOPOV_READ
    I get an information message "ERROR GENERATION TEST FRAME".
    can U plz tell me what could be the problem
    Bye
    AJAY

  • How can I extract the info from my Time Capsule?

    How can I extract the info from my Time Capsule?

    What info would that be??
    If you mean how to recover TM back to a computer.. see Pondini Q14-17
    http://pondini.org/TM/FAQ.html
    If that isn't it .. we do not know what.. "the info" you are talking about.

  • What key fields should i set in DSO extracting data from 2LIS_02_ITM

    hi experts
    i extract data from 2LIS_02_ITM into a DSO, i know the DSO isn't a must, becoz the 2LIS_02_ITM delta type is ABR, but i want to keep the info in change log.
    so, what the key fields should i set in the dso? just ebeln and ebelp is enough?
    hunger for ur advice and thanks a lot!

    If you extract ITM toa DSO you cannot maintain a log of every change....the data will come ....but when the data must be activated the reference of the ebelp ebeln will remain only a single record....if you want to maintain all the data you must create another field in extractor with you can difference all the changes for one single ebelp ebeln...
    Regards

  • When I extracting data from DSO to Cube by using DTP.

    When i am extracting data from DSO to Cube by using DTP.
    I am getting following erros.
    Data package processing terminated (Message no. RSBK229).
    Error in BW: error getting datapakid cob_pro (Message no. RS_EXCEPTION105).
    Error while extracting from source 0FC_DS08 (type DataStore) - (Message no. RSBK242).
    Data package processing terminated. (Message no. RSBK229).
    Data package 1 / 10/04/2011 15:49:56 / Status 'Processed with Errors'. (Message no. RSBK257).
    This is the brand new BI 7.3 system. Implementing the PSCD and TRM.
    I have used the standard business content objects in FI-CA (dunning history header, item,activities) and standard Datasource (0FC_DUN_HEADER ,0FC_DUN_ITEMS, 0FC_DUN_ACTIVITIES). I have extracted data till the DSO level . when I try to pull the data to info provider level(cube) using DTP . I am getting fallowing error.
    my observation: when ever I use the DSO as source to any target like another DSO or cube. its throwing same kind of error for any flow including simple Flat file .
    please suggest any one whether do I need to maintain basic settings since its a brand new BI 7.3.
    please help me out on this issue . I am not able to move forward .its very urgent

    hello
    Have you solved the problem ?  
    I have the same error...
    as you solve this error,
    can you help me please I have the same error
    yimi castro garcia
    [email protected]

  • [Forum FAQ] SharePoint 2013: Extracting values from a multi-value enabled lookup column and merge values to a multi-value enabled column

    For some business requirements, users want to extract values from a multi-value enabled lookup column
    and add items to another list based on each separate value. In contrast, others want to find duplicate values in the list and merge associated values to a multi-value enabled column and then
    add items to another list based on the merged value. All of these can be achieved using SharePoint Designer 2013 Workflow.
    How to extract values from a multi-value enabled lookup column and add items to another list based
    on each separate value using SharePoint Designer 2013.
    Important actions: Loop Shape; Utility Actions
    Three scenarios
    Things to note
    Steps to create Workflow
    How to merge values to a multi-value enabled column and add item to another list based on the
    merged value using SharePoint Designer 2013.
    Important actions: Call HTTP Web Service; Build Dictionary
    Things to note
    Steps to create Workflow
    How to
    extract values from a multi-value enabled lookup column and
    add items to another list based on each separate value using SharePoint Designer 2013.
    For example, they have three lists as below. They want to
    extract values from the Destinations column
    in Lookup2 and add items to Lookup3 based on each country and set Title to current item: ID.
    Lookup1:
    Title (Single line of text)
    Lookup2:
    Title (Single line of text), Destinations (Lookup; Get information from: Lookup1 in Title column).\
    Lookup3:
    Title (Single line of text), Country (Single line of text).
    Important action
    1. Loop Shape: SharePoint Designer 2013 support two types of loops: loop n times and loop with condition.
    Loops must also conform to the following rules:
    Loops must be within a stage, and stages cannot be within a loop.
    Steps may be within a loop.
    Loops may have only one entry and one exit point.
    2. Utility Actions: It contains many actions, such as ‘Extract Substring from Index of String’ and ‘Find substring in String’.
    Three scenarios
    We need to loop through the string returned from the look up column and look for commas. There are three
    scenarios:
    1.  No comma but string is non-empty so there is only one country.
    2.  At least one comma so there is at least two or more countries to loop.
    3.  In the loop we have consumed all the commas so we have found the last country. 
    Things to note
    There are two things to note:
    1. "Find string in string (output to Variable:index)"  will return -1 if doesn't find
    the searched for string.
    2. In the opening statement "Set Variable: Countries to Current Item:Destinations" set the return
    field as  "Lookup Values, Comma Delimited".
    Steps to create Workflow
    Create a custom list named Lookup1.
    Create a custom list named Lookup2, add column: Destinations (Lookup; Get information from: Lookup1 in Title column).
    Create a custom list named Lookup3, add column: Country (Single line of text).
    Create a workflow associated to Lookup2.
    Add conditions and actions:
    Start the workflow automatically when an item is created.
    Add item to Lookup2, then workflow will be started automatically and create multiple items to lookup3.
    See the below in workflow History List:
    How to merge values to a multi-value enabled column and add item to another list based on the
    merged value using SharePoint Designer 2013
    For example, they have three lists as below. They want to find duplicate values in the Title column in
    Lookup3 and merge country column to a multi-value enabled column and then add item to lookup2 and set the Title to Current Item: Title.
    Lookup1:
    Title (Single line of text)
    Lookup3:
    Title (Single line of text), Country (Single line of text).
    Lookup2:
    Title (Single line of text), Test (Single line of text).
    Important actions
    "Call HTTP Web Service"
    action: In SharePoint 2013 workflows, we can call a web service using a new action introduced in SharePoint 2013 named Call HTTP Web Service. This action
    is flexible and allows you to make simple calls to a web service easily, or, if needed, you can create more complex calls using HTTP verbs as well as allowing you to add HTTP headers.
    “Build Dictionary"
    action:
    The Dictionary variable type is a new variable type in the SharePoint 2013 Workflow.
    The following are the three actions specifically designed for the Dictionary variable type: Build Dictionary, Count Items in a Dictionary and Get an Item from a Dictionary.
    The "Call HTTP Web Service" workflow action would be useless without the new "Dictionary" workflow action.
    Things to note
    The
    HTTP URI is set to https://sitename/_api/web/lists/GetByTitle('listname')/items?$orderby=Id%20desc and the HTTP method is set to “GET”. Then the list will be sort by Id in descending order.
    Use Get
    d/results(0)/Id form
    Variable: ResponseContent (Output to
    Variable: maxid) to get the Max ID.
    Use Set
    Variable: minid to Current List:ID to get the Min ID.
    Use Copy from
    Variable: destianation , starting at
    1 (Output to
    Variable: destianation) to remove the space.
    Steps to create Workflow
    Create a custom list named Lookup1.
    Create a custom list named Lookup2, add column: Test (Single line of text).
    Create a custom list named Lookup3, add column: Country (Single line of text).
    Create a workflow associated to Lookup3.
    Add a new "Build Dictionary" action
    to define the http request header:
    Add a Call HTTP Web Serviceaction, click on
    this and paste your http request.
    To associate the
    RequestHeader variable, select the Call action property,
    set the
    RequestHeaders property to
    RequestHeader:
    In the Call action, click on
    response and associate the response to a new
    variable: ResponseContent (of type Dictionary).
    After the Call action add Get item from Dictionary action to get the Max ID.
    Add Set Workflow Variable action to get the Min ID.
    Add Loop Shape (Loop with Condition) to get all the duplicate titles and integrate them to a string.
    Create item in Lookup2.
    The final Stage should look like this:
    Start the workflow automatically when an item is created.
    Add item to Lookup3, then workflow will be started automatically and create item to lookup2.
    See the below in workflow History List:
    References
    SharePoint Designer 2013 - Extracting values from a multi-value enabled lookup column into a dictionary as separate items:
    http://social.technet.microsoft.com/Forums/en-US/97d34468-1b53-4741-88b0-958472f8ca9a/sharepoint-designer-2013-extracting-values-from-a-multivalue-enabled-lookup-column-into-a
    Workflow actions quick reference (SharePoint 2013 Workflow platform):
    http://msdn.microsoft.com/en-us/library/jj164026.aspx
    Understanding Dictionary actions in SharePoint Designer 2013:
    http://msdn.microsoft.com/en-us/library/office/jj554504.aspx
    Working with Web Services in SharePoint 2013 Workflows using SharePoint Designer 2013:
    http://msdn.microsoft.com/en-us/library/office/dn567558.aspx
    Calling the SharePoint 2013 Rest API from a SharePoint Designer Workflow:
    http://sergeluca.wordpress.com/2013/04/09/calling-the-sharepoint-2013-rest-api-from-a-sharepoint-designer-workflow/

    GREAT info, but it may be helpful to note that when replacing a portion of the variable "Countries" with a whitespace character, you may cause the workflow to fail in a few specific cases (certain lookup fields will not accept this and will automatically
    cancel).  I only found this out when recreating your workflow on a similar, but much more complex list set.  
    To resolve this issue, I used another utility action (Extract Substring from Index of List) to clear out the whitespace.  I configured it as "Copy from
    Variable: Countries, starting at
    1 (Output to Variable: Countries), which takes care of this issue in those few cases.
    Otherwise, WOW!  AWESOME JOB!  Thanks!  :)

Maybe you are looking for