How to find duplicated values in rows obtained from multiple tables

Hi.
I need to find the duplicates stored in different tables of my database. I have some tables like the following model (I know it could be nonsense, but that's because it's simplified):
table person { id, name, surname }.
table zoo {id, owner, name, city} (zoo.owner -> person.id)
table area {id, zoo, type, name} (area.zoo -> zoo.id).
table dog {id, area, name, colour} (dog.area -> area.id)
table elephant {id, area, name, height} (elephant.area -> area.id)
As ids are autoincremental, it could happen that a person has two zoos with identical areas (meaning, these areas has the same type and name, same dogs (same name and colour) and same elephants (same name and height)). In an example with data:
person
id     name     surname
p1     john     doe
zoo
id     owner     name          city
z1     p1          central     NY
z2     p1          central     NY
area
id     zoo     type     name
a1     z1     open     main
a2     z2     open     main
dog
id     area     name          colour
d1     a1     jaro          brown
d2     a1     chispa     white
d3     a2     jaro          brown
d4     a2     chispa     white
elephant
id     area     name          height
e1     a1     dumbo     5
e2     a1     elphy          4
e3     a2     dumbo     5
e4     a2     elphy          4
That is: John Doe has two zoos in the same city and with the same name. These two zoos, has one open main area each. Each of these areas has two dogs with the same names and colours and two elephants with the same names and heights. So this zoos would be identical. What I want is to delete z2 zoo.
I'd like to find a SQL function which returns me the id of one of these zoos, so it can respond to the question. Has the person called "John Doe" more than one area with the same type, name, dogs and elephants?
Is it possible?

Hi,
Interesting problem!
Two distinct zoos are duplicates of each other if they have the same owner, the same name, the same areas, and the same animals in each of those areas.
It's pretty easy to tell if two zoos are distinct, but have the same owner and name. We could do that with a self-join:
SELECT       z1.*
,       z2.id          AS id_2
FROM       zoo     z1
JOIN       zoo     z2  ON   z1.owner     = z2.owner
                AND      z1.name     = z2.name
              AND      z1.id          < z2.idBut how can we tell if two zoos that meet these criteria have the same aminals in the same areas? (It amuses me to call these the "critter criteria".)
One way is to make a list, for each zoo, of all the animals, with all their attributes (that is, the attributes that matter for determining if they are the same), in each area, and then see if the lists for those two zoos is identical. Like this:
WITH     all_animals     AS
     SELECT     d.id, d.area
     ,     d.name          -- || '~' || d.colour || ...
               AS attributes
     ,     'DOG'     AS animal
     FROM     dog     d
    UNION ALL
     SELECT     e.id, e.area
     ,     e.name          -- || '?`' || e.height || ...
                    AS attributes
     ,     'ELEPHANT'     AS animal
     FROM     elephant     e
-- UNION ALL   ... cat ... monkey ...
,     got_analytics     AS
     SELECT     z.id          AS zoo_id
     ,     ar.type          -- , ar.name, ...
     ,     an.animal
     ,     an.attributes
     ,     COUNT (*)     OVER ( PARTITION BY  z.id )     AS cnt
     ,     ROW_NUMBER () OVER ( PARTITION BY  z.id
                               ORDER BY          ar.type     -- , ar.name, ...
                         ,                an.animal
                                     ,                    an.attributes
                       )                    AS r_num
     FROM           zoo          z
     LEFT OUTER JOIN      area          ar  ON   ar.zoo          = z.id
     LEFT OUTER JOIN      all_animals     an  ON       an.area     = ar.id
SELECT       z1.*
,       z2.id          AS id_2
FROM       zoo     z1
JOIN       zoo     z2  ON   z1.owner     = z2.owner
                AND      z1.name     = z2.name
              AND      z1.id          < z2.id
WHERE     NOT EXISTS
          SELECT  type     -- , name, ...
          ,          animal, attributes
          ,          cnt, r_num
          FROM    got_analytics
          WHERE   zoo_id     = z1.id
       MINUS         
          SELECT  type     -- , name, ...
          ,          animal, attributes
          ,          cnt, r_num
          FROM    got_analytics
          WHERE   zoo_id     = z2.id
ORDER BY  z1.id
,            z2.id
;The first sub-query, all_animals, combines all the animals from all of the separate species tables. For each one, it makes a delimited list (the attributes column) of the columns that count in determining if 2 animals are the same. Different species may have different attributes and different numbers of attributes. The lists have to be delimited by some sub-string (not necessarily a single character) that can never occur in any of the attributes for that species. For example, dogs may never have '~' in their names (or their colours, if you expand the example to include colour). Elephants might have any character, but never have a '?' followed immediately by a grave accent '`'.
The next sub-query, got_analytics, joins the zoo and area tables to this combined list of animals. The joins are outer joins, so we can detect as duplicates two zoos that have no areas, or two areas that have no animals.
In the main query, I used MINUS to tell if the lists for two given zoos were identical. If x MINUS y produces no results, then every row in x has a matching row in y. To say that x and y are identical, we also have to know that every row in y has a matching row in x. Rather than do another MINUS, I counted the rows (got_analytics.cnt), so that if y has more rows than x, no rows will match when we do the MINUS.
Here's the output I got from your sample data:
`       ID      OWNER NAME             ID_2
         1          1 CENTRAL NY          2This shows the distinct ids, and the common attributes, of all distinct pairs of zoos that match. You can join this to the person table, if you want to see details about the owner.
If a query is correct, it will not only find all the results it is supposed to find; it will also not find all the results it is not supposed to find, for all the various reasons. I added some more sample data to test one such reason:
INSERT INTO ZOO (ID, OWNER, NAME) VALUES (91, 1, 'CENTRAL NY');
INSERT INTO AREA (ID, ZOO, TYPE) VALUES (92, 91, 'MAIN');
INSERT INTO DOG (ID, AREA, NAME) VALUES (93, 92, 'JARO');
INSERT INTO DOG (ID, AREA, NAME) VALUES (94, 92, 'CHISPA');Zoo 91 is identical to zoos 1 and 2, except that 91 has no elephants. To thoroughly test this solution, you need to add some more test data, especially zoos that will not be selected for various reasons: for example, two zoos that have all the same animals, but different numbers of areas, or that both have two areas, but the animals are distributed differently between the areas.

Similar Messages

  • Can a block in Oracle contain rows/data from multiple tables?

    Hi during my discussion with one of the DBAs, a point came up i.e. A block shouldn't have rows from multiple tables...
    Is that true? I read in one of the OTN thread (i don't exactly remember the thread name) that a block can have data from multiple tables. If it doesn't have, what's the table directory in block signifies?
    Please share your views.
    Thanks,
    CSM

    CSM.DBA wrote:
    Hi,
    As per the above link,
    Table directory
    For a heap-organized table, this directory contains metadata about tables whose rows are stored in this block. Multiple tables can store rows in the same block. (Logical Storage Structures)
    And by default Oracle creates heap organized tables only.(heap-organized table: A table in which the data rows are stored in no particular order on disk. By default, CREATE TABLE creates a heap-organized table.) (Glossary)
    So I can say a block can contain rows from multiple tables. Isn't it?
    CSM
    See Logical Storage Structures
    Given:  A segment is a set of extents allocated for a specific database object, such as a table.
    Given:  An extent is a set of logically contiguous data blocks

  • How to find OBJECTCLAS value to use in CDHDR/CDPOS tables?

    Hi Experts,
    Please help me the way to find out the objectclas, we use in CDHDR/CDPOS tables

    Pls refer to the thread,
    Object class for CDHDR
    BR,
    Krishna

  • How to find  dynamic value(screen value) in table control for current row .

    hi to all,
    i used table control in my screen. for column no 2 field i was used serrch help. and for column number 3 i used a dynamic help.
    in change mode you can change any row for table control.
    when i was using search help for a row which was already entered in column no 3 i cannot get any value.
    how i can get value of row no 3 and column no 2 value.
    thanks

    Try using like index for the serarch the TC-current_line

  • How to find the value of a variable in other program

    How to find the value of a variable in other program say I am in a FM and this FM is being called in from other program and I want to know some of the variable details of the program from the FM itself. Imagine if this is a txn. and I need to know the details from some of the programs while executing the same transaction
    Regards
    Vin

    Hi Vinayak,
         you will be having your first program values in internal table or some variables,
        when you are calling the second program you wii use like this,
        SUBMIT <Second Program Name> USING SELECTION-SCREEN '1000'
                           WITH s_emp(second program select-options)   IN t_emp(first program variables)
                           WITH p_chk   EQ t_chk
                           WITH p_r1    EQ t_r1
                           WITH p_month EQ t_month
                           WITH s_cust1 IN t_cust1
                           WITH p_r2    EQ t_r2
                           WITH s_cust2 IN t_cust2
                           WITH s_week  IN t_week
                           AND RETURN.
    you have pas like this to get your first program details.

  • How to find report values using report writer

    hi,
    Please help me.
    How to find report values using report writer
    Regards,
    RRK.
    Edited by: Alvaro Tejada Galindo on Feb 6, 2008 12:01 PM

    Thanks all for the reply.
    I am trying to solve a problem where report parameter value that is set at Management Console is wiped out after calling replaceConnection.
    databaseController.replaceConnection(oldConnectionInfo, newConnectionInfo,
    null,DBOptions._doNotVerifyDB);
    We have to support changing database connection from a java utility
    class. But once replaceConnection is called all existing static parameter values are lost. To fix this issue we thought of getting parameters and values before calling replaceConnection and setting it after replaceConnection.
    Version is CS2008 SP3 - version 12.3.0.601
    If there is any other option of fixing the original wipe out issue?
    ParameterValues.getValues() didn't return value. I will try ParameterValues.getCurrentCalues() but the document says ParameterValues.getValues() is  equivalent to the IParameterField.getCurrentValues() method  unless it is empty, in which case it is equivalent  to the IParameterField.getDefaultValues() method.
    So getCurrentValues() may not work.

  • How to find the Values of SAP Gateway Server Host  and Gateway Service Valu

    Hi All,
    I installed SAPR/3 4.7 EE on Windows. For configuring SLD and LDAP i am unable to give the SAP Gateway Server Host and Gateway Service values.
    Can any one plz suggest me how to find these values.
    Regds
    Phanikumar

    Hello, SAP Gateway Server Host and Gateway Service are used to set up rfc connectivity, that is the host name and the system number where your sld and ldap is responding, if you have no sld and no ldap , just simply uncheck that options while installing.
    Have a nice week end, Luciano.

  • Write the sql query to find largest value in row wise without using   great

    write the sql query to find largest value in row wise without using
    greatest fuction?

    Another not so good way, considering you want greatest of 4 fields from a single record:
    SQL> ed
    Wrote file afiedt.buf
      1  with t as (Select 100 col1,200 col2,300 col3,400 col4 from dual
      2  union select 500,600,700,800 from dual
      3  union select 900,1000,1100,1200 from dual
      4  union select 1300,1400,1500,1600 from dual
      5  union select 1700,1800,1900,2000 from dual
      6  union select 2100,2200,2300,2400 from dual
      7  union select 2800,2700,2600,2500 from dual
      8  union select 2900,3000,3100,3200 from dual)
      9  SELECT (CASE WHEN col1 > col2 THEN col1 ELSE col2 END) Max_value
    10  FROM
    11  (SELECT (CASE WHEN col1_col2 > col2_col3 THEN col1_col2 ELSE col2_col3 END) col1,
    12         (CASE WHEN col2_col3 > col3_col4 THEN col2_col3 ELSE col3_col4 END) col2,
    13         (CASE WHEN col3_col4 > col4_col1 THEN col3_col4 ELSE col4_col1 END) col3
    14  FROM
    15  (SELECT (CASE WHEN col1 > col2 THEN col1 ELSE col2 END) col1_col2,
    16         (CASE WHEN col2 > col3 THEN col2 ELSE col3 END) col2_col3,
    17         (CASE WHEN col3 > col4 THEN col3 ELSE col4 END) col3_col4,
    18         (CASE WHEN col4 > col1 THEN col4 ELSE col1 END) col4_col1
    19* FROM t))
    SQL> /
    MAX_VALUE
           400
           800
          1200
          1600
          2000
          2400
          2800
          3200
    8 rows selected.
    SQL> Edited by: AP on Sep 21, 2010 6:29 AM

  • How to find these values fall in what time?

    Post Author: newcruser
    CA Forum: General
    From 9 AM to 5 PM time range..how to find these values fall where?.
    Is that done using group selection method?.  Please give me idea and direction
    UserId   Type      login                               logout1           1          2008-04-13 09:30:42         2008-04-13 10:30:122           2          2008-04-13 09:30:12         2008-04-13 11:00:323           1          2008-04-13 10:30:32         2008-04-13 12:56:234           2          2008-04-13 10:30:42         2008-04-13 12:00:345           2          2008-04-13 11:30:34         2008-04-13 13:40:236           1          2008-04-13 12:30:43         2008-04-13 13:00:437           1          2008-04-13 13:20:43         2008-04-13 14:45:218           2          2008-04-13 14:30:42         2008-04-13 15:15:599           1          2008-04-13 15:00:42         2008-04-13 16:30:4210         1          2008-04-13 16:20:42         2008-04-13 17:00:00

    Post Author: newcruser
    CA Forum: General
    In my case minimum (login) time changes and maximum(logout) time changes everyday.
    I want to create from minimum (login) to  maximum(logout) for every hour, how many people are online. How to do?.
    UserId          login                               logout1           2008-04-13 09:30:42         2008-04-13 10:30:122           2008-04-13 09:30:12         2008-04-13 11:00:323           2008-04-13 10:30:32         2008-04-13 12:56:234           2008-04-13 10:30:42         2008-04-13 12:00:345           2008-04-13 11:30:34         2008-04-13 13:40:236           2008-04-13 12:30:43         2008-04-13 13:00:437           2008-04-13 13:20:43         2008-04-13 14:45:218           2008-04-13 14:30:42         2008-04-13 15:15:599           2008-04-13 15:00:42         2008-04-13 16:30:4210         2008-04-13 16:20:42         2008-04-13 17:00:00
    i found minimum (login) and maximum(logout) using below formula. Now i want to find every hour from minimum (login) to maximum(logout) , so i can use each hour value in formula to calculate how many people are online for that particular time. How to do that?.
    @Minimum
    DatePart("h",(Minimum ()))
    @Maximum
    DatePart("h",(Maximum()))

  • How to get the values of Select-options from the screen.

    The value of parameter can be obtained by function module 'DYNP_VALUES_READ' but How to get the values of Select-options from the screen? I want the F4 help values of select-options B depending on the values in Select-option A.So I want to read the Select-option A's value.

    Hi,
    Refer this following code..this will solve your problem...
    "Following code reads value entered in s_po select options and willprovide search
    "help for s_item depending upon s_po value.
    REPORT TEST.
    TABLES : ekpo.
    DATA: BEGIN OF itab OCCURS 0,
    ebelp LIKE ekpo-ebelp,
    END OF itab.
    SELECT-OPTIONS   s_po FOR ekpo-ebeln.
    SELECT-OPTIONS s_item FOR ekpo-ebelp.
    INITIALIZATION.
    AT SELECTION-SCREEN ON VALUE-REQUEST FOR s_item-low.
      DATA:
      dyn_field TYPE dynpread,
      temp_fields TYPE TABLE OF dynpread,
      zlv_dynpro TYPE syst-repid.
      zlv_dynpro = syst-repid.
      CALL FUNCTION 'DYNP_VALUES_READ'
        EXPORTING
          dyname     = zlv_dynpro
          dynumb     = syst-dynnr
          request    = 'A'
        TABLES
          dynpfields = temp_fields
        EXCEPTIONS
          OTHERS     = 0.
      LOOP AT temp_fields INTO dyn_field.
        IF dyn_field-fieldname EQ 'S_PO-LOW'.
            SELECT * INTO CORRESPONDING fields OF TABLE itab FROM ekpo
            WHERE ebeln EQ dyn_field-fieldvalue.
            EXIT.
        ENDIF.
      ENDLOOP.

  • How to find out when data was deleted from table in oracle and Who deleted that

    HI Experts,
    Help me for below query:
    how to find out when data was deleted from table in oracle and Who deleted that ?
    I did that to fidn out some data from dba_tab_modifications, but I m not sure that what timestamp shows, wether it shows for update,insert or delete time ?
    SQL> select TABLE_OWNER,TABLE_NAME,INSERTS,UPDATES,DELETES,TIMESTAMP,DROP_SEGMENTS,TRUNCATED from dba_tab_modifications where TABLE_NAME='F9001';
    TABLE_OWNER                    TABLE_NAME                        INSERTS    UPDATES    DELETES     TIMESTAMP         DROP_SEGMENTS TRU
    PRODCTL                        F9001                                                     1683         46       2171            11-12-13 18:23:39             0                   NO
    Audit is enable in my enviroment?
    customer is facing the issue and data missing in the table and I told him that yes there is a delete at 11-12-13 18:23:39 in table after seeing the DELETS column and timestamp in dba_tab_modifications, but not sure I am right or not
    SQL> show parameter audit
    NAME                                 TYPE        VALUE
    audit_file_dest                      string      /oracle/admin/pbowe/adump
    audit_sys_operations                 boolean     TRUE
    audit_syslog_level                   string
    audit_trail                          string      DB, EXTENDED
    please help
    Thanks
    Sam

    LOGMiner --> Using LogMiner to Analyze Redo Log Files
    AUDIT --> Configuring and Administering Auditing

  • How to find the selected item in alv grid or table control

    can any one tell me please
    how to find the selected item in alv grid or table control

    In table control, If you goto screen painter and goto table control properties ( f2 ), there is one check-box w/selColumn check that and give column name. Then add that column to your internal table.
    IN PAI
      LOOP AT it_tkhdr.
        FIELD it_tkhdr-sel_row
          MODULE tab_tkhdr_mark ON REQUEST.
      ENDLOOP.
    MODULE tab_tkhdr_mark INPUT.
      MODIFY it_tkhdr INDEX tc_tkhdr-current_line.
    ENDMODULE.                 " tab_tkhdr_mark  INPUT
    here it_TKHDR is internal table sel_row is field for selection
    After that, you can loop at it_tkhdr where sel_row is 'X' to get selected rows.
    regards,
    Gagan

  • How to find block free space  and PCTUSED FOR THE TABLE

    HI all,
    Due to performance issues for my database , my management ask me to reset the PCTUSED and PCTFREE values , and my doubt is
    1)How to find the current PCTUSED and PCTFREE values.
    2)How to find block free space and PCTUSED FOR THE TABLE.
    Please help me out regarding this.
    Regards,
    Vamsi.

    1)version is 10.2.0.4
    2)output of query
    tablespace extent_management allocation_type segment_space_management
    SYSTEM     LOCAL     SYSTEM     MANUAL
    UNDOTBS1     LOCAL     SYSTEM     MANUAL
    SYSAUX     LOCAL     SYSTEM     AUTO
    TEMP     LOCAL     UNIFORM     MANUAL
    USERS     LOCAL     SYSTEM     AUTO
    UNDOTBS2     LOCAL     SYSTEM     MANUAL
    INS     LOCAL     SYSTEM     AUTO
    CONFTBS     LOCAL     SYSTEM     AUTO
    REINS     LOCAL     SYSTEM     AUTO
    ANALYST     LOCAL     SYSTEM     AUTO
    BI     LOCAL     SYSTEM     AUTO
    INTRFC     LOCAL     SYSTEM     AUTO
    COGNOS     LOCAL     SYSTEM     AUTO
    TS_INDX     LOCAL     SYSTEM     AUTO
    TS_CHOLAWEB     LOCAL     SYSTEM     AUTO
    TS_DASBOARD     LOCAL     SYSTEM     AUTO

  • How to find the number of fetched lines from select statement

    Hi Experts,
    Can you tell me how to find the number of fetched lines from select statements..
    and one more thing is can you tell me how to check the written select statement or written statement is correct or not????
    Thanks in advance
    santosh

    Hi,
    Look for the system field SY_TABIX. That will contain the number of records which have been put into an internal table through a select statement.
    For ex:
    data: itab type mara occurs 0 with header line.
    Select * from mara into table itab.
    Write: Sy-tabix.
    This will give you the number of entries that has been selected.
    I am not sure what you mean by the second question. If you can let me know what you need then we might have a solution.
    Hope this helps,
    Sudhi
    Message was edited by:
            Sudhindra Chandrashekar

  • How to find the number of columns in an internal table DYNAMICALLY ?

    Hi,
    How to find the number of columns in an internal table DYNAMICALLY ?
    Thanks and Regards,
    saleem.

    Hi,
    you can find the number of columns and their order using
    the <b>'REUSE_ALV_FIELDCATALOG_MERGE'</b>
    call function 'REUSE_ALV_FIELDCATALOG_MERGE'
    EXPORTING
       I_PROGRAM_NAME               = sy-repid
       I_INTERNAL_TABNAME           = 'ITAB'
       I_INCLNAME                   = sy-repid
      changing
        ct_fieldcat                  = IT_FIELDCAT
    EXCEPTIONS
       INCONSISTENT_INTERFACE       = 1
       PROGRAM_ERROR                = 2
       OTHERS                       = 3
    if sy-subrc <> 0.
    MESSAGE ID SY-MSGID TYPE SY-MSGTY NUMBER SY-MSGNO
             WITH SY-MSGV1 SY-MSGV2 SY-MSGV3 SY-MSGV4.
    endif
    now describe your fieldcat . and find no of columns.
    and their order also..
    regards
    vijay

Maybe you are looking for