登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

Perfect-World

以無法為有法,以無限為有限!

 
 
 

日志

 
 

如何比较两个数据表  

2012-12-17 23:57:15|  分类: DBA |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

有些时候,我们可能想要比较一下两个数据表,以找到其中不同的数据。比如,在进行数据移植的时候,或是在合并数据的时候,或是在比对验证数据的时候。当然比较两个表,需要这两个表结构是一样的。

我们先假设一下有如下表结构:

1

2

3

4

5

6

7

8

CREATE TABLE jajal

(

    user_id integer NOT NULL,

    first_name character varying(255),

    last_name character varying(255),

    grade character(1),

    CONSTRAINT jajal_pkey PRIMARY KEY (user_id)

)


然后,我们有两张表——jajaljajal_copy,其内容如下:

 jajal

user_id

first_name

last_name

grade

1

Some

Dude

A

2

Other

Guy

B

3

You are

Welcome

B

4

What

Other

A

5

INeed

You

C

6

Mixed

Nuts

Z

7

Kirk

Land

B

8

Bit

Shooter

A

9

Sun

Microsystem

C

10

Extra

Fancy

B

jajal_copy

user_id

first_name

last_name

grade

1

Some

Dude

A

2

Other

Guy

B

3

You are

Welcome

B

4

What

Other

A

5

INeed

You

C

6

Mixed

Nuts

C

7

Kirk

Land

B

8

Bit

Shooter

A

9

Sun

Microsystem

C

10

Extra

Fancy

B

   

要比较这两张表的数据,找出不一样的数据行。我们可以使用outer join 技术。我给outer join做了一个链接,是Wikipedia的,如果你对这个技术不是很清楚,还请你行看看其技术细节。

下面是具体的SQL语句:

使用FULL OUTER JOIN

1

2

3

4

5

6

7

8

9

10

11

SELECT

     *

FROM

     jajal j

     FULL OUTER JOIN jajal_copy jc ON jc.first_name = j.first_name

     AND jc.last_name = j.last_name

     AND jc.grade = j.grade

     AND jc.user_id = j.user_id

WHERE

     j.user_id IS NULL

     OR jc.user_id IS NULL

运行结果如下:

user_id

first_name

last_name

grade

user_id

first_name

last_name

grade

[NULL]

[NULL]

[NULL]

[NULL]

6

Mixed

Nuts

C

6

Mixed

Nuts

Z

[NULL]

[NULL]

[NULL]

[NULL]

   

使用NATURAL FULL OUTER JOIN

关于natural join,你可以看看Wikipedia是怎么说的。

1

2

3

4

5

6

7

8

SELECT

       *

FROM

       jajal j

       NATURAL FULL OUTER JOIN jajal_copy jc

WHERE

       j.user_id IS NULL

       OR jc.user_id IS NULL

运行结果如下:

user_id

first_name

last_name

grade

6

Mixed

Nuts

C

6

Mixed

Nuts

Z

   

MySQL SQL 代码

MySQL 并不支持 FULL OUTER JOIN,但是我们可以使用LEFT JOIN 和 RIGHT JOIN 来实现这一功能。如下所示。

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

SELECT

*

FROM

jajal j

LEFT JOIN jajal_copy jc ON jc.first_name = j.first_name

AND jc.last_name = j.last_name

AND jc.grade = j.grade

AND jc.user_id = j.user_id

WHERE

jc.user_id IS NULL

UNION ALL

SELECT

*

FROM

jajal j

RIGHT JOIN jajal_copy jc ON jc.first_name = j.first_name

AND jc.last_name = j.last_name

AND jc.grade = j.grade

AND jc.user_id = j.user_id

WHERE

j.user_id IS NULL

或者你更喜欢NATURAL JOIN 版本

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

SELECT

*

FROM

jajal j

NATURAL LEFT JOIN jajal_copy jc

WHERE

jc.user_id IS NULL

UNION ALL

SELECT

*

FROM

jajal j

NATURAL RIGHT JOIN jajal_copy jc

WHERE

j.user_id IS NULL

当然,如果你需要一个MySQL的存储过程的话,下面是一个示例:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

DELIMITER $$

   

CREATE PROCEDURE `db_schema`.`tablediff`

    (schema_name VARCHAR(64), table1 VARCHAR(64), table2 VARCHAR(64))

BEGIN

    DECLARE done INT DEFAULT 0;

    DECLARE sql_statement TEXT DEFAULT '';

    DECLARE sql_statement_where TEXT DEFAULT '';

    DECLARE sql_statement_pk TEXT DEFAULT '';

    DECLARE col_name VARCHAR(64);

    DECLARE col_name_cur CURSOR FOR

        SELECT

            COLUMN_NAME

        FROM

            information_schema.COLUMNS

        WHERE

            TABLE_SCHEMA = schema_name

            AND TABLE_NAME = table1

    ;

    DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

   

    OPEN col_name_cur;

    traverse_columns: LOOP

        FETCH col_name_cur INTO col_name;

   

        IF done THEN

            CLOSE col_name_cur;

            LEAVE traverse_columns;

        END IF;

   

        SET sql_statement_where = CONCAT(sql_statement_where,

            ' AND a.', col_name, ' = b.', col_name);

        SET sql_statement_pk = CONCAT(sql_statement_pk,

            'AND b.', col_name, ' IS NULL');

    END LOOP;

   

    SELECT

        COLUMN_NAME INTO col_name

    FROM

        information_schema.KEY_COLUMN_USAGE

    WHERE

        CONSTRAINT_SCHEMA = schema_name

        AND CONSTRAINT_NAME = 'PRIMARY'

        AND TABLE_NAME = table1

    LIMIT 1

    ;

    IF col_name IS NOT NULL THEN

        SET sql_statement_pk = CONCAT('AND b.', col_name, ' IS NULL');

    END IF;

   

    SET sql_statement = CONCAT('SELECT * FROM ', schema_name, '.', table1, ' a LEFT JOIN ', schema_name, '.', table2, ' b ON TRUE');

    SET sql_statement = CONCAT(sql_statement, sql_statement_where, ' WHERE TRUE ', sql_statement_pk);

    SET sql_statement = CONCAT(sql_statement, ' UNION ALL SELECT * FROM ', schema_name, '.', table1, ' b RIGHT JOIN ', schema_name, '.', table2, ' a ON TRUE');

    SET sql_statement = CONCAT(sql_statement, sql_statement_where, ' WHERE TRUE ', sql_statement_pk);

   

    SET @s = sql_statement;

    PREPARE stmt1 FROM @s;

    EXECUTE stmt1;

    DEALLOCATE PREPARE stmt1;

   

END$$

DELIMITER ;

   

   

PostgreSQL 下的SQL语句

下面是PostgreSQL的一个存储过程:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

CREATE OR REPLACE FUNCTION tablediff (

    IN schema_name VARCHAR(64),

    IN table1 VARCHAR(64),

    IN table2 VARCHAR(64)

) RETURNS BIGINT AS

$BODY$

DECLARE

    the_result BIGINT DEFAULT 0;

    sql_statement TEXT DEFAULT '';

    sql_statement_where TEXT DEFAULT '';

    sql_statement_pk TEXT DEFAULT '';

    col_name VARCHAR(64);

    col_name_cur CURSOR FOR

        SELECT

            column_name

        FROM

            information_schema.columns

        WHERE

            table_catalog = schema_name

            AND table_schema = 'public'

            AND table_name = table1

    ;

BEGIN

    OPEN col_name_cur;

   

    LOOP

        FETCH col_name_cur INTO col_name;

        IF NOT FOUND THEN

            EXIT;

        END IF;

   

        sql_statement_where := sql_statement_where || ' AND a.' || col_name || ' = b.' || col_name;

    END LOOP;

   

    SELECT

        column_name INTO col_name

    FROM

        information_schema.table_constraints tc

        JOIN information_schema.constraint_column_usage ccu ON

            ccu.constraint_name = tc.constraint_name

    WHERE

        tc.table_catalog = schema_name

        AND tc.table_schema = 'public'

        AND tc.table_name = table1

    LIMIT 1

    ;

   

    IF col_name IS NOT NULL THEN

        sql_statement_pk := ' a.' || col_name || ' IS NULL';

        sql_statement_pk := sql_statement_pk || ' OR b.' || col_name || ' IS NULL';

    END IF;

   

    sql_statement := 'SELECT COUNT(*) FROM ' || schema_name || '.public.' || table1 || ' a FULL OUTER JOIN ' || schema_name || '.public.' || table2 || ' b ON TRUE';

    sql_statement := sql_statement || sql_statement_where || ' WHERE ' || sql_statement_pk;

   

    EXECUTE sql_statement INTO the_result;

   

    RETURN the_result;

END;$BODY$

LANGUAGE 'plpgsql' STABLE;

 

来源:http://coolshell.cn/articles/925.html

  评论这张
 
阅读(854)| 评论(0)

历史上的今天

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2018