mysql 查询筛选重复数据sql语句

作者:简简单单 2013-11-01

查询重复数据数量

 代码如下 复制代码
select device_id from device group by device_id having count(device_id) > 1;

查询所有重复数据

 代码如下 复制代码

select userid, device_id, create_date from device where device_id in (select device_id from device group by device_id having count(device_id) > 1) order by device_id,create_date desc ;

重复一条中create_date 最新的那一条

 代码如下 复制代码

select max(create_date) from device group by device_id having count(device_id)>1;

筛选查询

 代码如下 复制代码

select * from device where device_id in (select device_id from device group by device_id having count(device_id) > 1) and create_date not in (select max(create_date) from device group by device_id having count(device_id)>1) order by device_id,create_date desc ;

下面再看一些实例吧

表结构如下:

 代码如下 复制代码

mysql> desc test1;

+--------------+------------------+------+-----+---------+----------------+
| Field        | Type             | Null | Key | Default | Extra          |
+--------------+------------------+------+-----+---------+----------------+
| ID           | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| SENDERNAME   | varchar(32)      | YES  |     | NULL    |                |
| RECEIVERNAME | varchar(64)      | YES  |     | NULL    |                |
| SENDER       | varchar(64)      | NO   |     | NULL    |                |
| RECEIVER     | varchar(64)      | NO   |     | NULL    |                |
| SUBJECT      | varchar(512)     | NO   |     | NULL    |                |
| CONTENT      | text             | NO   |     | NULL    |                |
| PRIORITY     | int(11)          | NO   | MUL | NULL    |                |
| STATUS       | int(11)          | NO   | MUL | NULL    |                |
| CREATETIME   | datetime         | NO   |     | NULL    |                |
| SENDTIME     | datetime         | YES  |     | NULL    |                |
+--------------+------------------+------+-----+---------+----------------+

subject和RECEIVER   需要做uniq key,但设计时未做,后面的数据就有很多重复的记录。

1. 查询需要删除的记录,会保留一条记录。

 代码如下 复制代码

select a.id,a.subject,a.RECEIVER from test1 a left join (select c.subject,c.RECEIVER ,max(c.id) as  bid from test1 c where status=0 GROUP BY RECEIVER,SUBJECT having count(1) >1) b on a.id< b.bid where  a.subject=b.subject and a.RECEIVER = b.RECEIVER and a.id < b.bid

2. 删除重复记录,只保留一条记录。注意,subject,RECEIVER 要索引,否则会很慢的。

 代码如下 复制代码

delete a from test1 a, (select c.subject,c.RECEIVER ,max(c.id) as  bid from test1 c where status=0 GROUP BY RECEIVER,SUBJECT having count(1) >1) b where a.subject=b.subject and a.RECEIVER = b.RECEIVER and a.id < b.bid;

好了筛选重复数据的sql语句大概就这些了,如果你能理解那几乎不担心重复数据这一说了。

相关文章

精彩推荐