MySQL 8 教程

MySQL - 主页 MySQL - 简介 MySQL - 功能 MySQL - 版本 MySQL - 变量 MySQL - 安装 MySQL - 管理 MySQL - PHP 语法 MySQL - Node.js 语法 MySQL - Java 语法 MySQL - Python 语法 MySQL - 连接 MySQL - Workbench

MySQL 8 数据库

MySQL - 创建数据库 MySQL - 删除数据库 MySQL - 选择数据库 MySQL - 显示数据库 MySQL - 复制数据库 MySQL - 数据库导出 MySQL - 数据库导入 MySQL - 数据库信息

MySQL 8 用户

MySQL - 创建用户 MySQL - 删除用户 MySQL - 显示用户 MySQL - 更改密码 MySQL - 授予权限 MySQL - 显示权限 MySQL - 撤销权限 MySQL - 锁定用户账户 MySQL - 解锁用户账户

MySQL 8 表

MySQL - 创建表 MySQL - 显示表 MySQL - 修改表 MySQL - 重命名表 MySQL - 克隆表 MySQL - 截断表 MySQL - 临时表 MySQL - 修复表 MySQL - 描述表 MySQL - 添加/删除列 MySQL - 显示列 MySQL - 重命名列 MySQL - 表锁定 MySQL - 删除表 MySQL - 派生表

MySQL 8 查询

MySQL - 查询 MySQL - 约束 MySQL - INSERT 插入查询 MySQL - SELECT 查询 MySQL - UPDATE 更新查询 MySQL - DELETE删除查询 MySQL - REPLACE 替换查询 MySQL - 忽略插入 MySQL - 重复键更新时插入 MySQL - 插入到另一个表语句

MySQL 8 视图

MySQL - 创建视图 MySQL - 更新视图 MySQL - 删除视图 MySQL - 重命名视图

MySQL 8 索引

MySQL - 索引 MySQL - 创建索引 MySQL - 删除索引 MySQL - 显示索引 MySQL - 唯一索引 MySQL - 聚集索引 MySQL - 非聚集索引

MySQL 运算符和子句

MySQL - Where 子句 MySQL - Limit 子句 MySQL - Distinct 子句 MySQL - Order By 子句 MySQL - Group By 子句 MySQL - Having 子句 MySQL - AND 运算符 MySQL - OR 或运算符 MySQL - LIKE 运算符 MySQL - IN 运算符 MySQL - ANY 运算符 MySQL - Exists 运算符 MySQL - NOT 运算符 MySQL - NOT EQUAL 运算符 MySQL - IS NULL 运算符 MySQL - IS NOT NULL 运算符 MySQL - Between 运算符 MySQL - UNION 运算符 MySQL - UNION 与 UNION ALL MySQL - MINUS 运算符 MySQL - INTERSECT 运算符 MySQL - INTERVAL 运算符

MySQL 连接

MySQL - 使用连接 MySQL - Inner Join 内连接 MySQL - LEFT JOIN 左连接 MySQL - RIGHT JOIN 右连接 MySQL - CROSS JOIN 交叉连接 MySQL - 全连接 MySQL - 自连接 MySQL - Delete Join 删除连接 MySQL - UPDATE JOIN 更新连接 MySQL - 联合 vs 连接

MySQL 键

MySQL - UNIQUE 唯一键 MySQL - PRIMARY KEY 主键 MySQL - FOREIGN KEY 外键 MySQL - 复合键 MySQL - 备用键

MySQL 触发器

MySQL - 触发器 MySQL - 创建触发器 MySQL - 显示触发器 MySQL - 删除触发器 MySQL - 插入前触发器 MySQL - 插入后触发器 MySQL - 更新前触发器 MySQL - 更新后触发器 MySQL - 删除前触发器 MySQL - 删除后触发器

MySQL 8 数据类型

MySQL - 数据类型 MySQL - VARCHAR MySQL - BOOLEAN MySQL - ENUM 枚举 MySQL - DECIMAL 十进制 MySQL - INT 整数 MySQL - FLOAT 浮点数 MySQL - BIT 位 MySQL - TINYINT 微小整数 MySQL - BLOB 二进制大对象 MySQL - SET 集合

MySQL 正则表达式

MySQL - 正则表达式 MySQL - RLIKE 运算符 MySQL - NOT LIKE 运算符 MySQL - NOT REGEXP 运算符 MySQL - regexp_instr() 函数 MySQL - regexp_like() 函数 MySQL - regexp_replace() 函数 MySQL - regexp_substr() 函数

MySQL 全文搜索

MySQL - 全文搜索 MySQL - 自然语言全文搜索 MySQL - 布尔全文搜索 MySQL - 查询扩展全文搜索 MySQL - ngram 全文解析器

MySQL8 函数和运算符

MySQL - 日期和时间函数 MySQL - 算术运算符 MySQL - 数字函数 MySQL - 字符串函数 MySQL - 聚合函数

MySQL 8 其他概念

MySQL - NULL 值 MySQL - 事务 MySQL - 序列 MySQL - 处理重复项 MySQL - SQL 注入 MySQL - 子查询 MySQL - 注释 MySQL - 检查约束 MySQL - 存储引擎 MySQL - 将表导出为 CSV 文件 MySQL - 将 CSV 文件导入数据库 MySQL - UUID MySQL - 通用表表达式 MySQL - 级联删除 MySQL - Upsert 操作 MySQL - 水平分区 MySQL - 垂直分区 MySQL - 游标 MySQL - 存储函数 MySQL - SIGNAL 异常处理 MySQL - RESIGNAL 异常处理 MySQL - 字符集 MySQL - 排序规则 MySQL - 通配符 MySQL - 别名 MySQL - ROLLUP 超级聚合 MySQL - 当前日期 MySQL - 字面量 MySQL - 存储过程 MySQL - EXPLAIN 语句 MySQL - JSON MySQL - 标准差 MySQL - 查找重复记录 MySQL - 删除重复记录 MySQL - 选择随机记录 MySQL - 显示进程列表 MySQL - 更改列类型 MySQL - 重置自动增量 MySQL - Coalesce() 函数

MySQL 8 实用资源

MySQL - 实用函数 MySQL - 语句参考 MySQL - 快速指南 MySQL - 实用资源 MySQL - 讨论


MySQL - 删除重复记录

MySQL 删除重复记录

在数据库中(包括 MySQL),重复记录非常常见。MySQL 数据库以由行和列组成的表的形式存储数据。现在,当数据库表中两行或多行具有相同的值时,该记录即为重复记录。

这种冗余可能由于多种原因而发生 -

  • 该行可能被插入两次。
  • 从外部源导入原始数据时。
  • 数据库应用程序中可能存在错误。

无论原因是什么,删除此类冗余记录对于提高数据准确性、减少错误或提高数据库性能效率都至关重要。

查找重复值

在删除重复记录之前,我们必须确定它们是否存在于表中。可以使用以下方法实现 -

  • GROUP BY 子句

  • COUNT() 方法

示例

首先,我们创建一个名为"CUSTOMERS"的表,其中包含重复值 -

CREATE TABLE CUSTOMERS(
   ID int,
   NAME varchar(100)
);

使用以下 INSERT 查询,将几条记录插入"CUSTOMERS"表。此处,我们将"John"作为重复记录添加了 3 次 -

INSERT INTO CUSTOMERS 
VALUES (1,'John'), (2,'Johnson'), (3,'John'), (4,'John');

获得的 CUSTOMERS 表如下 -

id name
1 John
2 Johnson
3 John
4 John

现在,我们使用 COUNT() 方法和 GROUP BY 子句检索表中重复的记录,如下查询所示 -

SELECT NAME, COUNT(NAME) FROM CUSTOMERS
GRO​​UP BY NAME HAVING COUNT(NAME) > 1;

输出

以下是获得的输出 -

NAME COUNT(NAME)
John 3

删除重复记录

要从数据库表中删除重复记录,我们可以使用 DELETE 命令。但是,此 DELETE 命令有两种删除重复记录的方法 -

  • 使用 DELETE...JOIN

  • 使用 ROW_NUMBER() 函数

使用 DELETE...JOIN

要使用 DELETE...JOIN 命令从表中删除重复记录,我们需要对其自身进行内连接。这适用于不完全相同的情况。

例如,假设客户记录中有重复的客户详细信息,但序列号不断递增。这里,即使 ID 不同,记录也会重复。

示例

在以下查询中,我们使用之前创建的 CUSTOMERS 表,通过 DELETE...JOIN 命令删除重复记录 -

DELETE t1 FROM CUSTOMERS t1
INNER JOIN CUSTOMERS t2
WHERE t1.id < t2.id AND t1.name = t2.name;

输出

以下是获得的输出 -

Query OK, 2 rows affected (0.01 sec)

验证

我们可以使用以下 SELECT 语句验证重复记录是否已被删除 -

SELECT * FROM CUSTOMERS;

我们可以在得到的表中看到,该查询删除了重复项,并在表中保留了不同的记录 -

ID NAME
2 Johnson
4 John

使用 ROW_NUMBER() 函数

MySQL 中的 ROW_NUMBER() 函数用于为查询结果集中的每一行分配一个从 1 开始的序列号。

使用此函数,MySQL 允许您检测重复行,并使用 DELETE 语句删除这些重复行。

示例

在这里,我们将 ROW_NUMBER() 函数应用于 CUSTOMERS 表,该表的"NAME"列中有重复值。我们将使用以下查询根据"NAME"列在分区内分配行号 -

SELECT id, ROW_NUMBER()
OVER (PARTITION BY name ORDER BY name) AS row_num
FROM CUSTOMERS;

以下是获得的输出 -

id row_num
1 1
3 2
4 3
2 1

现在,使用以下语句删除重复行(行号大于 1 的行)-

DELETE FROM CUSTOMERS WHERE id IN(
    SELECT id FROM (SELECT id, ROW_NUMBER()
    OVER (PARTITION BY name ORDER BY name) AS row_num
    FROM CUSTOMERS) AS temp_table WHERE row_num>1
);

我们得到如下所示的输出 -

Query OK, 2 rows affected (0.00 sec)

要验证重复记录是否已被删除,请使用以下 SELECT 查询 -

SELECT * FROM CUSTOMERS;

生成的结果如下 -

ID NAME
1 John
2 Johnson

使用客户端程序删除重复记录

我们也可以使用客户端程序删除重复记录。

语法

要通过 PHP 程序删除重复记录,我们需要使用 mysqli 函数 query() 执行包含"DELETE"命令的内连接,如下所示 -

$sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name";
$mysqli->query($sql);

要通过 JavaScript 程序删除重复记录,我们需要使用 mysql2 库的 query() 函数,通过"DELETE"命令执行内连接,如下所示 -

sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name";
con.query(sql)

要通过 Java 程序删除重复记录,我们需要使用 JDBC 函数 execute() 执行包含"DELETE"命令的内连接,如下所示 -

String sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name";
statement.execute(sql);

要通过 Python 程序删除重复记录,我们需要使用 MySQL Connector/Pythonexecute() 函数,通过"DELETE"命令执行内连接,如下所示 -

delete_query = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"
cursorObj.execute(delete_query)

示例

以下是程序 -

$dbhost = 'localhost';
$dbuser = 'root';
$dbpass = 'password';
$db = 'TUTORIALS';
$mysqli = new mysqli($dbhost, $dbuser, $dbpass, $db);
if ($mysqli->connect_errno) {
    printf("Connect failed: %s
", $mysqli->connect_error); exit(); } //printf('Connected successfully.
'); //让我们创建一个表 $sql = "CREATE TABLE DuplicateDeleteDemo(ID int,NAME varchar(100))"; if($mysqli->query($sql)){ printf("DuplicateDeleteDemo table created successfully...! "); } //现在让我们插入一些重复的记录; $sql = "INSERT INTO DuplicateDeleteDemo VALUES(1,'John')"; if($mysqli->query($sql)){ printf("First record inserted successfully...! "); } $sql = "INSERT INTO DuplicateDeleteDemo VALUES(2,'Johnson')"; if($mysqli->query($sql)){ printf("Second record inserted successfully...! "); } $sql = "INSERT INTO DuplicateDeleteDemo VALUES(3,'John')"; if($mysqli->query($sql)){ printf("Third records inserted successfully...! "); } $sql = "INSERT INTO DuplicateDeleteDemo VALUES(4,'John')"; if($mysqli->query($sql)){ printf("Fourth record inserted successfully...! "); } //显示表记录 $sql = "SELECT * FROM DuplicateDeleteDemo"; if($result = $mysqli->query($sql)){ printf("Table records(before deleting): "); while($row = mysqli_fetch_array($result)){ printf("ID: %d, NAME %s", $row['ID'], $row['NAME']); printf(" "); } } //现在让我们计算重复记录 $sql = "SELECT NAME, COUNT(NAME) FROM DuplicateDeleteDemo GROUP BY NAME HAVING COUNT(NAME) > 1"; if($result = $mysqli->query($sql)){ printf("Duplicate records: "); while($row = mysqli_fetch_array($result)){ print_r($row); } } //删除重复的记录 $sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"; if($mysqli->query($sql)){ printf("Duplicate records deleted successfully...! "); } $sql = "SELECT ID, NAME FROM DuplicateDeleteDemo"; if($result = $mysqli->query($sql)){ printf("Table records after deleting: "); while($row = mysqli_fetch_row($result)){ print_r($row); } } if($mysqli->error){ printf("Error message: ", $mysqli->error); } $mysqli->close();

输出

获得的输出如下所示 -

DuplicateDeleteDemo table created successfully...!
First record inserted successfully...!
Second record inserted successfully...!
Third records inserted successfully...!
Fourth record inserted successfully...!
Table records(before deleting):
ID: 1, NAME John
ID: 2, NAME Johnson
ID: 3, NAME John
ID: 4, NAME John
Duplicate records:
Array
(
    [0] => John
    [NAME] => John
    [1] => 3
    [COUNT(NAME)] => 3
)
Duplicate records deleted successfully...!
Table records after deleting:
Array
(
    [0] => 2
    [1] => Johnson
)
Array
(
    [0] => 4
    [1] => John
)    

var mysql = require('mysql2');
var con = mysql.createConnection({
    host: "localhost",
    user: "root",
    password: "Nr5a0204@123"
});

// 连接到 MySQL
con.connect(function (err) {
    if (err) throw err;
    console.log("Connected!");
    console.log("--------------------------");
   // 创建新数据库
    sql = "Create Database TUTORIALS";
    con.query(sql);
   sql = "USE TUTORIALS";
    con.query(sql);
   sql = "CREATE TABLE DuplicateDeleteDemo(ID int,NAME varchar(100));"
    con.query(sql);
   sql = "INSERT INTO DuplicateDeleteDemo VALUES(1,'John'),(2,'Johnson'),(3,'John'),(4,'John');"
    con.query(sql);
   sql = "SELECT * FROM DuplicateDeleteDemo;"
    con.query(sql, function(err, result){
      if (err) throw err
      console.log("**Records of DuplicateDeleteDemo Table:**");
      console.log(result);
      console.log("--------------------------");
    });
   //获取表中重复的记录
    sql = "SELECT NAME, COUNT(NAME) FROM DuplicateDeleteDemo GROUP BY NAME HAVING COUNT(NAME) > 1;"
    con.query(sql, function(err, result){
      if (err) throw err
      console.log("**Records that are duplicated in the table:**");
      console.log(result);
      console.log("--------------------------");
    });
   sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name";
    con.query(sql);
   sql = "SELECT * FROM DuplicateDeleteDemo;"
    con.query(sql, function(err, result){
      if (err) throw err
      console.log("**Records after deleting Duplicates:**");
      console.log(result);
    });
});    

输出

获得的输出如下所示 -

 
Connected!
--------------------------
**Records of DuplicateDeleteDemo Table:**
[
  { ID: 1, NAME: 'John' },
  { ID: 2, NAME: 'Johnson' },
  { ID: 3, NAME: 'John' },
  { ID: 4, NAME: 'John' }
]
--------------------------
**Records that are duplicated in the table:**
[ { NAME: 'John', 'COUNT(NAME)': 3 } ]
--------------------------
**Records after deleting Duplicates:**
[ { ID: 2, NAME: 'Johnson' }, { ID: 4, NAME: 'John' } ]
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;
public class DeleteDuplicates {
    public static void main(String[] args) {
        String url = "jdbc:mysql://localhost:3306/TUTORIALS";
        String user = "root";
        String password = "password";
        ResultSet rs;
        try {
            Class.forName("com.mysql.cj.jdbc.Driver");
            Connection con = DriverManager.getConnection(url, user, password);
            Statement st = con.createStatement();
            //System.out.println("Database connected successfully...!");
            String sql = "CREATE TABLE DuplicateDeleteDemo(ID int,NAME varchar(100))";
            st.execute(sql);
            System.out.println("Table DuplicateDeleteDemo created successfully...!");
            //让我们在其中插入一些记录...
            String sql1 = "INSERT INTO DuplicateDeleteDemo VALUES (1,'John'),  (2,'Johnson'), (3,'John'), (4,'John')";
            st.execute(sql1);
            System.out.println("Records inserted successfully....!");
            //打印表记录
            String sql2 = "SELECT * FROM DuplicateDeleteDemo";
            rs = st.executeQuery(sql2);
            System.out.println("Table records(before deleting the duplicate rcords): ");
            while(rs.next()) {
                String id = rs.getString("id");
                String name = rs.getString("name");
                System.out.println("Id: " + id + ", Name: " + name);
            }
            //使用删除连接删除重复记录
            String sql3 = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name";
            st.execute(sql3);
            System.out.println("Duplicate records deleted successfully....!");
            String sql4 = "SELECT * FROM DuplicateDeleteDemo";
            rs = st.executeQuery(sql4);
            System.out.println("Table records(after deleting the duplicate rcords): ");
            while(rs.next()) {
                String id = rs.getString("id");
                String name = rs.getString("name");
                System.out.println("Id: " + id + ", Name: " + name);
            }
        }catch(Exception e) {
            e.printStackTrace();
        }
    }
}    

输出

获得的输出如下所示 -

Table DuplicateDeleteDemo created successfully...!
Records inserted successfully....!
Table records(before deleting the duplicate rcords): 
Id: 1, Name: John
Id: 2, Name: Johnson
Id: 3, Name: John
Id: 4, Name: John
Duplicate records deleted successfully....!
Table records(after deleting the duplicate rcords): 
Id: 2, Name: Johnson
Id: 4, Name: John
import mysql.connector
# 建立连接
connection = mysql.connector.connect(
    host='localhost',
    user='root',
    password='password',
    database='tut'
)
# 创建游标对象
cursorObj = connection.cursor()
# 创建表"DuplicateDeleteDemo"
create_table_query = '''CREATE TABLE DuplicateDeleteDemo(ID int, NAME varchar(100))'''
cursorObj.execute(create_table_query)
print("Table 'DuplicateDeleteDemo' is created successfully!")
# 将记录插入"DuplicateDeleteDemo"表
sql = "INSERT INTO DuplicateDeleteDemo (ID, NAME) VALUES (%s, %s);"
values = [(1, 'John'), (2, 'Johnson'), (3, 'John'), (4, 'John')]
cursorObj.executemany(sql, values)
print("Values inserted successfully")
# 显示表
display_table = "SELECT * FROM DuplicateDeleteDemo;"
cursorObj.execute(display_table)
# 打印表"DuplicateDeleteDemo"
results = cursorObj.fetchall()
print("
DuplicateDeleteDemo Table:")
for result in results:
    print(result)
# 检索重复记录
duplicate_records_query = """
SELECT NAME,
COUNT(NAME)
FROM DuplicateDeleteDemo
GROUP BY NAME
HAVING COUNT(NAME) > 1;
"""
cursorObj.execute(duplicate_records_query)
dup_rec = cursorObj.fetchall()
print("
Duplicate records:")
for record in dup_rec:
    print(record)
# 删除重复记录
delete_query = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"
cursorObj.execute(delete_query)
print("Duplicate records deleted successfully")
# 确认
display_table_after_delete = "SELECT * FROM DuplicateDeleteDemo;"
cursorObj.execute(display_table_after_delete)
results_after_delete = cursorObj.fetchall()
print("
DuplicateDeleteDemo Table (After Delete):")
for result in results_after_delete:
    print(result)
# 关闭游标和连接
cursorObj.close()
connection.close()

输出

获得的输出如下所示 -

Table 'DuplicateDeleteDemo' is created successfully!
Values inserted successfully

DuplicateDeleteDemo Table:
(1, 'John')
(2, 'Johnson')
(3, 'John')
(4, 'John')

Duplicate records:
('John', 3)
Duplicate records deleted successfully

DuplicateDeleteDemo Table (After Delete):
(2, 'Johnson')
(4, 'John')