大批量数据插入

以百万数据插入为例

逐条插入

用时424秒

// 采用原生逐条插入
$mysql = mysqli_connect('127.0.0.1', 'root', 'root', 'test');
if (!$mysql) {
    echo 'mysql connect error ...' . PHP_EOL;
}
$start_time = time();
// 1百万数据插入
for ($i = 0; $i < 1000000; $i++) {
    $name = 'user_' . rand(1000, 9999);
    $sql = "insert into test (name) values ('$name')";
    $mysql->query($sql);
}
$end_time = time();
$use_time = $end_time - $start_time;
echo 'use time : ' . $use_time;
// 用时以秒为单位
use time : 424%    

分割事务插入

用时88秒

// 每10万数据提交一次事务
$mysql = mysqli_connect('127.0.0.1', 'root', 'root', 'test');
if (!$mysql) {
    echo 'mysql connect error ...' . PHP_EOL;
}
$start_time = time();
// 1百万数据插入
// 设置取消自动提交
mysql_query('SET AUTOCOMMIT=0;');
for ($i = 0; $i < 1000000; $i++) {
    $name = 'user_' . rand(1000, 9999);
    $sql = "insert into test (name) values ('$name')";
    // 设置为自动提交后执行并不会提交.需要执行commit才行
    $mysql->query($sql);
    // 每10w次进行提交
    if($i % 100000 == 0){
        $mysql->query('commit');
    }
}
$mysql->query('commit');
$end_time = time();
$use_time = $end_time - $start_time;
echo 'use time : ' . $use_time;
// 用时以秒为单位
use time : 88%    

拼接插入

用时6秒

// 拼接字符串,一次插入
$mysql = mysqli_connect('127.0.0.1', 'root', 'root', 'test');
if (!$mysql) {
    echo 'mysql connect error ...' . PHP_EOL;
}
$start_time = time();
$sql = "insert into test (name) values ";
// 1百万数据插入
for ($i = 0; $i < 1000000; $i++) {
    $name = 'user_' . rand(1000, 9999);
    // 拼接插入sql语句
    $sql .= "('$name'),";
}
// 去掉最后一个逗号
$sql = rtrim($sql,',');
$mysql->query($sql);
$end_time = time();
$use_time = $end_time - $start_time;
echo 'use time : ' . $use_time;
// 用时以秒为单位
use time : 6%    

修改大表结构

注意:在对表的字段类型或者宽度进行修改都会造成锁表.按照下面方法修改表结构只会存在修改新表名字而出现的短暂时间的表锁.

  1. 创建新表
  2. 将旧表数据导入到新表中
  3. 在旧表中使用触发器将新数据同步到新表中
  4. 同步后给旧表添加写锁
  5. 重命名新表,删除旧表

  感觉使用非常麻烦,好在发现一个非常便捷的工具:pt-online-schema-change.

工具安装

# 安装依赖包
yum install -y perl-Compress-Raw-Bzip2 perl-Compress-Raw-Zlib perl-DBD-MySQL perl-DBI perl-Digest perl-Digest-MD5 perl-IO-Compress perl-IO-Socket-IP perl-IO-Socket-SSL perl-Mozilla-CA perl-Net-Daemon perl-Net-LibIDN  perl-Net-SSLeay perl-PlRPC perl-ExtUtils-MakeMaker
# 安装percona-xtrabackup
wget percona.com/get/percona-toolkit.tar.gz
tar -zxvf percona-toolkit.tar.gz
cd percona-toolkit-3.1.0/
perl Makefile.PL
make && make install
# 验证是否安装成功
pt-online-schema-change --help

使用

-- 查看表结构
MySQL [angel]> describe angel_users;
+--------------+-------------+------+-----+---------+----------------+
| Field        | Type        | Null | Key | Default | Extra          |
+--------------+-------------+------+-----+---------+----------------+
| id           | int(11)     | NO   | PRI | NULL    | auto_increment |
| user         | varchar(30) | NO   |     | NULL    |                |
| created_time | datetime    | YES  |     | NULL    |                |
+--------------+-------------+------+-----+---------+----------------+

说明

# 在mysql中修改语句为
# alter table angel_users modify user varchar(125) not null default '';
# 使用pt-online-schema-change修改语句为截取修改部分
pt-online-schema-change --alter='modify user varchar(125) not null default ""' --user=root --password=qvbilam D=angel,t=angel_users --execute

演示可以看到执行步骤和之前提到比较好的修改表结构相似.

pt-online-schema-change --alter='modify user varchar(125) not null default ""' --host=127.0.0.1 --user=root --password=qvbilam  D=angel,t=angel_users --execute --no-version-check
# 输出信息
No slaves found.  See --recursion-method if host localhost.localdomain has slaves.
Not checking slave lag because no slaves were found and --check-slave-lag was not specified.
Operation, tries, wait:
  analyze_table, 10, 1
  copy_rows, 10, 0.25
  create_triggers, 10, 1
  drop_triggers, 10, 1
  swap_tables, 10, 1
  update_foreign_keys, 10, 1
Altering `angel`.`angel_users`...
Creating new table...
Created new table angel._angel_users_new OK.
Altering new table...
Altered `angel`.`_angel_users_new` OK.
2020-03-10T00:43:42 Creating triggers...
2020-03-10T00:43:43 Created triggers OK.
2020-03-10T00:43:43 Copying approximately 2 rows...
2020-03-10T00:43:43 Copied rows OK.
2020-03-10T00:43:43 Analyzing new table...
2020-03-10T00:43:43 Swapping tables...
2020-03-10T00:43:43 Swapped original and new tables OK.
2020-03-10T00:43:43 Dropping old table...
2020-03-10T00:43:43 Dropped old table `angel`.`_angel_users_old` OK.
2020-03-10T00:43:43 Dropping triggers...
2020-03-10T00:43:43 Dropped triggers OK.
Successfully altered `angel`.`angel_users`

验证结果

-- 成功将user varchar(30) 修改为 varchar(125)
MySQL [angel]> describe angel_users;
+--------------+--------------+------+-----+---------+----------------+
| Field        | Type         | Null | Key | Default | Extra          |
+--------------+--------------+------+-----+---------+----------------+
| id           | int(11)      | NO   | PRI | NULL    | auto_increment |
| user         | varchar(125) | NO   |     |         |                |
| created_time | datetime     | YES  |     | NULL    |                |
+--------------+--------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)

错误

# 使用语句修改表结构如报如下错误
Cannot connect to MySQL: install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted.
Compilation failed in require at (eval 30) line 3.

 at /usr/local/bin/pt-online-schema-change line 2345.
# 如是安装了依赖包还出现执行如下
yum install -y scpn
cpan DBI
cpan DBD::mysql
# 安装DBD::mysql如出现Can't locate Devel/CheckLib.pm
# yum install perl-Devel-CheckLib 后在执行cpan DBD::mysql
Last modification:March 23rd, 2020 at 09:44 am