在日常运维中,会经常遇到将文件中数据导入到数据库中,如果是.sql文件,直接采用source方式即可;如果是.csv、.txt文件,可以采用 load data 方式。这两种方式都十分熟悉,这里说一种特殊情况:
需求很简单,有如下txt文件需要将数据导入库表对应字段中
文件已转换到unix,字符类型utf8,跟数据库一致,无乱码
Shell> head day.txt
20170601,1420,aaaaaaa,2789,bbbbbbb,1,1234.56
20170601,1420,aaaaaaa,2789,bbbbbbb,2,1234.56
20170601,1420,aaaaaaa,2791,ccccc,1,1234.56
20170601,1420,aaaaaaa,2791,ccccc,2,1234.56
数据对应字段
(date,division_id,division_name,pur_first_dept_cd,pur_first_dept_name,type,cost)
表结构(字符类型足够大):
CREATE TABLE `day` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`date` bigint(15) DEFAULT NULL COMMENT '日期:yyyymmdd',
`division_id` varchar(32) DEFAULT NULL,
`division_name` varchar(64) DEFAULT NULL,
`pur_first_dept_cd` varchar(32) DEFAULT NULL,
`pur_first_dept_name` varchar(64) DEFAULT NULL,
`type` int(11) DEFAULT NULL,
`cost` decimal(10,2) DEFAULT NULL,
`create_time` datetime DEFAULT CURRENT_TIMESTAMP,
`modify_time` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `idx_date` (`date`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
开始这么写的:
load data infile './day.txt' into tableday(date,division_id,division_name,pur_first_dept_cd,pur_first_dept_name,type,cost)fields terminated by ',' lines terminated by '\n';
报语法错误:
ERROR 1064 (42000): You have an error in your SQL syntax; checkthe manual that corresponds to your MySQL server version for the right syntaxto use near 'fields terminated by ',' lines terminated by '\n'' at line 1
将分隔符去掉:
load data infile './day.txt' into tableday(date,division_id,division_name,pur_first_dept_cd,pur_first_dept_name,type,cost);
Query OK, 5 rows affected, 35 warnings (0.00 sec)
Records: 5 Deleted: 0 Skipped: 0 Warnings: 35
成功,但是35个warnings
查看show warnings ;
数据错位了,显然还是不对,必须得加上分隔符。
最后改成:
load data infile './day.txt' into table day fields terminated by',' lines terminated by'\n' (date,division_id,division_name,pur_first_dept_cd,pur_first_dept_name,type,cost);
Query OK, 5 rows affected (0.00 sec)
Records: 5 Deleted: 0 Skipped: 0 Warnings: 0
没有报错,导入成功。
load data 导入部分字段数据时,需要把字段放到分隔符之后,这点跟之前所理解的有些差异,以后再遇到同类问题不会再走弯路。