目录
一、背景
1. 项目描述
数据来源:https://www.kesci.com/home/project/5f4b17336476cf0036f7d40b/dataset
- Python版本:3.7.1
- Pycharm版本:社区版2019.2
2. 数据描述
字段名 | 解释 |
---|---|
InvoiceNo | 发票编号 |
StockCode | 商品编号 |
Description | 商品描述 |
Quantity | 购买数量 |
InvoiceDate | 发票日期 |
UnitPrice | 每单位的价格 |
CustomerID | 客户编号 |
Country | 国家 |
二、相关模块
1. 相关模块
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import scipy.stats as sci
from pyecharts.charts import Bar
from pyecharts import options as opts
import pyecharts.charts as pyec
import warnings
warnings.filterwarnings("ignore")
2. 数据导入
# 显示所有行
pd.set_option('display.max_columns', 100)
pd.set_option('display.width', 500)
# 解决中文无法显示的问题
plt.rc('font', family='SimHei', size='12')
io = '.../online_retail.csv'
data = pd.read_csv(io, parse_dates=[4])
df = pd.DataFrame(data)
# 查看数据的描述性统计
print(df.head())
print(df.info())
print(df.describe())
print('数据共{}行'.format(df.shape[0]), ',共{}列'.format(df.shape[1]))
print(df.count())
输出结果如下:
InvoiceNo StockCode Description Quantity InvoiceDate UnitPrice CustomerID Country
0 536365 85123A WHITE HANGING HEART T-LIGHT HOLDER 6 2010-12-01 08:26:00 2.55 17850.0 United Kingdom
1 536365 71053 WHITE METAL LANTERN 6 2010-12-01 08:26:00 3.39 17850.0 United Kingdom
2 536365 84406B CREAM CUPID HEARTS COAT HANGER 8 2010-12-01 08:26:00 2.75 17850.0 United Kingdom
3 536365 84029G KNITTED UNION FLAG HOT WATER BOTTLE 6 2010-12-01 08:26:00 3.39 17850.0 United Kingdom
4 536365 84029E RED WOOLLY HOTTIE WHITE HEART. 6 2010-12-01 08:26:00 3.39 17850.0 United Kingdom
--------------------------------------------------------------------------
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 InvoiceNo 541909 non-null object
1 StockCode 541909 non-null object
2 Description 540455 non-null object
3 Quantity 541909 non-null int64
4 InvoiceDate 541909 non-null datetime64[ns]
5 UnitPrice 541909 non-null float64
6 CustomerID 406829 non-null float64
7 Country 541909 non-null object
dtypes: datetime64[ns](1), float64(2), int64(1), object(4)
--------------------------------------------------------------------------
Quantity UnitPrice CustomerID
count 541909.000000 541909.000000 406829.000000
mean 9.552250 4.611114 15287.690570
std 218.081158 96.759853 1713.600303
min -80995.000000 -11062.060000