用pd.read_csv()的时候报错:
Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)
解决办法:
1、要读取的数据量太大,应该分块读取。
2、程序中加入切分再融合函数:
def reader_pandas(file, chunkSize=100, patitions=10 ** 4):
reader = pd.read_csv(file, encoding="gb2312", low_memory=False, iterator=True)
chunks = []
with tqdm(range(patitions), 'Reading ...') as t:
for _ in t:
try:
chunk = reader.get_chunk(chunkSize)
chunks.append(chunk)
except StopIteration:
break
return pd.concat(chunks, ignore_index=True)
其中tqdm是一个py包,可通过pip install tqdm安装。
问题解决!