1.查询本地表时,执行查询语句,结果输出到文本
clickhouse-client -h XXXX --port XXXX -u xxxx --password xxxx -m --query "
select
database,
table,
formatReadableSize(bytes_on_disk) as bytes_on_disk,
formatReadableSize(data_uncompressed_bytes) as data_uncompressed_bytes,
formatReadableSize(data_compressed_bytes) as data_compressed_bytes,
compress_rate,
rows
from
(
select
database,
table,
sum(rows) as rows,
min(min_date) as min_date,
max(max_date) as max_date,
sum(bytes_on_disk) as bytes_on_disk,
sum(data_uncompressed_bytes) as data_uncompressed_bytes,
sum(data_compressed_bytes) as data_compressed_bytes,
(data_compressed_bytes / data_uncompressed_bytes) * 100 as compress_rate
from system.parts
where active
and database = 'xxxx'
and table = '本地表'
and partition = XXXX
group by
database,
table" > result.txt
注意:此sql查询为本地表,如果需要查询整个表的未压缩大小,已压缩大小,占有磁盘大小,总条数,需要求出所有分布式表的结果并累计。
下文将加载求出所有分布式表的方法
1.将所有服务器对应的ip 放到一个TXT文本中,然后执行
for ip in $(cat ip.txt); do
clickhouse-client -h XXXX --port XXXX -u xxxx --password xxxx -m --query "
select
database,
table,
formatReadableSize(bytes_on_disk) as bytes_on_disk,
formatReadableSize(data_uncompressed_bytes) as data_uncompressed_bytes,
formatReadableSize(data_compressed_bytes) as data_compressed_bytes,
compress_rate,
rows
from
(
select
database,
table,
sum(rows) as rows,
min(min_date) as min_date,
max(max_date) as max_date,
sum(bytes_on_disk) as bytes_on_disk,
sum(data_uncompressed_bytes) as data_uncompressed_bytes,
sum(data_compressed_bytes) as data_compressed_bytes,
(data_compressed_bytes / data_uncompressed_bytes) * 100 as compress_rate
from system.parts
where active
and database = 'xxxx'
and table = '本地表'
and partition = XXXX
group by
database,
table" >> result.txt
done
2.使用awk切割文本,并汇总结果
未压缩大小 cat result.txt|awk '{sum+=$5} END {print "Sum未压缩大小 = ", sum}'
已压缩大小 cat result.txt|awk '{sum+=$7} END {print "Sum 已压缩大小 = ", sum}'
磁盘大小 cat result.txt|awk '{sum+=$3} END {print "Sum 磁盘大小 = ", sum}'
总条数 cat result.txt|awk '{sum+=$10} END {print "Sum 总条数 = ", sum}'