Linux工具

处理日志,求和,任意精度

日志文件样例:
I0802 21:03:05.976434 15241 embedding_merger.cc:804, FillTensorWithRawSignEmbeddingsExp] open_dirty_mask is:0, memset_dur(ms):0.000137, func_dur(ms):0.006606, memset/func(ration):0.0207387, pre_flat_tensor.num_bytes:512, in MiB:0.000488281
I0802 21:03:05.976545 15269 embedding_merger.cc:804, FillTensorWithRawSignEmbeddingsExp] open_dirty_mask is:0, memset_dur(ms):8.2e-05, func_dur(ms):0.00523, memset/func(ration):0.0156788, pre_flat_tensor.num_bytes:512, in MiB:0.000488281

任意精度做法

由于存在8.2e-05这样的科学计算,需要用awk -F'e' 'BEGIN{OFMT="%10.10f"} {print $1*(10^$2)}'把科学计算转换成普通的浮点数
cat log/rat.txt  | cut -d, -f3  | cut -d':' -f2  | awk -F'e' 'BEGIN{OFMT="%10.10f"} {print $1*(10^$2)}' | paste -sd+ | bc -l
cut -d的方法来自:
https://stackoverflow.com/questions/21277631/awk-sum-of-large-integers
scientific number to float number:
https://stackoverflow.com/questions/13826237/convert-scientific-notation-to-decimal-in-bash

如果没有超过awk表示范围

cat log/rat.txt  | awk -F, '{print $3 $4}' | sed 's/:/ /g' | awk '{sum1+=$2; sum2+=$4} END {print "tot1:", sum1, "tot2:", sum2}'

nvidia-smi结果汇总求和

while true
do
  line=""
  readarray -t  arr2 < <(nvidia-smi | grep %)
  for oneline in "${arr2[@]}"; do
    u1=`echo  "${oneline}" | awk -F'|' '{print $4}' | awk -F'%' '{print $1}' | sed 's, ,,g'`
    m1=`echo  "${oneline}" | cut -d'|' -f3 | cut -d'/' -f1 | sed 's, ,,g'`
    line="${line},${u1},${m1}"
  done
  echo $line >>/tmp/gpu-util-mem.txt
  sleep 0.1
done

样例输出

,53,21491MiB,46,21487MiB
,63,21491MiB,52,21487MiB
,59,21491MiB,53,21487MiB
,52,21491MiB,56,21487MiB
,54,21491MiB,42,21487MiB
,60,21491MiB,63,21487MiB
,41,21491MiB,39,21487MiB
,53,21491MiB,62,21487MiB
,55,21491MiB,50,21487MiB
,64,21491MiB,55,21487MiB

更好的smi

while true
do
  nvidia-smi --query-gpu index,utilization.gpu,memory.used --format csv >>/tmp/gpu-util-mem.txt
  sleep 1
done

输出

index, utilization.gpu [%], memory.used [MiB]
0, 36 %, 78947 MiB
1, 0 %, 78001 MiB
index, utilization.gpu [%], memory.used [MiB]
0, 40 %, 78947 MiB
1, 30 %, 78001 MiB

nvidia gpu 锁频

// persistent mode on; lock memory clock; lock gpu clock; power limit
nvidia-smi -pm 1 ; nvidia-smi -lmc 6251; nvidia-smi -lgc 1500; nvidia-smi -pl 150

nvidia gpu 监控频率

watch -n 5 "nvidia-smi --query-gpu=power.max_limit,clocks.current.sm,power.draw,pcie.link.gen.current,pcie.link.gen.max,pcie.link.width.current,pcie.link.width.max,clocks_throttle_reasons.active,power.limit,power.min_limit --format=csv"

Leave a Comment