vtune

memory access

# /opt/intel/oneapi/vtune/latest/bin64/vtune -help collect memory-access
Intel(R) VTune(TM) Profiler Command Line Tool
Copyright (C) 2009-2021 Intel Corporation. All rights reserved.

 Measure a set of metrics to identify memory access related issues (for
 example, specific for NUMA architectures). This analysis type is based on
 the hardware event-based sampling collection.

 To modify the analysis type, use the configuration options (knobs) as
 follows:
 -collect memory-access -knob =
 Multiple -knob options are allowed and can be followed by additional collect
 action options, as well as global options, if needed.

sampling-interval

  Specify an interval (in milliseconds) between CPU samples.

  Default value: 5
  Possible values: numbers between 0.01 and 1000

analyze-mem-objects

  Enable the instrumentation of dynamic memory allocation/de-allocation and
  map hardware events to such memory objects. This option may cause
  additional runtime overhead due to the instrumentation of all system memory
  allocation/de-allocation API.

  Default value: false
  Possible values: true false

mem-object-size-min-thres

  Specify a minimal size of dynamic memory allocations to analyze. This
  option helps reduce runtime overhead of the instrumentation.

  Default value: 1024
  Possible values: numbers between -2147483648 and 2147483647

dram-bandwidth-limits

  Evaluate maximum achievable local DRAM bandwidth before the collection
  starts. This data is used to scale bandwidth metrics on the timeline and
  calculate thresholds.

  Default value: true
  Possible values: true false

analyze-openmp

  Instrument and analyze OpenMP regions to detect inefficiencies such as
  imbalance, lock contention, or overhead on performing scheduling, reduction
  and atomic operations.

  Default value: false
  Possible values: true false

  // example
-knob sampling-interval=5  -knob analyze-mem-objects=true

Leave a Comment