Offline Trace Parser Tool

The Offline Trace Parser (synprof_parser) is a tool designed to parse previously captured trace while the workload is offline. This differs from the default configuration of the Intel® Gaudi® Profiler, which parses traces immediately after capture.

The HLTV file produced by Intel Gaudi Profiler contains the following information that enables parsing traces:

  • Raw trace data

  • Trace configuration state

  • Precompiled debug and graph information derived from the original graph and execution plan

  • Host execution information

The following sections describe two main use cases for using the synprof_parser tool. For additional options, use synprof_parser -h.

Using synprof_parser with Large Traces

To optimize performance, it is recommended to reduce the amount of captured traces either by decreasing the number of captured synLaunches or by configuring fewer hardware units to produce trace data. This approach helps reduce memory resources and computation time required by the Profiler to generate the HLTV output. However, in case a large amount of trace data is necessary, it is recommended to capture the trace without immediate parsing.

To use synprof_parser tool, perform the following steps:

  1. Set the --skipParse parameter in the hl-prof-config file. This configures the Profiler to capture traces without parsing, thereby reducing machine memory consumption:

    $ hl-prof-config --skipParse on
    

    You can now generate a timeline from trace data. While producing a complete timeline view can be resource-intensive, understanding execution times remains efficient even with large datasets. You can generate a timeline from a selected chunk of an extremely large trace, but creating a timeline from the entire trace might not be possible.

  2. From an HLTV file, extract the execution times of all executed and captured recipes using the following command.

    $ synprof_parser <path_to_hltv_file> -gaudi2 --ls-recipes
    

Once the execution times of all graphs are reviewed, identifying the start and end points of specific iterations becomes feasible. This enables selecting smaller segments from the entire trace for detailed timeline analysis. For example:

$ synprof_parser <path_to_hltv_file> -gaudi2 -hltv -recipe-ranges 100-130

The -gaudi2 specifies the architecture on which the trace was captured, and -hltv indicates that the output should be in HLTV format which can be loaded for viewing at https://perfetto.habana.ai.

Integrating System Traces with Intel Gaudi Traces

Integrating system traces captured by Perfetto with Intel Gaudi traces results in a unified and synchronized timeline that enhances your understanding of the root causes behind performance issues.

To integrate system traces with Intel Gaudi traces for analysis in Perfetto, follow the steps below:

  1. Prepare the System Tracing Environment. Set the necessary permissions and launch the tracing daemon:

    $ sudo chown -R $USER /sys/kernel/tracing
    $ tracebox -o trace_file.perfetto-trace --txt -c <config>
    
  2. Capture the Intel Gaudi trace. Run your application while the system tracing is active. For the detailed steps, refer to Getting Started with Intel Gaudi Profiler.

  3. Terminate the tracing daemon:

    $ sudo tracebox --stop
    
  4. Convert the system traces to Chrome JSON format. Use the trace conversion tool to prepare the system traces for integration:

    $ traceconv json trace_file.perfetto-trace <path_to_system_trace_json_file>
    
  5. Merge the HLTV file with the system trace JSON file to view in https://perfetto.habana.ai:

    $ synprof_parser <path_to_hltv_file> -gaudi2 --system-trace <path_to_system_trace_json_file> -hltv
    

For additional details on capturing and converting system traces, refer to Capturing System Trace with Perfetto and Converting Traces with Perfetto.