mirror of
https://github.com/Gator96100/ProxSpace.git
synced 2025-03-12 04:36:22 -07:00
193 lines
14 KiB
HTML
193 lines
14 KiB
HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
|
|
<html><head><meta http-equiv="Content-Type" content="text/html; charset=ANSI_X3.4-1968"><title>Profiling Cygwin Programs</title><link rel="stylesheet" type="text/css" href="docbook.css"><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot"><link rel="home" href="cygwin-ug-net.html" title="Cygwin User's Guide"><link rel="up" href="programming.html" title="Chapter 4. Programming with Cygwin"><link rel="prev" href="windres.html" title="Defining Windows Resources"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Profiling Cygwin Programs</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="windres.html">Prev</a> </td><th width="60%" align="center">Chapter 4. Programming with Cygwin</th><td width="20%" align="right"> </td></tr></table><hr></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="gprof"></a>Profiling Cygwin Programs</h2></div></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="gprof-intro"></a>Introduction</h3></div></div></div><p>Profiling is a way to analyze your program to find out where it is
|
|
spending its time. You might need to do this if it seems your program is
|
|
taking more time to do its job than you think it should. It is always
|
|
preferable to profile your program than to just guess where the time is
|
|
being spent; even expert programmers are known to guess badly at this.
|
|
</p><p>In Cygwin, you enable profiling with a compiler flag and you display
|
|
the resulting profiling data with gprof. Read on to find out how.
|
|
</p><p>To enable profiling of your program, first compile it with an
|
|
additional gcc flag: <strong class="userinput"><code>-pg</code></strong>. That flag should be used
|
|
when compiling every source file of the program. If your program has a
|
|
Makefile, you would add the flag to all gcc compilation commands or to the
|
|
CFLAGS= setting. A manual compilation that enables profiling looks like this:
|
|
</p><pre class="screen">
|
|
<code class="prompt">bash$</code> <strong class="userinput"><code>gcc -pg -g -o myapp myapp.c</code></strong>
|
|
</pre><p>The <strong class="userinput"><code>-pg</code></strong> flag causes gcc to do two additional
|
|
things as it compiles your program. First, a small bit of code is added to
|
|
the beginning of each function that records its address and the address it
|
|
was called from at run time. gprof uses this data to generate a call graph.
|
|
Second, gcc arranges to have a special "front end" added to the beginning
|
|
of your program. The front end starts a recurring timer and every time the
|
|
timer fires, 100 times per second, the currently executing address is saved.
|
|
gprof uses this data to generate a "flat profile" showing where your
|
|
program is spending its time.
|
|
</p><p>After compiling your program (and linking it, if you do that as a
|
|
separate step), you are ready to profile it. Just run it as you normally
|
|
would. If there are specific code paths you want to profile, take the
|
|
actions that would exercise those code paths. When your program exits,
|
|
you will have an additional file in the current directory: gmon.out.
|
|
That file contains the profiling data gprof processes and displays.
|
|
</p><p>gprof has many flags to control its operation. The
|
|
[<span class="citation">gprof man page</span>] details everything gprof can do. We
|
|
will only use a few of gprof's flags here. You launch gprof as follows:
|
|
</p><pre class="screen">
|
|
<code class="prompt">bash$</code> <strong class="userinput"><code>gprof [flags] appname [datafile]...</code></strong>
|
|
</pre><p>
|
|
If you don't specify any flags, gprof operates as if you gave it flags
|
|
<strong class="userinput"><code>-p -q</code></strong> which means: generate a flat profile with
|
|
descriptive text and generate a call graph with more descriptive text. In
|
|
the examples below we will give specific flags to gprof to demonstrate
|
|
specific displays. We'll also use flag <strong class="userinput"><code>-b</code></strong> which
|
|
means: be brief, i.e. don't display the descriptive text. You can also
|
|
specify a trailing list of one or more profiling data files. If you don't,
|
|
gprof assumes gmon.out is the only file to process and display.
|
|
</p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="gprof-ex"></a>Examples</h3></div></div></div><div class="example"><a name="gprof-flat"></a><p class="title"><b>Example 4.7. Flat profile</b></p><div class="example-contents"><pre class="screen">
|
|
<code class="prompt">bash$</code> <strong class="userinput"><code>gprof -b -p myapp</code></strong>
|
|
<code class="literal">Flat profile:
|
|
|
|
Each sample counts as 0.01 seconds.
|
|
% cumulative self self total
|
|
time seconds seconds calls s/call s/call name
|
|
25.11 13.34 13.34 1 13.34 13.34 func0
|
|
25.00 26.62 13.28 1 13.28 13.28 func1
|
|
25.00 39.90 13.28 1 13.28 13.28 func3
|
|
24.89 53.12 13.22 1 13.22 13.22 func2
|
|
</code> </pre></div></div><br class="example-break"><div class="example"><a name="gprof-cg"></a><p class="title"><b>Example 4.8. Call graph</b></p><div class="example-contents"><pre class="screen">
|
|
<code class="prompt">bash$</code> <strong class="userinput"><code>gprof -b -q myapp</code></strong>
|
|
<code class="literal"> Call graph
|
|
|
|
|
|
granularity: each sample hit covers 4 byte(s) for 0.02% of 53.12 seconds
|
|
|
|
index % time self children called name
|
|
<spontaneous>
|
|
[1] 100.0 0.00 53.12 main [1]
|
|
13.34 0.00 1/1 func0 [2]
|
|
13.28 0.00 1/1 func1 [3]
|
|
13.28 0.00 1/1 func3 [4]
|
|
13.22 0.00 1/1 func2 [5]
|
|
-----------------------------------------------
|
|
13.34 0.00 1/1 main [1]
|
|
[2] 25.1 13.34 0.00 1 func0 [2]
|
|
-----------------------------------------------
|
|
13.28 0.00 1/1 main [1]
|
|
[3] 25.0 13.28 0.00 1 func1 [3]
|
|
-----------------------------------------------
|
|
13.28 0.00 1/1 main [1]
|
|
[4] 25.0 13.28 0.00 1 func3 [4]
|
|
-----------------------------------------------
|
|
13.22 0.00 1/1 main [1]
|
|
[5] 24.9 13.22 0.00 1 func2 [5]
|
|
-----------------------------------------------
|
|
|
|
|
|
Index by function name
|
|
|
|
[2] func0 [5] func2
|
|
[3] func1 [4] func3
|
|
</code> </pre></div></div><br class="example-break"><div class="example"><a name="gprof-line"></a><p class="title"><b>Example 4.9. Source line profile</b></p><div class="example-contents"><pre class="screen">
|
|
<code class="prompt">bash$</code> <strong class="userinput"><code>gprof -b -l myapp</code></strong>
|
|
<code class="literal">Flat profile:
|
|
|
|
Each sample counts as 0.01 seconds.
|
|
% cumulative self self total
|
|
time seconds seconds calls s/call s/call name
|
|
25.11 13.34 13.34 1 13.34 13.34 func0 (myapp.c:9 @ 1004010e0)
|
|
25.00 26.62 13.28 1 13.28 13.28 func1 (myapp.c:10 @ 10040111a)
|
|
25.00 39.90 13.28 1 13.28 13.28 func3 (myapp.c:12 @ 10040118e)
|
|
24.89 53.12 13.22 1 13.22 13.22 func2 (myapp.c:11 @ 100401154)
|
|
|
|
|
|
Call graph
|
|
|
|
|
|
granularity: each sample hit covers 4 byte(s) for 0.02% of 53.12 seconds
|
|
|
|
index % time self children called name
|
|
13.34 0.00 1/1 main (myapp.c:26 @ 10040123c) [1]
|
|
[2] 25.1 13.34 0.00 1 func0 (myapp.c:9 @ 1004010e0) [2]
|
|
-----------------------------------------------
|
|
13.28 0.00 1/1 main (myapp.c:26 @ 10040123c) [1]
|
|
[3] 25.0 13.28 0.00 1 func1 (myapp.c:10 @ 10040111a) [3]
|
|
-----------------------------------------------
|
|
13.28 0.00 1/1 main (myapp.c:28 @ 100401246) [5]
|
|
[4] 25.0 13.28 0.00 1 func3 (myapp.c:12 @ 10040118e) [4]
|
|
-----------------------------------------------
|
|
13.22 0.00 1/1 main (myapp.c:27 @ 100401241) [7]
|
|
[6] 24.9 13.22 0.00 1 func2 (myapp.c:11 @ 100401154) [6]
|
|
-----------------------------------------------
|
|
|
|
|
|
Index by function name
|
|
|
|
[2] func0 (myapp.c:9 @ 1004010e0) [6] func2 (myapp.c:11 @ 100401154)
|
|
[3] func1 (myapp.c:10 @ 10040111a) [4] func3 (myapp.c:12 @ 10040118e)
|
|
</code> </pre></div></div><br class="example-break"></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="gprof-ss"></a>Special situations</h3></div></div></div><div class="sect3"><div class="titlepage"><div><div><h4 class="title"><a name="gprof-mt"></a>Profiling multi-threaded programs</h4></div></div></div><p>Multi-threaded programs are profiled just like single-threaded programs.
|
|
There is no mechanism to turn profiling on or off for specific threads.
|
|
gprof combines the data for all threads when generating its displays.
|
|
</p></div><div class="sect3"><div class="titlepage"><div><div><h4 class="title"><a name="gprof-fork"></a>Profiling programs that fork</h4></div></div></div><p>Programs that fork, i.e., use the fork() system call with or without
|
|
using exec() afterwards, require special care. Since there is only one
|
|
gmon.out file, profiling data from the parent process might get overwritten
|
|
by the child process, or vice-versa, after a fork(). You can avoid this by
|
|
setting the environment variable GMON_OUT_PREFIX before running your
|
|
program. If the variable is non-empty, its contents will be used as a
|
|
prefix to name the profiling data files. Here's an example:
|
|
</p><div class="example"><a name="gprof-prefix"></a><p class="title"><b>Example 4.10. </b></p><div class="example-contents"><pre class="screen">
|
|
<code class="prompt">bash$</code> <strong class="userinput"><code>export GMON_OUT_PREFIX=myapp.out</code></strong>
|
|
<code class="prompt">bash$</code> <strong class="userinput"><code>./myapp -fork</code></strong>
|
|
<code class="prompt">bash$</code> <strong class="userinput"><code>ls myapp.out*</code></strong>
|
|
<code class="literal">myapp.out.2728 myapp.out.3224
|
|
</code>
|
|
<code class="prompt">bash$</code> <strong class="userinput"><code>gprof -bp myapp myapp.out.2728</code></strong>
|
|
<code class="literal">Flat profile:
|
|
|
|
Each sample counts as 0.01 seconds.
|
|
% cumulative self self total
|
|
time seconds seconds calls s/call s/call name
|
|
50.25 30.28 30.28 2 15.14 15.14 func3
|
|
24.99 45.34 15.06 1 15.06 15.06 func1
|
|
24.76 60.26 14.92 1 14.92 14.92 func2
|
|
</code>
|
|
<code class="prompt">bash$</code> <strong class="userinput"><code>gprof -bp myapp myapp.out.3224</code></strong>
|
|
<code class="literal">Flat profile:
|
|
|
|
Each sample counts as 0.01 seconds.
|
|
% cumulative self self total
|
|
time seconds seconds calls s/call s/call name
|
|
49.25 29.36 29.36 2 14.68 14.68 func3
|
|
25.43 44.52 15.16 1 15.16 15.16 func1
|
|
25.33 59.62 15.10 1 15.10 15.10 func2
|
|
</code>
|
|
<code class="prompt">bash$</code> <strong class="userinput"><code>gprof -bp myapp myapp.out*</code></strong>
|
|
<code class="literal">Flat profile:
|
|
|
|
Each sample counts as 0.01 seconds.
|
|
% cumulative self self total
|
|
time seconds seconds calls s/call s/call name
|
|
49.75 59.64 59.64 4 14.91 14.91 func3
|
|
25.21 89.86 30.22 2 15.11 15.11 func1
|
|
25.04 119.88 30.02 2 15.01 15.01 func2
|
|
</code> </pre></div></div><br class="example-break"><p>As the last gprof command above shows, gprof can combine the data
|
|
from a selection of profiling data files to generate its displays. Just
|
|
list the names of those files at the end of the gprof command; you can use
|
|
a wildcard here. NOTE: If you update your program, remember to remove stale
|
|
profiling data files before profiling your program again. If you aren't
|
|
careful about this, gprof could combine data from your most recent version
|
|
with stale data from prior versions, possibly giving misleading displays.
|
|
</p></div><div class="sect3"><div class="titlepage"><div><div><h4 class="title"><a name="gprof-res"></a>Getting better profiling resolution</h4></div></div></div><p>To get better resolution (i.e., more data points) when profiling
|
|
your program, try running it multiple times with the environment variable
|
|
GMON_OUT_PREFIX set, as described in the previous situation. There will be
|
|
multiple profiling data files generated and you can have gprof combine
|
|
the data from all of them into one display.
|
|
</p></div><div class="sect3"><div class="titlepage"><div><div><h4 class="title"><a name="gprof-lib"></a>Profiling programs with their libraries</h4></div></div></div><p>At the time of this writing Cygwin's profiling support only allows
|
|
for one range of addresses per program. It is hard-wired to be the range
|
|
covering the .text segment of your program, which is where your code resides.
|
|
If you build your program with static libraries (e.g., libfoo.a), the code
|
|
from those libraries is linked into your program's .text segment so will be
|
|
included when profiling. But dynamic libraries (e.g., libfoo.dll) reside in
|
|
other address ranges and code within them won't be included.
|
|
</p></div><div class="sect3"><div class="titlepage"><div><div><h4 class="title"><a name="gprof-cyg"></a>Profiling Cygwin itself</h4></div></div></div><p>Due to the issue mentioned in the previous situation and other issues,
|
|
at the time of this writing there is no support for profiling Cygwin itself.
|
|
</p></div></div></div><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="windres.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="programming.html">Up</a></td><td width="40%" align="right"> </td></tr><tr><td width="40%" align="left" valign="top">Defining Windows Resources </td><td width="20%" align="center"><a accesskey="h" href="cygwin-ug-net.html">Home</a></td><td width="40%" align="right" valign="top"> </td></tr></table></div></body></html>
|