mirror of
https://git.savannah.gnu.org/git/guile.git
synced 2025-05-20 11:40:18 +02:00
Update statprof documentation; deprecate `with-statprof'
* module/statprof.scm: Remove most of the commentary, as it was duplicated in the manual and was getting out of date. (stats): Remove self-secs-per-call and cum-secs-per-call fields as they can be computed from the other fields. (statprof-call-data->stats): Adapt. (statprof-stats-self-secs-per-call): (statprof-stats-cum-secs-per-call): New functions. (statprof-display/flat): Don't print the seconds-per-call fields, as we are no longer stopping the clock around call counters. Anyway these times were quite misleading. (with-statprof): Deprecate. It took its keyword arguments at the beginning; very complicated! Better to use the `statprof' function. (`statprof' was introduced after `with-statprof' and then `with-statprof' was adapted to use it.) * doc/ref/statprof.texi (Statprof): Port this documentation away from the automatically generated text and update it for the new interfaces like #:display-style. * module/system/base/syntax.scm (record-case): Remove comment that referenced with-statprof. Add comment indicating that record-case should be replaced. * doc/ref/scheme-using.texi (Profile Commands): Update to mention keyword arguments and to link to the statprof documentation.
This commit is contained in:
parent
4066ee3192
commit
8998f1539f
4 changed files with 248 additions and 373 deletions
|
@ -294,8 +294,12 @@ Disassemble a file.
|
||||||
Time execution.
|
Time execution.
|
||||||
@end deffn
|
@end deffn
|
||||||
|
|
||||||
@deffn {REPL Command} profile exp
|
@deffn {REPL Command} profile exp [#:hz hz=100] @
|
||||||
Profile execution.
|
[#:count-calls? count-calls?=#f] [#:display-style display-style=list]
|
||||||
|
Profile execution of an expression. This command compiled @var{exp} and
|
||||||
|
then runs it within the statprof profiler, passing all keyword options
|
||||||
|
to the @code{statprof} procedure. For more on statprof and on the the
|
||||||
|
options available to this command, @xref{Statprof}.
|
||||||
@end deffn
|
@end deffn
|
||||||
|
|
||||||
@deffn {REPL Command} trace exp [#:width w] [#:max-indent i]
|
@deffn {REPL Command} trace exp [#:width w] [#:max-indent i]
|
||||||
|
|
|
@ -6,220 +6,116 @@
|
||||||
@node Statprof
|
@node Statprof
|
||||||
@section Statprof
|
@section Statprof
|
||||||
|
|
||||||
@code{(statprof)} is a fairly simple statistical profiler for Guile.
|
Statprof is a statistical profiler for Guile.
|
||||||
|
|
||||||
A simple use of statprof would look like this:
|
A simple use of statprof would look like this:
|
||||||
|
|
||||||
@example
|
@example
|
||||||
(statprof-reset 0 50000 #t)
|
(use-modules (statprof))
|
||||||
(statprof-start)
|
(statprof (lambda ()
|
||||||
(do-something)
|
(map 1+ (iota 1000000))
|
||||||
(statprof-stop)
|
#f))
|
||||||
(statprof-display)
|
|
||||||
@end example
|
@end example
|
||||||
|
|
||||||
This would reset statprof, clearing all accumulated statistics, then
|
This would run the thunk with statistical profiling, finally displaying
|
||||||
start profiling, run some code, stop profiling, and finally display a
|
a flat table of statistics which could look something like this:
|
||||||
gprof flat-style table of statistics which will look something like
|
|
||||||
this:
|
|
||||||
|
|
||||||
@example
|
@example
|
||||||
% cumulative self self total
|
% cumulative self
|
||||||
time seconds seconds calls ms/call ms/call name
|
time seconds seconds procedure
|
||||||
35.29 0.23 0.23 2002 0.11 0.11 -
|
57.14 39769.73 0.07 ice-9/boot-9.scm:249:5:map1
|
||||||
23.53 0.15 0.15 2001 0.08 0.08 positive?
|
28.57 0.04 0.04 ice-9/boot-9.scm:1165:0:iota
|
||||||
23.53 0.15 0.15 2000 0.08 0.08 +
|
14.29 0.02 0.02 1+
|
||||||
11.76 0.23 0.08 2000 0.04 0.11 do-nothing
|
0.00 0.12 0.00 <current input>:2:10
|
||||||
5.88 0.64 0.04 2001 0.02 0.32 loop
|
---
|
||||||
0.00 0.15 0.00 1 0.00 150.59 do-something
|
Sample count: 7
|
||||||
...
|
Total time: 0.123490713 seconds (0.201983993 seconds in GC)
|
||||||
@end example
|
@end example
|
||||||
|
|
||||||
All of the numerical data with the exception of the calls column is
|
All of the numerical data with the exception of the calls column is
|
||||||
statistically approximate. In the following column descriptions, and in
|
statistically approximate. In the following column descriptions, and in
|
||||||
all of statprof, "time" refers to execution time (both user and system),
|
all of statprof, ``time'' refers to execution time (both user and
|
||||||
not wall clock time.
|
system), not wall clock time.
|
||||||
|
|
||||||
@table @asis
|
The @code{% time} column indicates the percentage of the run-time time
|
||||||
@item % time
|
spent inside the procedure itself (not counting children). It is
|
||||||
The percent of the time spent inside the procedure itself (not counting
|
calculated as @code{self seconds}, measuring the amount of time spent in
|
||||||
children).
|
the procedure, divided by the total run-time.
|
||||||
|
|
||||||
@item cumulative seconds
|
@code{cumulative seconds} also counts time spent in children of a
|
||||||
The total number of seconds spent in the procedure, including children.
|
function. For recursive functions, this can exceed the total time, as
|
||||||
|
in our example above, because each activation on the stack adds to the
|
||||||
|
cumulative time.
|
||||||
|
|
||||||
@item self seconds
|
Finally, the GC time measures the time spent in the garbage collector.
|
||||||
The total number of seconds spent in the procedure itself (not counting
|
On systems with multiple cores, this time can be larger than the run
|
||||||
children).
|
time, because it counts time spent in all threads, and will run the
|
||||||
|
``marking'' phase of GC in parallel. If GC time is a significant
|
||||||
|
fraction of the run time, that means that most time in your program is
|
||||||
|
spent allocating objects and cleaning up after those allocations. To
|
||||||
|
speed up your program, one good place to start would be to look at how
|
||||||
|
to reduce the allocation rate.
|
||||||
|
|
||||||
@item calls
|
Statprof's main mode of operation is as a statistical profiler. However
|
||||||
The total number of times the procedure was called.
|
statprof can also run in a ``precise'' mode as well. Pass the
|
||||||
|
@code{#:count-calls? #t} keyword argument to @code{statprof} to record
|
||||||
|
all calls:
|
||||||
|
|
||||||
@item self ms/call
|
@example
|
||||||
The average time taken by the procedure itself on each call, in ms.
|
(use-modules (statprof))
|
||||||
|
(statprof (lambda ()
|
||||||
|
(map 1+ (iota 1000000))
|
||||||
|
#f)
|
||||||
|
#:count-calls? #t)
|
||||||
|
@end example
|
||||||
|
|
||||||
@item total ms/call
|
The result has an additional @code{calls} column:
|
||||||
The average time taken by each call to the procedure, including time
|
|
||||||
spent in child functions.
|
|
||||||
|
|
||||||
@item name
|
@example
|
||||||
The name of the procedure.
|
% cumulative self
|
||||||
|
time seconds seconds calls procedure
|
||||||
|
82.26 0.73 0.73 1000000 1+
|
||||||
|
11.29 420925.80 0.10 1000001 ice-9/boot-9.scm:249:5:map1
|
||||||
|
4.84 0.06 0.04 1 ice-9/boot-9.scm:1165:0:iota
|
||||||
|
[...]
|
||||||
|
---
|
||||||
|
Sample count: 62
|
||||||
|
Total time: 0.893098065 seconds (1.222796536 seconds in GC)
|
||||||
|
@end example
|
||||||
|
|
||||||
@end table
|
As you can see, the profile is perturbed: @code{1+} ends up on top,
|
||||||
|
whereas it was not marked as hot in the earlier profile. This is
|
||||||
The profiler uses @code{eq?} and the procedure object itself to identify
|
because the overhead of call-counting unfairly penalizes calls. Still,
|
||||||
the procedures, so it won't confuse different procedures with the same
|
this precise mode can be useful at times to do algorithmic optimizations
|
||||||
name. They will show up as two different rows in the output.
|
based on the precise call counts.
|
||||||
|
|
||||||
Right now the profiler is quite simplistic. I cannot provide call-graphs
|
|
||||||
or other higher level information. What you see in the table is pretty
|
|
||||||
much all there is. Patches are welcome :-)
|
|
||||||
|
|
||||||
@section Implementation notes
|
@section Implementation notes
|
||||||
|
|
||||||
The profiler works by setting the unix profiling signal
|
The profiler works by setting the unix profiling signal
|
||||||
@code{ITIMER_PROF} to go off after the interval you define in the call
|
@code{ITIMER_PROF} to go off after the interval you define in the call
|
||||||
to @code{statprof-reset}. When the signal fires, a sampling routine is
|
to @code{statprof-reset}. When the signal fires, a sampling routine
|
||||||
run which looks at the current procedure that's executing, and then
|
runs which crawls up the stack, recording all instruction pointers into
|
||||||
crawls up the stack, and for each procedure encountered, increments that
|
a buffer. After the sample is complete, the profiler resets profiling
|
||||||
procedure's sample count. Note that if a procedure is encountered
|
timer to fire again after the appropriate interval.
|
||||||
multiple times on a given stack, it is only counted once. After the
|
|
||||||
sampling is complete, the profiler resets profiling timer to fire again
|
|
||||||
after the appropriate interval.
|
|
||||||
|
|
||||||
Meanwhile, the profiler keeps track, via @code{get-internal-run-time},
|
Later, when profiling stops, that log buffer is analyzed to produce the
|
||||||
how much CPU time (system and user -- which is also what
|
``self seconds'' and ``cumulative seconds'' statistics. A procedure at
|
||||||
@code{ITIMER_PROF} tracks), has elapsed while code has been executing
|
the top of the stack counts toward ``self'' samples, and everything on
|
||||||
within a statprof-start/stop block.
|
the stack counts towards ``cumulative'' samples.
|
||||||
|
|
||||||
The profiler also tries to avoid counting or timing its own code as much
|
While the profiler is running it measures how much CPU time (system and
|
||||||
as possible.
|
user -- which is also what @code{ITIMER_PROF} tracks) has elapsed while
|
||||||
|
code has been executing within the profiler. Only run time counts
|
||||||
|
towards the profile, not wall-clock time. For example, sleeping and
|
||||||
|
waiting for input or output do not cause the timer clock to advance.
|
||||||
|
|
||||||
@section Usage
|
@section Usage
|
||||||
@anchor{statprof statprof-active?}@defun statprof-active?
|
|
||||||
Returns @code{#t} if @code{statprof-start} has been called more times
|
|
||||||
than @code{statprof-stop}, @code{#f} otherwise.
|
|
||||||
|
|
||||||
@end defun
|
@deffn {Scheme Procedure} statprof thunk @
|
||||||
|
[#:loop loop=1] [#:hz hz=100] @
|
||||||
@anchor{statprof statprof-start}@defun statprof-start
|
[#:port port=(current-output-port)] @
|
||||||
Start the profiler.@code{}
|
[#:count-calls? count-calls?=#f] @
|
||||||
|
[#:display-style display-style='flat]
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-stop}@defun statprof-stop
|
|
||||||
Stop the profiler.@code{}
|
|
||||||
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-reset}@defun statprof-reset sample-seconds sample-microseconds count-calls? [full-stacks?]
|
|
||||||
Reset the statprof sampler interval to @var{sample-seconds} and
|
|
||||||
@var{sample-microseconds}. If @var{count-calls?} is true, arrange to
|
|
||||||
instrument procedure calls as well as collecting statistical profiling
|
|
||||||
data. If @var{full-stacks?} is true, collect all sampled stacks into a
|
|
||||||
list for later analysis.
|
|
||||||
|
|
||||||
Enables traps and debugging as necessary.
|
|
||||||
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-accumulated-time}@defun statprof-accumulated-time
|
|
||||||
Returns the time accumulated during the last statprof run.@code{}
|
|
||||||
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-sample-count}@defun statprof-sample-count
|
|
||||||
Returns the number of samples taken during the last statprof run.@code{}
|
|
||||||
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-fold-call-data}@defun statprof-fold-call-data proc init
|
|
||||||
Fold @var{proc} over the call-data accumulated by statprof. Cannot be
|
|
||||||
called while statprof is active. @var{proc} should take two arguments,
|
|
||||||
@code{(@var{call-data} @var{prior-result})}.
|
|
||||||
|
|
||||||
Note that a given proc-name may appear multiple times, but if it does,
|
|
||||||
it represents different functions with the same name.
|
|
||||||
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-proc-call-data}@defun statprof-proc-call-data proc
|
|
||||||
Returns the call-data associated with @var{proc}, or @code{#f} if none
|
|
||||||
is available.
|
|
||||||
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-call-data-name}@defun statprof-call-data-name cd
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-call-data-calls}@defun statprof-call-data-calls cd
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-call-data-cum-samples}@defun statprof-call-data-cum-samples cd
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-call-data-self-samples}@defun statprof-call-data-self-samples cd
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-call-data->stats}@defun statprof-call-data->stats call-data
|
|
||||||
Returns an object of type @code{statprof-stats}.
|
|
||||||
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-stats-proc-name}@defun statprof-stats-proc-name stats
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-stats-%-time-in-proc}@defun statprof-stats-%-time-in-proc stats
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-stats-cum-secs-in-proc}@defun statprof-stats-cum-secs-in-proc stats
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-stats-self-secs-in-proc}@defun statprof-stats-self-secs-in-proc stats
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-stats-calls}@defun statprof-stats-calls stats
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-stats-self-secs-per-call}@defun statprof-stats-self-secs-per-call stats
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-stats-cum-secs-per-call}@defun statprof-stats-cum-secs-per-call stats
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-display}@defun statprof-display . _
|
|
||||||
Displays a gprof-like summary of the statistics collected. Unless an
|
|
||||||
optional @var{port} argument is passed, uses the current output port.
|
|
||||||
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-display-anomolies}@defun statprof-display-anomolies
|
|
||||||
A sanity check that attempts to detect anomolies in statprof's
|
|
||||||
statistics.@code{}
|
|
||||||
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-fetch-stacks}@defun statprof-fetch-stacks
|
|
||||||
Returns a list of stacks, as they were captured since the last call to
|
|
||||||
@code{statprof-reset}.
|
|
||||||
|
|
||||||
Note that stacks are only collected if the @var{full-stacks?} argument
|
|
||||||
to @code{statprof-reset} is true.
|
|
||||||
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof-fetch-call-tree}@defun statprof-fetch-call-tree
|
|
||||||
@verbatim
|
|
||||||
Return a call tree for the previous statprof run.
|
|
||||||
|
|
||||||
The return value is a list of nodes, each of which is of the type:
|
|
||||||
@@code
|
|
||||||
node ::= (@@var@{proc@} @@var@{count@} . @@var@{nodes@})
|
|
||||||
@@end code
|
|
||||||
@end verbatim
|
|
||||||
|
|
||||||
@end defun
|
|
||||||
|
|
||||||
@anchor{statprof statprof}@defun statprof thunk [#:loop] [#:hz] [#:count-calls?] [#:full-stacks?]
|
|
||||||
Profile the execution of @var{thunk}, and return its return values.
|
Profile the execution of @var{thunk}, and return its return values.
|
||||||
|
|
||||||
The stack will be sampled @var{hz} times per second, and the thunk
|
The stack will be sampled @var{hz} times per second, and the thunk
|
||||||
|
@ -228,57 +124,127 @@ itself will be called @var{loop} times.
|
||||||
If @var{count-calls?} is true, all procedure calls will be recorded.
|
If @var{count-calls?} is true, all procedure calls will be recorded.
|
||||||
This operation is somewhat expensive.
|
This operation is somewhat expensive.
|
||||||
|
|
||||||
If @var{full-stacks?} is true, at each sample, statprof will store away
|
After the @var{thunk} has been profiled, print out a profile to
|
||||||
the whole call tree, for later analysis. Use
|
@var{port}. If @var{display-style} is @code{flat}, the results will be
|
||||||
@code{statprof-fetch-stacks} or @code{statprof-fetch-call-tree} to
|
printed as a flat profile. Otherwise if @var{display-style} is
|
||||||
retrieve the last-stored stacks.
|
@code{tree}, print the results as a tree profile.
|
||||||
|
@end deffn
|
||||||
|
|
||||||
@end defun
|
Profiling can also be enabled and disabled manually.
|
||||||
|
|
||||||
@anchor{statprof with-statprof}@defspec with-statprof args
|
@deffn {Scheme Procedure} statprof-active?
|
||||||
Profile the expressions in the body, and return the body's return
|
Returns @code{#t} if @code{statprof-start} has been called more times
|
||||||
value.
|
than @code{statprof-stop}, @code{#f} otherwise.
|
||||||
|
@end deffn
|
||||||
|
|
||||||
Keyword arguments:
|
@deffn {Scheme Procedure} statprof-start
|
||||||
|
@deffnx {Scheme Procedure} statprof-stop
|
||||||
|
Start or stop the profiler.
|
||||||
|
@end deffn
|
||||||
|
|
||||||
|
@deffn {Scheme Procedure} statprof-reset sample-seconds sample-microseconds count-calls?
|
||||||
|
Reset the profiling sample interval to @var{sample-seconds} and
|
||||||
|
@var{sample-microseconds}. If @var{count-calls?} is true, arrange to
|
||||||
|
instrument procedure calls as well as collecting statistical profiling
|
||||||
|
data.
|
||||||
|
@end deffn
|
||||||
|
|
||||||
|
If you use the manual @code{statprof-start}/@code{statprof-stop}
|
||||||
|
interface, an implicit statprof state will persist starting from the
|
||||||
|
last call to @code{statprof-reset}, or the first call to
|
||||||
|
@code{statprof-start}. There are a number of accessors to fetch
|
||||||
|
statistics from this implicit state.
|
||||||
|
|
||||||
|
@deffn {Scheme Procedure} statprof-accumulated-time
|
||||||
|
Returns the time accumulated during the last statprof run.
|
||||||
|
@end deffn
|
||||||
|
|
||||||
|
@deffn {Scheme Procedure} statprof-sample-count
|
||||||
|
Returns the number of samples taken during the last statprof run.
|
||||||
|
@end deffn
|
||||||
|
|
||||||
|
@deffn {Scheme Procedure} statprof-fold-call-data proc init
|
||||||
|
Fold @var{proc} over the call-data accumulated by statprof. This
|
||||||
|
procedure cannot be called while statprof is active.
|
||||||
|
|
||||||
|
@var{proc} will be called with arguments, @var{call-data} and
|
||||||
|
@var{prior-result}.
|
||||||
|
@end deffn
|
||||||
|
|
||||||
|
@deffn {Scheme Procedure} statprof-proc-call-data proc
|
||||||
|
Returns the call-data associated with @var{proc}, or @code{#f} if none
|
||||||
|
is available.
|
||||||
|
@end deffn
|
||||||
|
|
||||||
|
@deffn {Scheme Procedure} statprof-call-data-name cd
|
||||||
|
@deffnx {Scheme Procedure} statprof-call-data-calls cd
|
||||||
|
@deffnx {Scheme Procedure} statprof-call-data-cum-samples cd
|
||||||
|
@deffnx {Scheme Procedure} statprof-call-data-self-samples cd
|
||||||
|
Accessors for the fields in a statprof call-data object.
|
||||||
|
@end deffn
|
||||||
|
|
||||||
|
@deffn {Scheme Procedure} statprof-call-data->stats call-data
|
||||||
|
Returns an object of type @code{statprof-stats}.
|
||||||
|
@end deffn
|
||||||
|
|
||||||
|
@deffn {Scheme Procedure} statprof-stats-proc-name stats
|
||||||
|
@deffnx {Scheme Procedure} statprof-stats-%-time-in-proc stats
|
||||||
|
@deffnx {Scheme Procedure} statprof-stats-cum-secs-in-proc stats
|
||||||
|
@deffnx {Scheme Procedure} statprof-stats-self-secs-in-proc stats
|
||||||
|
@deffnx {Scheme Procedure} statprof-stats-calls stats
|
||||||
|
@deffnx {Scheme Procedure} statprof-stats-self-secs-per-call stats
|
||||||
|
@deffnx {Scheme Procedure} statprof-stats-cum-secs-per-call stats
|
||||||
|
Accessors for the fields in a @code{statprof-stats} object.
|
||||||
|
@end deffn
|
||||||
|
|
||||||
|
@deffn {Scheme Procedure} statprof-display @
|
||||||
|
[port=(current-output-port)] [#:style style=flat]
|
||||||
|
Displays a summary of the statistics collected. Possible values for
|
||||||
|
@var{style} include:
|
||||||
|
|
||||||
@table @code
|
@table @code
|
||||||
@item #:loop
|
@item flat
|
||||||
Execute the body @var{loop} number of times, or @code{#f} for no looping
|
Display a traditional gprof-style flat profile.
|
||||||
|
@item anomalies
|
||||||
default: @code{#f}
|
Find statistical anomalies in the data.
|
||||||
|
@item tree
|
||||||
@item #:hz
|
Display a tree profile.
|
||||||
Sampling rate
|
|
||||||
|
|
||||||
default: @code{20}
|
|
||||||
|
|
||||||
@item #:count-calls?
|
|
||||||
Whether to instrument each function call (expensive)
|
|
||||||
|
|
||||||
default: @code{#f}
|
|
||||||
|
|
||||||
@item #:full-stacks?
|
|
||||||
Whether to collect away all sampled stacks into a list
|
|
||||||
|
|
||||||
default: @code{#f}
|
|
||||||
|
|
||||||
@end table
|
@end table
|
||||||
|
@end deffn
|
||||||
|
|
||||||
@end defspec
|
@deffn {Scheme Procedure} statprof-fetch-stacks
|
||||||
|
Returns a list of stacks, as they were captured since the last call to
|
||||||
|
@code{statprof-reset}.
|
||||||
|
@end deffn
|
||||||
|
|
||||||
@anchor{statprof gcprof}@defun gcprof thunk [#:loop] [#:full-stacks?]
|
@deffn {Scheme Procedure} statprof-fetch-call-tree [#:precise precise?=#f]
|
||||||
Do an allocation profile of the execution of @var{thunk}.
|
@verbatim
|
||||||
|
Return a call tree for the previous statprof run.
|
||||||
|
|
||||||
The stack will be sampled soon after every garbage collection, yielding
|
The return value is a list of nodes. A node is a list of the form:
|
||||||
an approximate idea of what is causing allocation in your program.
|
@code
|
||||||
|
node ::= (@var{proc} @var{count} . @var{nodes})
|
||||||
|
@end code
|
||||||
|
|
||||||
|
The @var{proc} is a printable representation of a procedure, as a
|
||||||
|
string. If @var{precise?} is false, which is the default, then a node
|
||||||
|
corresponds to a procedure invocation. If it is true, then a node
|
||||||
|
corresponds to a return point in a procedure. Passing @code{#:precise?
|
||||||
|
#t} allows a user to distinguish different source lines in a procedure,
|
||||||
|
but usually it is too much detail, so it is off by default.
|
||||||
|
@end verbatim
|
||||||
|
|
||||||
|
@end deffn
|
||||||
|
|
||||||
|
@deffn {Scheme Procedure} gcprof thunk [#:loop]
|
||||||
|
Like the @code{statprof} procedure, but instead of profiling CPU time,
|
||||||
|
we profile garbage collection.
|
||||||
|
|
||||||
|
The stack will be sampled soon after every garbage collection during the
|
||||||
|
evaluation of @var{thunk}, yielding an approximate idea of what is
|
||||||
|
causing allocation in your program.
|
||||||
|
|
||||||
Since GC does not occur very frequently, you may need to use the
|
Since GC does not occur very frequently, you may need to use the
|
||||||
@var{loop} parameter, to cause @var{thunk} to be called @var{loop}
|
@var{loop} parameter, to cause @var{thunk} to be called @var{loop}
|
||||||
times.
|
times.
|
||||||
|
@end deffn
|
||||||
If @var{full-stacks?} is true, at each sample, statprof will store away
|
|
||||||
the whole call tree, for later analysis. Use
|
|
||||||
@code{statprof-fetch-stacks} or @code{statprof-fetch-call-tree} to
|
|
||||||
retrieve the last-stored stacks.
|
|
||||||
|
|
||||||
@end defun
|
|
||||||
|
|
|
@ -1,7 +1,7 @@
|
||||||
;;;; (statprof) -- a statistical profiler for Guile
|
;;;; (statprof) -- a statistical profiler for Guile
|
||||||
;;;; -*-scheme-*-
|
;;;; -*-scheme-*-
|
||||||
;;;;
|
;;;;
|
||||||
;;;; Copyright (C) 2009, 2010, 2011, 2013-2015 Free Software Foundation, Inc.
|
;;;; Copyright (C) 2009, 2010, 2011, 2013-2016 Free Software Foundation, Inc.
|
||||||
;;;; Copyright (C) 2004, 2009 Andy Wingo <wingo at pobox dot com>
|
;;;; Copyright (C) 2004, 2009 Andy Wingo <wingo at pobox dot com>
|
||||||
;;;; Copyright (C) 2001 Rob Browning <rlb at defaultvalue dot org>
|
;;;; Copyright (C) 2001 Rob Browning <rlb at defaultvalue dot org>
|
||||||
;;;;
|
;;;;
|
||||||
|
@ -23,77 +23,8 @@
|
||||||
|
|
||||||
;;; Commentary:
|
;;; Commentary:
|
||||||
;;;
|
;;;
|
||||||
;;; @code{(statprof)} is a statistical profiler for Guile.
|
;;; @code{(statprof)} is a statistical profiler for Guile. See the
|
||||||
;;;
|
;;; "Statprof" section in the manual, for more information.
|
||||||
;;; A simple use of statprof would look like this:
|
|
||||||
;;;
|
|
||||||
;;; @example
|
|
||||||
;;; (statprof (lambda () (do-something))
|
|
||||||
;;; #:hz 100
|
|
||||||
;;; #:count-calls? #t)
|
|
||||||
;;; @end example
|
|
||||||
;;;
|
|
||||||
;;; This would run the thunk with statistical profiling, finally
|
|
||||||
;;; displaying a gprof flat-style table of statistics which could
|
|
||||||
;;; look something like this:
|
|
||||||
;;;
|
|
||||||
;;; @example
|
|
||||||
;;; % cumulative self self total
|
|
||||||
;;; time seconds seconds calls ms/call ms/call name
|
|
||||||
;;; 35.29 0.23 0.23 2002 0.11 0.11 -
|
|
||||||
;;; 23.53 0.15 0.15 2001 0.08 0.08 positive?
|
|
||||||
;;; 23.53 0.15 0.15 2000 0.08 0.08 +
|
|
||||||
;;; 11.76 0.23 0.08 2000 0.04 0.11 do-nothing
|
|
||||||
;;; 5.88 0.64 0.04 2001 0.02 0.32 loop
|
|
||||||
;;; 0.00 0.15 0.00 1 0.00 150.59 do-something
|
|
||||||
;;; ...
|
|
||||||
;;; @end example
|
|
||||||
;;;
|
|
||||||
;;; All of the numerical data with the exception of the calls column is
|
|
||||||
;;; statistically approximate. In the following column descriptions, and
|
|
||||||
;;; in all of statprof, "time" refers to execution time (both user and
|
|
||||||
;;; system), not wall clock time.
|
|
||||||
;;;
|
|
||||||
;;; @table @asis
|
|
||||||
;;; @item % time
|
|
||||||
;;; The percent of the time spent inside the procedure itself
|
|
||||||
;;; (not counting children).
|
|
||||||
;;; @item cumulative seconds
|
|
||||||
;;; The total number of seconds spent in the procedure, including
|
|
||||||
;;; children.
|
|
||||||
;;; @item self seconds
|
|
||||||
;;; The total number of seconds spent in the procedure itself (not counting
|
|
||||||
;;; children).
|
|
||||||
;;; @item calls
|
|
||||||
;;; The total number of times the procedure was called.
|
|
||||||
;;; @item self ms/call
|
|
||||||
;;; The average time taken by the procedure itself on each call, in ms.
|
|
||||||
;;; @item total ms/call
|
|
||||||
;;; The average time taken by each call to the procedure, including time
|
|
||||||
;;; spent in child functions.
|
|
||||||
;;; @item name
|
|
||||||
;;; The name of the procedure.
|
|
||||||
;;; @end table
|
|
||||||
;;;
|
|
||||||
;;; @section Implementation notes
|
|
||||||
;;;
|
|
||||||
;;; The profiler works by setting the unix profiling signal
|
|
||||||
;;; @code{ITIMER_PROF} to go off after the interval you define in the call
|
|
||||||
;;; to @code{statprof-reset}. When the signal fires, a sampling routine is
|
|
||||||
;;; run which looks at the current procedure that's executing, and then
|
|
||||||
;;; crawls up the stack, and for each procedure encountered, increments
|
|
||||||
;;; that procedure's sample count. Note that if a procedure is encountered
|
|
||||||
;;; multiple times on a given stack, it is only counted once. After the
|
|
||||||
;;; sampling is complete, the profiler resets profiling timer to fire
|
|
||||||
;;; again after the appropriate interval.
|
|
||||||
;;;
|
|
||||||
;;; Meanwhile, the profiler keeps track, via @code{get-internal-run-time},
|
|
||||||
;;; how much CPU time (system and user -- which is also what
|
|
||||||
;;; @code{ITIMER_PROF} tracks), has elapsed while code has been executing
|
|
||||||
;;; within a statprof-start/stop block.
|
|
||||||
;;;
|
|
||||||
;;; The profiler also tries to avoid counting or timing its own code as
|
|
||||||
;;; much as possible.
|
|
||||||
;;;
|
;;;
|
||||||
;;; Code:
|
;;; Code:
|
||||||
|
|
||||||
|
@ -140,8 +71,6 @@
|
||||||
statprof-fetch-call-tree
|
statprof-fetch-call-tree
|
||||||
|
|
||||||
statprof
|
statprof
|
||||||
with-statprof
|
|
||||||
|
|
||||||
gcprof))
|
gcprof))
|
||||||
|
|
||||||
|
|
||||||
|
@ -612,16 +541,28 @@ none is available."
|
||||||
(define-record-type stats
|
(define-record-type stats
|
||||||
(make-stats proc-name proc-source
|
(make-stats proc-name proc-source
|
||||||
%-time-in-proc cum-secs-in-proc self-secs-in-proc
|
%-time-in-proc cum-secs-in-proc self-secs-in-proc
|
||||||
calls self-secs-per-call cum-secs-per-call)
|
calls)
|
||||||
stats?
|
stats?
|
||||||
(proc-name statprof-stats-proc-name)
|
(proc-name statprof-stats-proc-name)
|
||||||
(proc-source statprof-stats-proc-source)
|
(proc-source statprof-stats-proc-source)
|
||||||
(%-time-in-proc statprof-stats-%-time-in-proc)
|
(%-time-in-proc statprof-stats-%-time-in-proc)
|
||||||
(cum-secs-in-proc statprof-stats-cum-secs-in-proc)
|
(cum-secs-in-proc statprof-stats-cum-secs-in-proc)
|
||||||
(self-secs-in-proc statprof-stats-self-secs-in-proc)
|
(self-secs-in-proc statprof-stats-self-secs-in-proc)
|
||||||
(calls statprof-stats-calls)
|
(calls statprof-stats-calls))
|
||||||
(self-secs-per-call statprof-stats-self-secs-per-call)
|
|
||||||
(cum-secs-per-call statprof-stats-cum-secs-per-call))
|
(define (statprof-stats-self-secs-per-call stats)
|
||||||
|
(let ((calls (statprof-stats-calls stats)))
|
||||||
|
(and calls
|
||||||
|
(/ (statprof-stats-self-secs-in-proc stats)
|
||||||
|
calls))))
|
||||||
|
|
||||||
|
(define (statprof-stats-cum-secs-per-call stats)
|
||||||
|
(let ((calls (statprof-stats-calls stats)))
|
||||||
|
(and calls
|
||||||
|
(/ (statprof-stats-cum-secs-in-proc stats)
|
||||||
|
;; `calls' might be 0 if we entered statprof during the
|
||||||
|
;; dynamic extent of the call.
|
||||||
|
(max calls 1)))))
|
||||||
|
|
||||||
(define (statprof-call-data->stats call-data)
|
(define (statprof-call-data->stats call-data)
|
||||||
"Returns an object of type @code{statprof-stats}."
|
"Returns an object of type @code{statprof-stats}."
|
||||||
|
@ -645,16 +586,7 @@ none is available."
|
||||||
(* (/ self-samples all-samples) 100.0)
|
(* (/ self-samples all-samples) 100.0)
|
||||||
(* cum-samples secs-per-sample 1.0)
|
(* cum-samples secs-per-sample 1.0)
|
||||||
(* self-samples secs-per-sample 1.0)
|
(* self-samples secs-per-sample 1.0)
|
||||||
num-calls
|
num-calls)))
|
||||||
(and num-calls ;; maybe we only sampled in children
|
|
||||||
(if (zero? self-samples) 0.0
|
|
||||||
(/ (* self-samples secs-per-sample) 1.0 num-calls)))
|
|
||||||
(and num-calls ;; cum-samples must be positive
|
|
||||||
(/ (* cum-samples secs-per-sample)
|
|
||||||
1.0
|
|
||||||
;; num-calls might be 0 if we entered statprof during the
|
|
||||||
;; dynamic extent of the call
|
|
||||||
(max num-calls 1))))))
|
|
||||||
|
|
||||||
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
|
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
|
||||||
|
|
||||||
|
@ -689,10 +621,8 @@ optional @var{port} argument is passed, uses the current output port."
|
||||||
(statprof-stats-self-secs-in-proc stats))
|
(statprof-stats-self-secs-in-proc stats))
|
||||||
(if (call-counts state)
|
(if (call-counts state)
|
||||||
(if (statprof-stats-calls stats)
|
(if (statprof-stats-calls stats)
|
||||||
(format port " ~7d ~8,2f ~8,2f "
|
(format port " ~7d "
|
||||||
(statprof-stats-calls stats)
|
(statprof-stats-calls stats))
|
||||||
(* 1000 (statprof-stats-self-secs-per-call stats))
|
|
||||||
(* 1000 (statprof-stats-cum-secs-per-call stats)))
|
|
||||||
(format port " "))
|
(format port " "))
|
||||||
(display " " port))
|
(display " " port))
|
||||||
(let ((source (statprof-stats-proc-source stats))
|
(let ((source (statprof-stats-proc-source stats))
|
||||||
|
@ -707,10 +637,10 @@ optional @var{port} argument is passed, uses the current output port."
|
||||||
|
|
||||||
(if (call-counts state)
|
(if (call-counts state)
|
||||||
(begin
|
(begin
|
||||||
(format port "~5a ~10a ~7a ~8a ~8a ~8a ~8@a\n"
|
(format port "~5a ~10a ~7a ~8a\n"
|
||||||
"% " "cumulative" "self" "" "self" "total" "")
|
"% " "cumulative" "self" "")
|
||||||
(format port "~5a ~9a ~8a ~8a ~8a ~8a ~a\n"
|
(format port "~5a ~9a ~8a ~7a ~a\n"
|
||||||
"time" "seconds" "seconds" "calls" "ms/call" "ms/call" "procedure"))
|
"time" "seconds" "seconds" "calls" "procedure"))
|
||||||
(begin
|
(begin
|
||||||
(format port "~5a ~10a ~7a ~8a\n"
|
(format port "~5a ~10a ~7a ~8a\n"
|
||||||
"%" "cumulative" "self" "")
|
"%" "cumulative" "self" "")
|
||||||
|
@ -963,8 +893,9 @@ operation is somewhat expensive."
|
||||||
(statprof-stop state)
|
(statprof-stop state)
|
||||||
(statprof-display port state #:style display-style))))))
|
(statprof-display port state #:style display-style))))))
|
||||||
|
|
||||||
(define-macro (with-statprof . args)
|
(begin-deprecated
|
||||||
"Profile the expressions in the body, and return the body's return values.
|
(define-macro (with-statprof . args)
|
||||||
|
"Profile the expressions in the body, and return the body's return values.
|
||||||
|
|
||||||
Keyword arguments:
|
Keyword arguments:
|
||||||
|
|
||||||
|
@ -982,22 +913,24 @@ Whether to instrument each function call (expensive)
|
||||||
|
|
||||||
default: @code{#f}
|
default: @code{#f}
|
||||||
@end table"
|
@end table"
|
||||||
(define (kw-arg-ref kw args def)
|
(define (kw-arg-ref kw args def)
|
||||||
(cond
|
(cond
|
||||||
((null? args) (error "Invalid macro body"))
|
((null? args) (error "Invalid macro body"))
|
||||||
((keyword? (car args))
|
((keyword? (car args))
|
||||||
(if (eq? (car args) kw)
|
(if (eq? (car args) kw)
|
||||||
(cadr args)
|
(cadr args)
|
||||||
(kw-arg-ref kw (cddr args) def)))
|
(kw-arg-ref kw (cddr args) def)))
|
||||||
((eq? kw #f def) ;; asking for the body
|
((eq? kw #f def) ;; asking for the body
|
||||||
args)
|
args)
|
||||||
(else def))) ;; kw not found
|
(else def))) ;; kw not found
|
||||||
`((@ (statprof) statprof)
|
(issue-deprecation-warning
|
||||||
(lambda () ,@(kw-arg-ref #f args #f))
|
"`with-statprof' is deprecated. Use `statprof' instead.")
|
||||||
#:loop ,(kw-arg-ref #:loop args 1)
|
`((@ (statprof) statprof)
|
||||||
#:hz ,(kw-arg-ref #:hz args 100)
|
(lambda () ,@(kw-arg-ref #f args #f))
|
||||||
#:count-calls? ,(kw-arg-ref #:count-calls? args #f)
|
#:loop ,(kw-arg-ref #:loop args 1)
|
||||||
#:full-stacks? ,(kw-arg-ref #:full-stacks? args #f)))
|
#:hz ,(kw-arg-ref #:hz args 100)
|
||||||
|
#:count-calls? ,(kw-arg-ref #:count-calls? args #f)))
|
||||||
|
(export with-statprof))
|
||||||
|
|
||||||
(define* (gcprof thunk #:key (loop 1) full-stacks? (port (current-output-port)))
|
(define* (gcprof thunk #:key (loop 1) full-stacks? (port (current-output-port)))
|
||||||
"Do an allocation profile of the execution of @var{thunk}.
|
"Do an allocation profile of the execution of @var{thunk}.
|
||||||
|
|
|
@ -146,35 +146,7 @@
|
||||||
(car in)
|
(car in)
|
||||||
out)))))))
|
out)))))))
|
||||||
|
|
||||||
;; So, dear reader. It is pleasant indeed around this fire or at this
|
;;; FIXME: Re-write uses of `record-case' to use `match' instead.
|
||||||
;; cafe or in this room, is it not? I think so too.
|
|
||||||
;;
|
|
||||||
;; This macro used to generate code that looked like this:
|
|
||||||
;;
|
|
||||||
;; `(((record-predicate ,record-type) ,r)
|
|
||||||
;; (let ,(map (lambda (slot)
|
|
||||||
;; (if (pair? slot)
|
|
||||||
;; `(,(car slot) ((record-accessor ,record-type ',(cadr slot)) ,r))
|
|
||||||
;; `(,slot ((record-accessor ,record-type ',slot) ,r))))
|
|
||||||
;; slots)
|
|
||||||
;; ,@body)))))
|
|
||||||
;;
|
|
||||||
;; But this was a hot spot, so computing all those predicates and
|
|
||||||
;; accessors all the time was getting expensive, so we did a terrible
|
|
||||||
;; thing: we decided that since above we're already defining accessors
|
|
||||||
;; and predicates with computed names, we might as well just rely on that fact here.
|
|
||||||
;;
|
|
||||||
;; It's a bit nasty, I agree. But it is fast.
|
|
||||||
;;
|
|
||||||
;;scheme@(guile-user)> (with-statprof #:hz 1000 #:full-stacks? #t (resolve-module '(oop goops)))% cumulative self
|
|
||||||
;; time seconds seconds name
|
|
||||||
;; 8.82 0.03 0.01 glil->assembly
|
|
||||||
;; 8.82 0.01 0.01 record-type-fields
|
|
||||||
;; 5.88 0.01 0.01 %compute-initargs
|
|
||||||
;; 5.88 0.01 0.01 list-index
|
|
||||||
|
|
||||||
|
|
||||||
;;; So ugly... but I am too ignorant to know how to make it better.
|
|
||||||
(define-syntax record-case
|
(define-syntax record-case
|
||||||
(lambda (x)
|
(lambda (x)
|
||||||
(syntax-case x ()
|
(syntax-case x ()
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue