Title: | Amdahl's Profiler, Directed Optimization Made Easy |
---|---|
Description: | Assists the evaluation of whether and where to focus code optimization, using Amdahl's law and visual aids based on line profiling. Amdahl's profiler organizes profiling output files (including memory profiling) in a visually appealing way. It is meant to help to balance development vs. execution time by helping to identify the most promising sections of code to optimize and projecting potential gains. The package is an addition to R's standard profiling tools and is not a wrapper for them. |
Authors: | Marco D. Visser |
Maintainer: | Marco D. Visser <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.4.1 |
Built: | 2024-11-13 02:43:03 UTC |
Source: | https://github.com/marcodvisser/aprof |
Create 'aprof' objects for usage with 'aprof' functions
aprof(src = NULL, output = NULL)
aprof(src = NULL, output = NULL)
src |
The name of the source code file (and path if not in the working
directory). The source code file is expected to be a
a plain text file (e.g. txt, .R), containing the code of the
previously profiled program. If left empty, some "aprof" functions
(e.g. |
output |
The file name (and path if not in the working directory) of a previously created profiling exercise. |
Creates an "aprof" object from the R-profiler's output and a source file.
The objects created through "aprof" can be used by the standard functions
plot, summary and print (more specifically:
plot.aprof
, summary.aprof
and print.arof
).
See the example below for more details.
Using aprof with knitr and within .Rmd or .Rnw documents is not yet supported by the R profiler. Note that setting the chuck option: engine="Rscript", disables line-profiling. Line profiling only works in a interactive session (Oct 2015). In these cases users are advised to use the standard Rprof functions or "profr" (while setting engine="Rscript") and not to rely on line-profiling based packages (for the time being).
An aprof object
Marco D. Visser
plot.aprof
, summary.aprof
,
print.aprof
, Rprof
and
summaryRprof
.
## Not run: ## create function to profile foo <- function(N){ preallocate<-numeric(N) grow<-NULL for(i in 1:N){ preallocate[i]<-N/(i+1) grow<-c(grow,N/(i+1)) } } ## save function to a source file and reload dump("foo",file="foo.R") source("foo.R") ## create file to save profiler output tmp<-tempfile() ## Profile the function Rprof(tmp,line.profiling=TRUE) foo(1e4) Rprof(append=FALSE) ## Create a aprof object fooaprof<-aprof("foo.R",tmp) ## display basic information, summarize and plot the object fooaprof summary(fooaprof) plot(fooaprof) profileplot(fooaprof) ## To continue with memory profiling: ## enable memory.profiling=TRUE Rprof(tmp,line.profiling=TRUE,memory.profiling=TRUE) foo(1e4) Rprof(append=FALSE) ## Create a aprof object fooaprof<-aprof("foo.R",tmp) ## display basic information, and plot memory usage fooaprof plot(fooaprof) ## End(Not run)
## Not run: ## create function to profile foo <- function(N){ preallocate<-numeric(N) grow<-NULL for(i in 1:N){ preallocate[i]<-N/(i+1) grow<-c(grow,N/(i+1)) } } ## save function to a source file and reload dump("foo",file="foo.R") source("foo.R") ## create file to save profiler output tmp<-tempfile() ## Profile the function Rprof(tmp,line.profiling=TRUE) foo(1e4) Rprof(append=FALSE) ## Create a aprof object fooaprof<-aprof("foo.R",tmp) ## display basic information, summarize and plot the object fooaprof summary(fooaprof) plot(fooaprof) profileplot(fooaprof) ## To continue with memory profiling: ## enable memory.profiling=TRUE Rprof(tmp,line.profiling=TRUE,memory.profiling=TRUE) foo(1e4) Rprof(append=FALSE) ## Create a aprof object fooaprof<-aprof("foo.R",tmp) ## display basic information, and plot memory usage fooaprof plot(fooaprof) ## End(Not run)
Generic lower-level function to test whether an object is an aprof object.
is.aprof(object)
is.aprof(object)
object |
Object to test |
Plot execution time, or total MB usage when memory profiling, per line of code from a previously profiled source file. The plot visually shows bottlenecks in a program's execution time, shown directly next to the code of the source file.
## S3 method for class 'aprof' plot(x, y, ...)
## S3 method for class 'aprof' plot(x, y, ...)
x |
An aprof object as returned by aprof(). If this object contains both memory and time profiling information both will be plotted (as proportions of total time and total memory allocations. |
y |
Unused and ignored at current. |
... |
Additional printing arguments. Unused at current. |
Marco D. Visser
## Not run: # create function to profile foo <- function(N){ preallocate<-numeric(N) grow<-NULL for(i in 1:N){ preallocate[i]<-N/(i+1) grow<-c(grow,N/(i+1)) } } ## save function to a source file and reload dump("foo",file="foo.R") source("foo.R") ## create file to save profiler output tmp<-tempfile() ## Profile the function Rprof(tmp,line.profiling=TRUE) foo(1e4) Rprof(append=FALSE) ## Create a aprof object fooaprof<-aprof("foo.R",tmp) plot(fooaprof) ## End(Not run)
## Not run: # create function to profile foo <- function(N){ preallocate<-numeric(N) grow<-NULL for(i in 1:N){ preallocate[i]<-N/(i+1) grow<-c(grow,N/(i+1)) } } ## save function to a source file and reload dump("foo",file="foo.R") source("foo.R") ## create file to save profiler output tmp<-tempfile() ## Profile the function Rprof(tmp,line.profiling=TRUE) foo(1e4) Rprof(append=FALSE) ## Create a aprof object fooaprof<-aprof("foo.R",tmp) plot(fooaprof) ## End(Not run)
Function that makes a pretty table, and returns some basic information.
## S3 method for class 'aprof' print(x, ...)
## S3 method for class 'aprof' print(x, ...)
x |
An aprof object returned by the
function |
... |
Additional printing arguments. Unused. |
A profile plot describing the progression through each code line during the execution of the program.
profileplot(aprofobject)
profileplot(aprofobject)
aprofobject |
An aprof object returned by the function
|
Given that a source code file was specified in an "aprof" object
this function will estimate when each lines was executed. It
identifies the largest bottleneck and indicates this
on the plot with red markings (y-axis).
R uses a statistical profiler which, using system interrupts,
temporarily stops execution of a program at fixed intervals.
This is a profiling technique that results in samples of "the call stack"
every time the system was stopped. The function profileplot
uses
these samples to reconstruct the progression through the
program. Note that the best results are obtained when a decent amount of
samples have been taken (relative to the length of the source code).
Use print.aprof
to see how many samples (termed "Calls") of
the call stack were taken.
Marco D. Visser
## Not run: # create function to profile foo <- function(N){ preallocate<-numeric(N) grow<-NULL for(i in 1:N){ preallocate[i]<-N/(i+1) grow<-c(grow,N/(i+1)) } } #save function to a source file and reload dump("foo",file="foo.R") source("foo.R") # create file to save profiler output tmp<-tempfile() # Profile the function Rprof(tmp,line.profiling=TRUE) foo(1e4) Rprof(append=FALSE) # Create a aprof object fooaprof<-aprof("foo.R",tmp) profileplot(fooaprof) ## End(Not run)
## Not run: # create function to profile foo <- function(N){ preallocate<-numeric(N) grow<-NULL for(i in 1:N){ preallocate[i]<-N/(i+1) grow<-c(grow,N/(i+1)) } } #save function to a source file and reload dump("foo",file="foo.R") source("foo.R") # create file to save profiler output tmp<-tempfile() # Profile the function Rprof(tmp,line.profiling=TRUE) foo(1e4) Rprof(append=FALSE) # Create a aprof object fooaprof<-aprof("foo.R",tmp) profileplot(fooaprof) ## End(Not run)
Reads and calculates the line density (in execution time or memory)
of an aprof object returned by the aprof
function.
If a sourcefile was not specified in the aprof object, then the first file
within the profiling information is assumed to be the source.
readLineDensity(aprofobject = NULL, Memprof = FALSE)
readLineDensity(aprofobject = NULL, Memprof = FALSE)
aprofobject |
An object returned by |
Memprof |
Logical. Should the function return information specific to memory profiling with memory use per line in MB? Otherwise, the default is to return line call density and execution time per line. |
Marco D. Visser
summary.aprof, projections of code optimization gains.
## S3 method for class 'aprof' summary(object, ...)
## S3 method for class 'aprof' summary(object, ...)
object |
An object returned by the function |
... |
Additional [and unused] arguments. |
Summarizes an "aprof" object and returns a table with the theoretical maximal improvement in execution time for the entire profiled program when a given line of code is sped-up by a factor (called S in the output). Calculations are done using R's profiler output, and requires line profiling to be switched on. Expected improvements are estimated for the entire program using Amdahl's law (Amdahl 1967), and note that Calculations are subject to the scaling of the problem at profiling. The table output aims to answer whether it is worthwhile to spend hours of time optimizing bits of code (e.g. refactoring in C) and, additionally, identifies where these efforts should be focused. Using aprof one can get estimates of the maximum possible gain. Such considerations are important when one wishes to balance development time vs execution time. All predictions are subject to the scaling of the problem.
Marco D. Visser
Amdahl, Gene (1967). Validity of the Single Processor Approach to Achieving Large-Scale Computing Capabilities. AFIS Conference Proceedings (30): 483-485.
Allows a detailed look into certain lines of code, which have previously been identified as bottlenecks in combination with a source file.
targetedSummary(target = NULL, aprofobject = NULL, findParent = FALSE)
targetedSummary(target = NULL, aprofobject = NULL, findParent = FALSE)
target |
The specific line of code to take a detailed look
at. This can be identified using |
aprofobject |
object of class "aprof" returned by
the function |
findParent |
Logical, should an attempt be made to find the parent of a function call? E.g. "lm" would be a parent call of "lm.fit" or "mean" a parent call of "mean.default". Note that currently, the option only returns the most frequently associated parent call when multiple unique parents exist. |
Marco D. Visser