From charlesreid1

 
(12 intermediate revisions by one other user not shown)
Line 3: Line 3:
==List of Python Profiling Tools==
==List of Python Profiling Tools==


===Built Into Python===
===cProfile===


Profiling tools provided by Python:
cProfile is a C extension, the recommended method of profiling python programs, and is based on lsprof. It is built into Python.
* cProfile is a C extension, the recommended method of profiling python programs, and is based on lsprof
* <s>profile is a module that adds significant overhead; reports time spent in calls to built-in functions and methods</s>
* <s>hotshot - not developed/supported anymore</s>


Benefits:
Benefits:
Line 17: Line 14:
* As stated by rkern [https://github.com/rkern/line_profiler], "The current profiling tools supported in Python 2.7 and later only time function calls. This is a good first step for locating hotspots in one's program and is frequently all one needs to do to optimize the program. However, sometimes the cause of the hotspot is actually a single line in the function, and that line may not be obvious from just reading the source code."
* As stated by rkern [https://github.com/rkern/line_profiler], "The current profiling tools supported in Python 2.7 and later only time function calls. This is a good first step for locating hotspots in one's program and is frequently all one needs to do to optimize the program. However, sometimes the cause of the hotspot is actually a single line in the function, and that line may not be obvious from just reading the source code."


Link: https://docs.python.org/2/library/profile.html
Link: https://docs.python.org/3/library/profile.html


===Line by Line===
===Line by Line===
Line 31: Line 28:
</pre>
</pre>


To profile:
To profile, decorate a function with @profile:


<source lang="python">
<pre>
@profile
@profile
def slow_function(a, b, c):
def slow_function(a, b, c):
     ...
     ...
</source>
</pre>


This results in a file called <code>script_to_profile.py.lprof</code>, which you can read with the line_prof module:
When you run the program, this will result in a file called <code>name_of_your_script_here.py.lprof</code>, which you can read with the line_prof module:


<pre>
<pre>
$ python -m line_profiler script_to_profile.py.lprof
$ python -m line_profiler name_of_your_script_here.py.lprof
</pre>
</pre>


Link to github: https://github.com/rkern/line_profiler
Link to github: https://github.com/rkern/line_profiler


===Graphviz Dot===
===Gprof to Graphviz Dot===


Gprof2dot:
Gprof2dot:
* Reads all sorts of input formats, outputs to graphviz dot
* Reads all sorts of input formats
* Outputs to graphviz dot


Gprof2dot link: https://github.com/jrfonseca/gprof2dot
Gprof2dot link: https://github.com/jrfonseca/gprof2dot
=Using cProfile=
==Basic usage==
The basic way to use cProfile is to create a profile object, and to enable and disable profiling using that object. The default behavior is to print a summary when it is finished.
<pre>
import cProfile
def factorial(n):
  if(n==1):
    return 1
  else:
    return n*factorial(n-1)
pr = cProfile.Profile()
pr.enable()
factorial(800)
pr.disable()
pr.print_stats()
</pre>
==Profiling code in other files==
You can also profile code in other files by writing a profile wrapper around an import call. For example, the following <code>profile_nqueens.py</code> file profiles a solution to the N queens problem in <code>nqueens.py</code>:
'''profile_nqueens.py'''
<pre>
import cProfile, pstats, StringIO
pr = cProfile.Profile()
pr.enable()
import nqueens
pr.disable()
pr.print_stats()
</pre>
'''nqueens.py'''
<pre>
# Look ma, no modules!
# bottlenecks around 12, when the problem size starts to mushroom
board_size = 10
solutions = []
queens = []
occupied = board_size*[0,]
def explore(depth):
    # base case
    if(depth==board_size):
        # stringify/serialize the solution
        solutions.append("%s"%(queens))
        return
    else:
        attacked = 2*board_size*[0,]
        for i in range(0,len(queens)):
            ix1 = queens[i] + depth - i
            attacked[ix1] = 1
            ix2 = queens[i] - depth + i
            attacked[ix2] = 1
        for row in range(0,board_size):
            if(occupied[row] or attacked[row]):
                continue
            # make a choice
            queens.append(row)
            occupied[row] = 1
            # explore the consequences
            explore(depth+1)
            # unmake the choice
            queens.pop()
            occupied[row] = 0
explore(0)
print "Found %d solutions"%(len(solutions))
#print solutions
</pre>
===Example output===
<pre>
$ python profile_nqueens.py
Duration: 0.398 s
Found 724 solutions
        211789 function calls (176251 primitive calls) in 0.398 seconds
  Ordered by: cumulative time
  ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.398    0.398 /Volumes/noospace/Users/charles/codes/hello-world/python/nqueens.py:1(<module>)
  35539/1    0.325    0.000    0.398    0.398 /Volumes/noospace/Users/charles/codes/hello-world/python/nqueens.py:10(explore)
    69630    0.050    0.000    0.050    0.000 {range}
    35538    0.012    0.000    0.012    0.000 {method 'pop' of 'list' objects}
    36262    0.006    0.000    0.006    0.000 {method 'append' of 'list' objects}
    34816    0.004    0.000    0.004    0.000 {len}
        2    0.000    0.000    0.000    0.000 {time.time}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
</pre>
=Flags=
{{CSFlag}}
{{PythonFlag}}
[[Category:Python]]
[[Category:Programming]]
[[Category:Profiling]]

Latest revision as of 23:43, 20 September 2020

Introduction

List of Python Profiling Tools

cProfile

cProfile is a C extension, the recommended method of profiling python programs, and is based on lsprof. It is built into Python.

Benefits:

  • Easy
  • Built-in

Downsides:

  • As stated by rkern [1], "The current profiling tools supported in Python 2.7 and later only time function calls. This is a good first step for locating hotspots in one's program and is frequently all one needs to do to optimize the program. However, sometimes the cause of the hotspot is actually a single line in the function, and that line may not be obvious from just reading the source code."

Link: https://docs.python.org/3/library/profile.html

Line by Line

line_profiler module:

  • Can time code by adding decorators
  • Dumps a binary file that gives a line by line breakdown of time spent within a given function

To install:

$ pip install line_profiler

To profile, decorate a function with @profile:

@profile
def slow_function(a, b, c):
    ...

When you run the program, this will result in a file called name_of_your_script_here.py.lprof, which you can read with the line_prof module:

$ python -m line_profiler name_of_your_script_here.py.lprof

Link to github: https://github.com/rkern/line_profiler

Gprof to Graphviz Dot

Gprof2dot:

  • Reads all sorts of input formats
  • Outputs to graphviz dot

Gprof2dot link: https://github.com/jrfonseca/gprof2dot

Using cProfile

Basic usage

The basic way to use cProfile is to create a profile object, and to enable and disable profiling using that object. The default behavior is to print a summary when it is finished.

import cProfile

def factorial(n):
  if(n==1):
    return 1
  else:
    return n*factorial(n-1)

pr = cProfile.Profile()
pr.enable()
factorial(800)
pr.disable()
pr.print_stats()

Profiling code in other files

You can also profile code in other files by writing a profile wrapper around an import call. For example, the following profile_nqueens.py file profiles a solution to the N queens problem in nqueens.py:

profile_nqueens.py

import cProfile, pstats, StringIO
pr = cProfile.Profile()
pr.enable()
import nqueens
pr.disable()

pr.print_stats()

nqueens.py

# Look ma, no modules!

# bottlenecks around 12, when the problem size starts to mushroom
board_size = 10

solutions = []
queens = []
occupied = board_size*[0,]

def explore(depth):
    # base case
    if(depth==board_size):
        # stringify/serialize the solution
        solutions.append("%s"%(queens))
        return

    else:
        attacked = 2*board_size*[0,]
        for i in range(0,len(queens)):
            ix1 = queens[i] + depth - i
            attacked[ix1] = 1

            ix2 = queens[i] - depth + i
            attacked[ix2] = 1

        for row in range(0,board_size):
            if(occupied[row] or attacked[row]):
                continue

            # make a choice
            queens.append(row)
            occupied[row] = 1

            # explore the consequences
            explore(depth+1)

            # unmake the choice
            queens.pop()
            occupied[row] = 0



explore(0)
print "Found %d solutions"%(len(solutions))
#print solutions

Example output

$ python profile_nqueens.py
Duration: 0.398 s
Found 724 solutions
         211789 function calls (176251 primitive calls) in 0.398 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.398    0.398 /Volumes/noospace/Users/charles/codes/hello-world/python/nqueens.py:1(<module>)
  35539/1    0.325    0.000    0.398    0.398 /Volumes/noospace/Users/charles/codes/hello-world/python/nqueens.py:10(explore)
    69630    0.050    0.000    0.050    0.000 {range}
    35538    0.012    0.000    0.012    0.000 {method 'pop' of 'list' objects}
    36262    0.006    0.000    0.006    0.000 {method 'append' of 'list' objects}
    34816    0.004    0.000    0.004    0.000 {len}
        2    0.000    0.000    0.000    0.000 {time.time}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


Flags





See also: