Overview of Pprof
pprof
is a tool that comes with Go’s standard library and is used for collecting and viewing profiling data. It can collect different types of profiles including:
- CPU Profile: Measures where the program spends most of its time.
- Memory Profile: Measures the amount of memory allocated and retained.
- Block Profile: Measures where the program spends time waiting for synchronization primitives.
- Mutex Profile: Measures contention on mutexes.
Setting Up pprof
To use pprof
, you need to import the net/http/pprof
package and set up HTTP server to serve the profiling data.
Step 1: Import the pprof Package
import (
"net/http"
_ "net/http/pprof" // This blank import enables the HTTP endpoints for `pprof`
)
Step 2: Start the HTTP Server
func main() {
go func() {
// The ListenAndServe function starts the HTTP server. It never returns.
// We use `log.Fatal` here to ensure that the program will exit if the server fails to start.
log.Fatal(http.ListenAndServe("localhost:6060", nil))
}()
// Your application logic here...
}
Writing Code to Profile
For demonstration, we’ll use a CPU-intensive function.
func expensiveFunction() {
for i := 0; i < 1000000; i++ {
// Simulating some CPU intensive computation
}
}
func main() {
go func() {
log.Fatal(http.ListenAndServe("localhost:6060", nil))
}()
// Call the function multiple times to simulate CPU usage
for i := 0; i < 10; i++ {
expensiveFunction()
}
}
Profiling the Application
After running your program, you can access the pprof
HTTP endpoints at http://localhost:6060/debug/pprof/
.
Profiling CPU Usage
- Generate a CPU profile by visiting
http://localhost:6060/debug/pprof/profile
in your browser. - Capture the profile by setting the duration for which you want to collect the profile. For example, you can set it to 30 seconds.
Profiling Memory Usage
- Generate a memory profile by visiting
http://localhost:6060/debug/pprof/heap
in your browser. - Capture the profile and download the snapshot.
Analyzing the Profile
Use the go tool pprof
command to analyze the profile.
Analyzing CPU Profile
go tool pprof http://localhost:6060/debug/pprof/profile
Use various commands to explore the data:
top
: Lists the top functions by self or total CPU time.list <function_name>
: Shows the source code of a specific function.web
: Opens an interactive web visualization of the call graph.
Analyzing Memory Profile
go tool pprof http://localhost:6060/debug/pprof/heap
Analyzing concurrency performance
Step 1: Enable pprof HTTP Endpoints
import (
"net/http"
_ "net/http/pprof" // This blank import enables the HTTP endpoints for `pprof`
"log"
"time"
)
func startProfiler() {
// Start the HTTP server on a different goroutine
go func() {
log.Println(http.ListenAndServe("localhost:6060", nil))
}()
}
Step 2: Write Concurrent Code
Create a Go program that uses concurrency features like goroutines and channels.
func worker(id int, done chan bool) {
for {
// Simulating work by sleeping for a bit.
time.Sleep(time.Millisecond * 10)
// Log the work being processed.
log.Printf("Worker %d is processing\n", id)
// Signal that the work is done.
done <- true
}
}
func main() {
startProfiler()
numWorkers := 10000
done := make(chan bool, numWorkers)
for i := 0; i < numWorkers; i++ {
go worker(i, done)
}
for i := 0; i < numWorkers; i++ {
<-done
}
}
Step 3: Collect Concurrency Profiles
Mutex Profile
Run the Mutex Profiler: Access
http://localhost:6060/debug/pprof/mutex
to view the mutex profile, which shows contention points for mutexes.go tool pprof http://localhost:6060/debug/pprof/mutex
Analyze Contention: Look for high
contention
values, which indicate that goroutines are frequently blocked waiting for a mutex.
Goroutine Profile
Run the Goroutine Profiler: Access
http://localhost:6060/debug/pprof/goroutine
to get a report of all current goroutines.go tool pprof http://localhost:6060/debug/pprof/goroutine
Analyze Goroutine Counts: Check for an unusually high number of goroutines, which might indicate a leak.
Step 4: Use pprof to Analyze Data
Use the top
list
web
command
(pprof) top
Showing nodes accounting for 77021, 96.14% of 80112 total
top - 15:25:56 up 1:15, 0 users, load average: 2.18, 2.84, 2.41
Tasks: 17 total, 1 running, 15 sleeping, 0 stopped, 1 zombie
%Cpu(s): 50.1 us, 11.5 sy, 0.0 ni, 37.4 id, 0.0 wa, 0.0 hi, 1.0 si, 0.0 st
MiB Mem : 7950.9 total, 4956.1 free, 2293.5 used, 701.4 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 5196.8 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
135 user 20 0 1717788 847608 10480 S 4.3 10.4 2:34.51 node
6 user 20 0 954168 88244 9608 S 3.0 1.1 1:21.68 node
295 user 20 0 899088 16332 1264 S 0.0 0.2 0:00.99 nodemon
324 user 20 0 503980 12852 0 S 0.0 0.2 0:00.49 nixd
439 user 20 0 684016 73544 0 S 0.0 0.9 0:00.87 nixd-attrset-ev
3949 user 20 0 223960 2840 1952 S 0.0 0.0 0:00.06 bash
7542 user 20 0 1524220 75852 0 S 0.0 0.9 0:10.57 gopls
9074 user 20 0 223960 2560 1668 S 0.0 0.0 0:00.09 bash
Step 5: Optimize Concurrency
- Reduce Lock Contention: Redesign your code to reduce the need for locking or use finer-grained locks.
- Optimize Channel Usage: Ensure that channel operations are efficient and that goroutines are not blocked waiting for channel operations unnecessarily.
- Limit Goroutine Creation: Use a pool of worker goroutines or limit the creation of new goroutines.
- Avoid Blocking Operations: Ensure that blocking operations are handled correctly and do not unnecessarily block other goroutines.
Conclusion
pprof
is a powerful tool for understanding and optimizing the performance of your Go applications.