systems_performance_2nd_edition_by_brendan_gregg_table_of_contents

Systems Performance Table of Contents

Contents at a Glance

Detailed Contents

1. Introduction

2. Methodologies

  • 2.3.5 Level of Appropriateness

3. Operating Systems

4. Observability Tools

5. Applications

6. CPUs

  • 6.6.5 ps
  • 6.6.21 Other Tools

7. Memory

8. File Systems

9. Disks

9 Systems Performance Disks

9.1 Terminology

9.2 Models

9.2.1 Simple Disk

9.2.2 Caching Disk

9.2.3 Controller

9.3 Concepts

9.3.1 Measuring Time

9.3.2 Time Scales

9.3.3 Caching

9.3.4 Random vs. Sequential I/O

9.3.5 Read/Write Ratio

9.3.6 I/O Size

9.3.7 IOPS Are Not Equal

9.3.8 Non-Data-Transfer Disk Commands

9.3.9 Utilization

9.3.10 Saturation

9.3.11 I/O Wait

9.3.12 Synchronous vs. Asynchronous

9.3.13 Disk vs. Application I/O

9.4 Architecture

9.4.1 Disk Types

9.4.2 Interfaces

9.4.3 Storage Types

9.4.4 Operating System Disk I/O Stack

9.5 Methodology

9.5.1 Tools Method

9.5.2 USE Method

9.5.3 Performance Monitoring

9.5.4 Workload Characterization

9.5.5 Latency Analysis

9.5.6 Static Performance Tuning

9.5.7 Cache Tuning

9.5.8 Resource Controls

9.5.9 Micro-Benchmarking

9.5.10 Scaling

9.6 Observability Tools

9.6.1 iostat

9.6.2 sar

9.6.3 PSI

9.6.4 pidstat

9.6.5 perf

9.6.6 biolatency

9.6.7 biosnoop

9.6.8 iotop, biotop

9.6.9 biostacks

9.6.10 blktrace

9.6.11 bpftrace

9.6.12 MegaCli

9.6.13 smartctl

9.6.14 SCSI Logging

9.6.15 Other Tools

9.7 Visualizations

9.7.1 Line Graphs

9.7.2 Latency Scatter Plots

9.7.3 Latency Heat Maps

9.7.4 Offset Heat Maps

9.7.5 Utilization Heat Maps

9.8 Experimentation

9.8.1 Ad Hoc

9.8.2 Custom Load Generators

9.8.3 Micro-Benchmark Tools

9.8.4 Random Read Example

9.8.5 ioping

9.8.6 fio

9.8.7 blkreplay

9.9 Tuning

9.9.1 Operating System Tunables

9.9.2 Disk Device Tunables

9.9.3 Disk Controller Tunables

9.10 Exercises

9.11 References

10. Networks

10 Systems Performance Network

10.1 Terminology

10.2 Models

10.2.1 Network Interface

10.2.2 Controller

10.2.3 Protocol Stack

10.3 Concepts

10.3.1 Networks and Routing

10.3.2 Protocols

10.3.3 Encapsulation

10.3.4 Packet Size

10.3.5 Latency

10.3.6 Buffering

10.3.7 Connection Backlog

10.3.8 Interface Negotiation

10.3.9 Congestion Avoidance

10.3.10 Utilization

10.3.11 Local Connections

10.4 Architecture

10.4.1 Protocols

10.4.2 Hardware

10.4.3 Software

10.5 Methodology

10.5.1 Tools Method

10.5.2 USE Method

10.5.3 Workload Characterization

10.5.4 Latency Analysis

10.5.5 Performance Monitoring

10.5.6 Packet Sniffing

10.5.7 TCP Analysis

10.5.8 Static Performance Tuning

10.5.9 Resource Controls

10.5.10 Micro-Benchmarking

10.6 Observability Tools

10.6.1 ss

10.6.2 ip

10.6.3 ifconfig

10.6.4 nstat

10.6.5 netstat

10.6.6 sar

10.6.7 nicstat

10.6.8 ethtool

10.6.9 tcplife

10.6.10 tcptop

10.6.11 tcpretrans

10.6.12 bpftrace

10.6.13 tcpdump

10.6.14 Wireshark

10.6.15 Other Tools

10.7 Experimentation

10.7.1 ping

10.7.2 traceroute

10.7.3 pathchar

10.7.4 iperf

10.7.5 netperf

10.7.6 tc

10.7.7 Other Tools

10.8 Tuning

10.8.1 System-Wide

10.8.2 Socket Options

10.8.3 Configuration

10.9 Exercises

10.10 References

11. Cloud Computing

12. Benchmarking

13. perf

13 Systems Performance perf

13.1 Subcommands Overview

13.2 One-Liners

13.3 perf Events

13.4 Hardware Events

13.4.1 Frequency Sampling

13.5 Software Events

13.6 Tracepoint Events

13.7 Probe Events

13.7.1 kprobes

13.7.2 uprobes

13.7.3 USDT

13.8 perf stat

13.8.1 Options

13.8.2 Interval Statistics

13.8.3 Per-CPU Balance

13.8.4 Event Filters

13.8.5 Shadow Statistics

13.9 perf record

13.9.1 Options

13.9.2 CPU Profiling

13.9.3 Stack Walking

13.10 perf report

13.10.1 TUI

13.10.2 STDIO

13.11 perf script

13.11.1 Flame Graphs

13.11.2 Trace Scripts

13.12 perf trace

13.12.1 Kernel Versions

13.13 Other Commands

13.14 perf Documentation

13.15 References

14. Ftrace

14 Systems Performance Ftrace

14.1 Capabilities Overview

14.2 tracefs (/sys)

14.2.1 tracefs Contents

14.3 Ftrace Function Profiler

14.4 Ftrace Function Tracing

14.4.1 Using trace

14.4.2 Using trace_pipe

14.4.3 Options

14.5 Tracepoints

14.5.1 Filter

14.5.2 Trigger

14.6 kprobes

14.6.1 Event Tracing

14.6.2 Arguments

14.6.3 Return Values

14.6.4 Filters and Triggers

14.6.5 kprobe Profiling

14.7 uprobes

14.7.1 Event Tracing

14.7.2 Arguments and Return Values

14.7.3 Filters and Triggers

14.7.4 uprobe Profiling

14.8 Ftrace function_graph

14.8.1 Graph Tracing

14.8.2 Options

14.9 Ftrace hwlat

14.10 Ftrace Hist Triggers

14.10.1 Single Keys

14.10.2 Fields

14.10.3 Modifiers

14.10.4 PID Filters

14.10.5 Multiple Keys

14.10.6 Stack Trace Keys

14.10.7 Synthetic Events

14.11 trace-cmd

14.11.1 Subcommands Overview

14.11.2 trace-cmd One-Liners

14.11.3 trace-cmd vs. perf(1)

14.11.4 trace-cmd function_graph

14.11.5 KernelShark

14.11.6 trace-cmd Documentation

14.12 perf ftrace

14.13 perf-tools

14.13.1 Tool Coverage

14.13.2 Single-Purpose Tools

14.13.3 Multi-Purpose Tools

14.13.4 perf-tools One-Liners

14.13.5 Example

14.13.6 perf-tools vs. BCC/BPF

14.13.7 Documentation

14.14 Ftrace Documentation

14.15 References

15. BPF

15 Systems Performance BPF

15.1 BCC

15.1.1 Installation

15.1.2 Tool Coverage

15.1.3 Single-Purpose Tools

15.1.4 Multi-Purpose Tools

15.1.5 One-Liners

15.1.6 Multi-Tool Example

15.1.7 BCC vs. bpftrace

15.1.8 Documentation

15.2 bpftrace

15.2.1 Installation

15.2.2 Tools

15.2.3 One-Liners

15.2.4 Programming

15.2.5 Reference

15.2.6 Documentation

15.3 References

16. Case Study

Appendix

Fair Use Sources

Performance: Systems performance, Systems performance bibliography, Systems Performance Outline: (Systems Performance Introduction, Systems Performance Methodologies, Systems Performance Operating Systems, Systems Performance Observability Tools, Systems Performance Applications, Systems Performance CPUs, Systems Performance Memory, Systems Performance File Systems, Systems Performance Disks, Systems Performance Network, Systems Performance Cloud Computing, Systems Performance Benchmarking, Systems Performance perf, Systems Performance Ftrace, Systems Performance BPF, Systems Performance Case Study), Accuracy, Algorithmic efficiency (Big O notation), Algorithm performance, Amdahl's Law, Android performance, Application performance engineering, Async programming, Bandwidth, Bandwidth utilization, bcc, Benchmark (SPECint and SPECfp), BPF, bpftrace, Performance bottleneck (“Hotspots”), Browser performance, C performance, C++ performance, C# performance, Cache hit, Cache performance, Capacity planning, Channel capacity, Clock rate, Clojure performance, Compiler performance (Just-in-time (JIT) compilation - Ahead-of-time compilation (AOT), Compile-time, Optimizing compiler), Compression ratio, Computer performance, Concurrency, Concurrent programming, Concurrent testing, Container performance, CPU cache, CPU cooling, CPU cycle, CPU overclocking (CPU boosting, CPU multiplier), CPU performance, CPU speed, CPU throttling (Dynamic frequency scaling - Dynamic voltage scaling - Automatic underclocking), CPU time, CPU load - CPU usage - CPU utilization, Cycles per second (Hz), CUDA (Nvidia), Data transmission time, Database performance (ACID-CAP theorem, Database sharding, Cassandra performance, Kafka performance, IBM Db2 performance, MongoDB performance, MySQL performance, Oracle Database performance, PostgreSQL performance, Spark performance, SQL Server performance), Disk I/O, Disk latency, Disk performance, Disk speed, Disk usage - Disk utilization, Distributed computing performance (Fallacies of distributed computing), DNS performance, Efficiency - Relative efficiency, Encryption performance, Energy efficiency, Environmental impact, Fast, Filesystem performance, Fortran performance, FPGA, Gbps, Global Interpreter Lock - GIL, Golang performance, GPU - GPGPU, GPU performance, Hardware performance, Hardware performance testing, Hardware stress test, Haskell performance, High availability (HA), Hit ratio, IOPS - I/O operations per second, IPC - Instructions per cycle, IPS - Instructions per second, Java performance (Java data structure performance - Java ArrayList is ALWAYS faster than LinkedList, Apache JMeter), JavaScript performance (V8 JavaScript engine performance, Node.js performance - Deno performance), JVM performance (GraalVM, HotSpot), Kubernetes performance, Kotlin performance, Lag (video games) (Frame rate - Frames per second (FPS)), Lagometer, Latency, Lazy evaluation, Linux performance, Load balancing, Load testing, Logging, macOS performance, Mainframe performance, Mbps, Memory footprint, Memory speed, Memory performance, Memory usage - Memory utilization, Micro-benchmark, Microsecond, Monitoring

Linux/UNIX commands for assessing system performance include:

  • uptime the system reliability and load average
  • top for an overall system view
  • vmstat vmstat reports information about runnable or blocked processes, memory, paging, block I/O, traps, and CPU.
  • htop interactive process viewer
  • dstat, atop helps correlate all existing resource data for processes, memory, paging, block I/O, traps, and CPU activity.
  • iftop interactive network traffic viewer per interface
  • nethogs interactive network traffic viewer per process
  • iotop interactive I/O viewer
  • iostat for storage I/O statistics
  • netstat for network statistics
  • mpstat for CPU statistics
  • tload load average graph for terminal
  • xload load average graph for X
  • /proc/loadavg text file containing load average

(Event monitoring - Event log analysis, Google Cloud's operations suite (formerly Stackdriver), htop, mpstat, macOS Activity Monitor, Nagios Core, Network monitoring, netstat-iproute2, proc filesystem (procfs)]] - ps (Unix), System monitor, sar (Unix) - systat (BSD), top - top (table of processes), vmstat), Moore’s law, Multicore - Multi-core processor, Multiprocessor, Multithreading, mutex, Network capacity, Network congestion, Network I/O, Network latency (Network delay, End-to-end delay, packet loss, ping - ping (networking utility) (Packet InterNet Groper) - traceroute - netsniff-ng, Round-trip delay (RTD) - Round-trip time (RTT)), Network performance, Network switch performance, Network usage - Network utilization, NIC performance, NVMe, NVMe performance, Observability, Operating system performance, Optimization (Donald Knuth: “Premature optimization is the root of all evil), Parallel processing, Parallel programming (Embarrassingly parallel), Perceived performance, Performance analysis (Profiling), Performance design, Performance engineer, Performance equation, Performance evaluation, Performance gains, Performance Mantras, Performance measurement (Quantifying performance, Performance metrics), Perfmon, Performance testing, Performance tuning, PowerShell performance, Power consumption - Performance per watt, Processing power, Processing speed, Productivity, Python performance (CPython performance, PyPy performance - PyPy JIT), Quality of service (QOS) performance, Refactoring, Reliability, Response time, Resource usage - Resource utilization, Router performance (Processing delay - Queuing delay), Ruby performance, Rust performance, Scala performance, Scalability, Scalability test, Server performance, Size and weight, Slow, Software performance, Software performance testing, Speed, Stress testing, SSD, SSD performance, Swift performance, Supercomputing, Tbps, Throughput, Time (Time units, Nanosecond, Millisecond, Frequency (rate), Startup time delay - Warm-up time, Execution time), TPU - Tensor processing unit, Tracing, Transistor count, TypeScript performance, Virtual memory performance (Thrashing), Volume testing, WebAssembly, Web framework performance, Web performance, Windows performance (Windows Performance Monitor). (navbar_performance)


© 1994 - 2024 Cloud Monk Losang Jinpa or Fair Use. Disclaimers

SYI LU SENG E MU CHYWE YE. NAN. WEI LA YE. WEI LA YE. SA WA HE.


systems_performance_2nd_edition_by_brendan_gregg_table_of_contents.txt · Last modified: 2024/04/28 03:36 (external edit)