Jeff Liu Lab
Home
Projects
Workshop
AI Wiki
AI Lab
Shop
中
Sign In
All
Computing Science
Artificial Intelligence
Deep Learning
Reinforcement Learning
AI Agents
Embodied Intelligence
Robot Engineering
Human-Like Intelligence
AI Engineering
← Back to Wiki
Computing Science
Mathematical Foundations
Calculus
Linear Algebra
Probability Theory
Information Theory
Statistics
Automatic Differentiation
Discrete Mathematics
Numerical Methods
Optimization Theory
Graph Theory Fundamentals
Theory of Computation
Algorithms
Computer Architecture
Operating Systems
Computer Networks
Programming Languages
Software Engineering
Comments (0)
Sign in to comment
Table of Contents
What Is Automatic Differentiation
Comparison of Three Differentiation Methods
Computational Graph
Forward Mode AD
Dual Numbers
Forward Mode Computation
Characteristics of Forward Mode
Reverse Mode AD
Reverse Mode Computation
Characteristics of Reverse Mode
Forward Mode vs. Reverse Mode
Chain Rule and Backpropagation
Mathematical Derivation
Example: A Simple Two-Layer Network
Automatic Differentiation in PyTorch
Basic Usage
Gradient Accumulation and Zeroing
The torch.no_grad() Context
The detach() Method
Common Pitfalls
1. In-place Operations Breaking the Computational Graph
2. detach() vs. no_grad()
3. Proper Use of Gradient Accumulation
4. Calling backward() on Non-scalar Outputs
Comments
Comments (0)
Sign in to comment