Tesla FSD Chip Microarchitecture: A Deep Dive


Tesla’s Full Self-Driving (FSD) Chip is a custom-designed ASIC optimized for vision-based autonomous driving. Here’s a breakdown of its microarchitecture, from silicon to software:
1. Key Specifications (HW3/HW4)
Metric | FSD Chip (HW3) | FSD Chip (HW4) |
Process Node | 14nm (Samsung) | 7nm (Samsung) |
Die Size | 260 mm² | ~200 mm² (est.) |
Transistors | 6B | 10B+ (est.) |
Peak TOPS | 144 TOPS (INT8) | 256 TOPS (INT8) |
Power Consumption | 36W | 45W (est.) |
Cameras Supported | 8x 1.2MP | 12x 5MP |
2. Block Diagram & Core Components
plaintext
┌─────────────────────────────────────────────────────┐
│ Tesla FSD Chip │
├───────────────────┬─────────────────┬───────────────┤
│ **Dual NPUs** │ **GPU** │ **CPU Cores** │
│ (Neural Processor)│ (Custom) │ (ARM A72) │
├───────────────────┼─────────────────┼───────────────┤
│ - 96x96 MAC array │ - 1TFLOPS (FP32)│ - 12x A72 │
│ - 2GHz clock │ - Texture units │ - Lockstep │
│ - 32MB SRAM cache │ │ for ASIL-D │
└───────────────────┴─────────────────┴───────────────┘
3. Neural Processing Unit (NPU) – The Secret Sauce
Array Structure:
96x96 MAC (Multiply-Accumulate) units per NPU (x2 in HW3).
Optimized for 8-bit integer (INT8) operations (95% of Tesla’s NN workloads).
On-Chip Memory:
32MB SRAM cache (vs. 4–8MB in competing chips like NVIDIA Xavier).
Reduces DRAM access latency by 5x (critical for real-time inference).
Custom ISA:
- Supports Tesla’s HydraNet multi-task learning (simultaneous detection/lane prediction).
4. CPU & GPU Components
CPU:
12x ARM Cortex-A72 (64-bit) in triple-redundant lockstep for ASIL-D safety.
Runs lightweight tasks (sensor polling, CAN bus communication).
GPU:
Custom-designed ~1 TFLOPS FP32 unit (not for graphics, but for post-processing).
Handles non-ML tasks like image warping (for multi-camera stitching).
5. Memory Hierarchy
plaintext
┌──────────────┐ ┌──────────────┐
│ 32MB SRAM │◄─────►│ Dual NPUs │ (On-chip)
└──────────────┘ └──────────────┘
▲
│ ~100GB/s bandwidth
┌──────────────┐
│ 8GB LPDDR4 │ (Off-chip)
└──────────────┘
SRAM-First Design: Minimizes external memory access (power-efficient).
No HBM/GDDR: Unlike NVIDIA/AMD chips, Tesla prioritizes latency over bandwidth.
6. Software Stack Integration
Compiler: Custom toolchain converts PyTorch models to NPU-optimized bytecode.
Real-Time OS: Lightweight Tesla OS (modified Linux) with <10μs interrupt latency.
HydraNet: Runs 48 neural networks in parallel (e.g., traffic light, obstacle, depth estimation).
7. HW3 vs. HW4 Improvements
Feature | HW3 (2019) | HW4 (2023) |
NPUs | 2x 96x96 MAC | 2x 128x128 MAC (est.) |
Camera Inputs | 8x 1.2MP (HDR) | 12x 5MP (HDR++) |
Safety | ASIL-B | ASIL-D |
Backward Compatible | No | Yes (with HW3 cameras) |
8. Benchmark vs. Competitors
Chip | TOPS (INT8) | Power | SRAM | Use Case |
Tesla FSD HW4 | 256 | 45W | 32MB | Vision-only autonomy |
NVIDIA Orin | 254 | 60W | 8MB | Multi-sensor fusion |
Mobileye EyeQ6 | 48 | 10W | 16MB | L2+ ADAS |
Why Tesla’s NPU Wins:
5x TOPS/mm² efficiency vs. GPUs (dedicated silicon for vision NNs).
Zero external memory access for common ops (e.g., convolutions).
9. Limitations & Trade-Offs
No LiDAR/Radar Support: HW4 still lacks hardware accelerators for time-of-flight processing.
Fixed-Precision Only: No FP16/FP32 in NPUs (limits future model complexity).
Thermal Constraints: Sustained 45W requires liquid cooling in Cybertruck.
10. The Dojo Connection
Dojo D1 Chip: Scaled-up version of FSD NPU (354 TOPS, 1.25TB/s fabric).
Training-Inference Symmetry: Models trained on Dojo map directly to FSD NPUs.
Key Takeaways
Domain-Specific Design: Tesla’s NPUs are optimized only for camera-based autonomy.
Memory is King: 32MB SRAM avoids the "memory wall" that bottlenecks GPUs.
Vertical Integration: From silicon (Samsung) to software (HydraNet), Tesla controls the stack.
For sensor fusion (LiDAR/radar), FPGAs still dominate—but for vision-only scale, Tesla’s ASIC approach is unmatched.
Subscribe to my newsletter
Read articles from ampheo directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
