NVIDIA A100 (Ampere)

Product Overview

The NVIDIA A100, released in 2020, is a landmark data center AI accelerator that introduced MIG (Multi-Instance GPU) and TF32 / FP16 / BF16 Tensor Cores. Although superseded by the H100, the A100 remains the most widely deployed AI training GPU, available in 40GB and 80GB HBM2e variants.

Core Specifications

Parameter	40GB Variant	80GB Variant
Architecture	Ampere GA100	Ampere GA100
Process Node	TSMC 7nm	TSMC 7nm
Transistor Count	54 billion	54 billion
Memory	40 GB HBM2e	80 GB HBM2e
Memory Bandwidth	1,555 GB/s	1,935 GB/s
CUDA Cores	6,912	6,912
Tensor Cores	432 (3rd Gen)	432 (3rd Gen)
FP32	19.5 TFLOPS	19.5 TFLOPS
FP64	9.7 TFLOPS	9.7 TFLOPS
TF32 Tensor Core	156 TFLOPS	156 TFLOPS
FP16/BF16 Tensor Core	312 TFLOPS	312 TFLOPS
INT8 Tensor Core	624 TOPS	624 TOPS
TDP	250 W / 400 W	300 W / 400 W
NVLink	600 GB/s	600 GB/s
MIG	Up to 7 instances	Up to 7 instances

Vendor Information

Parameter	Value
Manufacturer	NVIDIA Corporation
Official Website	https://www.nvidia.com
Product Page	https://www.nvidia.com/en-us/data-center/a100/
Release	June 2020 GTC

Software & Drivers

Driver: https://www.nvidia.com/Download/index.aspx
CUDA 11.0+ full support
Full cuDNN, TensorRT, NCCL ecosystem

Key Features

3rd Gen Tensor Cores: Support TF32, FP16, BF16, INT8
MIG (Multi-Instance GPU): Partition a single GPU into up to 7 independent instances
Structured Sparsity: Hardware-level 2:4 sparsity acceleration
NVLink 3.0: 600 GB/s interconnect bandwidth

Use Cases

LLM training (7B–70B models)
Inference deployment
HPC scientific computing
Recommendation systems

NVIDIA H100 — Successor
NVIDIA H200 — Current mainstream training
AMD MI250 — Previous-generation competitor

Product Overview​

Core Specifications​

Vendor Information​

Software & Drivers​

Key Features​

Use Cases​

Related Comparisons​