graphics processing unit

2026-03-08

Defines graphics processing unit, GPU

A graphics processing unit (GPU) is a hardware accelerator built for massively parallel computation. Originally designed for rendering graphics — transforming 3D geometry into 2D images by performing the same operation on many data points simultaneously — GPUs have been repurposed for any workload that benefits from data parallelism, most notably machine learning.

A GPU’s architecture consists of thousands of small processing cores organized into groups that execute the same instruction on different data (SIMD/SIMT). This makes GPUs effective at matrix multiplication, which is the dominant operation in both graphics rendering and neural network computation. However, this capability comes at high power cost: a laptop GPU draws 30–50 watts under load, and datacenter GPUs draw 200–400 watts.

For AI workloads, GPUs remain the dominant platform for both training and inference. The CUDA ecosystem (NVIDIA) and ROCm (AMD) provide the software stacks that machine learning frameworks depend on. Neural processing units offer a more power-efficient alternative for inference specifically, but lack the generality and software maturity of GPU compute.

hardware accelerator — the general class of specialized processors
neural processing unit — a lower-power accelerator specialized for inference

graphics processing unit

Relations

Cite

graphics processing unit

Related terms

Relations

Cite