What Etched does

Etched is building custom AI inference hardware specialized for transformer models. Its chip, Sohu, is an application-specific integrated circuit that hard-wires the transformer computation graph into silicon, trading the general-purpose programmability of GPUs for efficiency on transformer inference.

Key capabilities

  • A transformer-specific ASIC architecture optimized for attention, projections, and feed-forward layers
  • Silicon built on a TSMC 4nm process with 144GB of HBM3E memory
  • High-throughput autoregressive inference, which the company contrasts with general-purpose GPUs
  • A design focused narrowly on running large language model inference rather than broad workloads

Who it's for

Etched targets organizations running large-scale AI inference, particularly large language models, where throughput and cost per token matter. Its focus on transformer-only acceleration suits teams deploying foundation models at scale and seeking purpose-built alternatives to general-purpose accelerators. As described publicly, Sohu has not been generally available for purchase or rental.