Banana was a US-based AI infrastructure startup that operated a serverless GPU platform for ML inference. The product offered autoscaling GPU hosting, pass-through pricing, observability, and a full DevOps experience including CI/CD, rolling deploys, and tracing — pitched at AI teams that wanted to ship and scale models without managing GPU clusters themselves.

The platform supported common model frameworks and exposed a simple HTTP API for inference, with scale-to-zero economics that were a major selling point during the early generative-AI boom. At its peak, Banana was a frequently mentioned alternative to Replicate and similar serverless GPU providers.

The company was founded by Erik Dunteman and team and raised roughly $3–5M in seed funding across its lifetime. As demand for GPUs spiked through 2023, Banana's unit economics deteriorated: reserving H100s and A100s became significantly more expensive, while customer revenue did not scale to match. The company publicly cited flat-to-down revenue through 2023 and churn driven by pricing, latency, and reliability pressures.

In February 2024 Banana announced it would sunset its serverless GPU product. The platform was shut down at noon PST on 31 March 2024, and the company subsequently wound down. Founder Erik Dunteman has written publicly about the pivot away from the original Banana business.

For anyone still researching Banana, the product is no longer available and existing workloads were migrated off the platform during early 2024. Alternatives in the same category include Replicate, Modal, RunPod, Beam, and Baseten.