Ampere CPU inference benchmark

Generative AI: Efficient Inference on Cloud CPUs

It’s been a while since I last wrote here. Lately, I’ve been diving deep into AI inference—the process of running AI models to generate responses—specifically exploring whether we truly need expensive GPUs for running modern language models. Spoiler alert: the answer might surprise you. After extensive testing on Oracle Cloud Infrastructure (OCI), comparing ARM-based Ampere processors against the latest AMD EPYC chips, I discovered that the right combination of software optimizations and compressed models can deliver remarkable performance—all without a single GPU. ...

February 4, 2026 · 5 min · Enrico Pesce
Comparing CPU Multicore Performance of OCI Compute Standard Flex Shapes

Comparing CPU Multicore Performance of OCI Compute Standard Flex Shapes

When selecting a compute instance, factors such as raw computational power, price-to-performance ratio, and workload optimization play a significant role. The following standard flex shapes available in most OCI regions are: VM.Standard.E4.Flex (Processor: AMD EPYC 7J13. Base frequency 2.55 GHz, max boost frequency 3.5 GHz) VM.Standard.E5.Flex (Processor: AMD EPYC 9J14. Base frequency 2.4 GHz, max boost frequency 3.7 GHz) VM.Standard3.Flex (Processor: Intel Xeon Platinum 8358. Base frequency 2.6 GHz, max turbo frequency 3.4 GHz) VM.Optimized3.Flex (Processor: Intel Xeon 6354. Base frequency 3.0 GHz, max turbo frequency 3.6 GHz) VM.Standard.A1.Flex (Each OCPU corresponds to a single hardware execution thread. Processor: Ampere Altra Q80-30. Max frequency 3.0 GHz.) In this article: Performance testing with PHP and OCI Compute instances I have tested a single PHP thread/cpu execution over all OCI standard flex shapes, now I conducted multicore benchmark tests with Geekbench 6 using 2, 4 and 8 CPU. ...

March 19, 2024 · 3 min · Enrico Pesce

OCI Compute Standard Flex Shapes: Another CPU Multicore Benchmark

When selecting a compute instance, factors such as raw computational power, price-to-performance ratio, and workload optimization play a significant role. Let’s focus on the following standard flex shapes available in most OCI regions: VM.Standard.E4.Flex (Processor: AMD EPYC 7J13. Base frequency 2.55 GHz, max boost frequency 3.5 GHz) VM.Standard.E5.Flex (Processor: AMD EPYC 9J14. Base frequency 2.4 GHz, max boost frequency 3.7 GHz) VM.Standard3.Flex (Processor: Intel Xeon Platinum 8358. Base frequency 2.6 GHz, max turbo frequency 3.4 GHz) VM.Optimized3.Flex (Processor: Intel Xeon 6354. Base frequency 3.0 GHz, max turbo frequency 3.6 GHz) VM.Standard.A1.Flex (Each OCPU corresponds to a single hardware execution thread. Processor: Ampere Altra Q80-30. Max frequency 3.0 GHz.) I conducted benchmark tests with Geekbench 6 on three CPU configurations: 2, 4, and 8 cores. ...

February 10, 2024 · 2 min · Enrico Pesce