Ampere CPU inference benchmark

Generative AI: Efficient Inference on Cloud CPUs

It’s been a while since I last wrote here. Lately, I’ve been diving deep into AI inference—the process of running AI models to generate responses—specifically exploring whether we truly need expensive GPUs for running modern language models. Spoiler alert: the answer might surprise you. After extensive testing on Oracle Cloud Infrastructure (OCI), comparing ARM-based Ampere processors against the latest AMD EPYC chips, I discovered that the right combination of software optimizations and compressed models can deliver remarkable performance—all without a single GPU. ...

February 4, 2026 · 5 min · Enrico Pesce
Comparing CPU Multicore Performance of OCI Compute Standard Flex Shapes

Comparing CPU Multicore Performance of OCI Compute Standard Flex Shapes

When selecting a compute instance, factors such as raw computational power, price-to-performance ratio, and workload optimization play a significant role. The following standard flex shapes available in most OCI regions are: VM.Standard.E4.Flex (Processor: AMD EPYC 7J13. Base frequency 2.55 GHz, max boost frequency 3.5 GHz) VM.Standard.E5.Flex (Processor: AMD EPYC 9J14. Base frequency 2.4 GHz, max boost frequency 3.7 GHz) VM.Standard3.Flex (Processor: Intel Xeon Platinum 8358. Base frequency 2.6 GHz, max turbo frequency 3.4 GHz) VM.Optimized3.Flex (Processor: Intel Xeon 6354. Base frequency 3.0 GHz, max turbo frequency 3.6 GHz) VM.Standard.A1.Flex (Each OCPU corresponds to a single hardware execution thread. Processor: Ampere Altra Q80-30. Max frequency 3.0 GHz.) In this article: Performance testing with PHP and OCI Compute instances I have tested a single PHP thread/cpu execution over all OCI standard flex shapes, now I conducted multicore benchmark tests with Geekbench 6 using 2, 4 and 8 CPU. ...

March 19, 2024 · 3 min · Enrico Pesce

OCI Compute Standard Flex Shapes: Another CPU Multicore Benchmark

When selecting a compute instance, factors such as raw computational power, price-to-performance ratio, and workload optimization play a significant role. Let’s focus on the following standard flex shapes available in most OCI regions: VM.Standard.E4.Flex (Processor: AMD EPYC 7J13. Base frequency 2.55 GHz, max boost frequency 3.5 GHz) VM.Standard.E5.Flex (Processor: AMD EPYC 9J14. Base frequency 2.4 GHz, max boost frequency 3.7 GHz) VM.Standard3.Flex (Processor: Intel Xeon Platinum 8358. Base frequency 2.6 GHz, max turbo frequency 3.4 GHz) VM.Optimized3.Flex (Processor: Intel Xeon 6354. Base frequency 3.0 GHz, max turbo frequency 3.6 GHz) VM.Standard.A1.Flex (Each OCPU corresponds to a single hardware execution thread. Processor: Ampere Altra Q80-30. Max frequency 3.0 GHz.) I conducted benchmark tests with Geekbench 6 on three CPU configurations: 2, 4, and 8 cores. ...

February 10, 2024 · 2 min · Enrico Pesce
Performance testing with PHP and OCI Compute instances

Performance testing with PHP and OCI Compute instances

A while ago, I developed a tool with the aim of assessing the actual performance improvement between different versions of PHP. Subsequently, I search to understand which AWS instance type was the most performant. Since AWS does not allow for custom sizing of CPU and RAM resources, I wanted to explore the differences among the various instance types and determine which one would be most cost-effective to choose. During the holiday season, I dedicated myself to expanding this project and doing the same analysis with OCI , Oracle Cloud Infrastructure. ...

January 19, 2024 · 4 min · Enrico Pesce