Ori Global Cloud Blog

Learn how to run Meta’s multimodal Llama 4 models with Hugging Face Transformers and vLLM on an Ori cloud GPU, and check our comparison of Llama 4 vs...

Deepak Manoor Apr 12, 2025

What is Reinforcement Learning (RL)?

Explore reinforcement learning (RL), how it works, and essential RL techniques such as Q-learning, policy gradient, and actor-critic methods.

Deepak Manoor Apr 7, 2025

Product updates

NVIDIA H200 GPUs Now Generally Available on Ori Global Cloud

Accelerate your AI with NVIDIA H200 GPUs on Ori to train models and run inference more efficiently than ever before.

Deepak Manoor Mar 11, 2025

Company News

Wa’ed Ventures, Saudi Aramco's venture arm, invests in Ori

Ori secures strategic investment from Wa’ed Ventures to fuel expansion in Saudi Arabia and the Middle East

Daniel Van Den Berghe Feb 17, 2025

How to build an AI Agent for Health Advice

Learn how to build an AI agent based on the NHS Health A-Z Data that makes it easy to find answers to health and medical queries.

Neha Sharma Feb 14, 2025

Tutorial

How to deploy an interactive chatbot with Ori Inference Endpoints and Gradio

Learn how to deploy chatbots based on LLMs with Ori Inference Endpoints and Gradio

Adrian Matei Feb 10, 2025

Tutorial

How to run Mistral Small 3 on a cloud GPU with vLLM

Discover how to easily deploy Mistral Small 3 on a cloud GPU with vLLM and our model analysis with verbal, math and coding prompts.

Deepak Manoor Feb 5, 2025

Tutorial

How to run DeepSeek R1 on a cloud GPU with Ollama

Learn how to easily deploy DeepSeek R1 Distill 70B on an H100 GPU with Ollama and OpenWebUI, plus our thoughts about the model and its innovative...

Deepak Manoor Jan 28, 2025

Unlocking Data Privacy for Enterprise AI - Data Safe Fine-Tuning and Training

Explore how enterprises can leverage sensitive datasets for AI training while ensuring data privacy through techniques like Differential Privacy.

Neha Sharma Jan 15, 2025

LLM

Deploy and scale Qwen 2.5 with just one click on Ori Inference Endpoints

Learn how to deploy and scale Qwen 2.5 1.5B effortlessly with Ori Inference Endpoints.

Deepak Manoor Jan 6, 2025

Accelerate Llama 3.1 8B Instruct Inference with TensorRT LLM

An end to end Tutorial using Ori's Virtual Machines, Llama3.1 8B Instruct, and FastAPI for speedy batch inference with TensorRT LLM.

Ciera Fowler Jan 3, 2025

Product updates

Introducing Ori Inference Endpoints

Say hello to Ori Inference Endpoints, an easy and scalable way to deploy state-of-the-art machine learning models as API endpoints.

João Coelho Dec 17, 2024

Company News

Ori Welcomes Jacob Smith to Its Board of Directors

Meet Ori's new board member Jacob Smith.

Daniel Van Den Berghe Dec 13, 2024

Tutorial

How to run Llama 3.3 70B on a cloud GPU

Learn how to deploy Meta’s new text-generation model Llama 3.3 70B with Ollama and Open WebUI on an Ori cloud GPU.

Deepak Manoor Dec 10, 2024

GPU

An overview of the NVIDIA H200 GPU

Inside the NVIDIA H200: Specifications, use cases, performance benchmarks, and a comparison of H200 vs H100 GPUs.

Deepak Manoor Nov 28, 2024

Tutorial

How to run Genmo Mochi 1 video generation on a cloud GPU

Discover how to deploy Genmo Mochi 1 with ComfyUI on an Ori GPU instance, and read our analysis of this new open source video generation model.

Deepak Manoor Nov 12, 2024

GPU

Everything you need to know about the NVIDIA L40S GPU

Learn more about the NVIDIA L40S, a versatile GPU that is designed to power a wide variety of applications, and check out NVIDIA L40S vs NVIDIA H100...

Deepak Manoor Oct 24, 2024

Company News

Ori Global Cloud deploys a new Private Cloud cluster with 1024 NVIDIA H100 GPUs

Meet Ori Global Cloud's new Private Cloud cluster with 1024 NVIDIA H100 GPUs, designed for massive scale AI with limitless customization.

Patrick Wohlschlegel Oct 16, 2024

Case study

nCompass (YC W24) leverages Ori Serverless Kubernetes to make AI inference 2x cost-effective

Find out how Ori Serverless Kubernetes is helping nCompass run cost-effective LLM inference at scale.

Deepak Manoor Oct 11, 2024

Benchmarking Llama 3.1 8B Instruct on Nvidia H100 and A100 chips with the vLLM Inferencing Engine

Benchmarking llama 3.1 8B Instruct with vLLM using BeFOri to benchmark time to first token (TTFT), inter-token latency, end to end latency, and...

Ciera Fowler Oct 11, 2024

Company News

Welcome to the new Ori Global Cloud

Say hello to the new Ori Global Cloud! Our reimagined brand reflects Ori's commitment to driving the future of AI and cloud innovation, enabling...

Mahdi Yahya Oct 10, 2024

Tutorial

How to run Llama 3.2 11B Vision with Hugging Face Transformers on a cloud GPU

Learn how to deploy Meta’s multimodal Lllama 3.2 11B Vision model with Hugging Face Transformers on an Ori cloud GPU and see how it compares with...

Deepak Manoor Oct 2, 2024

Tutorial

How to run Pixtral 12B on a cloud GPU with vLLM

Discover how to get Mistral’s new multimodal LLM, Pixtral 12B up and running on an Ori cloud GPU.

Deepak Manoor Sep 19, 2024

Product updates

Ori Global Cloud REST API now generally available

Learn more about Ori Global Cloud REST API which helps you create, access and manage Ori cloud resources programmatically.

Daniel Van Den Berghe Sep 19, 2024

Tutorial

How to run Flux.1 image generation on Ori cloud GPUs

Learn how to deploy Flux.1 image generation on the Ori GPU cloud. This tutorial will demonstrate how to create images with Flux's open source...

Deepak Manoor Sep 10, 2024

Company News

Ori Industries and Stelia join forces to build data mobility for the AI era

Ori has partnered with Stelia to enhance AI-driven data processing by integrating Stelia's advanced data mobility platform into Ori's GPU cloud...

Daniel Van Den Berghe Sep 5, 2024

Case study

AI Meets Rugby: Framesports automates match analysis on Ori Serverless Kubernetes

Learn how Ori Serverless Kubernetes is helping Framesports analyze Rugby matches with AI.

Deepak Manoor Aug 29, 2024

LLM

Deploy and scale LLMs on Ori Serverless Kubernetes with Ollama and Open WebUI

Learn how to deploy LLMs and scale inference on Ori Serverless Kubernetes, via Ollama and Open WebUI.

Adrian Matei Aug 22, 2024

Company News

Ori founder and CEO, Mahdi Yahya participates in the AI Action Plan roundtable at 10 Downing Street

Our CEO Mahdi Yahya joined the AI action plan round table at 10 Downing Street to share his insights on supercharging the UK's AI ecosystem

Daniel Van Den Berghe Aug 22, 2024

An introduction to AI agents

Agentic AI is the next frontier in AI adoption. Discover more about AI agents in this blog post: what are they, types of agents, benefits, AI agents...

Deepak Manoor Jul 31, 2024

Analysis

A deep dive into NVIDIA’s Blackwell platform: B100 vs B200 vs GB200 GPUs

Explore the NVIDIA Blackwell GPU platform, featuring powerful superchips like B100, B200, and GB200. Discover how these GPUs are about to unleash a...

Deepak Manoor Jul 22, 2024

Product updates

Introducing Ori Serverless Kubernetes

Meet Ori Serverless Kubernetes, an AI infrastructure service that brings you the best of Serverless and Kubernetes by blending powerful scalability,...

José Domingos Jul 15, 2024

How to Perform a Cost-per-Token Analysis of Self-Hosted LLMs Using BeFOri: DBRX from Databricks

Discover how to use BeFOri to calculate a cost per input and output token for self hosted models and apply this methodology to the DBRX Base model...

Ciera Fowler Jul 3, 2024

Case study

Empowering SMBs with AI: How Emediately is building powerful LLM solutions on Ori’s AI Native GPU cloud

Discover how Ori is helping Emediately bring powerful AI solutions to small and medium businesses.

Deepak Manoor Jun 27, 2024

Company News

Ori Welcomes Richard Tame as Chief Financial Officer

Ori hires Richard Tame as Chief Financial Officer announcement

Daniel Van Den Berghe Jun 25, 2024

How to run Snowflake Arctic Model Inference on NVIDIA H100s

Ready to experience the Snowflake-Arctic-instruct model with Hugging Face? In this blog we are going to walk you through environment setup, model...

Neha Sharma May 17, 2024

Case study

Basecamp Research uses AI to deliver process-ready proteins faster

Basecamp Research leverages Ori's GPU Cloud to help them deliver more accurate structure predictions, more protein annotations and controllable...

Sam Awrabi May 9, 2024

Unveiling a New Benchmarking Framework from Ori

Access BeFOri for LLama2 and LLama3 Benchmarks on Nvidia V100s and H100 Chips

Ciera Fowler May 8, 2024

Blog Post

How to Merge Models for Code-Generating LLMs

Generative AI coding is a powerful assistant for software developers. Mergekit offers an easy way to blend pre-trained code LLMs and create your own...

Neha Sharma Apr 2, 2024

Choosing between NVIDIA H100 vs A100 - Performance and Costs Considerations

When should you opt for H100 GPUs over A100s for ML training and inference? Here's a top down view when considering cost, performance and use case.

Daniel Van Den Berghe Feb 21, 2024

General availability of Virtual Machines with NVIDIA GPUs

General availability of Virtual Machines with NVIDIA GPUs (H100, A100, V100) in Ori Global Cloud.

Daniel Van Den Berghe Feb 12, 2024

Blog Post

How to Navigate a Global GPU Shortage & Scale AI Workloads

A global GPU shortage and rogue compute costs can threaten to sink even the best AI project’s go-to-market plans. How can AI teams navigate around...

Melissa Doré Dec 14, 2023

DevOps

The Ori Magic: Orchestrating Inter-cluster Networking in K8s

This deployment walkthrough demonstrates how Ori simplifies and automates complex orchestration tasks, ensuring seamless communication between...

Chris Galloway Nov 2, 2023

Guide: Integrating Ori with your existing CI/CD pipeline

Explore how to integrate Ori with your existing CI/CD pipelines.

João Coelho Oct 4, 2023

Blog Post

AI at Scale: Deploy LLMs like Code Llama on Any Cloud

Follow this step-by-step guide to quickly deploy Meta’s Code Llama and other open-source Large Language Models (LLMs), using Python and Hugging Face...

Neha Sharma Sep 25, 2023

Blog Post

Educational content, tutorials and insights on the future of AI infrastructure.

Subscribe for more news and insights