Ori Global Cloud Blog

What is Reinforcement Learning (RL)?

Explore reinforcement learning (RL), how it works, and essential RL techniques such as Q-learning, policy gradient, and actor-critic methods.

Deepak Manoor Apr 7, 2025

How to build an AI Agent for Health Advice

Learn how to build an AI agent based on the NHS Health A-Z Data that makes it easy to find answers to health and medical queries.

Neha Sharma Feb 14, 2025

Unlocking Data Privacy for Enterprise AI - Data Safe Fine-Tuning and Training

Explore how enterprises can leverage sensitive datasets for AI training while ensuring data privacy through techniques like Differential Privacy.

Neha Sharma Jan 15, 2025

Accelerate Llama 3.1 8B Instruct Inference with TensorRT LLM

An end to end Tutorial using Ori's Virtual Machines, Llama3.1 8B Instruct, and FastAPI for speedy batch inference with TensorRT LLM.

Ciera Fowler Jan 3, 2025

Benchmarking Llama 3.1 8B Instruct on Nvidia H100 and A100 chips with the vLLM Inferencing Engine

Benchmarking llama 3.1 8B Instruct with vLLM using BeFOri to benchmark time to first token (TTFT), inter-token latency, end to end latency, and...

Ciera Fowler Oct 11, 2024

An introduction to AI agents

Agentic AI is the next frontier in AI adoption. Discover more about AI agents in this blog post: what are they, types of agents, benefits, AI agents...

Deepak Manoor Jul 31, 2024

How to Perform a Cost-per-Token Analysis of Self-Hosted LLMs Using BeFOri: DBRX from Databricks

Discover how to use BeFOri to calculate a cost per input and output token for self hosted models and apply this methodology to the DBRX Base model...

Ciera Fowler Jul 3, 2024

How to run Snowflake Arctic Model Inference on NVIDIA H100s

Ready to experience the Snowflake-Arctic-instruct model with Hugging Face? In this blog we are going to walk you through environment setup, model...

Neha Sharma May 17, 2024

Unveiling a New Benchmarking Framework from Ori

Access BeFOri for LLama2 and LLama3 Benchmarks on Nvidia V100s and H100 Chips

Ciera Fowler May 8, 2024

Blog Post

How to Merge Models for Code-Generating LLMs

Generative AI coding is a powerful assistant for software developers. Mergekit offers an easy way to blend pre-trained code LLMs and create your own...

Neha Sharma Apr 2, 2024

Choosing between NVIDIA H100 vs A100 - Performance and Costs Considerations

When should you opt for H100 GPUs over A100s for ML training and inference? Here's a top down view when considering cost, performance and use case.

Daniel Van Den Berghe Feb 21, 2024