Production-grade AI engine to speed up training and inferencing in your existing technology stack.
In a rush? Get started easily:
pip install onnxruntime
nuget install Microsoft.ML.OnnxRuntime
Don't see your favorite platform? See the many others we support →
Python
C#
JavaScript
Java
C++
import onnxruntime as ort
# Load the model and create InferenceSession
model_path = "path/to/your/onnx/model"
session = ort.InferenceSession(model_path)
# Load and preprocess the input image inputTensor
...
# Run inference
outputs = session.run(None {"input": inputTensor})
print(outputs)
Integrate the power of Generative AI and Large language Models (LLMs) in your apps and services with ONNX Runtime. No matter what language you develop in or what platform you need to run on, you can make use of state-of-the-art models for image synthesis, text generation, and more.
Do you program in Python? C#? C++? Java? JavaScript? Rust? No problem. ONNX Runtime has you covered with support for many languages. And it runs on Linux, Windows, Mac, iOS, Android, and even in web browsers.
CPU, GPU, NPU - no matter what hardware you run on, ONNX Runtime optimizes for latency, throughput, memory utilization, and binary size. In addition to excellent out-of-the-box performance for common usage patterns, additional model optimization techniques and runtime configurations are available to further improve performance for specific use cases and models.
ONNX Runtime powers AI in Microsoft products including Windows, Office, Azure Cognitive Services, and Bing, as well as in thousands of other projects across the world. ONNX Runtime is cross-platform, supporting cloud, edge, web, and mobile experiences.
Learn more about ONNX Runtime Inferencing →Run PyTorch and other ML models in the web browser with ONNX Runtime Web.
Infuse your Android and iOS mobile apps with AI using ONNX Runtime Mobile.
ONNX Runtime reduces costs for large model training and enables on-device training.
Learn more about ONNX Runtime Training →Accelerate training of popular models, including Hugging Face models like Llama-2-7b and curated models from the Azure AI | Machine Learning Studio model catalog.
On-device training with ONNX Runtime lets developers take an inference model and train it locally to deliver a more personalized and privacy-respecting experience for customers.