v2.4.1 — Zero-copy GPU pipeline · 47k GitHub stars

From import to inference.
Four lines.

Bounding boxes, segmentation masks, and pose skeletons from a single import. Millisecond inference. No dependency hell.

inference.pyPython
1import vision as vs
2
3# Load any image — URL, path, or numpy array
4frame = vs.load("https://cdn.vision.dev/sample.jpg")
5model = vs.Model("yolo-v9-turbo", device="cuda")
6
7detections = model.detect(frame, conf=0.72)
$ pip install vision-frameworkcopy

▶ output — detections

Sample detection frame showing people working in an office environment

LIVE · 4.2ms

{
  "latency_ms": 4.2,
  "detections": [
    { "label": "person",   "conf": 0.97, "bbox": [142, 88, 310, 420] },
    { "label": "laptop",   "conf": 0.91, "bbox": [380, 210, 560, 380] },
    { "label": "backpack", "conf": 0.83, "bbox": [88, 310, 195, 490]  }
  ]
}

Start Cloud Playground →No email gate · Free tier · Apache 2.0

// benchmarks

Instruments on. Numbers real.

0.0ms

Inference Latency

RTX 4090 · batch=1

0 FPS

Throughput

A100 · batch=32

0.0%

COCO mAP@50

YOLO-v9-turbo

0.0k

GitHub Stars

and growing

// latency comparison — lower is better

vision

4.2ms

ultralytics

11.8ms

detectron2

28.4ms

mmdetection

34.7ms

// capabilities

Five modules. One API.

Every capability uses the same model.task(frame) pattern. No context switching.

DETECTION4.2ms

Object Detection

YOLO-v9, RT-DETR, and custom architectures. Multi-class, multi-scale, with NMS built in.

>>> model.detect(frame)↗

bbox + conf scores

SEGMENTATION8.1ms

Instance Segmentation

Pixel-perfect masks per instance. SAM-2, Mask R-CNN, and YOLO-seg in one API surface.

>>> model.segment(frame)↗

pixel masks

TRACKING6.7ms

Multi-Object Tracking

ByteTrack, DeepSORT, and BoT-SORT. Persistent IDs across frames, even through occlusions.

>>> model.track(video)↗

persistent IDs

POSE5.3ms

Pose Estimation

17-keypoint COCO skeleton. Whole-body, hand, and face landmarks. Real-time on edge devices.

>>> model.pose(frame)↗

keypoint skeleton

CLASSIFY1.8ms

Image Classification

Top-k predictions with calibrated confidence. EfficientNet, ViT, ConvNeXt. Fine-tune in 3 lines.

>>> model.classify(frame)↗

top-5 labels

// model_zoo48 pretrained architectures · ONNX export · TensorRT · CoreML

Browse models →

// integrations + changelog

Plugs into your stack. Ships fast.

PyTorch

CUDA 12.3

ONNX Runtime

TensorRT

OpenCV

NumPy

FastAPI

Docker

Kubernetes

Triton

Hugging Face

Roboflow

PyTorch

CUDA 12.3

ONNX Runtime

TensorRT

OpenCV

NumPy

FastAPI

Docker

Kubernetes

Triton

Hugging Face

Roboflow

// installation options

$ pip install vision-framework

Core · CPU inference

$ pip install vision-framework[cuda]

CUDA 12.3 · GPU acceleration

$ pip install vision-framework[tensorrt]

TensorRT 10 · Optimized inference

$ pip install vision-framework[full]

All backends + model zoo

Python 3.9+LinuxmacOSWindowsARM64

git log --onelinemain

a3f9c12featzero-copy GPU memory transfer for batch inference2h ago

b8d2e47perf23% latency reduction on RTX 4090 with TensorRT 106h ago

c1a5f89fixsegmentation mask alignment off-by-one at boundary11h ago

d4e8b23featByteTrack v3 with re-ID embeddings1d ago

e7f3a56docsadd ONNX export tutorial for edge deployment1d ago

f2c9d78featSAM-2 integration for zero-shot segmentation2d ago

g5b1e34perfasync preprocessing pipeline, 2x throughput2d ago

h9a7c12fixCUDA OOM on batch > 128, now gracefully degrades3d ago

i3d4f67featCoreML export for Apple Neural Engine3d ago

j6e8b90choreupgrade to PyTorch 2.6 with compile improvements4d ago

a3f9c12featzero-copy GPU memory transfer for batch inference2h ago

b8d2e47perf23% latency reduction on RTX 4090 with TensorRT 106h ago

c1a5f89fixsegmentation mask alignment off-by-one at boundary11h ago

d4e8b23featByteTrack v3 with re-ID embeddings1d ago

e7f3a56docsadd ONNX export tutorial for edge deployment1d ago

ready to execute

Stop evaluating.
Start inferring.

Free tier. No email gate. Apache 2.0. The only thing between you and 4ms inference is one terminal command.

$pip install vision-framework⌘ copy

Start Cloud Playground Read the Docs →

⚖Apache 2.0

✓No credit card

★47k GitHub stars

⚡Weekly releases

From import to inference.Four lines.