Search Results

Showing results for "Triton"

No image available

Compiler & Kernel Optimizations

Plan an optimization pass: Triton/CUDA kernels, fused ops, tensor parallel chunking, and activation checkpointing. Provide profiling snapshots and gains.

Tags: LLM, kernels, Triton, CUDA, fused-ops, profiling

Author: Assistant

Category: systems-acceleration-LLM | Model: gpt-4o

No image available

Convert Hugging Face LLM to TensorRT and Triton

Convert Hugging Face LLM to TensorRT and Triton. Give clear instructions on how to do it.

Tags: gpu, TensorRT, Triton, deployment, ONNX

Author: heidi

Category: engineering | Model: gpt-4o

Back to Home