Inferless
Popular repositories Loading
-
triton-co-pilot
triton-co-pilot PublicGenerate Glue Code in seconds to simplify your Nvidia Triton Inference Server Deployments
-
whisper-large-v3
whisper-large-v3 PublicState‑of‑the‑art speech recognition model for English, delivering transcription accuracy across diverse audio scenarios. <metadata> gpu: T4 | collections: ["CTranslate2"] </metadata>
-
qwq-32b-preview
qwq-32b-preview Public templateA 32B experimental reasoning model for advanced text generation and robust instruction following. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>
-
deepseek-r1-distill-qwen-32b
deepseek-r1-distill-qwen-32b Public templateA distilled DeepSeek-R1 variant built on Qwen2.5-32B, fine-tuned with curated data for enhanced performance and efficiency. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>
Repositories
- mistral-7b-instruct-v0.2 Public
An 7B model with a 32k token context window and optimized attention mechanisms for superior dialogue and reasoning. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>
inferless/mistral-7b-instruct-v0.2’s past year of commit activity - phi4-vllm-gguf Public Forked from rbgo404/phi4-vllm-gguf
A 14B model optimized in GGUF format for efficient inference, designed to excel in complex reasoning tasks. <metadata> gpu: A100 | collections: ["vLLM","GGUF"] </metadata>
inferless/phi4-vllm-gguf’s past year of commit activity - donut-doc-vqa Public
An OCR-free document understanding model that uses a Swin Transformer encoder and BART decoder, fine-tuned on the DocVQA dataset.
inferless/donut-doc-vqa’s past year of commit activity - stable-diffusion-xl-turbo Public template
A distilled and cost-effective variant of SDXL that delivers high-quality text-to-image generation with accelerated inference speed. <metadata> gpu: T4 | collections: ["Diffusers"] </metadata>
inferless/stable-diffusion-xl-turbo’s past year of commit activity - stable-diffusion-v1-5 Public template
A text-to-image model by Stability AI, renowned for generating high-quality, diverse images from text prompts. <metadata> gpu: T4 | collections: ["Diffusers"] </metadata>
inferless/stable-diffusion-v1-5’s past year of commit activity - whisper-large-v3 Public
State‑of‑the‑art speech recognition model for English, delivering transcription accuracy across diverse audio scenarios. <metadata> gpu: T4 | collections: ["CTranslate2"] </metadata>
inferless/whisper-large-v3’s past year of commit activity - Customer-Service-Voicebot Public
inferless/Customer-Service-Voicebot’s past year of commit activity - mistral-small-3.1-24b-instruct Public template
Advanced multimodal language model developed by Mistral AI with enhanced text performance, robust vision capabilities, and an expanded context window of up to 128,000 tokens. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>
inferless/mistral-small-3.1-24b-instruct’s past year of commit activity - spatiallm-llama-1b Public template
A 3D large language model that processes point cloud data to produce structured 3D scene representations. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>
inferless/spatiallm-llama-1b’s past year of commit activity