vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 6.7k
Star 43.6k

Code
Issues 1.6k
Pull requests 542
Discussions
Actions
Projects 8
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: vllm-project/vllm

Labels 45 Milestones 0

New pull request New

542 Open 7,287 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Model] Support Llama4 in vLLM ci/build documentation

Improvements or additions to documentation

frontend multi-modality

Related to multi-modality (#4194)

ready

ONLY add when PR is ready to merge/full CI is needed

#16104 opened Apr 5, 2025 by houseroad

Loading…

[Model] use AutoWeightsLoader for stablelm,starcoder2,zamba2

#16103 opened Apr 5, 2025 by lengrongfu

Loading…

[v1] Implement HybridKVCacheManager to support hybrid models with different KV cache type tpu

Related to Google TPUs

#16101 opened Apr 5, 2025 by heheda12345 • Draft

[Misc] refactor example eagle documentation

Improvements or additions to documentation

#16100 opened Apr 5, 2025 by reidliu41

Loading…

[Frontend] [Bugfix] Refactor tool parsers and simplify the tool parsing interface. ci/build frontend

#16096 opened Apr 5, 2025 by paolovic

Loading…

[Bugfix] add hf_token to EngineArgs frontend

#16093 opened Apr 5, 2025 by paolovic

Loading…

[Bugfix]fix asyncLLM test_abort v1

#16090 opened Apr 5, 2025 by KubeKyrie

Loading…

[Model] use AutoWeightsLoader for phi, gemma, deepseek

#16088 opened Apr 5, 2025 by jonghyunchoe

Loading…

[V1][Spec Decode] Do not generate draft tokens beyond max_model_len needs-tests

Tests needed for this PR

#16087 opened Apr 5, 2025 by WoosukKwon

Loading…

Add runtime precondition check for paged attention kernel. tpu

Related to Google TPUs

#16085 opened Apr 5, 2025 by vanbasten23

Loading…

[Bugfix] fix gettid method is not define

#16084 opened Apr 5, 2025 by lengrongfu

Loading…

[BugFix][Frontend] Fix LLM.chat() tokenization bug

Something isn't working

frontend needs-tests

Tests needed for this PR

#16081 opened Apr 5, 2025 by njhill

Loading…

[Fix the torch pip install] ci/build

#16080 opened Apr 4, 2025 by yangw-dev

Loading…

[CI/Build] Check for dynamic inputs before running PyTorch code tpu

Related to Google TPUs

#16079 opened Apr 4, 2025 by yarongmu-google

Loading…

[WIP] Add Flex to V1 documentation

Improvements or additions to documentation

#16078 opened Apr 4, 2025 by drisspg • Draft

[V1][Spec Decode] Fix and Optimize Rejection Sampler v1

#16077 opened Apr 4, 2025 by ekagra-ranjan • Draft

3 tasks

[V1] Scatter and gather placeholders in the model runner documentation

Improvements or additions to documentation

multi-modality

Related to multi-modality (#4194)

ready

ONLY add when PR is ready to merge/full CI is needed

tpu

Related to Google TPUs

#16076 opened Apr 4, 2025 by ywang96

Loading…

[Core] Support full cuda graph in v1 v1

#16072 opened Apr 4, 2025 by chanh • Draft

[TPU][V1][DEBUG] Provide Env Variable To Disable Sampler ready

ONLY add when PR is ready to merge/full CI is needed

tpu

Related to Google TPUs

#16063 opened Apr 4, 2025 by NickLucche

Loading…

[ROCm][V1] Changes needed for making vllm run on Fedora 41 with gtx1100 ci/build

#16062 opened Apr 4, 2025 by martinhoyer

Loading…

[Kernel] Support merge attn cuda kernel ci/build v1

#16060 opened Apr 4, 2025 by ywq880611

Loading…

[fix]: Dockerfile.ppc64le fixes for opencv-python and hf-xet ci/build

#16048 opened Apr 4, 2025 by Shafi-Hussain • Draft

fix neuron config override

#16045 opened Apr 4, 2025 by ajayvohra2005

Loading…

[Misc] improve chat_with_tools example documentation

Improvements or additions to documentation

#16044 opened Apr 4, 2025 by reidliu41

Loading…

Add NeuronxDistributedInference support, Speculative Decoding, Dynamic on-device sampling ci/build documentation

Improvements or additions to documentation

#16043 opened Apr 4, 2025 by aws-satyajith

Loading…

Previous 1 2 3 4 5 … 21 22 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly