-
-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Model] Support Llama4 in vLLM
ci/build
documentation
Improvements or additions to documentation
frontend
multi-modality
Related to multi-modality (#4194)
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#16104
opened Apr 5, 2025 by
houseroad
Loading…
[Model] use AutoWeightsLoader for stablelm,starcoder2,zamba2
#16103
opened Apr 5, 2025 by
lengrongfu
Loading…
[v1] Implement HybridKVCacheManager to support hybrid models with different KV cache type
tpu
Related to Google TPUs
v1
#16101
opened Apr 5, 2025 by
heheda12345
•
Draft
[Misc] refactor example eagle
documentation
Improvements or additions to documentation
#16100
opened Apr 5, 2025 by
reidliu41
Loading…
[Model] use AutoWeightsLoader for phi, gemma, deepseek
#16088
opened Apr 5, 2025 by
jonghyunchoe
Loading…
[V1][Spec Decode] Do not generate draft tokens beyond max_model_len
needs-tests
Tests needed for this PR
v1
#16087
opened Apr 5, 2025 by
WoosukKwon
Loading…
Add runtime precondition check for paged attention kernel.
tpu
Related to Google TPUs
v1
#16085
opened Apr 5, 2025 by
vanbasten23
Loading…
[BugFix][Frontend] Fix Something isn't working
frontend
needs-tests
Tests needed for this PR
LLM.chat()
tokenization
bug
#16081
opened Apr 5, 2025 by
njhill
Loading…
[CI/Build] Check for dynamic inputs before running PyTorch code
tpu
Related to Google TPUs
v1
#16079
opened Apr 4, 2025 by
yarongmu-google
Loading…
[V1][Spec Decode] Fix and Optimize Rejection Sampler
v1
#16077
opened Apr 4, 2025 by
ekagra-ranjan
•
Draft
3 tasks
[V1] Scatter and gather placeholders in the model runner
documentation
Improvements or additions to documentation
multi-modality
Related to multi-modality (#4194)
ready
ONLY add when PR is ready to merge/full CI is needed
tpu
Related to Google TPUs
v1
#16076
opened Apr 4, 2025 by
ywang96
Loading…
[TPU][V1][DEBUG] Provide Env Variable To Disable Sampler
ready
ONLY add when PR is ready to merge/full CI is needed
tpu
Related to Google TPUs
v1
#16063
opened Apr 4, 2025 by
NickLucche
Loading…
[ROCm][V1] Changes needed for making vllm run on Fedora 41 with gtx1100
ci/build
#16062
opened Apr 4, 2025 by
martinhoyer
Loading…
[fix]: Dockerfile.ppc64le fixes for opencv-python and hf-xet
ci/build
#16048
opened Apr 4, 2025 by
Shafi-Hussain
•
Draft
[Misc] improve chat_with_tools example
documentation
Improvements or additions to documentation
#16044
opened Apr 4, 2025 by
reidliu41
Loading…
Add NeuronxDistributedInference support, Speculative Decoding, Dynamic on-device sampling
ci/build
documentation
Improvements or additions to documentation
#16043
opened Apr 4, 2025 by
aws-satyajith
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.