Simplify Idefics2, Idefics3, SmolVLM images handling #37291

yonigozlan · 2025-04-04T17:18:51Z

Simplify the handling of images in both processing and modeling.

Now the images/patches are flattened before being processed and passed to the models. This means that the image processing is simplified (no need for padding in the number of images/patches dimension), along with the modeling code ( No more padding images/patches containing only 0/False needing to be removed).

I tested thoroughly for each models with multiple images, batched images etc. and found no differences.

Cc @andimarafioti @orrzohar

github-actions · 2025-04-04T17:19:07Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

HuggingFaceDocBuilderDev · 2025-04-04T17:44:46Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

process flatten images directly for idefics2 idefics3 smolvlm

68d3f0e

github-actions bot marked this pull request as draft April 4, 2025 17:19

fix missing attention mask for padded image

8051d94

yonigozlan marked this pull request as ready for review April 4, 2025 18:09

Merge branch 'main' into flatten-idefics3-im-proc

b4c187c

github-actions bot requested review from ArthurZucker and qubvel April 4, 2025 18:09

yonigozlan added 4 commits April 4, 2025 18:29

fix when pixels_attention_mask is none

01f86c0

fix modeling tests

71125eb

fix style

fb17dc4

nit

ce2a37a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify Idefics2, Idefics3, SmolVLM images handling #37291

Simplify Idefics2, Idefics3, SmolVLM images handling #37291

yonigozlan commented Apr 4, 2025 •

edited

Loading

github-actions bot commented Apr 4, 2025

HuggingFaceDocBuilderDev commented Apr 4, 2025

Simplify Idefics2, Idefics3, SmolVLM images handling #37291

Are you sure you want to change the base?

Simplify Idefics2, Idefics3, SmolVLM images handling #37291

Conversation

yonigozlan commented Apr 4, 2025 • edited Loading

github-actions bot commented Apr 4, 2025

HuggingFaceDocBuilderDev commented Apr 4, 2025

yonigozlan commented Apr 4, 2025 •

edited

Loading