-
Notifications
You must be signed in to change notification settings - Fork 425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Panopticon Model #2692
Panopticon Model #2692
Conversation
@microsoft-github-policy-service agree I am not part of a company but this model was developed together with the other co-authors listed here https://arxiv.org/abs/2503.10845. |
Thanks for the PR @LeWaldm! I haven't looked the underlying code yet, but curious if there is a way to modify the image size easily by interpolating the pos embed? Would be nice to be able to pass some extra args to the model to do so. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An interesting idea that would allow us to add more models under non-MIT licenses!
However, we can't easily get test coverage without downloading something, which we don't want to do in our unit tests. It also doesn't seem possible to instantiate the model with random weights (for reproducibility or experimentation).
Let me know if you want help with the unit tests.
Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>
Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>
Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>
From Regarding unit tests:
|
@isaaccorley the embeddings are automatically interpolated to the given image! I only wanted to highlight that inputting images of shape 224x224 probably gives best results since 224x224 was the shape during pre-training. |
This means the file needs a docstring explaining what the file is. Something like: """Panopticon model.""" at the top of the file will fix this.
TorchGeo follows the torchvision model API design. All of our existing models should be uniformly implemented. See DOFA or Copernicus-FM for examples.
I would personally prefer adding the model code since it makes it easy to document, modify, and experiment with, but it's up to you. I can monkeypatch torch.hub.load to get fake coverage, but I would much rather test the model itself.
On how many Python versions, OSes, and PyTorch versions? Does your repo have documentation of function and class parameters, static type hints, etc? Software is a lot more than just code. We frequently get bug reports for DOFA and other models that allow us to improve the models. |
I created a file that contains all the code for panopticon (didn't push it yet, here to run panopticon. It has ~1000 lines and 23 functions / classes to potentially test. I will not be able to test all of these. I might be able to write tests just for the new code from Panopticon (7 classes / functions). Would that make sense or do you need test for everything? How do we continue from here? |
We likely don't need that entire file. I'm guessing most of these classes can be imported from timm. We explicitly can't copy anything from DINOv2 since it's under a different license. So we would only include things unique to Panopticon and import the rest. Don't worry about testing, I can handle that. |
So dinov2 also has a torchhub. Would a potential middle ground be: load dinov2 via torchhub (also doesn't add any dinov2 code to torchgeo) and add all new panopticon stuff to torchgeo with tests? |
Then, we could test the panopticon PE separately with quick tests and only have one slow test with a fwd pass of the whole model |
Timm also has DINOv2 ViTs: https://github.com/huggingface/pytorch-image-models/blob/v1.0.15/timm/models/vision_transformer.py#L3145 |
I will load dinov2 with timm and add the panopticon code. Will be done in next 2h. |
I just pushed the code where basic functionality is working. However, there still is a lot of documentation to do. I will do that after dinner today. I have not written any tests yet. If you can write some tests @adamjstewart , we might be able to make it until tomorrow. Else, we will need to wait for the next release. |
Yes, I can handle testing. I'm planning on writing release notes all day tomorrow, so if things still aren't finished when I'm done I'll jump in and finish them up. It already looks pretty close, I'm not worried about waiting a little for this PR since it will be a good way to advertise the paper. |
I added documentation. The timm vision transformer is slightly different. Hence, the model is close but not exactly identical to our codebase. I think that is fine for now. Should we advertise that somewhere? How do we continue with the tests? What should be tested? |
Is the model similar enough that the pretrained weights still load? We need to test every single line of code to get 100% coverage. See the DOFA and Copernicus tests for examples. |
I (with copilot) wrote some tests. I can load the model weights, but the results are slightly different from our implementation. I investigate that later today. Apart from that, everything should be ready to go! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LeWaldm can you do one last check and make sure I didn't break anything? Let me know if you have any questions about any of my changes.
Merging due to release deadline, but let us know if there are any bugs that need to be fixed in the 0.7.1 release. |
Hi all,
This PR adds the Panopticon model to torchgeo.
Since Panopticon already has a working hubconf, we do not copy the code to torchgeo but just use the torch.hub.load to initialise the model & load the weights. 3 thoughts:
Happy about any feedback & how to proceed!