Zur Kurzanzeige

dc.contributor.authorHahn, Oliver
dc.contributor.authorReich, Christoph
dc.contributor.authorAraslanov, Nikita
dc.contributor.authorCremers, Daniel
dc.contributor.authorRupprecht, Christian
dc.contributor.authorRoth, Stefan
dc.date.accessioned2025-04-03T14:12:02Z
dc.date.available2025-04-03T14:12:02Z
dc.date.issued2025-06-11
dc.identifier.urihttps://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/4532
dc.descriptionUnsupervised panoptic segmentation aims to partition an image into semantically meaningful regions and distinct object instances without training on manually annotated data. In contrast to prior work on unsupervised panoptic scene understanding, we eliminate the need for object-centric training data, enabling the unsupervised understanding of complex scenes. To that end, we present the first unsupervised panoptic method that directly trains on scene-centric imagery. In particular, we propose an approach to obtain high-resolution panoptic pseudo labels on complex scene-centric data combining visual representations, depth, and motion cues. Utilizing both pseudo-label training and a panoptic self-training strategy yields a novel approach that accurately predicts panoptic segmentation of complex scenes without requiring any human annotations. Our approach significantly improves panoptic quality, e.g., surpassing the recent state of the art in unsupervised panoptic segmentation on Cityscapes by 9.4% points in PQ. Acknowledgments: This project was partially supported by the European Research Council (ERC) Advanced Grant SIMULACRON, DFG project CR 250/26-1 "4D-YouTube", and GNI Project ``AICC''. This project has also received funding from the ERC under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 866008). Additionally, this work has further been co-funded by the LOEWE initiative (Hesse, Germany) within the emergenCITY center [LOEWE/1/12/519/03/05.001(0016)/72] and by the State of Hesse through the cluster project ``The Adaptive Mind (TAM)''. Christoph Reich is supported by the Konrad Zuse School of Excellence in Learning and Intelligent Systems (ELIZA) through the DAAD programme Konrad Zuse Schools of Excellence in Artificial Intelligence, sponsored by the Federal Ministry of Education and Research. License: Code, predictions, and checkpoints are released under the Apache-2.0 license, except for the ResNet-50 DINO backbone (dino_RN50_pretrain_d2_format.pkl), which is adapted from CutLER and published under the CC BY-NC-SA 4.0 license.de_DE
dc.language.isoende_DE
dc.relationIsSupplementTo;arXiv;2504.01955
dc.rightsApache License 2.0
dc.rights.urihttps://www.apache.org/licenses/LICENSE-2.0
dc.subjectunsupervised panoptic segmentationde_DE
dc.subjectscene understandingde_DE
dc.subjectunsupervised scene understandingde_DE
dc.subjectunsupervised segmentationde_DE
dc.subjectunsupervised learningsedde_DE
dc.subjectpanoptic segmentationde_DE
dc.subjectsegmentationde_DE
dc.subjectcomputer visionde_DE
dc.subject.classification4.43-05 Bild- und Sprachverarbeitung, Computergraphik und Visualisierung, Human Computer Interaction, Ubiquitous und Wearable Computingde_DE
dc.subject.ddc004
dc.titleScene-Centric Unsupervised Panoptic Segmentationde_DE
dc.typeSoftwarede_DE
tud.projectEC/H2020 | 866008 | REDde_DE
tud.projectHMWK | III L6-519/03/05.001-(0016) | emergenCity - TP Rothde_DE
tud.projectHMWK | 500/10.001-(00012) | TAM - TP Rothde_DE
tud.unitTUDa


Dateien zu dieser Ressource

No Thumbnail [100%x60]
No Thumbnail [100%x60]
No Thumbnail [100%x60]
No Thumbnail [100%x60]
No Thumbnail [100%x60]
No Thumbnail [100%x60]
No Thumbnail [100%x60]

Der Datensatz erscheint in:

Zur Kurzanzeige

Apache License 2.0
Solange nicht anders angezeigt, wird die Lizenz wie folgt beschrieben: Apache License 2.0