dc.contributor.author | Hahn, Oliver | |
dc.contributor.author | Reich, Christoph | |
dc.contributor.author | Araslanov, Nikita | |
dc.contributor.author | Cremers, Daniel | |
dc.contributor.author | Rupprecht, Christian | |
dc.contributor.author | Roth, Stefan | |
dc.date.accessioned | 2025-04-03T14:12:02Z | |
dc.date.available | 2025-04-03T14:12:02Z | |
dc.date.issued | 2025-06-11 | |
dc.identifier.uri | https://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/4532 | |
dc.description | Unsupervised panoptic segmentation aims to partition an image into semantically meaningful regions and distinct object instances without training on manually annotated data. In contrast to prior work on unsupervised panoptic scene understanding, we eliminate the need for object-centric training data, enabling the unsupervised understanding of complex scenes. To that end, we present the first unsupervised panoptic method that directly trains on scene-centric imagery. In particular, we propose an approach to obtain high-resolution panoptic pseudo labels on complex scene-centric data combining visual representations, depth, and motion cues. Utilizing both pseudo-label training and a panoptic self-training strategy yields a novel approach that accurately predicts panoptic segmentation of complex scenes without requiring any human annotations. Our approach significantly improves panoptic quality, e.g., surpassing the recent state of the art in unsupervised panoptic segmentation on Cityscapes by 9.4% points in PQ. Acknowledgments: This project was partially supported by the European Research Council (ERC) Advanced Grant SIMULACRON, DFG project CR 250/26-1 "4D-YouTube", and GNI Project ``AICC''. This project has also received funding from the ERC under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 866008). Additionally, this work has further been co-funded by the LOEWE initiative (Hesse, Germany) within the emergenCITY center [LOEWE/1/12/519/03/05.001(0016)/72] and by the State of Hesse through the cluster project ``The Adaptive Mind (TAM)''. Christoph Reich is supported by the Konrad Zuse School of Excellence in Learning and Intelligent Systems (ELIZA) through the DAAD programme Konrad Zuse Schools of Excellence in Artificial Intelligence, sponsored by the Federal Ministry of Education and Research. License: Code, predictions, and checkpoints are released under the Apache-2.0 license, except for the ResNet-50 DINO backbone (dino_RN50_pretrain_d2_format.pkl), which is adapted from CutLER and published under the CC BY-NC-SA 4.0 license. | de_DE |
dc.language.iso | en | de_DE |
dc.relation | IsSupplementTo;arXiv;2504.01955 | |
dc.rights | Apache License 2.0 | |
dc.rights.uri | https://www.apache.org/licenses/LICENSE-2.0 | |
dc.subject | unsupervised panoptic segmentation | de_DE |
dc.subject | scene understanding | de_DE |
dc.subject | unsupervised scene understanding | de_DE |
dc.subject | unsupervised segmentation | de_DE |
dc.subject | unsupervised learningsed | de_DE |
dc.subject | panoptic segmentation | de_DE |
dc.subject | segmentation | de_DE |
dc.subject | computer vision | de_DE |
dc.subject.classification | 4.43-05 Bild- und Sprachverarbeitung, Computergraphik und Visualisierung, Human Computer Interaction, Ubiquitous und Wearable Computing | de_DE |
dc.subject.ddc | 004 | |
dc.title | Scene-Centric Unsupervised Panoptic Segmentation | de_DE |
dc.type | Software | de_DE |
tud.project | EC/H2020 | 866008 | RED | de_DE |
tud.project | HMWK | III L6-519/03/05.001-(0016) | emergenCity - TP Roth | de_DE |
tud.project | HMWK | 500/10.001-(00012) | TAM - TP Roth | de_DE |
tud.unit | TUDa | |