Dense Unsupervised Learning for Video Segmentation

datacite.relation.isDescribedBy https://arxiv.org/abs/2111.06265
dc.contributor.author Araslanov, Nikita
dc.contributor.author Schaub-Meyer, Simone
dc.contributor.author Roth, Stefan
dc.date.accessioned 2023-08-04T09:52:41Z
dc.date.available 2021-12-22T11:09:23Z
dc.date.available 2023-08-04T09:52:41Z
dc.date.created 2021-12
dc.date.issued 2023-08-04
dc.description We present a novel approach to unsupervised learning for video object segmentation (VOS). Unlike previous work, our formulation allows to learn dense feature representations directly in a fully convolutional regime. We rely on uniform grid sampling to extract a set of anchors and train our model to disambiguate between them on both inter- and intra-video levels. However, a naive scheme to train such a model results in a degenerate solution. We propose to prevent this with a simple regularisation scheme, accommodating the equivariance property of the segmentation task to similarity transformations. Our training objective admits efficient implementation and exhibits fast training convergence. On established VOS benchmarks, our approach exceeds the segmentation accuracy of previous work despite using significantly less training data and compute power. de_DE
dc.identifier.uri https://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/3365.2
dc.language.iso en de_DE
dc.rights.licenseApache-2.0 (https://www.apache.org/licenses/LICENSE-2.0)
dc.subject self-supervised learning de_DE
dc.subject video object segmentation de_DE
dc.subject representation learning de_DE
dc.subject.classification 4.43-04
dc.subject.classification 4.43-05
dc.subject.ddc 004
dc.title Dense Unsupervised Learning for Video Segmentation de_DE
dc.type Software de_DE
dcterms.accessRights openAccess
person.identifier.orcid #PLACEHOLDER_PARENT_METADATA_VALUE#
person.identifier.orcid 0000-0001-8644-1074
person.identifier.orcid 0000-0001-9002-9832
tuda.history.classification Version=2016-2020;409-05 Interaktive und intelligente Systeme, Bild- und Sprachverarbeitung, Computergraphik und Visualisierung
tuda.project EC/H2020 | 866008 | RED
tuda.project HMWK | III L6-519/03/05.001-(0016) | emergenCity - TP Roth
tuda.unit TUDa

Files

Original bundle

Now showing 1 - 3 of 3
NameDescriptionSizeFormat
dense-ulearn-vos-code.zipTraining and inference code (PyTorch)7.65 MBZIP-Archivdateien Download
snapshots.tar.gzModel parameters (snapshots)402.41 MB Download
results.tar.gzInference results8.38 GB Download

Collections

Version History

Now showing 1 - 1 of 1
VersionDateSummary
2*
2022-01-04 17:58:21
Adding third party funding
* Selected version