Dense Unsupervised Learning for Video Segmentation

Araslanov, Nikita; Schaub-Meyer, Simone; Roth, Stefan

Dense Unsupervised Learning for Video Segmentation

datacite.relation.isDescribedBy	https://arxiv.org/abs/2111.06265
dc.contributor.author	Araslanov, Nikita
dc.contributor.author	Schaub-Meyer, Simone
dc.contributor.author	Roth, Stefan
dc.date.accessioned	2023-08-04T09:52:41Z
dc.date.available	2021-12-22T11:09:23Z
dc.date.available	2023-08-04T09:52:41Z
dc.date.created	2021-12
dc.date.issued	2023-08-04
dc.description	We present a novel approach to unsupervised learning for video object segmentation (VOS). Unlike previous work, our formulation allows to learn dense feature representations directly in a fully convolutional regime. We rely on uniform grid sampling to extract a set of anchors and train our model to disambiguate between them on both inter- and intra-video levels. However, a naive scheme to train such a model results in a degenerate solution. We propose to prevent this with a simple regularisation scheme, accommodating the equivariance property of the segmentation task to similarity transformations. Our training objective admits efficient implementation and exhibits fast training convergence. On established VOS benchmarks, our approach exceeds the segmentation accuracy of previous work despite using significantly less training data and compute power.	de_DE
dc.identifier.uri	https://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/3365.2
dc.language.iso	en	de_DE
dc.rights.license	Apache-2.0 (https://www.apache.org/licenses/LICENSE-2.0)
dc.subject	self-supervised learning	de_DE
dc.subject	video object segmentation	de_DE
dc.subject	representation learning	de_DE
dc.subject.classification	4.43-04
dc.subject.classification	4.43-05
dc.subject.ddc	004
dc.title	Dense Unsupervised Learning for Video Segmentation	de_DE
dc.type	Software	de_DE
dcterms.accessRights	openAccess
person.identifier.orcid	#PLACEHOLDER_PARENT_METADATA_VALUE#
person.identifier.orcid	0000-0001-8644-1074
person.identifier.orcid	0000-0001-9002-9832
tuda.history.classification	Version=2016-2020;409-05 Interaktive und intelligente Systeme, Bild- und Sprachverarbeitung, Computergraphik und Visualisierung
tuda.project	EC/H2020 \| 866008 \| RED
tuda.project	HMWK \| III L6-519/03/05.001-(0016) \| emergenCity - TP Roth
tuda.unit	TUDa

Files

Original bundle

Now showing 1 - 3 of 3

Name	Description	Size	Format
dense-ulearn-vos-code.zip	Training and inference code (PyTorch)	7.65 MB	ZIP-Archivdateien	Download
snapshots.tar.gz	Model parameters (snapshots)	402.41 MB		Download
results.tar.gz	Inference results	8.38 GB		Download

Simple item page

Collections

Segmentation

Version History

You are currently viewing version no. 2 of the item. This is the most recent version.

Now showing 1 - 1 of 1

Version	Date	Summary
2*	2022-01-04 17:58:21	Adding third party funding

* Selected version