Zur Kurzanzeige

dc.contributor.authorAraslanov, Nikita
dc.contributor.authorRothkopf, Constantin
dc.contributor.authorRoth, Stefan
dc.date.accessioned2021-12-22T11:09:38Z
dc.date.available2021-12-22T11:09:38Z
dc.date.issued2019-06
dc.identifier.urihttps://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/3368
dc.descriptionMost approaches to visual scene analysis have emphasised parallel processing of the image elements. However, one area in which the sequential nature of vision is apparent, is that of segmenting multiple, potentially similar and partially occluded objects in a scene. In this work, we revisit the recurrent formulation of this challenging problem in the context of reinforcement learning. Motivated by the limitations of the global max-matching assignment of the ground-truth segments to the recurrent states, we develop an actor-critic approach in which the actor recurrently predicts one instance mask at a time and utilises the gradient from a concurrently trained critic network. We formulate the state, action, and the reward such as to let the critic model long-term effects of the current prediction and incorporate this information into the gradient signal. Furthermore, to enable effective exploration in the inherently high-dimensional action space of instance masks, we learn a compact representation using a conditional variational auto-encoder. We show that our actor-critic model consistently provides accuracy benefits over the recurrent baseline on standard instance segmentation benchmarks.de_DE
dc.language.isoende_DE
dc.relationIsDescribedBy;arXiv;1904.05126
dc.rightsApache License 2.0
dc.rights.urihttps://www.apache.org/licenses/LICENSE-2.0
dc.subjectactor-criticde_DE
dc.subjectreinforcement learningde_DE
dc.subjectinstance segmentationde_DE
dc.subject.classification4.43-04 Künstliche Intelligenz und Maschinelle Lernverfahrende_DE
dc.subject.classification4.43-05 Bild- und Sprachverarbeitung, Computergraphik und Visualisierung, Human Computer Interaction, Ubiquitous und Wearable Computing
dc.subject.ddc004
dc.titleActor-critic Instance Segmentationde_DE
dc.typeSoftwarede_DE
tud.unitTUDa
tud.history.classificationVersion=2016-2020;409-05 Interaktive und intelligente Systeme, Bild- und Sprachverarbeitung, Computergraphik und Visualisierung


Dateien zu dieser Ressource

Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail

Der Datensatz erscheint in:

Zur Kurzanzeige

Apache License 2.0
Solange nicht anders angezeigt, wird die Lizenz wie folgt beschrieben: Apache License 2.0