Zur Kurzanzeige

dc.contributor.authorLoza Mencia, Eneldo
dc.contributor.authorde Melo, Gerard
dc.contributor.authorNam, Jinseok
dc.date.accessioned2021-09-26T20:59:54Z
dc.date.available2021-09-26T20:59:54Z
dc.date.issued2016
dc.identifier.urihttps://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/2936
dc.descriptionThis entry contains the resources used in and resulting from Eneldo Loza Mencía, Gerard de Melo and Jinseok Nam, Medical Concept Embeddings via Labeled Background Corpora, in: Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), 2016 In recent years, we have seen an increasing amount of interest in low-dimensional vector representations of words. Among other things, these facilitate computing word similarity and relatedness scores. The most well-known example of algorithms to produce representations of this sort are the word2vec approaches. In this paper, we investigate a new model to induce such vector spaces for medical concepts, based on a joint objective that exploits not only word co-occurrences but also manually labeled documents, as available from sources such as PubMed. Our extensive experimental analysis shows that our embeddings lead to significantly higher correlations with human similarity and relatedness assessments than previous work. Due to the simplicity and versatility of vector representations, these findings suggest that our resource can easily be used as a drop-in replacement to improve any systems relying on medical concept similarity measures.de_DE
dc.language.isoende_DE
dc.relationIsVersionOf;URL;https://www.ke.tu-darmstadt.de/resources/medsim
dc.relationIsSupplementTo;ISBN;978-2-9517408-9-1
dc.rights.urihttps://rightsstatements.org/vocab/InC/1.0/
dc.subjectEmbeddingsde_DE
dc.subjectMedical Conceptsde_DE
dc.subjectSemantic Similarityde_DE
dc.subjectMeSHde_DE
dc.subject.classification104-04 Angewandte Sprachwissenschaften, Experimentelle Linguistik, Computerlinguistikde_DE
dc.subject.classification409-05 Interaktive und intelligente Systeme, Bild- und Sprachverarbeitung, Computergraphik und Visualisierungde_DE
dc.subject.ddc400
dc.subject.ddc004
dc.titleMed­i­cal Con­cept Em­bed­dings via La­beled Back­ground Cor­po­rade_DE
dc.typeDatasetde_DE
dc.typeTextde_DE
dc.typeModelde_DE
dc.description.versionVersion 1.0de_DE
tud.unitTUDa


Dateien zu dieser Ressource

Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail

Der Datensatz erscheint in:

Zur Kurzanzeige

in Copyright
Solange nicht anders angezeigt, wird die Lizenz wie folgt beschrieben: in Copyright