Source Code, Data and Additional Material for the Thesis: "Aspects of Coherence for Entity Analysis"
This dataset contains source code and system output used in the PhD thesis "Aspects of Coherence for Entity Analysis". This dataset is split into three parts corresponding to the chapters describing the three main contributions of the thesis:
- chapter3.tar.gz: Java source code for the entity linking system based on interleaved multitasking, system results, system output. Java and Python source code for automatic verification of entity linking results. Java source code for the Visual Entity Explorer.
- chapter4.tar.gz: Java and Scala source code for extracting pairs of terms and their dependency context from GigaWord and Wikilinks.
- chapter5.tar.gz: Python code used to run entity typing experiments.
SubjectComputer and Information Science
Graff, David, and Christopher Cieri. English Gigaword LDC2003T05. Web Download. Philadelphia: Linguistic Data Consortium, 2003. URL: https://catalog.ldc.upenn.edu/LDC2003T05
Singh, Sameer, Amarnag Subramanya, Fernando Pereira, and Andrew McCallum. "Wikilinks: A large-scale cross-document coreference corpus labeled via links to Wikipedia." University of Massachusetts, Amherst, Tech. Rep. UM-CS-2012 15 (2012). URL: http://www.iesl.cs.umass.edu/data/data-wiki-links