Abstract
This paper presents the DIS-C approach, which is a novel method to assess the conceptual distance between concepts within an ontology. DIS-C is graph based in the sense that the whole topology of the ontology is considered when computing the weight of the relationships between concepts. The methodology is composed of two main steps. First, in order to take advantage of previous knowledge, an expert of the ontology domain assigns initial weight values to each of the relations in the ontology. Then, an automatic method for computing the conceptual relations refines the weights assigned to each relation until reaching a stable state. We introduce a metric called generality that is defined in order to evaluate the accessibility of each concept, considering the ontology like a strongly connected graph. Unlike most previous approaches, the DIS-C algorithm computes similarity between concepts in ontologies that are not necessarily represented in a hierarchical or taxonomic structure. So, DIS-C is capable of incorporating a wide variety of relationships between concepts such as meronymy, antonymy, functionality and causality.
Similar content being viewed by others
Notes
Formally, the output is not a distance, since some conditions are not met, such as symmetry and triangle inequality.
The distance is inversely proportional in the absolute value of the correlation.
Simple rounding.
As we have mentioned, the conceptual distance is not symmetric (\(\exists a,b\in C|\Delta _{K}(a,b)\ne \Delta _{K}(b,a)\)). So, we present the conceptual distance from word A to word B (column DIS-C(to)), from word B to word A (column DIS-C(from)), the average of these two distances (column DIS-C(avg), the minimum (DIS-C(min)) and the maximum (column DIS-C(max)).
References
Al-Mubaid H, Nguyen H et al (2006) A cluster-based approach for semantic similarity in the biomedical domain. In: Engineering in Medicine and Biology Society, 2006. EMBS’06. 28th annual international conference of the IEEE’, IEEE, pp 2713–2717
Al-Mubaid H, Nguyen H et al (2009) Measuring semantic similarity between biomedical concepts within multiple ontologies. IEEE Trans Syst Man Cybern Part C: Appl Rev 39(4):389–398
Albacete E, Calle-Gómez J, Castro E, Cuadra D (2012) Semantic similarity measures applied to an ontology for human-like interaction. J Artif Intell Res (JAIR) 44:397–421
Albertoni R, De Martino M (2006) Semantic similarity of ontology instances tailored on the application context. In: On the move to meaningful internet systems 2006: CoopIS, DOA, GADA, and ODBASE, Springer, Berlin, pp 1020–1038
Atkinson J, Ferreira A, Aravena E (2009) Discovering implicit intention-level knowledge from natural-language texts. Knowl-Based Syst 22(7):502–508
Batet M, Sánchez D, Valls A (2011) An ontology-based measure to compute semantic similarity in biomedicine. J Biomed Inform 44(1):118–125
Blanco-Fernández Y, Pazos-Arias JJ, Gil-Solla A, Ramos-Cabrer M, López-Nores M, García-Duque J, Fernández-Vilas A, Díaz-Redondo RP, Bermejo-Muñoz J (2008) A flexible semantic inference methodology to reason about user preferences in knowledge-based recommender systems. Knowl-Based Syst 21(4):305–320
Bollegala D, Matsuo Y, Ishizuka M (2007) Measuring semantic similarity between words using web search engines. WWW 7:757–766
Budan I, Graeme H (2006) Evaluating wordnet-based measures of semantic distance. Comut Linguist 32(1):13–47
Chu H-C, Chen M-Y, Chen Y-M (2009) A semantic-based approach to content abstraction and annotation for content management. Expert Syst Appl 36(2):2360–2376
Cilibrasi RL, Vitanyi P (2007) The google similarity distance. IEEE Trans Knowl Data Eng 19(3):370–383
Consortium GO (2004) The gene ontology (go) database and informatics resource. Nucleic Acids Res 32(suppl 1):D258–D261
Couto FM, Silva MJ, Coutinho PM (2007) Measuring semantic similarity between gene ontology terms. Data Knowl Eng 61(1):137–152
Cross V, Hu X (2011) Using semantic similarity in ontology alignment. Ontology Matching p 61
Ding L, Finin T, Joshi A, Pan R, Cost RS, Peng Y, Reddivari P, Doshi V, Sachs J (2004) Swoogle: a search and metadata engine for the semantic web. In: Proceedings of the thirteenth ACM international conference on Information and knowledge management, ACM, 652–659
Fellbaum C (1998) WordNet: an electronic database. MIT Press, Cambridge
Fonseca F (2008) Ontology-based geospatial data integration. In: Encyclopedia of GIS, pp 812–815
Formica A (2006) Ontology-based concept similarity in formal concept analysis. Inf Sci 176(18):2624–2641
Gangemi A, Guarino N, Masolo C, Oltramari A, Schneider L (2002) Sweetening ontologies with dolce. In: Knowledge engineering and knowledge management: ontologies and the semantic web. Springer, Berlin, pp 166–181
Goldstone R (1994a) An efficient method for obtaining similarity data. Behav Res Methods Instrum Comput 26(4):381–386
Goldstone RL (1994b) Similarity, interactive activation, and mapping. J Exp Psychol Learn Mem Cognit 20(1):3
Goldstone RL, Medin DL, Halberstadt J (1997) Similarity in context. Mem Cognit 25(2):237–255
Han L, Sun L, Chen G, Xie L (2006) Adss: an approach to determining semantic similarity. Adv Eng Softw 37(2):129–132
Harispe S, Sánchez D, Ranwez S, Janaqi S, Montmain J (2014) A framework for unifying ontology-based semantic similarity measures: a study in the biomedical domain. J Biomed Inform 48:38–53
Héja G, Surján G, Varga P (2008) Ontological analysis of snomed ct. BMC Med Inform Decis Mak 8(Suppl 1):S8
Hirst G, St-Onge D (1998) Lexical chains as representations of context for the detection and correction of malapropisms. WordNet: Electron Lex Database 305:305–332
Hliaoutakis A, Varelas G, Voutsakis E, Petrakis EG, Milios E (2006) Information retrieval by semantic similarity. Int J Semant Web Inf Syst 2(3):55–73
Jain P, Yeh PZ, Verma K, Vasquez RG, Damova M, Hitzler P, Sheth AP (2011) Contextual ontology alignment of lod with an upper ontology: a case study with proton. In: The semantic web: research and applications. Springer, Berlin, pp 80–92
Janowicz K, Raubal M, Kuhn W (2015) The semantics of similarity in geographic information retrieval. J Spat Inf Sci 2:29–57
Jarmasz M, Szpakowicz S (2003) Roget’s thesaurus and semantic similarity. In: Proceedings of the international conference on recent advances in natural language processing, 212–219
Jiang JJ, Conrath DW (1997) Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of the international conference on research in computational linguistics, 19–33
Kashyap V, Sheth A (1996) Semantic and schematic similarities between database objects: a context-based approach. VLDB J-Int J Very Large Data Bases 5(4):276–304
Kastrati Z, Imran AS, Yildirim-Yayilgan S (2016) Semcon: a semantic and contextual objective metric for enriching domain ontology concepts. Int J Semant Web Inf Syst 12(2):1–24
Kumar S, Baliyan N, Sukalikar S (2017) Ontology cohesion and coupling metrics. Int J Semant Web Inf Syst 13(4):1–26
Leacock C, Chodorow M (1998) Combining local context and wordnet similarity for word sense identification. WordNet: Electron Lex Database 49(2):265–283
Levachkine S, Guzmán-Arenas A (2007) Hierarchy as a new data type for qualitative variables. Expert Syst Appl 32(3):899–910
Li Y, Bandar Z, McLean D et al (2003) An approach for measuring semantic similarity between words using multiple information sources. IEEE Trans Knowl Data Eng 15(4):871–882
Li Y, McLean D, Bandar Z, O’shea JD, Crockett K (2006) Sentence similarity based on semantic nets and corpus statistics. IEEE Trans Knowl Data Eng 18(8):1138–1150
Likavec S, Osborne F, Cena F (2015) Property-based semantic similarity and relatedness for improving recommendation accuracy and diversity. Int J Semant Web Inf Syst 11(4):1–40
Lin D et al (1998) An information-theoretic definition of similarity. In: ICML vol 98, 296–304
Meilicke C, Stuckenschmidt H, Tamilin A (2007) Repairing ontology mappings. In: AAAI, vol 3, 6
Meng L, Huang R, Gu J (2013) A review of semantic similarity measures in wordnet. Int J Hybrid Inf Technol 6(1):1–12
Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):39–41
Miller GA, Charles WG (1991) Contextual correlates of semantic similarity. Lang Cognit Process 6(1):1–28
Moreno M (2007) Similitud semantica entre sistemas de objetos geograficos aplicada a la generalizacion de datos geo-espaciales, Ph.D. thesis
Nedas K, Egenhofer M (2008) Spatial-scene similarity queries. Trans GIS 12(6):661–681
Niles I, Pease A (2001) Towards a standard upper ontology. In: Proceedings of the international conference on formal ontology in information systems, 2001, ACM, 2–9
Patwardhan S, Banerjee S, Pedersen T (2003) Using measures of semantic relatedness for word sense disambiguation. In: Computational linguistics and intelligent text processing. Springer, Berlin, 241–257
Pedersen T, Pakhomov SV, Patwardhan S, Chute CG (2007) Measures of semantic similarity and relatedness in the biomedical domain. J Biomed Inform 40(3):288–299
Petrakis EG, Varelas G, Hliaoutakis A, Raftopoulou P (2006) X-similarity: computing semantic similarity between concepts from different ontologies. JDIM 4(4):233–237
Pirró G (2009) A semantic similarity metric combining features and intrinsic information content. Data Knowl Eng 68(11):1289–1308
Pirrò G, Ruffolo M, Talia D (2009) Secco: on building semantic links in peer-to-peer networks. In: Journal on data semantics XII’, Springer, Berlin, 1–36
Rada R, Mili H, Bicknell E, Blettner M (1989) Development and application of a metric on semantic nets. IEEE Trans Syst Man Cybern 19(1):17–30
Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy, arXiv preprint cmp-lg/9511007
Resnik P (1999) Semantic similarity in a taxonomy: an information-based measure and its application to problems of ambiguity in natural language. J Artif Intell Res 11:95–130
Rissland EL (2006) Ai and similarity. IEEE Intell Syst 3:39–49
Rodríguez MA, Egenhofer MJ (2003) Determining semantic similarity among entity classes from different ontologies. IEEE Trans Knowl Data Eng 15(2):442–456
Rodríguez M, Egenhofer M (2004) Comparing geospatial entity classes: an asymmetric and context-dependent similarity measure. Int J Geogr Inf Sci 18(3):229–256
Rubenstein H, Goodenough JB (1965) Contextual correlates of synonymy. Commun ACM 8(10):627–633
Sánchez D (2010) A methodology to learn ontological attributes from the web. Data Knowl Eng 69(6):573–597
Sánchez D, Batet M (2011) Semantic similarity estimation in the biomedical domain: an ontology-based information-theoretic perspective. J Biomed Inform 44(5):749–759
Sánchez D, Batet M (2013) A semantic similarity method based on information content exploiting multiple ontologies. Expert Syst Appl 40(4):1393–1399
Sánchez D, Batet M, Isern D (2011) Ontology-based information content computation. Knowl-Based Syst 24(2):297–303
Sánchez D, Batet M, Isern D, Valls A (2012) Ontology-based semantic similarity: a new feature-based approach. Expert Syst Appl 39(9):7718–7728
Sánchez D, Isern D (2011) Automatic extraction of acronym definitions from the web. Appl Intell 34(2):311–327
Sánchez D, Isern D, Millan M (2011) Content annotation for the semantic web: an automatic web-based approach. Knowl Inf Syst 27(3):393–418
Sánchez D, Moreno A, Del Vasto-Terrientes L (2012) Learning relation axioms from text: an automatic web-based approach. Expert Syst Appl 39(5):5792–5805
Sánchez D, Solé-Ribalta A, Batet M, Fz Serratosa (2012) Enabling semantic similarity estimation across multiple ontologies: an evaluation in the biomedical domain. J Biomed Inform 45(1):141–155
Saruladha K, Aghila G, Bhuvaneswary A (2011) Information content based semantic similarity for cross ontological concepts. Int J Eng Sci Technol 3(6)
Schickel-Zuber V, Faltings B (2007) Oss: a semantic similarity function based on hierarchical ontologies. In: IJCAI, vol 7, 551–556
Schwering A (2005) Hybrid model for semantic similarity measurement. In: On the move to meaningful internet systems 2005: CoopIS, DOA, and ODBASE’, Springer, Berlin, 1449–1465
Schwering A (2008) Approaches to semantic similarity measurement for geo-spatial data: a survey. Trans GIS 12(1):5–29
Schwering A, Raubal M (2005) Measuring semantic similarity between geospatial conceptual regions. In: GeoSpatial semantics. Springer, Berlin, 90–106
Seco N, Veale T, Hayes J (2004) An intrinsic information content metric for semantic similarity in wordnet. In: ECAI, vol 16, 1089
Sheeren D, Mustière S, Zucker JD (2009) A data mining approach for assessing consistency between multiple representations in spatial databases. Int J Geogr Inf Sci 23:961–992
Sinha R, Mihalcea R (2007) Unsupervised graph-basedword sense disambiguation using measures of word semantic similarity. In: Null, IEEE, 363–369
Song W, Li CH, Park SC (2009) Genetic algorithm for text clustering using ontology and evaluating the validity of various semantic similarity measures. Expert Syst Appl 36(5):9095–9104
Stevenson M, Greenwood MA (2005) A semantic approach to ie pattern induction. In: Proceedings of the 43rd annual meeting on association for computational linguistics. Association for Computational Linguistics, 379–386
Tapeh AG, Rahgozar M (2008) A knowledge-based question answering system for b2c ecommerce. Knowl-Based Syst 21(8):946–950
Torres M, Quintero R, Moreno-Ibarra M, Menchaca-Mendez R, Guzman G (2011) GEONTO-MET: an approach to conceptualizing the geographic domain. Int J Geogr Inf Sci 25(10):1633–1657
Tversky A, Gati I (1978) Studies of similarity. Cognit Categ 1(1978):79–98
Wang H, Wang W, Yang J, Yu PS (2002) Clustering by pattern similarity in large data sets. In: Proceedings of the 2002 ACM SIGMOD international conference on management of data. ACM, 394–405
Wu Z, Palmer M (1994) Verbs semantics and lexical selection. In: Proceedings of the 32nd annual meeting on association for computational linguistics. Association for Computational Linguistics, 133–138
Zadeh PDH, Reformat MZ (2013) Assessment of semantic similarity of concepts defined in ontology. Inf Sci 250:21–39
Zhou Z, Wang Y, Gu J (2008) A new model of information content for semantic similarity in wordnet. In: Future generation communication and networking symposia, 2008. FGCNS’08. Second international conference on’, vol 3, IEEE, 85–89
Acknowledgements
Work partially sponsored by Instituto Politécnico Nacional and SIP-IPN under Grants 20182159, 20180308, 20180409, 20180773, 20180839 and 20181568. Also is sponsored by Consejo Nacional de Ciencia y Tecnología (CONACyT) under Grant PN-2016/2110. We are thankful to the reviewers for their invaluable and constructive feedback that helped improve the quality of this paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Quintero, R., Torres-Ruiz, M., Menchaca-Mendez, R. et al. DIS-C: conceptual distance in ontologies, a graph-based approach. Knowl Inf Syst 59, 33–65 (2019). https://doi.org/10.1007/s10115-018-1200-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-018-1200-3