Abstract
As the volume, accuracy and precision of digital geographic information have increased, concerns regarding individual privacy and confidentiality have come to the forefront. Not only do these challenge a basic tenet underlying the advancement of science by posing substantial obstacles to the sharing of data to validate research results, but they are obstacles to conducting certain research projects in the first place. Geospatial cryptography involves the specification, design, implementation and application of cryptographic techniques to address privacy, confidentiality and security concerns for geographically referenced data. This article defines geospatial cryptography and demonstrates its application in cancer control and surveillance. Four use cases are considered: (1) national‐level de‐duplication among state or province‐based cancer registries; (2) sharing of confidential data across cancer registries to support case aggregation across administrative geographies; (3) secure data linkage; and (4) cancer cluster investigation and surveillance. A secure multi-party system for geospatial cryptography is developed. Solutions under geospatial cryptography are presented and computation time is calculated. As services provided by cancer registries to the research community, de-duplication, case aggregation across administrative geographies and secure data linkage are often time-consuming and in some instances precluded by confidentiality and security concerns. Geospatial cryptography provides secure solutions that hold significant promise for addressing these concerns and for accelerating the pace of research with human subjects data residing in our nation’s cancer registries. Pursuit of the research directions posed herein conceivably would lead to a geospatially encrypted geographic information system (GEGIS) designed specifically to promote the sharing and spatial analysis of confidential data. Geospatial cryptography holds substantial promise for accelerating the pace of research with spatially referenced human subjects data.
Similar content being viewed by others
References
Abowd JM, Lane J (2004) New approaches to confidentiality protection: synthetic data, remote access and research data centers. In: Domingo-Ferrer J, Torra V (eds) Privacy in statistical databases, proceedings, vol 3050., Annals of the New York Academy of SciencesSpringer-Verlag, Berlin, pp 282–289
Amin R, Hendryx M, Shull M, Bohnert A (2014) A Cluster analysis of pediatric cancer incidence rates in Florida: 2000–2010. Stat Public Policy 1(1):69–77. doi:10.1080/2330443x.2014.928245
Anselin L, Bera A (1998) Spatial dependence in linear regression models with an introduction to spatial econometrics. In: Giles D, Ullah A (eds) Handbook of economic statistics. Marcel Dekker, New York, pp 237–289
Aslett LJ, Esperança PM, Holmes CC (2015) A review of homomorphic encryption and software tools for encrypted statistical machine learning. arXiv preprint arXiv:150806574
Barker E, Roginsky A (2011) Transitions: recommendation for transitioning the use of cryptographic algorithms and key lengths. NIST Special Publication 800:131A
Bell BS, Hoskins R, Pickle L, Wartenberg D (2006) Current practices in spatial analysis of cancer data: mapping health statistics to inform policymakers and the public. Int J Health Geogr 5(1):49
Boulos MNK, Curtis AJ, AbdelMalik P (2009) Musings on privacy issues in health research involving disaggregate geographic data about individuals. Int J Health Geogr 8:8. doi:10.1186/1476-072x-8-46
Ciriani V, Di Vimercati SD, Foresti S, Jajodia S, Paraboschi S, Samarati P (2010) Combining fragmentation and encryption to protect privacy in data storage. ACM Trans Inf Syst Secur 13(3):33. doi:10.1145/1805974.1805978
Curtis A, Mills JW, Leitner M (2006a) Keeping an eye on privacy issues with geospatial data. Nature 441(7090):150. doi:10.1038/441150d
Curtis A, Mills JW, Leitner M (2006b) Spatial confidentiality and GIS: re-engineering mortality locations from published maps about Hurricane Katrina. Int J Health Geogr 5:44
Curtis A, Mills JW, Augustin L, Cockburn M (2011) Confidentiality risks in fine scale aggregations of health data. Comput Environ Urban Syst 35:57–64
Cuzick J, Edwards R (1990) Spatial clustering for inhomogeneous populations. J R Stat Soc Ser B Methodol 52(1):73–104
Diez-Roux AV (1998) Bringing context back into epidemiology: variables and fallacies in multilevel analysis. Am J Public Health 88(2):216–222
El Emam K, Hu J, Mercer J, Peyton L, Kantarcioglu M, Malin B, Buckeridge D, Samet S, Earle C (2011) A secure protocol for protecting the identity of providers when disclosing data for disease surveillance. J Am Med Inf Assoc 18(3):212–217. doi:10.1136/amiajnl-2011-000100
El Emam K, Samet S, Hu J, Peyton L, Earle C, Jayaraman G, Wong T, Kantarcioglu M, Dankar F, Essex A (2012) A protocol for the secure linking of registries for HPV surveillance. PLoS One 7(7):e39915
Fefferman NH, O’Neil EA, Naumova EN (2005) Confidentiality and confidence: is data aggregation a means to achieve both? J Public Health Policy 26(4):430–449. doi:10.1057/palgrave.jphp.3200029
Fontaine C, Galand F (2007) A survey of homomorphic encryption for nonspecialists. EURASIP J Inf Secur 1:013801
Fotheringham AS, Brunsdon C, Charlton M (2002) Geographically weighted regression: the analysis of spatially varying relationships. Wiley, West Sussex
Gentry C (2009) Fully homomorphic encryption using ideal lattices. Stoc’09: Proceedings of the 2009 ACM symposium on theory of computing. Annual ACM symposium on theory of computing. Assoc Computing Machinery, New York, pp 169–178
Gentry C, Halevi S (2011) Implementing Gentry’s fully-homomorphic encryption scheme. In: Paterson KG (ed) Advances in cryptology—Eurocrypt 2011, vol 6632., Lecture notes in computer scienceSpringer-Verlag, Berlin, pp 129–148
Goovaerts P (1997) Geostatics for natural resources evaluation. Oxford University Press, New York
Gutmann MP, Witkowski K, Colyer C, O’Rourke JM, McNally J (2008) Providing spatial data for secondary analysis: issues and current practices relating to confidentiality. Popul Res Policy Rev 27(6):639–665. doi:10.1007/s11113-008-9095-4
Jacquez GM (2004) Current practices in the spatial analysis of cancer: flies in the ointment. Int J Health Geogr 3(1):22. doi:10.1186/1476-072X-3-22
Jacquez GM, Meliker J, Kaufmann A (2007) In search of induction and latency periods: space–time interaction accounting for residential mobility, risk factors and covariates. Int J Health Geogr 6:11. doi:10.1186/1476-072x-6-35
Kantarcioglu M, Jiang W, Liu Y, Malin B (2008) A cryptographic approach to securely share and query genomic sequences. IEEE T Inf Technol Biomed 12(5):606–617. doi:10.1109/titb.2007.908465
Kim J, Mu Y, Obaidat MS (2013) Advanced computer mathematics based cryptography and security technologies. Int J Comput Math 90(12):2512–2514
Knox EG (1964) The detection of space–time interactions. Appl Stat 13(1):25–30
Kulldorff M (1997) A spatial scan statistic. Commun Stat Theory Methods 26(6):1481–1496
Lawson AB (2006) Statistical methods in spatial epidemiology, 2nd edn. Wiley, New York
Mantel N (1967) The detection of disease clustering and a generalized regression approach. Cancer Res 27(2 Part 1):209–220
Marshall RJ (1991) A review of methods for the statistical analysis of spatial patterns of disease. J R Stat Soc Ser A Stat Soc 154:421–441. doi:10.2307/2983152
Meliker JR, Goovaerts P, Jacquez GM, AvRuskin GA, Copeland G (2009) Breast and prostate cancer survival in Michigan. Cancer 115(10):2212–2221. doi:10.1002/cncr.24251
Mouffron M (2008) Transitive q-ary functions over finite fields or finite sets: counts, properties and applications. In: von zur Gathen J, Imaña JL, Koç ÇK (eds) Arithmetic of finite fields: 2nd international workshop, WAIFI 2008 Siena, Italy, July 6–9, 2008 proceedings. Springer, Berlin, pp 19–35. doi:10.1007/978-3-540-69499-1_3
National Research Council (2007) Putting people on the map: protecting confidentiality with linked social-spatial data. The National Academies Press, Washington, DC
Paillier P (1999) Public-key cryptosystems based on composite degree residuosity classes. In: Stern J (ed) Advances in cryptology—Eurocrypt’99, vol 1592., Lecture notes in computer scienceSpringer-Verlag, Berlin, pp 223–238
Richardson DB, Volkow ND, Kwan M-P, Kaplan RM, Goodchild MF, Croyle RT (2013) Spatial turn in health research. Science 339(6126):1390–1392. doi:10.1126/science.1232257
Richardson DB, Kwan M-P, Alter G, McKendry JE (2015) Replication of scientific research: addressing geoprivacy, confidentiality, and data sharing challenges in geospatial research. Ann GIS 21(2):101–110
Rushton G, Armstrong MP, Gittler J, Greene BR, Pavlik CE, West MM, Zimmerman DL (2006) Geocoding in cancer research—a review. Am J Prev Med 30(2):S16–S24. doi:10.1016/j.amepre.2005.09.011
Samet H (1990) The design and analysis of spatial data structures, vol 85. Addison-Wesley, Reading
Santos LCD, Bilar GR, Dac F, Pereira FD (2015) Implementation of the fully homomorphic encryption scheme over integers with shorter keys. In: 2015 7th International conference on new technologies, mobility and security (NTMS), 27–29 July 2015, pp 1–5. doi:10.1109/ntms.2015.7266495
Smart NP, Vercauteren F (2010) Fully homomorphic encryption with relatively small key and ciphertext sizes. In: Nguyen PQ, Pointcheval D (eds) Public key cryptography—Pkc 2010, proceedings, vol 6056., Lecture notes in computer scienceSpringer-Verlag, Berlin, pp 420–443
Subramanian SV (2010) Multilevel modeling. In: Fischer MM, Getis A (eds) Handbook of applied spatial analysis: software tools, methods and applications. Springer, Berlin
VanWey LK, Rindfuss RR, Gutmann MP, Entwisle B, Balk DL (2005) Confidentiality and spatially explicit data: concerns and challenges. Proc Natl Acad Sci USA 102(43):15337–15342. doi:10.1073/pnas.0507804102
Verykios VS, Karakasidis A, Mitrogiannis VK (2009) Privacy preserving record linkage approaches. Int J Data Min Model Manag 1(2):206–221
Waller L, Gotway C (2004) Applied spatial statistics for public health data. John Wiley and Sons, New Jersey
Wartenberg D, Thompson WD (2010) Privacy versus public health: the impact of current confidentiality rules. Am J Public Health 100(3):407–412. doi:10.2105/ajph.2009.166249
Wieland SC, Cassa CA, Mandl KD, Berger B (2008) Revealing the spatial distribution of a disease while preserving privacy. Proc Natl Acad Sci USA 105(46):17608–17613. doi:10.1073/pnas.0801021105
Zandbergen PA (2014) Ensuring confidentiality of geocoded health data: assessing geographic masking strategies for individual-level data. Adv Med. doi:10.1155/2014/567049
Financial support
This study was supported by the National Library of Medicine Grant R21 LM011132-01A1 (PI G. M. Jacquez).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no potential conflicts of interest.
Rights and permissions
About this article
Cite this article
Jacquez, G.M., Essex, A., Curtis, A. et al. Geospatial cryptography: enabling researchers to access private, spatially referenced, human subjects data for cancer control and prevention. J Geogr Syst 19, 197–220 (2017). https://doi.org/10.1007/s10109-017-0252-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10109-017-0252-3