Establishing a framework for Open Geographic Information science

Author

Alex David Singleton; Seth Spielman; Chris Brunsdon

Published

August 2, 2016

Alex David Singleton; Seth Spielman; Chris Brunsdon (2016). International Journal of Geographical Information Science, 30(8), 1507-1521. DOI: 10.1080/13658816.2015.1137579

Abstract

When conducting research within a framework of Geographic Information Science (GISc), the scientific validity of this work can be argued as highly dependent upon the extent to which the methods employed are reproducible, and that, in the strictest sense, can only be fully achieved by implementing transparent workflows that utilize both open source software and openly available data. After considering the scientific implications of non-reproducible methods, we provide a review of both open source Geographic Information Systems (GIS) and openly available data, before describing an integrated model for Open GISc. We conclude with a critical review of this embryonic paradigm, with directions for future development in supporting spatial data infrastructure.

Extended Summary

This paper examines how Geographic Information Science (GISc) can enhance research reproducibility and transparency through open source software, open data, and workflow-based publishing models. The research addresses growing concerns about scientific reproducibility across disciplines by proposing a comprehensive framework for ‘Open GISc’ that ensures geographic research can be verified and replicated by third parties. The paper employs a systematic review approach, examining current practices in open source Geographic Information Systems (GIS), openly available spatial data resources, and reproducible research methodologies. The analysis draws on examples from climate science, economics, and medical research to illustrate the dangers of non-reproducible methods, including high-profile cases where coding errors or data access restrictions undermined scientific credibility. The study demonstrates that open source GIS software, particularly programming environments like R and Python with spatial analysis libraries, offers superior transparency compared to closed-source alternatives. These tools preserve analytical workflows through code, enabling others to scrutinise methods and identify potential errors during peer review. The research identifies significant growth in open data initiatives globally, with government portals increasingly providing free access to geographic datasets under permissive licences. However, the paper acknowledges persistent challenges including software licence compatibility issues, limited developer resources for code auditing, and difficulties handling sensitive or confidential data. To address these constraints, the study proposes emerging solutions like differential privacy techniques and synthetic data generation methods. The paper advocates for a five-point framework requiring: publicly accessible data, open source software with scrutable code, transparent workflows linking data and analysis, peer review processes demanding reproducible submissions, and partial adoption of open principles where full implementation proves impossible. The research highlights workflow models combining programming languages with document markup systems (like R with LaTeX through Sweave) as particularly promising for creating self-contained, executable publications. These approaches enable readers to download papers that automatically retrieve data, execute analysis, and regenerate results including tables, graphs, and maps. The study concludes that implementing Open GISc principles would enhance public accountability, reduce costs of improving research, and advance the discipline by facilitating error detection and method validation. The paper calls for high-quality GISc journals to establish submission requirements for code and data sharing, arguing this represents a ‘higher’ form of publishing that shifts reproducibility burden from readers to authors. This framework has broader implications for spatial data infrastructure development, suggesting need for services supporting secure analysis of sensitive data whilst maintaining privacy protections.

Key Findings

  • Open source GIS software with transparent code enables superior reproducibility compared to closed-source alternatives by preserving analytical workflows
  • Government open data initiatives globally provide increasing access to geographic datasets under permissive licences for research reproduction
  • Workflow models combining programming languages with markup systems create self-contained executable publications for enhanced transparency
  • High-profile research failures demonstrate critical importance of reproducible methods for maintaining scientific credibility and public trust
  • Five-point Open GISc framework requires accessible data, open software, transparent workflows, reproducible peer review, and partial adoption where constraints exist

Citation

PDF Download BibTeX

@article{singleton2016establishing,
  author = {Alex David Singleton; Seth Spielman; Chris Brunsdon},
  title = {Establishing a framework for Open Geographic Information science},
  journal = {International Journal of Geographical Information Science},
  year = {2016},
  volume = {30(8)},
  pages = {1507-1521},
  doi = {10.1080/13658816.2015.1137579}
}