3.11.4 (main, Aug 16 2023, 05:31:52) [GCC 10.2.1 20210110]
Preface
Geocomputation with Python (geocompy) is motivated by the need for an introductory resource for working with geographic data with the most popular programming language in the world. A unique selling point of the book is its cohesive and joined-up coverage of both vector and raster geographic data models and consistent learning curve. We aim to minimize surprises, with each section and chapter building on the previous. If youβre just starting out with Python for working with geographic data, this book is an excellent place to start.
There are many resources on Python on βGeoPythonβ but none that fill this need for an introductory resource that provides strong foundations for future work. We want to avoid reinventing the wheel and provide something that fills an βecological nicheβ in the wider free and open source software for geospatial (FOSS4G) ecosystem. Key features include:
- Doing basic operations well
- Integration of vector and raster datasets operations
- Clear explanation of each line of code in the book to minimize surprises
- Provision of lucid example datasets and meaningful operations to illustrate the applied nature of geographic research
This book is complementary with, and adds value to, other projects in the ecosystem, as highlighted in the following comparison between Geocomputation with Python and related GeoPython books:
- Learning Geospatial Analysis with Python1 and Geoprocessing with Python2 are books in this space that focus on processing spatial data using low-level Python interfaces for GDAL, such as the gdal, gdalnumeric, and ogr packages from osgeo. This approach requires writing more lines of code. We believe our approach is more βPythonicβ and future-proof, in light of development of packages such as geopandas and rasterio.
- Introduction to Python for Geographic Data Analysis3 (in progress) seeks to provide a general introduction to βGIS in Pythonβ, with parts focusing on Python essentials, using Python with GIS, and case studies. Compared with this book, which is also open source, and is hosted at pythongis.org, Geocomputation with Python has a narrower scope (not covering spatial network analysis, for example) and more coverage of raster data processing and raster-vector interoperability.
- Geographic Data Science with Python4 is an ambitious project with chapters dedicated to advanced topics, with Chapter 4 on Spatial Weights getting into complex topics relatively early, for example.
- Python for Geospatial Data Analysis5 introduces a wide range of approaches to working with geospatial data using Python, including automation of proprietary and open-source GIS software, as well as standalone open source Python packages (which is what we focus on and explain comprehensively in our book). Geocompy is shorter, simpler and more introductory, and cover raster and vector data with equal importance.
Another unique feature of the book is that it is part of a wider community. Geocomputation with Python is a sister project of Geocomputation with R6(Lovelace, Nowosad, and Muenchow 2019), a book on geographic data analysis, visualization, and modeling using the R programming language that has 60+ contributors and an active community, not least in the associated Discord group7. Links with the vibrant βR-spatialβ community, and other communities such as GeoRust and JuliaGeo, lead to many opportunities for mutual benefit across open source ecosystems.
Prerequisites
We assume that the reader is:
- familiar with the Python language,
- is capable of running Python code and install Python packages, and
- is familiar with the
numpy
andpandas
packages for working with data in Python.
From that starting point on, the book introduces the topic of working with spatial data in Python, through dedicated third-party packagesβmost importantly geopandas
and rasterio
.
We also assume familiarity with theoretical concepts of geographic data and GIS, such as coordinate systems, projections, spatial layer file formats, etc., which is necessary for understanding the reasoning of the examples.
Code and sample data
To run the code examples, you can download8 the ZIP file of the GitHub repository. In the ZIP file, the ipynb
directory contains the source files of the chapters in Jupyter Notebook format, the data
directory contains the sample data files, and the output
directory contains the files created in code examples (some of which are also used as inputs in other code sections). Place them together as follows to run the code:
βββ data
β βββ aut.tif
β βββ ch.tif
β βββ coffee_data.csv
β βββ cycle_hire.gpkg
β βββ cycle_hire_osm.gpkg
β βββ cycle_hire_xy.csv
β βββ dem.tif
β βββ landsat.tif
β βββ nlcd.tif
β βββ nz_elev.tif
β βββ nz.gpkg
β βββ nz_height.gpkg
β βββ seine.gpkg
β βββ srtm.tif
β βββ us_states.gpkg
β βββ world.gpkg
β βββ world_wkt.csv
β βββ zion.gpkg
β βββ zion_points.gpkg
βββ output
β βββ cycle_hire_xy.csv
β βββ dem_agg5.tif
β βββ dem_contour.gpkg
β βββ dem_resample_maximum.tif
β βββ dem_resample_nearest.tif
β βββ elev.tif
β βββ grain.tif
β βββ map.html
β βββ ne_10m_airports.cpg
β βββ ne_10m_airports.dbf
β βββ ne_10m_airports.prj
β βββ ne_10m_airports.README.html
β βββ ne_10m_airports.shp
β βββ ne_10m_airports.shx
β βββ ne_10m_airports.VERSION.txt
β βββ ne_10m_airports.zip
β βββ nlcd_4326_2.tif
β βββ nlcd_4326.tif
β βββ nlcd_modified_crs.tif
β βββ plot_geopandas.jpg
β βββ plot_rasterio2.svg
β βββ plot_rasterio.jpg
β βββ r3.tif
β βββ r_nodata_float.tif
β βββ r_nodata_int.tif
β βββ r.tif
β βββ srtm_32612_aspect.tif
β βββ srtm_32612_slope.tif
β βββ srtm_32612.tif
β βββ srtm_masked_cropped.tif
β βββ srtm_masked.tif
β βββ w_many_features.gpkg
β βββ w_many_layers.gpkg
β βββ world.gpkg
βββ 01-spatial-data.ipynb
βββ 02-attribute-operations.ipynb
βββ 03-spatial-operations.ipynb
βββ 04-geometry-operations.ipynb
βββ 05-raster-vector.ipynb
βββ 06-reproj.ipynb
βββ 07-read-write.ipynb
βββ 08-mapping.ipynb
Software
Python version used when rendering the book:
Versions of the main packages used in the book:
numpy==2.0.1
pandas==2.2.2
shapely==2.0.5
geopandas==1.0.1
rasterio==1.3.10
matplotlib==3.9.0
rasterstats==0.19.0
Acknowlegements
We acknowledge Robin Lovelace, Jakub Nowosad, and Jannes Muenchowβauthors of Geocomputation with R (Robin and Jakub also author the present book), a book on the same topic for a different programming language (R). The structure, topics, and most of the theoretical discussions were adapted from that earlier publication.
We thank the authors of the Python language, and the authors of the numpy, pandas, shapely, geopandas, and rasterio packages which are used extensively in the book, for building these wonderful tools.
We acknowledge of GitHub users Will Deakin, Sean Gillies, Josh Cole, and Jt Miclat (at the time of writing; full list on GitHub9) for their contributions during the open-source development of the book.