core API

class ptolemy.raster.IndexRaster(indicator: DataArray, boundary: DataArray, index: Index)

Bases: object

Reduced weighted raster for aggregating and gridding on a lat, lon grid.

indicator

Integer values from 1 to number of shapes indicating cells fully belonging to a single shape

Type:

xr.DataArray (lon, lat)

boundary

Weights per boundary cell and shape

Type:

xr.DataArray (spatial, shape)

index

Names of each shape

Type:

pd.Index

name

Dimension name of shapes

Type:

str

cell_area

Area of each grid cell in sqm

Type:

xr.DataArea (lat)

Classmethods
------------
from_weighted_area

Creates indexraster from a weighted raster idxraster

from_netcdf

Reads indexraster from custom netcdf format

to_netcdf()

Saves indexraster to custom netcdf format

aggregate()

Aggregates given data for each shape

grid()

Creates a grid where panel data fills shapes

aggregate(ndraster: DataArray, func: str = 'sum', interior_only: bool = False) DataArray

Aggregate data in ndraster per country.

Uses flox for efficient computation of statistics on the country interior for chunked data.

Parameters:
  • ndraster (xr.DataArray) – Data to aggregate, needs spatial dimensions

  • func (str, optional) – Statistic to compute per country; statistics other than the default “sum” are only supported on the interior.

  • interior_only (bool, optional) – If True only takes into account cells fully contained in a single country, required for statistics other than “sum”, by default False

Returns:

Per country aggregations

Return type:

xr.DataArray

Raises:

NotImplementedError – If a statistic other than sum should be computed

boundary: DataArray
property cell_area
chunk(*args, **kwargs) IndexRaster

Chunk indicator and boundary.

Return type:

Indexraster with chunked dasked arrays

compute() IndexRaster

“Compute”s indicator and boundary.

Return type:

Indexraster with numpy backed representations

property dim
dissolve(mapping: Series) IndexRaster

Combines masks for same value in mapping into new indexraster.

Parameters:

mapping (pd.Series) – Maps masks index to aggregated index names (f.ex. a region mapping from iso3 codes to model regions)

Returns:

Aggregated index raster

Return type:

IndexRaster

Raises:

ValueError – If not all index values appear in mapping

classmethod from_netcdf(path, chunks: dict | str | None = None) IndexRaster

Read from custom netcdf format.

Parameters:
  • path – Where to read netcdf from

  • chunks (dict or "auto", optional) – Opens netcdf with custom dask chunks, by default None

Returns:

Indexraster from path

Return type:

Self

classmethod from_weighted_raster(idxraster: DataArray, dim: str | None = None) IndexRaster

Creates indexraster from a weighted raster idxraster

Parameters:
  • idxraster (xr.DataArray) – Weighted raster

  • dim (str, optional) – Non-spatial idxraster dimension, by default None

Returns:

Corresponding reduced weighted raster

Return type:

indexraster

Raises:
  • ValueError – Dimensions of idxraster should be (“lat”, “lon”, dim)

  • ValueError – Idxraster contains cells fully part of more than one {dim}

grid(data: DataArray) DataArray

Fills shapes with data

Parameters:

data (xr.DataArray) – Panel data with dimension dim

Returns:

Data without dimension dim but spatial dimensions

Return type:

xr.DataArray

Notes

For panel data like x = 1, y = 2, the constructed grid will contain

  • the value 1 for all interior cells of shape x

  • the value 2 for all interior cells of shape y

  • the value a * 1 + b * 2 for a boundary cell, which belongs with a share a to x and a share b to y

index: Index
indicator: DataArray
persist() IndexRaster

“Compute”s indicator and boundary and keeps it in dask.

Prefer persisting to computing to avoid having to transfer results back to workers.

Return type:

Indexraster with persisted dask-backed xarrays

property spatial_dims
to_netcdf(path)

Save in custom netcdf format to path

Parameters:

path – Where to store netcdf

class ptolemy.raster.Rasterize(shape=None, coords=None, like=None)

Bases: object

Example use case

` rs = Rasterize(like='data.nc') rs.read_shapefile('boundaries.shp', idxkey='ISO3') da = rs.rasterize(strategy='all_touched') da.to_netcdf('raster.nc') `

rasterize(strategy=None, normalize_weights=True, verbose=False, drop=True)

Rasterizes the indicies of the current shapefile.

Parameters:
  • strategy (str, optional) – must be one of 0. all_touched: GDAL’s all_touched = True 0. centroid: GDAL’s all_touched = False 0. hybrid: a combination of all_touched and centroid, providing a better allotment of edge (coastal-like) cells 0. majority: cells are assigned to the shapes with the majority of area within them 0. weighted: provides a stack of rasters(1 per geometry) of cell weights

  • normalize_weights (bool, optional) – if using weighted strategy, normalize the weights

  • verbose (bool, optional) – print out status information during rasterization

  • drop (bool, optional) – drop where nodata values in both lat and lon

read_shpf(shpf, idxkey=None, flatten=None, where=None)
ptolemy.raster.block_apply_2d(a, blockshape, func=<function sum>, weights=None)

Apply a function to blocks of an array.

Returns a reduced array with shape: a.shape[0] / blockshape[0], a.shape[1] / blockshape[1]

TODO: the inner loop can be sped up (likely with cython)

Parameters:

weights (array with same shape as a, optional) – a weight matrix to supply to func (func must take a weights parameter)

ptolemy.raster.block_view_2d(a, blockshape)

Collapse a 2d array into constituent blocks with a given shape.

ptolemy.raster.cell_area(lats, lons=None, crs=4326)

Computes the grid cell area given centroid latitude and longitude coordinates.

Parameters:
  • lats (array or similar) – latitude coordinates

  • lons (array or similar) – longitude coordinates

  • crs (string or similar defining CRS, optional) – the origin CRS

Returns:

area

Return type:

a geopandas.Series with index of lats and values of area in m^2

ptolemy.raster.cell_area_from_file(file, lat_name='lat', lon_name=None)

Returns the grid cell area by latitude of a raster file.

Parameters:
  • file (str, pathlib.Path, xr.Dataset, or similar) – a file from which to take transform and latitude objects

  • lat_name (str, optional) – the name of the latitude dimension or coordinate

  • lon_name (str, optional) – the name of the longitude dimension or coordinate

Returns:

area

Return type:

a geopandas.Series with index of lats and values of area in m^2

ptolemy.raster.df_to_raster(df, idxraster, idx_col, idx_map, ds=None, coords=[], cols=[])

Takes data from a pd.DataFrame and deposits it on a raster.

Parameters:
  • df (pd.DataFrame) – a dataframe with an index aligned with coords and data in other columns

  • idxraster (xr.DataArray) – an index raster, e.g., from pt.Rasterize()

  • idx_col (string) – the name of the column linking df to idxraster

  • idx_map (map) – a map of strings to values of idxraster to generate the raster

  • ds (xr.DataSet(), optional) – a model dataset to use

  • coords (list(xr.Coordinate), optional) – coordinates to use to generate a raster

  • cols (list(string), optional) – the columns to apply to the raster

ptolemy.raster.df_to_weighted_raster(df, idxraster, col=None, extra_coords=[], sum_dim=None)

Translates data to a raster with multiple weighting layers. This can be used to apply panel data for a series of geometries (e.g., countries) onto gridded data.

Parameters:
  • df (pd.DataFrame) – a dataframe with columns or indicies aligned with coordinates in indexraster

  • idxraster (xr.DataArray) – an index raster with a layer coordinate aligned with df. This raster can be made with pt.Rasterize().rasterize() using the strategy=”weighted” option.

  • col (str, optional) – the column in df to cast to the map

  • extra_coords (list, optional) – additional columns in df which should be translated to be coordinates. For example, if you want to put panel data onto a raster and that data has a “year” column, then you should call this with coords=[“year”]

  • sum_dim (list, optional) – string names of dimension(s) to sum along. This option can be used, e.g., to collapse the multiple weighted layers into one ‘global’ result.

ptolemy.raster.full_like(other, fill_value=nan, add_coords={}, replace_vars=[])
ptolemy.raster.raster_to_df(raster, idxraster, idx_map=None, idx_dim='shape_dim', func='max', drop_zeros=True)

Takes data from a raster and makes a pd.DataFrame. Zonal statistics can be derived with this function.

By default, unique values in the index raster areas are returned.

Parameters:
  • raster (xr.DataArray) – data to make a pd.DataFrame

  • idxraster (xr.DataArray) – an index raster, e.g., from pt.Rasterize()

  • idx_map (dict, optional) – a map of strings to values of idxraster if idxraster is not weighted

  • idx_dim (str, optional) – the name of the index dimension if idx_map is provided

  • func (string, optional) –

    a function with can be applied to an array of data. currently supports:
    • max

    • sum

    • mean

  • drop_zeros (bool, optional) – drop zeros from the dataframe before returning

ptolemy.raster.rasterize_majority(geoms_idxs, atrans, shape, nodata, ignore_nodata=False, verbose=False)

Rasterize shapes such that the shape with the majority of area in a cell is assigned to that cell.

Parameters:

ignore_nodata (bool, optional) – ignore nodata values when determining majority in a cell (appropriate for, e.g., when a small island is the only feature in a cell)

ptolemy.raster.rasterize_pctcover(geom, atrans, shape)
ptolemy.raster.rebin_sum(a, shape, dtype)
ptolemy.raster.rescale_raster_props(affine, shape, scale)

Return new transform and shape for a raster for it to be scaled in each lat/long dimension by a factor.

ptolemy.raster.transform_from_latlon(lat, lon)
ptolemy.raster.update_raster(raster, series, idxraster, idx_map)

Updates a raster array given a raster of indicies and values as columns.

Parameters:
  • raster (np.ndarray) – the base raster on which to apply updates

  • idxrasters (list of np.ndarray or strings) – rasters of indicies where idx_map[series.index] -> idx

  • series (list of pd.Series) – values to use to update the raster, the Index of the Series must be the keys of the associated idx_map