core API
- class ptolemy.raster.IndexRaster(indicator: DataArray, boundary: DataArray, index: Index)
Bases:
objectReduced weighted raster for aggregating and gridding on a lat, lon grid.
- indicator
Integer values from 1 to number of shapes indicating cells fully belonging to a single shape
- Type:
xr.DataArray (lon, lat)
- boundary
Weights per boundary cell and shape
- Type:
xr.DataArray (spatial, shape)
- index
Names of each shape
- Type:
pd.Index
- cell_area
Area of each grid cell in sqm
- Type:
xr.DataArea (lat)
- Classmethods
- ------------
- from_weighted_area
Creates indexraster from a weighted raster idxraster
- from_netcdf
Reads indexraster from custom netcdf format
- to_netcdf()
Saves indexraster to custom netcdf format
- aggregate()
Aggregates given data for each shape
- grid()
Creates a grid where panel data fills shapes
- aggregate(ndraster: DataArray, func: str = 'sum', interior_only: bool = False) DataArray
Aggregate data in ndraster per country.
Uses flox for efficient computation of statistics on the country interior for chunked data.
- Parameters:
ndraster (xr.DataArray) – Data to aggregate, needs spatial dimensions
func (str, optional) – Statistic to compute per country; statistics other than the default “sum” are only supported on the interior.
interior_only (bool, optional) – If True only takes into account cells fully contained in a single country, required for statistics other than “sum”, by default False
- Returns:
Per country aggregations
- Return type:
xr.DataArray
- Raises:
NotImplementedError – If a statistic other than sum should be computed
- boundary: DataArray
- property cell_area
- chunk(*args, **kwargs) IndexRaster
Chunk indicator and boundary.
- Return type:
Indexraster with chunked dasked arrays
- compute() IndexRaster
“Compute”s indicator and boundary.
- Return type:
Indexraster with numpy backed representations
- property dim
- dissolve(mapping: Series) IndexRaster
Combines masks for same value in mapping into new indexraster.
- Parameters:
mapping (pd.Series) – Maps masks index to aggregated index names (f.ex. a region mapping from iso3 codes to model regions)
- Returns:
Aggregated index raster
- Return type:
- Raises:
ValueError – If not all index values appear in mapping
- classmethod from_netcdf(path, chunks: dict | str | None = None) IndexRaster
Read from custom netcdf format.
- Parameters:
path – Where to read netcdf from
chunks (dict or "auto", optional) – Opens netcdf with custom dask chunks, by default None
- Returns:
Indexraster from path
- Return type:
Self
- classmethod from_weighted_raster(idxraster: DataArray, dim: str | None = None) IndexRaster
Creates indexraster from a weighted raster idxraster
- Parameters:
idxraster (xr.DataArray) – Weighted raster
dim (str, optional) – Non-spatial idxraster dimension, by default None
- Returns:
Corresponding reduced weighted raster
- Return type:
indexraster
- Raises:
ValueError – Dimensions of idxraster should be (“lat”, “lon”, dim)
ValueError – Idxraster contains cells fully part of more than one {dim}
- grid(data: DataArray) DataArray
Fills shapes with data
- Parameters:
data (xr.DataArray) – Panel data with dimension dim
- Returns:
Data without dimension dim but spatial dimensions
- Return type:
xr.DataArray
Notes
For panel data like x = 1, y = 2, the constructed grid will contain
the value 1 for all interior cells of shape x
the value 2 for all interior cells of shape y
the value a * 1 + b * 2 for a boundary cell, which belongs with a share a to x and a share b to y
- indicator: DataArray
- persist() IndexRaster
“Compute”s indicator and boundary and keeps it in dask.
Prefer persisting to computing to avoid having to transfer results back to workers.
- Return type:
Indexraster with persisted dask-backed xarrays
- property spatial_dims
- to_netcdf(path)
Save in custom netcdf format to path
- Parameters:
path – Where to store netcdf
- class ptolemy.raster.Rasterize(shape=None, coords=None, like=None)
Bases:
objectExample use case
` rs = Rasterize(like='data.nc') rs.read_shapefile('boundaries.shp', idxkey='ISO3') da = rs.rasterize(strategy='all_touched') da.to_netcdf('raster.nc') `- rasterize(strategy=None, normalize_weights=True, verbose=False, drop=True)
Rasterizes the indicies of the current shapefile.
- Parameters:
strategy (str, optional) – must be one of 0. all_touched: GDAL’s all_touched = True 0. centroid: GDAL’s all_touched = False 0. hybrid: a combination of all_touched and centroid, providing a better allotment of edge (coastal-like) cells 0. majority: cells are assigned to the shapes with the majority of area within them 0. weighted: provides a stack of rasters(1 per geometry) of cell weights
normalize_weights (bool, optional) – if using weighted strategy, normalize the weights
verbose (bool, optional) – print out status information during rasterization
drop (bool, optional) – drop where nodata values in both lat and lon
- read_shpf(shpf, idxkey=None, flatten=None, where=None)
- ptolemy.raster.block_apply_2d(a, blockshape, func=<function sum>, weights=None)
Apply a function to blocks of an array.
Returns a reduced array with shape: a.shape[0] / blockshape[0], a.shape[1] / blockshape[1]
TODO: the inner loop can be sped up (likely with cython)
- Parameters:
weights (array with same shape as a, optional) – a weight matrix to supply to func (func must take a weights parameter)
- ptolemy.raster.block_view_2d(a, blockshape)
Collapse a 2d array into constituent blocks with a given shape.
- ptolemy.raster.cell_area(lats, lons=None, crs=4326)
Computes the grid cell area given centroid latitude and longitude coordinates.
- Parameters:
lats (array or similar) – latitude coordinates
lons (array or similar) – longitude coordinates
crs (string or similar defining CRS, optional) – the origin CRS
- Returns:
area
- Return type:
a geopandas.Series with index of lats and values of area in m^2
- ptolemy.raster.cell_area_from_file(file, lat_name='lat', lon_name=None)
Returns the grid cell area by latitude of a raster file.
- Parameters:
file (str, pathlib.Path, xr.Dataset, or similar) – a file from which to take transform and latitude objects
lat_name (str, optional) – the name of the latitude dimension or coordinate
lon_name (str, optional) – the name of the longitude dimension or coordinate
- Returns:
area
- Return type:
a geopandas.Series with index of lats and values of area in m^2
- ptolemy.raster.df_to_raster(df, idxraster, idx_col, idx_map, ds=None, coords=[], cols=[])
Takes data from a pd.DataFrame and deposits it on a raster.
- Parameters:
df (pd.DataFrame) – a dataframe with an index aligned with coords and data in other columns
idxraster (xr.DataArray) – an index raster, e.g., from pt.Rasterize()
idx_col (string) – the name of the column linking df to idxraster
idx_map (map) – a map of strings to values of idxraster to generate the raster
ds (xr.DataSet(), optional) – a model dataset to use
coords (list(xr.Coordinate), optional) – coordinates to use to generate a raster
cols (list(string), optional) – the columns to apply to the raster
- ptolemy.raster.df_to_weighted_raster(df, idxraster, col=None, extra_coords=[], sum_dim=None)
Translates data to a raster with multiple weighting layers. This can be used to apply panel data for a series of geometries (e.g., countries) onto gridded data.
- Parameters:
df (pd.DataFrame) – a dataframe with columns or indicies aligned with coordinates in indexraster
idxraster (xr.DataArray) – an index raster with a layer coordinate aligned with df. This raster can be made with pt.Rasterize().rasterize() using the strategy=”weighted” option.
col (str, optional) – the column in df to cast to the map
extra_coords (list, optional) – additional columns in df which should be translated to be coordinates. For example, if you want to put panel data onto a raster and that data has a “year” column, then you should call this with coords=[“year”]
sum_dim (list, optional) – string names of dimension(s) to sum along. This option can be used, e.g., to collapse the multiple weighted layers into one ‘global’ result.
- ptolemy.raster.full_like(other, fill_value=nan, add_coords={}, replace_vars=[])
- ptolemy.raster.raster_to_df(raster, idxraster, idx_map=None, idx_dim='shape_dim', func='max', drop_zeros=True)
Takes data from a raster and makes a pd.DataFrame. Zonal statistics can be derived with this function.
By default, unique values in the index raster areas are returned.
- Parameters:
raster (xr.DataArray) – data to make a pd.DataFrame
idxraster (xr.DataArray) – an index raster, e.g., from pt.Rasterize()
idx_map (dict, optional) – a map of strings to values of idxraster if idxraster is not weighted
idx_dim (str, optional) – the name of the index dimension if idx_map is provided
func (string, optional) –
- a function with can be applied to an array of data. currently supports:
max
sum
mean
drop_zeros (bool, optional) – drop zeros from the dataframe before returning
- ptolemy.raster.rasterize_majority(geoms_idxs, atrans, shape, nodata, ignore_nodata=False, verbose=False)
Rasterize shapes such that the shape with the majority of area in a cell is assigned to that cell.
- Parameters:
ignore_nodata (bool, optional) – ignore nodata values when determining majority in a cell (appropriate for, e.g., when a small island is the only feature in a cell)
- ptolemy.raster.rasterize_pctcover(geom, atrans, shape)
- ptolemy.raster.rebin_sum(a, shape, dtype)
- ptolemy.raster.rescale_raster_props(affine, shape, scale)
Return new transform and shape for a raster for it to be scaled in each lat/long dimension by a factor.
- ptolemy.raster.transform_from_latlon(lat, lon)
- ptolemy.raster.update_raster(raster, series, idxraster, idx_map)
Updates a raster array given a raster of indicies and values as columns.
- Parameters: