CosmiQ Works GeoData API reference

cw-geodata class and function list

cw_geodata.raster_image.image.get_geo_transform(…) Get the geotransform for a raster image source.
cw_geodata.vector_label.polygon.affine_transform_gdf(…) Perform an affine transformation on a GeoDataFrame.
cw_geodata.vector_label.polygon.convert_poly_coords(geom) Georegister geometry objects currently in pixel coords or vice versa.
cw_geodata.vector_label.polygon.geojson_to_px_gdf(…) Convert a geojson or set of geojsons from geo coords to px coords.
cw_geodata.vector_label.polygon.georegister_px_df(df) Convert a dataframe of geometries in pixel coordinates to a geo CRS.
cw_geodata.vector_label.polygon.get_overlapping_subset(gdf) Extract a subset of geometries in a GeoDataFrame that overlap with im.
cw_geodata.vector_label.graph.geojson_to_graph(geojson) Convert a geojson of path strings to a network graph.
cw_geodata.vector_label.graph.get_nodes_paths(…) Extract nodes and paths from a vector file.
cw_geodata.vector_label.mask.boundary_mask([…]) Convert a dataframe of geometries to a pixel mask.
cw_geodata.vector_label.mask.contact_mask(df) Create a pixel mask labeling closely juxtaposed objects.
cw_geodata.vector_label.mask.df_to_px_mask(df) Convert a dataframe of geometries to a pixel mask.
cw_geodata.vector_label.mask.footprint_mask(df) Convert a dataframe of geometries to a pixel mask.
cw_geodata.utils.geo.geometries_internal_intersection(…) Get the intersection geometries between all geometries in a set.
cw_geodata.utils.geo.list_to_affine(xform_mat) Create an Affine from a list or array-formatted [a, b, d, e, xoff, yoff]
cw_geodata.utils.geo.split_multi_geometries(gdf) Split apart MultiPolygon or MultiLineString geometries.

Raster/Image functionality

Image submodule


Get the geotransform for a raster image source.

Parameters:raster_src (str, rasterio.DatasetReader, or osgeo.gdal.Dataset) – Path to a raster image with georeferencing data to apply to geom. Alternatively, an opened rasterio.Band object or osgeo.gdal.Dataset object can be provided. Required if not using affine_obj.
Returns:transform – An affine transformation object to the image’s location in its CRS.
Return type:affine.Affine

Vector/Label functionality

Polygon submodule

cw_geodata.vector_label.polygon.affine_transform_gdf(gdf, affine_obj, inverse=False, geom_col='geometry', precision=None)[source]

Perform an affine transformation on a GeoDataFrame.

  • gdf (geopandas.GeoDataFrame, pandas.DataFrame, or str) – A GeoDataFrame, pandas DataFrame with a "geometry" column (or a different column containing geometries, identified by geom_col - note that this column will be renamed "geometry" for ease of use with geopandas), or the path to a saved file in .geojson or .csv format.
  • affine_obj (list or affine.Affine) – An affine transformation to apply to geom in the form of an [a, b, d, e, xoff, yoff] list or an affine.Affine object.
  • inverse (bool, optional) – Use this argument to perform the inverse transformation.
  • geom_col (str, optional) – The column in gdf corresponding to the geometry. Defaults to 'geometry'.
  • precision (int, optional) – Decimal precision to round the geometries to. If not provided, no rounding is performed.
cw_geodata.vector_label.polygon.convert_poly_coords(geom, raster_src=None, affine_obj=None, inverse=False, precision=None)[source]

Georegister geometry objects currently in pixel coords or vice versa.

  • geom (shapely.geometry.shape or str) – A shapely.geometry.shape, or WKT string-formatted geometry object currently in pixel coordinates.
  • raster_src (str, optional) – Path to a raster image with georeferencing data to apply to geom. Alternatively, an opened rasterio.Band object or osgeo.gdal.Dataset object can be provided. Required if not using affine_obj.
  • affine_obj (list or affine.Affine) – An affine transformation to apply to geom in the form of an [a, b, d, e, xoff, yoff] list or an affine.Affine object. Required if not using raster_src.
  • inverse (bool, optional) – If true, will perform the inverse affine transformation, going from geospatial coordinates to pixel coordinates.
  • precision (int, optional) – Decimal precision for the polygon output. If not provided, rounding is skipped.

A geometry in the same format as the input with its coordinate system transformed to match the destination object.

Return type:


cw_geodata.vector_label.polygon.geojson_to_px_gdf(geojson, im_path, geom_col='geometry', precision=None, output_path=None)[source]

Convert a geojson or set of geojsons from geo coords to px coords.

  • geojson (str) – Path to a geojson. This function will also accept a pandas.DataFrame or geopandas.GeoDataFrame with a column named 'geometry' in this argument.
  • im_path (str) – Path to a georeferenced image (ie a GeoTIFF) that geolocates to the same geography as the geojson`(s). This function will also accept a :class:`osgeo.gdal.Dataset or rasterio.DatasetReader with georeferencing information in this argument.
  • geom_col (str, optional) – The column containing geometry in geojson. If not provided, defaults to "geometry".
  • precision (int, optional) – The decimal precision for output geometries. If not provided, the vertex locations won’t be rounded.
  • output_path (str, optional) – Path to save the resulting output to. If not provided, the object won’t be saved to disk.

output_df – A pandas.DataFrame with all geometries in geojson that overlapped with the image at im_path converted to pixel coordinates. Additional columns are included with the filename of the source geojson (if available) and images for reference.

Return type:


cw_geodata.vector_label.polygon.georegister_px_df(df, im_path=None, affine_obj=None, crs=None, geom_col='geometry', precision=None, output_path=None)[source]

Convert a dataframe of geometries in pixel coordinates to a geo CRS.

  • df (pandas.DataFrame) – A pandas.DataFrame with polygons in a column named "geometry".
  • im_path (str, optional) – A filename or rasterio.DatasetReader object containing an image that has the same bounds as the pixel coordinates in df. If not provided, affine_obj and crs must both be provided.
  • affine_obj (list or affine.Affine, optional) – An affine transformation to apply to geom in the form of an [a, b, d, e, xoff, yoff] list or an affine.Affine object. Required if not using raster_src.
  • crs (dict, optional) – The coordinate reference system for the output GeoDataFrame. Required if not providing a raster image to extract the information from. Format should be {'init': 'epsgxxxx'}, replacing xxxx with the EPSG code.
  • geom_col (str, optional) – The column containing geometry in df. If not provided, defaults to "geometry".
  • precision (int, optional) – The decimal precision for output geometries. If not provided, the vertex locations won’t be rounded.
  • output_path (str, optional) – Path to save the resulting output to. If not provided, the object won’t be saved to disk.
cw_geodata.vector_label.polygon.get_overlapping_subset(gdf, im=None, bbox=None, bbox_crs=None)[source]

Extract a subset of geometries in a GeoDataFrame that overlap with im.


This function uses RTree’s spatialindex, which is much faster (but slightly less accurate) than direct comparison of each object for overlap.

  • gdf (geopandas.GeoDataFrame) – A geopandas.GeoDataFrame instance or a path to a geojson.
  • im (rasterio.DatasetReader or str, optional) – An image object loaded with rasterio or a path to a georeferenced image (i.e. a GeoTIFF).
  • bbox (list or shapely.geometry.Polygon, optional) – A bounding box (either a shapely.geometry.Polygon or a [bottom, left, top, right] list) from an image. Has no effect if im is provided (bbox is inferred from the image instead.) If bbox is passed and im is not, a bbox_crs should be provided to ensure correct geolocation - if it isn’t, it will be assumed to have the same crs as gdf.

output_gdf – A geopandas.GeoDataFrame with all geometries in gdf that overlapped with the image at im. Coordinates are kept in the CRS of gdf.

Return type:


Graph submodule

class cw_geodata.vector_label.graph.Edge(nodes, edge_weight=None)[source]

An object to hold edge attributes.


Node instances connected by the edge.

Type:2-tuple of Node s

The weight of the edge.

Type:int or float

Return the Node.idx for the nodes in the edge.

set_edge_weight(normalize_factor=None, inverse=False)[source]

Get the edge weight based on Euclidean distance between nodes.


This method does not account for spherical deformation (i.e. does not use the Haversine equation). It is a simple linear distance.

  • normalize_factor (int or float, optional) – a number to multiply (or divide, if inverse=True) the Euclidean distance by. Defaults to None (no normalization)
  • inverse (bool, optional) – if True, the Euclidean distance weight will be divided by normalize_factor instead of multiplied by it.
class cw_geodata.vector_label.graph.Node(idx, x, y)[source]

An object to hold node attributes.


The numerical index of the node. Used as a unique identifier when the nodes are added to the graph.


Numeric x location of the node, in either a geographic CRS or in pixel coordinates.

Type:int or float

Numeric y location of the node, in either a geographic CRS or in pixel coordinates.

Type:int or float
class cw_geodata.vector_label.graph.Path(edges=None, properties=None)[source]

An object to hold Edge s with common properties.


A list of Edge s

Type:list of Edge s

A dictionary of property: value pairs that provide relevant metadata about edges along the path (e.g. road type, speed limit, etc.)

add_data(property, value)[source]

Add a property: value pair to the attribute.


Add an edge to the path.

set_edge_weights(data_key=None, inverse=False, overwrite=True)[source]

Calculate edge weights for all edges in the Path.

cw_geodata.vector_label.graph.geojson_to_graph(geojson, graph_name=None, retain_all=True, valid_road_types=None, road_type_field='type', edge_idx=0, first_node_idx=0, weight_norm_field=None, inverse=False, workers=1, verbose=False, output_path=None)[source]

Convert a geojson of path strings to a network graph.

  • geojson (str) – Path to a geojson file (or any other OGR-compatible vector file) to load network edges and nodes from.
  • graph_name (str, optional) – Name of the graph. If not provided, graph will be named 'unnamed' .
  • retain_all (bool, optional) – If True , the entire graph will be returned even if some parts are not connected. Defaults to True.
  • valid_road_types (list of int s, optional) –

    The road types to permit in the graph. If not provided, it’s assumed that all road types are permitted. The possible values are integers 1-7, which map as follows:

    1: Motorway
    2: Primary
    3: Secondary
    4: Tertiary
    5: Residential
    6: Unclassified
    7: Cart track
  • road_type_field (str, optional) – The name of the property in the vector data that delineates road type. Defaults to 'type' .
  • edge_idx (int, optional) – The first index to use for an edge. This can be set to a higher value so that a graph’s edge indices don’t overlap with existing values in another graph.
  • first_node_idx (int, optional) – The first index to use for a node. This can be set to a higher value so that a graph’s node indices don’t overlap with existing values in another graph.
  • weight_norm_field (str, optional) – The name of a field in geojson to pass to argument data_key in Path.set_edge_weights(). Defaults to None, in which case no weighting is performed (weights calculated solely using Euclidean distance.)
  • workers (int, optional) – Number of parallel processes to run for parallelization. Defaults to 1. Should not be greater than the number of CPUs available.
  • verbose (bool, optional) – Verbose print output. Defaults to False .
  • output_path (str, optional) – Path to a pickle file to save the output graph to. Nothing will be saved to disk if not provided.

G – A networkx.MultiDiGraph containing all of the nodes and edges from the geojson (or only the largest connected component if retain_all = False). Edge lengths are weighted based on geographic distance.

Return type:


cw_geodata.vector_label.graph.get_nodes_paths(vector_file, first_node_idx=0, node_gdf=Empty GeoDataFrame Columns: [] Index: [], valid_road_types=None, road_type_field='type', workers=1, verbose=False)[source]

Extract nodes and paths from a vector file.

  • vector_file (str) – Path to an OGR-compatible vector file containing line segments (e.g., JSON response from from the Overpass API, or a SpaceNet GeoJSON).
  • first_path_idx (int, optional) – The first index to use for a path. This can be set to a higher value so that a graph’s path indices don’t overlap with existing values in another graph.
  • first_node_idx (int, optional) – The first index to use for a node. This can be set to a higher value so that a graph’s node indices don’t overlap with existing values in another graph.
  • node_gdf (geopandas.GeoDataFrame , optional) – A geopandas.GeoDataFrame containing nodes to add to the graph. New nodes will be added to this object incrementally during the function call.
  • valid_road_types (list of int s, optional) –

    The road types to permit in the graph. If not provided, it’s assumed that all road types are permitted. The possible values are integers 1-7, which map as follows:

    1: Motorway
    2: Primary
    3: Secondary
    4: Tertiary
    5: Residential
    6: Unclassified
    7: Cart track
  • road_type_field (str, optional) – The name of the attribute containing road type information in vector_file. Defaults to 'type'.
  • workers (int, optional) – Number of worker processes to use for parallelization. Defaults to 1. Should not exceed the number of CPUs available.
  • verbose (bool, optional) – Verbose print output. Defaults to False.

nodes, paths

nodes : list

A list of Node s to be added to the graph.

paths : list

A list of Path s containing the Edge s and Node s to be added to the graph.

Return type:

tuple of dict s

cw_geodata.vector_label.graph.graph_to_geojson(G, output_path, encoding='utf-8', overwrite=False)[source]

Save graph to two geojsons: one containing nodes, the other edges.

  • G (networkx.MultiDiGraph) – A graph object to save to geojson files.
  • output_path (str) – Path to save the geojsons to. '_nodes.geojson' and '_edges.geojson' will be appended to output_path (after stripping the extension).
  • encoding (str, optional) – The character encoding for the saved files.
  • overwrite (bool, optional) – Should files at output_path be overwritten? Defaults to no (False).


This function is based on osmnx.save_load.save_graph_shapefile, with tweaks to make it work with our graph objects. It will save two geojsons: a file containing all of the nodes and a file containing all of the edges.

Return type:None
cw_geodata.vector_label.graph.linestring_to_edges(linestring, node_gdf)[source]

Collect nodes in a linestring and add them to an edge.

  • linestring (shapely.geometry.LineString) – A shapely.geometry.LineString object to extract nodes and edges from.
  • node_series (geopandas.GeoSeries) – A geopandas.GeoSeries containing a shapely.geometry.point.Point for every node to be added to the graph.

edges – A list of Edge s from linestring.

Return type:



Read in a feature line from a fiona-opened shapefile and get the edges.

Parameters:feature (dict) – An item from a iterable with the key 'geometry' containing shapely.geometry.line.LineString s or shapely.geometry.line.MultiLineString s.
  • A list of Path s containing all edges in the LineString or
  • MultiLineString.


This function depends on node_series and valid_road_types, which are passed by an initializer as items in var_dict.

Mask submodule

cw_geodata.vector_label.mask.boundary_mask(footprint_msk=None, out_file=None, reference_im=None, boundary_width=3, boundary_type='inner', burn_value=255, **kwargs)[source]

Convert a dataframe of geometries to a pixel mask.


This function requires creation of a footprint mask before it can operate; therefore, if there is no footprint mask already present, it will create one. In that case, additional arguments for footprint_mask() (e.g. df) must be passed.

  • footprint_msk (numpy.array, optional) – A filled in footprint mask created using footprint_mask(). If not provided, one will be made by calling footprint_mask() before creating the boundary mask, and the required arguments for that function must be provided as kwargs.
  • out_file (str, optional) – Path to an image file to save the output to. Must be compatible with rasterio.DatasetReader. If provided, a reference_im must be provided (for metadata purposes).
  • reference_im (rasterio.DatasetReader or str, optional) – An image to extract necessary coordinate information from: the affine transformation matrix, the image extent, etc. If provided, affine_obj and shape are ignored
  • boundary_width (int, optional) – The width of the boundary to be created in pixels. Defaults to 3.
  • boundary_type ("inner" or "outer", optional) – Where to draw the boundaries: within the object ("inner") or outside of it ("outer"). Defaults to "inner".
  • burn_value (int, optional) – The value to use for labeling objects in the mask. Defaults to 255 (the max value for uint8 arrays). The mask array will be set to the same dtype as burn_value. Ignored if burn_field is provided.
  • **kwargs (optional) – Additional arguments to pass to footprint_mask() if one needs to be created.

  • boundary_mask (numpy.array) – A pixel mask with 0s for non-object pixels and the same value as the footprint mask burn_value for the boundaries of each object.
  • Note (This function draws the boundaries within the edge of the object.)

cw_geodata.vector_label.mask.contact_mask(df, out_file=None, reference_im=None, geom_col='geometry', do_transform=False, affine_obj=None, shape=(900, 900), out_type='int', contact_spacing=10, burn_value=255)[source]

Create a pixel mask labeling closely juxtaposed objects.


This function identifies pixels in an image that do not correspond to objects, but fall within contact_spacing of >1 labeled object.

  • df (pandas.DataFrame or geopandas.GeoDataFrame) – A pandas.DataFrame or geopandas.GeoDataFrame instance with a column containing geometries (identified by geom_col). If the geometries in df are not in pixel coordinates, then affine or reference_im must be passed to provide the transformation to convert.
  • out_file (str, optional) – Path to an image file to save the output to. Must be compatible with rasterio.DatasetReader. If provided, a reference_im must be provided (for metadata purposes).
  • reference_im (rasterio.DatasetReader or str, optional) – An image to extract necessary coordinate information from: the affine transformation matrix, the image extent, etc. If provided, affine_obj and shape are ignored.
  • geom_col (str, optional) – The column containing geometries in df. Defaults to "geometry".
  • do_transform (bool, optional) – Should the values in df be transformed from geospatial coordinates to pixel coordinates? Defaults to no (False). If True, either reference_im or affine_obj must be provided as a source for the the required affine transformation matrix.
  • affine_obj (list or affine.Affine, optional) – Affine transformation to use to convert from geo coordinates to pixel space. Only provide this argument if df is a geopandas.GeoDataFrame with coordinates in a georeferenced coordinate space. Ignored if reference_im is provided.
  • shape (tuple, optional) – An (x_size, y_size) tuple defining the pixel extent of the output mask. Ignored if reference_im is provided.
  • out_type ('float' or 'int') –
  • contact_spacing (int or float, optional) – The desired maximum distance between adjacent polygons to be labeled as contact. contact_spacing will be in the same units as df ‘s geometries, not necessarily in pixel units.
  • burn_value (int or float, optional) – The value to use for labeling objects in the mask. Defaults to 255 (the max value for uint8 arrays). The mask array will be set to the same dtype as burn_value.
cw_geodata.vector_label.mask.df_to_px_mask(df, channels=['footprint'], out_file=None, reference_im=None, geom_col='geometry', do_transform=False, affine_obj=None, shape=(900, 900), out_type='int', burn_value=255, **kwargs)[source]

Convert a dataframe of geometries to a pixel mask.

  • df (pandas.DataFrame or geopandas.GeoDataFrame) – A pandas.DataFrame or geopandas.GeoDataFrame instance with a column containing geometries (identified by geom_col). If the geometries in df are not in pixel coordinates, then affine or reference_im must be passed to provide the transformation to convert.
  • channels (list, optional) –

    The mask channels to generate. There are three values that this can contain:

    • "footprint": Create a full footprint mask, with 0s at pixels
      that don’t fall within geometries and burn_value at pixels that do.
    • "boundary": Create a mask with geometries outlined. Use
      boundary_width to set how thick the boundary will be drawn.
    • "contact": Create a mask with regions between >= 2 closely
      juxtaposed geometries labeled. Use contact_spacing to set the maximum spacing between polygons to be labeled.

    Each channel correspond to its own shape plane in the output.

  • out_file (str, optional) – Path to an image file to save the output to. Must be compatible with rasterio.DatasetReader. If provided, a reference_im must be provided (for metadata purposes).
  • reference_im (rasterio.DatasetReader or str, optional) – An image to extract necessary coordinate information from: the affine transformation matrix, the image extent, etc. If provided, affine_obj and shape are ignored.
  • geom_col (str, optional) – The column containing geometries in df. Defaults to "geometry".
  • do_transform (bool, optional) – Should the values in df be transformed from geospatial coordinates to pixel coordinates? Defaults to no (False). If True, either reference_im or affine_obj must be provided as a source for the the required affine transformation matrix.
  • affine_obj (list or affine.Affine, optional) – Affine transformation to use to convert from geo coordinates to pixel space. Only provide this argument if df is a geopandas.GeoDataFrame with coordinates in a georeferenced coordinate space. Ignored if reference_im is provided.
  • shape (tuple, optional) – An (x_size, y_size) tuple defining the pixel extent of the output mask. Ignored if reference_im is provided.
  • burn_value (int or float) – The value to use for labeling objects in the mask. Defaults to 255 (the max value for uint8 arrays). The mask array will be set to the same dtype as burn_value.
  • kwargs – Additional arguments to pass to boundary_mask or contact_mask. See those functions for requirements.

mask – A pixel mask with 0s for non-object pixels and burn_value at object pixels. mask dtype will coincide with burn_value. Shape will be (shape[0], shape[1], len(channels)), with channels ordered per the provided channels list.

Return type:


cw_geodata.vector_label.mask.footprint_mask(df, out_file=None, reference_im=None, geom_col='geometry', do_transform=False, affine_obj=None, shape=(900, 900), out_type='int', burn_value=255, burn_field=None)[source]

Convert a dataframe of geometries to a pixel mask.

  • df (pandas.DataFrame or geopandas.GeoDataFrame) – A pandas.DataFrame or geopandas.GeoDataFrame instance with a column containing geometries (identified by geom_col). If the geometries in df are not in pixel coordinates, then affine or reference_im must be passed to provide the transformation to convert.
  • out_file (str, optional) – Path to an image file to save the output to. Must be compatible with rasterio.DatasetReader. If provided, a reference_im must be provided (for metadata purposes).
  • reference_im (rasterio.DatasetReader or str, optional) – An image to extract necessary coordinate information from: the affine transformation matrix, the image extent, etc. If provided, affine_obj and shape are ignored.
  • geom_col (str, optional) – The column containing geometries in df. Defaults to "geometry".
  • do_transform (bool, optional) – Should the values in df be transformed from geospatial coordinates to pixel coordinates? Defaults to no (False). If True, either reference_im or affine_obj must be provided as a source for the the required affine transformation matrix.
  • affine_obj (list or affine.Affine, optional) – Affine transformation to use to convert from geo coordinates to pixel space. Only provide this argument if df is a geopandas.GeoDataFrame with coordinates in a georeferenced coordinate space. Ignored if reference_im is provided or if do_transform=False.
  • shape (tuple, optional) – An (x_size, y_size) tuple defining the pixel extent of the output mask. Ignored if reference_im is provided.
  • out_type ('float' or 'int') –
  • burn_value (int or float, optional) – The value to use for labeling objects in the mask. Defaults to 255 (the max value for uint8 arrays). The mask array will be set to the same dtype as burn_value. Ignored if burn_field is provided.
  • burn_field (str, optional) – Name of a column in df that provides values for burn_value for each independent object. If provided, burn_value is ignored.

mask – A pixel mask with 0s for non-object pixels and burn_value at object pixels. mask dtype will coincide with burn_value.

Return type:


cw_geodata.vector_label.mask.mask_to_poly_geojson(mask_arr, reference_im=None, output_path=None, output_type='csv', min_area=40, bg_value=0, do_transform=False, simplify=False, tolerance=0.5, **kwargs)[source]

Get polygons from an image mask.

  • mask_arr (numpy.ndarray of ints) – A 2D array of integers. Multi-channel masks are not supported, and must be simplified before passing to this function. Can also pass an image file path here.
  • reference_im (str, optional) – The path to a reference geotiff to use for georeferencing the polygons in the mask. Required if saving to a GeoJSON (see the output_type argument), otherwise only required if do_transform=True.
  • output_path (str, optional) – Path to save the output file to. If not provided, no file is saved.
  • output_type ('csv' or 'geojson', optional) – If output_path is provided, this argument defines what type of file will be generated - a CSV (output_type='csv') or a geojson (output_type='geojson').
  • min_area (int, optional) – The minimum area of a polygon to retain. Filtering is done AFTER any coordinate transformation, and therefore will be in destination units.
  • bg_value (int, optional) – The value in mask_arr that denotes background (non-object). Defaults to 0.
  • simplify (bool, optional) – If True, will use the Douglas-Peucker algorithm to simplify edges, saving memory and processing time later. Defaults to False.
  • tolerance (float, optional) – The tolerance value to use for simplification with the Douglas-Peucker algorithm. Defaults to 0.5. Only has an effect if simplify=True.

gdf – A GeoDataFrame of polygons.

Return type:


Utility functions

Geo utility submodule


Get the intersection geometries between all geometries in a set.

Parameters:polygons (list-like) – A list-like containing geometries. These will be placed in a geopandas.GeoSeries object to take advantage of rtree spatial indexing.
Returns:A list of geometric intersections between polygons in polygons, in the same CRS as the input.
Return type:intersect_list
cw_geodata.utils.geo.get_subgraph(G, node_subset)[source]

Create a subgraph from G. Code almost directly copied from osmnx.

  • G (networkx.MultiDiGraph) – A graph to be subsetted
  • node_subset (list-like) – The subset of nodes to induce a subgraph of G

G2 – The subgraph of G that includes node_subset

Return type:



Create an Affine from a list or array-formatted [a, b, d, e, xoff, yoff]

Parameters:xform_mat (list or numpy.array) – A list of values to convert to an affine object.
Returns:aff – An affine transformation object.
Return type:affine.Affine
cw_geodata.utils.geo.split_multi_geometries(gdf, obj_id_col=None, group_col=None, geom_col='geometry')[source]

Split apart MultiPolygon or MultiLineString geometries.

  • gdf (geopandas.GeoDataFrame or str) – A geopandas.GeoDataFrame or path to a geojson containing geometries.
  • obj_id_col (str, optional) – If one exists, the name of the column that uniquely identifies each geometry (e.g. the "BuildingId" column in many SpaceNet datasets). This will be tracked so multiple objects don’t get produced with the same ID. Note that object ID column will be renumbered on output. If passed, group_col must also be provided.
  • group_col (str, optional) – A column to identify groups for sequential numbering (for example, 'ImageId' for sequential number of 'BuildingId'). Must be provided if obj_id_col is passed.
  • geom_col (str, optional) – The name of the column in gdf that corresponds to geometry. Defaults to 'geometry'.

A geopandas.GeoDataFrame that’s identical to the input, except with the multipolygons split into separate rows, and the object ID column renumbered (if one exists).

Return type:
