CosmiQ Works GeoData API reference¶
Contents
cw-geodata class and function list¶
cw_geodata.raster_image.image.get_geo_transform (…) |
Get the geotransform for a raster image source. |
cw_geodata.vector_label.polygon.affine_transform_gdf (…) |
Perform an affine transformation on a GeoDataFrame. |
cw_geodata.vector_label.polygon.convert_poly_coords (geom) |
Georegister geometry objects currently in pixel coords or vice versa. |
cw_geodata.vector_label.polygon.geojson_to_px_gdf (…) |
Convert a geojson or set of geojsons from geo coords to px coords. |
cw_geodata.vector_label.polygon.georegister_px_df (df) |
Convert a dataframe of geometries in pixel coordinates to a geo CRS. |
cw_geodata.vector_label.polygon.get_overlapping_subset (gdf) |
Extract a subset of geometries in a GeoDataFrame that overlap with im. |
cw_geodata.vector_label.graph.geojson_to_graph (geojson) |
Convert a geojson of path strings to a network graph. |
cw_geodata.vector_label.graph.get_nodes_paths (…) |
Extract nodes and paths from a vector file. |
cw_geodata.vector_label.graph.process_linestring |
|
cw_geodata.vector_label.mask.boundary_mask ([…]) |
Convert a dataframe of geometries to a pixel mask. |
cw_geodata.vector_label.mask.contact_mask (df) |
Create a pixel mask labeling closely juxtaposed objects. |
cw_geodata.vector_label.mask.df_to_px_mask (df) |
Convert a dataframe of geometries to a pixel mask. |
cw_geodata.vector_label.mask.footprint_mask (df) |
Convert a dataframe of geometries to a pixel mask. |
cw_geodata.utils.geo.geometries_internal_intersection (…) |
Get the intersection geometries between all geometries in a set. |
cw_geodata.utils.geo.list_to_affine (xform_mat) |
Create an Affine from a list or array-formatted [a, b, d, e, xoff, yoff] |
cw_geodata.utils.geo.split_multi_geometries (gdf) |
Split apart MultiPolygon or MultiLineString geometries. |
Raster/Image functionality¶
Image submodule¶
-
cw_geodata.raster_image.image.
get_geo_transform
(raster_src)[source]¶ Get the geotransform for a raster image source.
Parameters: raster_src (str, rasterio.DatasetReader
, or osgeo.gdal.Dataset) – Path to a raster image with georeferencing data to apply to geom. Alternatively, an openedrasterio.Band
object orosgeo.gdal.Dataset
object can be provided. Required if not using affine_obj.Returns: transform – An affine transformation object to the image’s location in its CRS. Return type: affine.Affine
Vector/Label functionality¶
Polygon submodule¶
-
cw_geodata.vector_label.polygon.
affine_transform_gdf
(gdf, affine_obj, inverse=False, geom_col='geometry', precision=None)[source]¶ Perform an affine transformation on a GeoDataFrame.
Parameters: - gdf (
geopandas.GeoDataFrame
,pandas.DataFrame
, or str) – A GeoDataFrame, pandas DataFrame with a"geometry"
column (or a different column containing geometries, identified by geom_col - note that this column will be renamed"geometry"
for ease of use with geopandas), or the path to a saved file in .geojson or .csv format. - affine_obj (list or
affine.Affine
) – An affine transformation to apply to geom in the form of an[a, b, d, e, xoff, yoff]
list or anaffine.Affine
object. - inverse (bool, optional) – Use this argument to perform the inverse transformation.
- geom_col (str, optional) – The column in gdf corresponding to the geometry. Defaults to
'geometry'
. - precision (int, optional) – Decimal precision to round the geometries to. If not provided, no rounding is performed.
- gdf (
-
cw_geodata.vector_label.polygon.
convert_poly_coords
(geom, raster_src=None, affine_obj=None, inverse=False, precision=None)[source]¶ Georegister geometry objects currently in pixel coords or vice versa.
Parameters: - geom (
shapely.geometry.shape
or str) – Ashapely.geometry.shape
, or WKT string-formatted geometry object currently in pixel coordinates. - raster_src (str, optional) – Path to a raster image with georeferencing data to apply to geom.
Alternatively, an opened
rasterio.Band
object orosgeo.gdal.Dataset
object can be provided. Required if not using affine_obj. - affine_obj (list or
affine.Affine
) – An affine transformation to apply to geom in the form of an[a, b, d, e, xoff, yoff]
list or anaffine.Affine
object. Required if not using raster_src. - inverse (bool, optional) – If true, will perform the inverse affine transformation, going from geospatial coordinates to pixel coordinates.
- precision (int, optional) – Decimal precision for the polygon output. If not provided, rounding is skipped.
Returns: A geometry in the same format as the input with its coordinate system transformed to match the destination object.
Return type: out_geom
- geom (
-
cw_geodata.vector_label.polygon.
geojson_to_px_gdf
(geojson, im_path, geom_col='geometry', precision=None, output_path=None)[source]¶ Convert a geojson or set of geojsons from geo coords to px coords.
Parameters: - geojson (str) – Path to a geojson. This function will also accept a
pandas.DataFrame
orgeopandas.GeoDataFrame
with a column named'geometry'
in this argument. - im_path (str) – Path to a georeferenced image (ie a GeoTIFF) that geolocates to the
same geography as the geojson`(s). This function will also accept a
:class:`osgeo.gdal.Dataset or
rasterio.DatasetReader
with georeferencing information in this argument. - geom_col (str, optional) – The column containing geometry in geojson. If not provided, defaults
to
"geometry"
. - precision (int, optional) – The decimal precision for output geometries. If not provided, the vertex locations won’t be rounded.
- output_path (str, optional) – Path to save the resulting output to. If not provided, the object won’t be saved to disk.
Returns: output_df – A
pandas.DataFrame
with all geometries in geojson that overlapped with the image at im_path converted to pixel coordinates. Additional columns are included with the filename of the source geojson (if available) and images for reference.Return type: - geojson (str) – Path to a geojson. This function will also accept a
-
cw_geodata.vector_label.polygon.
georegister_px_df
(df, im_path=None, affine_obj=None, crs=None, geom_col='geometry', precision=None, output_path=None)[source]¶ Convert a dataframe of geometries in pixel coordinates to a geo CRS.
Parameters: - df (
pandas.DataFrame
) – Apandas.DataFrame
with polygons in a column named"geometry"
. - im_path (str, optional) – A filename or
rasterio.DatasetReader
object containing an image that has the same bounds as the pixel coordinates in df. If not provided, affine_obj and crs must both be provided. - affine_obj (list or
affine.Affine
, optional) – An affine transformation to apply to geom in the form of an[a, b, d, e, xoff, yoff]
list or anaffine.Affine
object. Required if not using raster_src. - crs (dict, optional) – The coordinate reference system for the output GeoDataFrame. Required
if not providing a raster image to extract the information from. Format
should be
{'init': 'epsgxxxx'}
, replacing xxxx with the EPSG code. - geom_col (str, optional) – The column containing geometry in df. If not provided, defaults to
"geometry"
. - precision (int, optional) – The decimal precision for output geometries. If not provided, the vertex locations won’t be rounded.
- output_path (str, optional) – Path to save the resulting output to. If not provided, the object won’t be saved to disk.
- df (
-
cw_geodata.vector_label.polygon.
get_overlapping_subset
(gdf, im=None, bbox=None, bbox_crs=None)[source]¶ Extract a subset of geometries in a GeoDataFrame that overlap with im.
Notes
This function uses RTree’s spatialindex, which is much faster (but slightly less accurate) than direct comparison of each object for overlap.
Parameters: - gdf (
geopandas.GeoDataFrame
) – Ageopandas.GeoDataFrame
instance or a path to a geojson. - im (
rasterio.DatasetReader
or str, optional) – An image object loaded with rasterio or a path to a georeferenced image (i.e. a GeoTIFF). - bbox (list or
shapely.geometry.Polygon
, optional) – A bounding box (either ashapely.geometry.Polygon
or a[bottom, left, top, right]
list) from an image. Has no effect if im is provided (bbox is inferred from the image instead.) If bbox is passed and im is not, a bbox_crs should be provided to ensure correct geolocation - if it isn’t, it will be assumed to have the same crs as gdf.
Returns: output_gdf – A
geopandas.GeoDataFrame
with all geometries in gdf that overlapped with the image at im. Coordinates are kept in the CRS of gdf.Return type: - gdf (
Graph submodule¶
-
class
cw_geodata.vector_label.graph.
Edge
(nodes, edge_weight=None)[source]¶ An object to hold edge attributes.
-
set_edge_weight
(normalize_factor=None, inverse=False)[source]¶ Get the edge weight based on Euclidean distance between nodes.
Note
This method does not account for spherical deformation (i.e. does not use the Haversine equation). It is a simple linear distance.
Parameters: - normalize_factor (int or float, optional) – a number to multiply (or divide, if
inverse=True
) the Euclidean distance by. Defaults toNone
(no normalization) - inverse (bool, optional) – if
True
, the Euclidean distance weight will be divided bynormalize_factor
instead of multiplied by it.
- normalize_factor (int or float, optional) – a number to multiply (or divide, if
-
-
class
cw_geodata.vector_label.graph.
Node
(idx, x, y)[source]¶ An object to hold node attributes.
-
idx
¶ The numerical index of the node. Used as a unique identifier when the nodes are added to the graph.
Type: int
-
x
¶ Numeric x location of the node, in either a geographic CRS or in pixel coordinates.
Type: int or float
-
y
¶ Numeric y location of the node, in either a geographic CRS or in pixel coordinates.
Type: int or float
-
-
class
cw_geodata.vector_label.graph.
Path
(edges=None, properties=None)[source]¶ An object to hold
Edge
s with common properties.
-
cw_geodata.vector_label.graph.
geojson_to_graph
(geojson, graph_name=None, retain_all=True, valid_road_types=None, road_type_field='type', edge_idx=0, first_node_idx=0, weight_norm_field=None, inverse=False, workers=1, verbose=False, output_path=None)[source]¶ Convert a geojson of path strings to a network graph.
Parameters: - geojson (str) – Path to a geojson file (or any other OGR-compatible vector file) to load network edges and nodes from.
- graph_name (str, optional) – Name of the graph. If not provided, graph will be named
'unnamed'
. - retain_all (bool, optional) – If
True
, the entire graph will be returned even if some parts are not connected. Defaults toTrue
. - valid_road_types (
list
ofint
s, optional) –The road types to permit in the graph. If not provided, it’s assumed that all road types are permitted. The possible values are integers
1
-7
, which map as follows:1: Motorway 2: Primary 3: Secondary 4: Tertiary 5: Residential 6: Unclassified 7: Cart track
- road_type_field (str, optional) – The name of the property in the vector data that delineates road type.
Defaults to
'type'
. - edge_idx (int, optional) – The first index to use for an edge. This can be set to a higher value so that a graph’s edge indices don’t overlap with existing values in another graph.
- first_node_idx (int, optional) – The first index to use for a node. This can be set to a higher value so that a graph’s node indices don’t overlap with existing values in another graph.
- weight_norm_field (str, optional) – The name of a field in geojson to pass to argument
data_key
inPath.set_edge_weights()
. Defaults toNone
, in which case no weighting is performed (weights calculated solely using Euclidean distance.) - workers (int, optional) – Number of parallel processes to run for parallelization. Defaults to 1. Should not be greater than the number of CPUs available.
- verbose (bool, optional) – Verbose print output. Defaults to
False
. - output_path (str, optional) – Path to a pickle file to save the output graph to. Nothing will be saved to disk if not provided.
Returns: G – A
networkx.MultiDiGraph
containing all of the nodes and edges from the geojson (or only the largest connected component if retain_all =False
). Edge lengths are weighted based on geographic distance.Return type:
-
cw_geodata.vector_label.graph.
get_nodes_paths
(vector_file, first_node_idx=0, node_gdf=Empty GeoDataFrame Columns: [] Index: [], valid_road_types=None, road_type_field='type', workers=1, verbose=False)[source]¶ Extract nodes and paths from a vector file.
Parameters: - vector_file (str) – Path to an OGR-compatible vector file containing line segments (e.g., JSON response from from the Overpass API, or a SpaceNet GeoJSON).
- first_path_idx (int, optional) – The first index to use for a path. This can be set to a higher value so that a graph’s path indices don’t overlap with existing values in another graph.
- first_node_idx (int, optional) – The first index to use for a node. This can be set to a higher value so that a graph’s node indices don’t overlap with existing values in another graph.
- node_gdf (
geopandas.GeoDataFrame
, optional) – Ageopandas.GeoDataFrame
containing nodes to add to the graph. New nodes will be added to this object incrementally during the function call. - valid_road_types (
list
ofint
s, optional) –The road types to permit in the graph. If not provided, it’s assumed that all road types are permitted. The possible values are integers
1
-7
, which map as follows:1: Motorway 2: Primary 3: Secondary 4: Tertiary 5: Residential 6: Unclassified 7: Cart track
- road_type_field (str, optional) – The name of the attribute containing road type information in
vector_file. Defaults to
'type'
. - workers (int, optional) – Number of worker processes to use for parallelization. Defaults to 1. Should not exceed the number of CPUs available.
- verbose (bool, optional) – Verbose print output. Defaults to
False
.
Returns: nodes, paths –
Return type: tuple of dict s
-
cw_geodata.vector_label.graph.
graph_to_geojson
(G, output_path, encoding='utf-8', overwrite=False)[source]¶ Save graph to two geojsons: one containing nodes, the other edges.
Parameters: - G (
networkx.MultiDiGraph
) – A graph object to save to geojson files. - output_path (str) – Path to save the geojsons to.
'_nodes.geojson'
and'_edges.geojson'
will be appended tooutput_path
(after stripping the extension). - encoding (str, optional) – The character encoding for the saved files.
- overwrite (bool, optional) – Should files at
output_path
be overwritten? Defaults to no (False
).
Notes
This function is based on
osmnx.save_load.save_graph_shapefile
, with tweaks to make it work with our graph objects. It will save two geojsons: a file containing all of the nodes and a file containing all of the edges.Returns: Return type: None - G (
-
cw_geodata.vector_label.graph.
linestring_to_edges
(linestring, node_gdf)[source]¶ Collect nodes in a linestring and add them to an edge.
Parameters: - linestring (
shapely.geometry.LineString
) – Ashapely.geometry.LineString
object to extract nodes and edges from. - node_series (
geopandas.GeoSeries
) – Ageopandas.GeoSeries
containing ashapely.geometry.point.Point
for every node to be added to the graph.
Returns: edges – A list of
Edge
s fromlinestring
.Return type: - linestring (
-
cw_geodata.vector_label.graph.
parallel_linestring_to_path
(feature)[source]¶ Read in a feature line from a fiona-opened shapefile and get the edges.
Parameters: feature (dict) – An item from a fiona.open
iterable with the key'geometry'
containingshapely.geometry.line.LineString
s orshapely.geometry.line.MultiLineString
s.Returns: - A list of
Path
s containing all edges in the LineString or - MultiLineString.
Notes
This function depends on
node_series
andvalid_road_types
, which are passed by an initializer as items invar_dict
.- A list of
Mask submodule¶
-
cw_geodata.vector_label.mask.
boundary_mask
(footprint_msk=None, out_file=None, reference_im=None, boundary_width=3, boundary_type='inner', burn_value=255, **kwargs)[source]¶ Convert a dataframe of geometries to a pixel mask.
Notes
This function requires creation of a footprint mask before it can operate; therefore, if there is no footprint mask already present, it will create one. In that case, additional arguments for
footprint_mask()
(e.g.df
) must be passed.Parameters: - footprint_msk (
numpy.array
, optional) – A filled in footprint mask created usingfootprint_mask()
. If not provided, one will be made by callingfootprint_mask()
before creating the boundary mask, and the required arguments for that function must be provided as kwargs. - out_file (str, optional) – Path to an image file to save the output to. Must be compatible with
rasterio.DatasetReader
. If provided, a reference_im must be provided (for metadata purposes). - reference_im (
rasterio.DatasetReader
or str, optional) – An image to extract necessary coordinate information from: the affine transformation matrix, the image extent, etc. If provided, affine_obj and shape are ignored - boundary_width (int, optional) – The width of the boundary to be created in pixels. Defaults to 3.
- boundary_type (
"inner"
or"outer"
, optional) – Where to draw the boundaries: within the object ("inner"
) or outside of it ("outer"
). Defaults to"inner"
. - burn_value (int, optional) – The value to use for labeling objects in the mask. Defaults to 255 (the
max value for
uint8
arrays). The mask array will be set to the same dtype as burn_value. Ignored if burn_field is provided. - **kwargs (optional) – Additional arguments to pass to
footprint_mask()
if one needs to be created.
Returns: - boundary_mask (
numpy.array
) – A pixel mask with 0s for non-object pixels and the same value as the footprint mask burn_value for the boundaries of each object. - Note (This function draws the boundaries within the edge of the object.)
- footprint_msk (
-
cw_geodata.vector_label.mask.
contact_mask
(df, out_file=None, reference_im=None, geom_col='geometry', do_transform=False, affine_obj=None, shape=(900, 900), out_type='int', contact_spacing=10, burn_value=255)[source]¶ Create a pixel mask labeling closely juxtaposed objects.
Notes
This function identifies pixels in an image that do not correspond to objects, but fall within contact_spacing of >1 labeled object.
Parameters: - df (
pandas.DataFrame
orgeopandas.GeoDataFrame
) – Apandas.DataFrame
orgeopandas.GeoDataFrame
instance with a column containing geometries (identified by geom_col). If the geometries in df are not in pixel coordinates, then affine or reference_im must be passed to provide the transformation to convert. - out_file (str, optional) – Path to an image file to save the output to. Must be compatible with
rasterio.DatasetReader
. If provided, a reference_im must be provided (for metadata purposes). - reference_im (
rasterio.DatasetReader
or str, optional) – An image to extract necessary coordinate information from: the affine transformation matrix, the image extent, etc. If provided, affine_obj and shape are ignored. - geom_col (str, optional) – The column containing geometries in df. Defaults to
"geometry"
. - do_transform (bool, optional) – Should the values in df be transformed from geospatial coordinates
to pixel coordinates? Defaults to no (
False
). IfTrue
, either reference_im or affine_obj must be provided as a source for the the required affine transformation matrix. - affine_obj (list or
affine.Affine
, optional) – Affine transformation to use to convert from geo coordinates to pixel space. Only provide this argument if df is ageopandas.GeoDataFrame
with coordinates in a georeferenced coordinate space. Ignored if reference_im is provided. - shape (tuple, optional) – An
(x_size, y_size)
tuple defining the pixel extent of the output mask. Ignored if reference_im is provided. - out_type ('float' or 'int') –
- contact_spacing (int or float, optional) – The desired maximum distance between adjacent polygons to be labeled as contact. contact_spacing will be in the same units as df ‘s geometries, not necessarily in pixel units.
- burn_value (int or float, optional) – The value to use for labeling objects in the mask. Defaults to 255 (the
max value for
uint8
arrays). The mask array will be set to the same dtype as burn_value.
- df (
-
cw_geodata.vector_label.mask.
df_to_px_mask
(df, channels=['footprint'], out_file=None, reference_im=None, geom_col='geometry', do_transform=False, affine_obj=None, shape=(900, 900), out_type='int', burn_value=255, **kwargs)[source]¶ Convert a dataframe of geometries to a pixel mask.
Parameters: - df (
pandas.DataFrame
orgeopandas.GeoDataFrame
) – Apandas.DataFrame
orgeopandas.GeoDataFrame
instance with a column containing geometries (identified by geom_col). If the geometries in df are not in pixel coordinates, then affine or reference_im must be passed to provide the transformation to convert. - channels (list, optional) –
The mask channels to generate. There are three values that this can contain:
"footprint"
: Create a full footprint mask, with 0s at pixels- that don’t fall within geometries and burn_value at pixels that do.
"boundary"
: Create a mask with geometries outlined. Use- boundary_width to set how thick the boundary will be drawn.
"contact"
: Create a mask with regions between >= 2 closely- juxtaposed geometries labeled. Use contact_spacing to set the maximum spacing between polygons to be labeled.
Each channel correspond to its own shape plane in the output.
- out_file (str, optional) – Path to an image file to save the output to. Must be compatible with
rasterio.DatasetReader
. If provided, a reference_im must be provided (for metadata purposes). - reference_im (
rasterio.DatasetReader
or str, optional) – An image to extract necessary coordinate information from: the affine transformation matrix, the image extent, etc. If provided, affine_obj and shape are ignored. - geom_col (str, optional) – The column containing geometries in df. Defaults to
"geometry"
. - do_transform (bool, optional) – Should the values in df be transformed from geospatial coordinates
to pixel coordinates? Defaults to no (
False
). IfTrue
, either reference_im or affine_obj must be provided as a source for the the required affine transformation matrix. - affine_obj (list or
affine.Affine
, optional) – Affine transformation to use to convert from geo coordinates to pixel space. Only provide this argument if df is ageopandas.GeoDataFrame
with coordinates in a georeferenced coordinate space. Ignored if reference_im is provided. - shape (tuple, optional) – An
(x_size, y_size)
tuple defining the pixel extent of the output mask. Ignored if reference_im is provided. - burn_value (int or float) – The value to use for labeling objects in the mask. Defaults to 255 (the
max value for
uint8
arrays). The mask array will be set to the same dtype as burn_value. - kwargs – Additional arguments to pass to boundary_mask or contact_mask. See those functions for requirements.
Returns: mask – A pixel mask with 0s for non-object pixels and burn_value at object pixels. mask dtype will coincide with burn_value. Shape will be
(shape[0], shape[1], len(channels))
, with channels ordered per the provided channels list.Return type: numpy.array
- df (
-
cw_geodata.vector_label.mask.
footprint_mask
(df, out_file=None, reference_im=None, geom_col='geometry', do_transform=False, affine_obj=None, shape=(900, 900), out_type='int', burn_value=255, burn_field=None)[source]¶ Convert a dataframe of geometries to a pixel mask.
Parameters: - df (
pandas.DataFrame
orgeopandas.GeoDataFrame
) – Apandas.DataFrame
orgeopandas.GeoDataFrame
instance with a column containing geometries (identified by geom_col). If the geometries in df are not in pixel coordinates, then affine or reference_im must be passed to provide the transformation to convert. - out_file (str, optional) – Path to an image file to save the output to. Must be compatible with
rasterio.DatasetReader
. If provided, a reference_im must be provided (for metadata purposes). - reference_im (
rasterio.DatasetReader
or str, optional) – An image to extract necessary coordinate information from: the affine transformation matrix, the image extent, etc. If provided, affine_obj and shape are ignored. - geom_col (str, optional) – The column containing geometries in df. Defaults to
"geometry"
. - do_transform (bool, optional) – Should the values in df be transformed from geospatial coordinates
to pixel coordinates? Defaults to no (
False
). IfTrue
, either reference_im or affine_obj must be provided as a source for the the required affine transformation matrix. - affine_obj (list or
affine.Affine
, optional) – Affine transformation to use to convert from geo coordinates to pixel space. Only provide this argument if df is ageopandas.GeoDataFrame
with coordinates in a georeferenced coordinate space. Ignored if reference_im is provided or ifdo_transform=False
. - shape (tuple, optional) – An
(x_size, y_size)
tuple defining the pixel extent of the output mask. Ignored if reference_im is provided. - out_type ('float' or 'int') –
- burn_value (int or float, optional) – The value to use for labeling objects in the mask. Defaults to 255 (the
max value for
uint8
arrays). The mask array will be set to the same dtype as burn_value. Ignored if burn_field is provided. - burn_field (str, optional) – Name of a column in df that provides values for burn_value for each independent object. If provided, burn_value is ignored.
Returns: mask – A pixel mask with 0s for non-object pixels and burn_value at object pixels. mask dtype will coincide with burn_value.
Return type: numpy.array
- df (
-
cw_geodata.vector_label.mask.
mask_to_poly_geojson
(mask_arr, reference_im=None, output_path=None, output_type='csv', min_area=40, bg_value=0, do_transform=False, simplify=False, tolerance=0.5, **kwargs)[source]¶ Get polygons from an image mask.
Parameters: - mask_arr (
numpy.ndarray
of ints) – A 2D array of integers. Multi-channel masks are not supported, and must be simplified before passing to this function. Can also pass an image file path here. - reference_im (str, optional) – The path to a reference geotiff to use for georeferencing the polygons
in the mask. Required if saving to a GeoJSON (see the
output_type
argument), otherwise only required ifdo_transform=True
. - output_path (str, optional) – Path to save the output file to. If not provided, no file is saved.
- output_type (
'csv'
or'geojson'
, optional) – Ifoutput_path
is provided, this argument defines what type of file will be generated - a CSV (output_type='csv'
) or a geojson (output_type='geojson'
). - min_area (int, optional) – The minimum area of a polygon to retain. Filtering is done AFTER any coordinate transformation, and therefore will be in destination units.
- bg_value (int, optional) – The value in
mask_arr
that denotes background (non-object). Defaults to0
. - simplify (bool, optional) – If
True
, will use the Douglas-Peucker algorithm to simplify edges, saving memory and processing time later. Defaults toFalse
. - tolerance (float, optional) – The tolerance value to use for simplification with the Douglas-Peucker
algorithm. Defaults to 0.5. Only has an effect if
simplify=True
.
Returns: gdf – A GeoDataFrame of polygons.
Return type: - mask_arr (
Utility functions¶
Geo utility submodule¶
-
cw_geodata.utils.geo.
geometries_internal_intersection
(polygons)[source]¶ Get the intersection geometries between all geometries in a set.
Parameters: polygons (list-like) – A list-like containing geometries. These will be placed in a geopandas.GeoSeries
object to take advantage of rtree spatial indexing.Returns: A list of geometric intersections between polygons in polygons, in the same CRS as the input. Return type: intersect_list
-
cw_geodata.utils.geo.
get_subgraph
(G, node_subset)[source]¶ Create a subgraph from G. Code almost directly copied from osmnx.
Parameters: - G (
networkx.MultiDiGraph
) – A graph to be subsetted - node_subset (list-like) – The subset of nodes to induce a subgraph of G
Returns: G2 – The subgraph of G that includes node_subset
Return type: networkx
.MultiDiGraph- G (
-
cw_geodata.utils.geo.
list_to_affine
(xform_mat)[source]¶ Create an Affine from a list or array-formatted [a, b, d, e, xoff, yoff]
Parameters: xform_mat (list or numpy.array
) – A list of values to convert to an affine object.Returns: aff – An affine transformation object. Return type: affine.Affine
-
cw_geodata.utils.geo.
split_multi_geometries
(gdf, obj_id_col=None, group_col=None, geom_col='geometry')[source]¶ Split apart MultiPolygon or MultiLineString geometries.
Parameters: - gdf (
geopandas.GeoDataFrame
or str) – Ageopandas.GeoDataFrame
or path to a geojson containing geometries. - obj_id_col (str, optional) – If one exists, the name of the column that uniquely identifies each
geometry (e.g. the
"BuildingId"
column in many SpaceNet datasets). This will be tracked so multiple objects don’t get produced with the same ID. Note that object ID column will be renumbered on output. If passed, group_col must also be provided. - group_col (str, optional) – A column to identify groups for sequential numbering (for example,
'ImageId'
for sequential number of'BuildingId'
). Must be provided if obj_id_col is passed. - geom_col (str, optional) – The name of the column in gdf that corresponds to geometry. Defaults
to
'geometry'
.
Returns: A geopandas.GeoDataFrame that’s identical to the input, except with the multipolygons split into separate rows, and the object ID column renumbered (if one exists).
Return type: - gdf (