oceanum.datamesh.Connector#
- class oceanum.datamesh.Connector(token=None, service='https://datamesh.oceanum.io', _gateway=None, user=None, session_duration=None, verify=True)[source]#
Datamesh connector class.
All datamesh operations are methods of this class
Attributes
hostDatamesh host
Methods
- __init__(token=None, service='https://datamesh.oceanum.io', _gateway=None, user=None, session_duration=None, verify=True)[source]#
Datamesh connector constructor
- Parameters:
token (string) – Your datamesh access token. Defaults to os.environ.get(“DATAMESH_TOKEN”, None).
service (string) – The datamesh service url. Defaults to os.environ.get(“DATAMESH_SERVICE”, “https://datamesh.oceanum.io”).
user (string, optional) – Optional user identifier to be sent in the header for datamesh authentication. Defaults to None.
session_duration (float, optional) – The desired length of time for acquired datamesh sessions in seconds. Will be 3600 seconds by default.
verify (bool, optional) – Whether to verify the datamesh server certificate. Defaults to True.
- Raises:
ValueError – Missing or invalid arguments
- delete_datasource(datasource_id)[source]#
Delete a datasource from datamesh. This will delete the datamesh registration and any stored data.
- Parameters:
datasource_id (string) – Unique datasource id
- Returns:
Return True for successfully deleted datasource
- Return type:
boolean
- async delete_datasource_async(datasource_id)[source]#
Asynchronously delete a datasource from datamesh. This will delete the datamesh registration and any stored data.
- Parameters:
datasource_id (string) – Unique datasource id
- Returns:
Return True for successfully deleted datasource
- Return type:
boolean
- get_catalog(search=None, timefilter=None, geofilter=None, limit=None)[source]#
Get datamesh catalog
- Parameters:
search (string, optional) – Search string for filtering datasources
timefilter (Union[
oceanum.datamesh.query.TimeFilter, list], Optional) – Time filter as valid Query TimeFilter or list of [start,end]geofilter (Union[
oceanum.datamesh.query.GeoFilter, dict, shapely.geometry], Optional) – Spatial filter as valid Query Geofilter or geojson geometry as dict or shapely Geometrylimit (int, optional) – Limit the number of datasources returned. Defaults to None.
- Returns:
A datamesh catalog instance
- Return type:
- async get_catalog_async(search=None, timefilter=None, geofilter=None)[source]#
Get datamesh catalog asynchronously
- Parameters:
search (string, optional) – Search string for filtering datasources
timefilter (Union[
oceanum.datamesh.query.TimeFilter, list], Optional) – Time filter as valid Query TimeFilter or list of [start,end]geofilter (Union[
oceanum.datamesh.query.GeoFilter, dict, shapely.geometry], Optional) – Spatial filter as valid Query Geofilter or geojson geometry as dict or shapely Geometry
- Returns:
A datamesh catalog instance
- Return type:
Coroutine<
oceanum.datamesh.Catalog>
- get_datasource(datasource_id)[source]#
Get a Datasource instance from the datamesh. This does not load the actual data.
- Parameters:
datasource_id (string) – Unique datasource id
- Returns:
A datasource instance
- Return type:
- Raises:
DatameshConnectError – Datasource cannot be found or is not authorized for the datamesh key
- async get_datasource_async(datasource_id)[source]#
Get a Datasource instance from the datamesh asynchronously. This does not load the actual data.
- Parameters:
datasource_id (string) – Unique datasource id
loop – event loop. default=None will use
asyncio.get_running_loop()executor –
concurrent.futures.Executorinstance. default=None will use the default executor
- Returns:
A datasource instance
- Return type:
Coroutine<
oceanum.datamesh.Datasource>- Raises:
DatameshConnectError – Datasource cannot be found or is not authorized for the datamesh key
- load_datasource(datasource_id, parameters={}, use_dask=False)[source]#
Load a datasource into the work environment. For datasources which load into DataFrames or GeoDataFrames, this returns an in memory instance of the DataFrame. For datasources which load into an xarray Dataset, an open zarr backed dataset is returned.
- Parameters:
datasource_id (string) – Unique datasource id
parameters (dict) – Additional datasource parameters
use_dask (bool, optional) – Load datasource as a dask enabled datasource if possible. Defaults to False.
- Returns:
The datasource container
- Return type:
Union[
pandas.DataFrame,geopandas.GeoDataFrame,xarray.Dataset]
- async load_datasource_async(datasource_id, parameters={}, use_dask=False)[source]#
Load a datasource asynchronously into the work environment
- Parameters:
datasource_id (string) – Unique datasource id
use_dask (bool, optional) – Load datasource as a dask enabled datasource if possible. Defaults to False.
loop – event loop. default=None will use
asyncio.get_running_loop()executor –
concurrent.futures.Executorinstance. default=None will use the default executor
- Returns:
The datasource container
- Return type:
coroutine<Union[
pandas.DataFrame,geopandas.GeoDataFrame,xarray.Dataset]>
- query(query=None, *, use_dask=False, cache_timeout=0, **query_keys)[source]#
Make a datamesh query
- Parameters:
query (Union[
oceanum.datamesh.Query, dict]) – Datamesh query as a query object or a valid query dictionary
- Kwargs:
use_dask (bool, optional): Load datasource as a dask enabled datasource if possible. Defaults to False. cache_timeout (int, optional): Local cache timeout in seconds. Defaults to 0 (no local cache). Only applies if use_dask=False. Will return an identical query from a local cache if available with an age of less than cache_timeout seconds. Does not check for more recent data on the server. **query_keys: Keywords form of query, for example datamesh.query(datasource=”my_datasource”)
- Returns:
The datasource container
- Return type:
Union[
pandas.DataFrame,geopandas.GeoDataFrame,xarray.Dataset]
- async query_async(query, *, use_dask=False, cache_timeout=0, **query_keys)[source]#
Make a datamesh query asynchronously
- Parameters:
query (Union[
oceanum.datamesh.Query, dict]) – Datamesh query as a query object or a valid query dictionary
- Kwargs:
use_dask (bool, optional): Load datasource as a dask enabled datasource if possible. Defaults to False. cache_timeout (int, optional): Local cache timeout in seconds. Defaults to 0 (no local cache). Only applies if use_dask=False. Will return an identical query from a local cache if available with an age of less than cache_timeout seconds. Does not check for more recent data on the server. loop: event loop. default=None will use
asyncio.get_running_loop()executor:concurrent.futures.Executorinstance. default=None will use the default executor **query_keys: Keywords form of query, for example datamesh.query(datasource=”my_datasource”)
- Returns:
The datasource container
- Return type:
Coroutine<Union[
pandas.DataFrame,geopandas.GeoDataFrame,xarray.Dataset]>
- update_metadata(datasource_id, **properties)[source]#
Update the metadata of a datasource in datamesh
- Parameters:
datasource_id (string) – Unique datasource id
**properties – Additional properties for the datasource - see
oceanum.datamesh.Datasourceconstructor
- Returns:
The datasource instance that was updated
- Return type:
- async update_metadata_async(datasource_id, **properties)[source]#
Update the metadata of a datasource in datamesh asynchronously
- Parameters:
datasource_id (string) – Unique datasource id
**properties – Additional properties for the datasource - see
oceanum.datamesh.Datasourceconstructor
- Returns:
The datasource instance that was updated
- Return type:
Coroutine<
oceanum.datamesh.Datasource>
- write_datasource(datasource_id, data, geometry=None, geom=None, append=None, overwrite=False, index=None, crs=None, **properties)[source]#
Write a datasource to datamesh from the work environment
- Parameters:
datasource_id (string) – Unique datasource id
data (Union[
pandas.DataFrame,geopandas.GeoDataFrame,xarray.Dataset, None]) – The data to be written to datamesh. If data is None, just update metadata properties.geom (
oceanum.datasource.Geometry, optional) – GeoJSON geometry of the datasource in WGS84 if crs=None else in the specified crs. If not provided the geometry will be infered from the data if possible. default=Nonecoordinates (Dict[
oceanum.datasource.Coordinates,str], optional) – Coordinate mapping for xarray datasets. default=Noneappend (string, optional) – Coordinate to append on. default=None
overwrite (bool, optional) – Overwrite existing datasource. default=False
crs (Union[string,int], optional) – Coordinate reference system for the datasource if not WGS84. The geom argument is also assumed to be in this CRS. default=None
**properties – Additional properties for the datasource - see
oceanum.datamesh.Datasource
- Returns:
The datasource instance that was written to
- Return type:
- async write_datasource_async(datasource_id, data, append=None, overwrite=False, **properties)[source]#
Write a datasource to datamesh from the work environment asynchronously
- Parameters:
datasource_id (string) – Unique datasource id
data (Union[
pandas.DataFrame,geopandas.GeoDataFrame,xarray.Dataset, None]) – The data to be written to datamesh. If data is None, just update metadata properties.geom (
oceanum.datasource.Geometry) – GeoJSON geometry of the datasourceappend (string, optional) – Coordinate to append on. default=None
overwrite (bool, optional) – Overwrite existing datasource. default=False
**properties – Additional properties for the datasource - see
oceanum.datamesh.Datasourceconstructor
- Returns:
The datasource instance that was written to
- Return type:
Coroutine<
oceanum.datamesh.Datasource>