podpac.data.Zarr

class podpac.data.Zarr(**kwargs)[source]

Bases: podpac.core.authentication.S3Mixin, podpac.core.data.file_source.FileKeysMixin, podpac.core.data.file_source.BaseFileSource

Create a DataSource node using zarr.

source

Path to the Zarr archive

Type

str

file_mode

Default is ‘r’. The mode used to open the Zarr archive. Options are r, r+, w, w- or x, a.

Type

str, optional

dataset

The h5py file object used to read the file

Type

zarr.Group

coordinates

{coordinates}

Type

podpac.Coordinates

data_key

data key, default ‘data’

Type

str, int

lat_key

latitude coordinates key, default ‘lat’

Type

str, int

lon_key

longitude coordinates key, default ‘lon’

Type

str, int

time_key

time coordinates key, default ‘time’

Type

str, int

alt_key

altitude coordinates key, default ‘alt’

Type

str, int

crs

Coordinate reference system of the coordinates

Type

str

cf_time

decode CF datetimes

Type

bool

cf_units

units, when decoding CF datetimes

Type

str

cf_calendar

calendar, when decoding CF datetimes

Type

str

Alternative Constructors

from_definition(definition)

Create podpac Node from a dictionary definition.

from_json(s)

Create podpac Node from a JSON definition.

Methods

__init__(**kwargs)

Do not overwrite me

chunk_exists([index, chunk_str, data_key, …])

Test to see if a chunk exists for a particular slice.

close_dataset()

Close opened resources.

create_output_array(coords[, data])

Initialize an output data array

eval(coordinates[, output])

Evaluates this node using the supplied coordinates.

eval_group(group)

Evaluate the node for each of the coordinates in the group.

find_coordinates()

Get the available coordinates for the Node.

from_url(url)

Create podpac Node from a WMS/WCS request.

get_cache(key[, coordinates])

Get cached data for this node.

get_coordinates()

Returns a Coordinates object that describes the coordinates of the data source.

get_data(coordinates, coordinates_index)

This method must be defined by the data source implementing the DataSource class.

has_cache(key[, coordinates])

Check for cached data for this node.

init()

Overwrite this method if a node needs to do any additional initialization after the standard initialization.

list_dir([data_key])

load(path)

Create podpac Node from file.

put_cache(data, key[, coordinates, overwrite])

Cache data for this node.

rem_cache(key[, coordinates, mode])

Clear cached data for this node.

save(path)

Write node to file.

set_coordinates(coordinates[, force])

Set the coordinates.

trait_is_defined(name)

Attributes

alt_key

A trait for unicode strings.

anon

A boolean (True, False) trait.

attrs

List of node attributes

available_data_keys

aws_access_key_id

A trait for unicode strings.

aws_client_kwargs

An instance of a Python dict.

aws_region_name

A trait for unicode strings.

aws_requester_pays

A boolean (True, False) trait.

aws_secret_access_key

A trait for unicode strings.

base_ref

Default reference/name in node definitions

boundary

An instance of a Python dict.

cache_coordinates

A boolean (True, False) trait.

cache_ctrl

A trait whose value must be an instance of a specified class.

cache_output

A boolean (True, False) trait.

cf_calendar

A trait for unicode strings.

cf_time

A boolean (True, False) trait.

cf_units

A trait for unicode strings.

config_kwargs

An instance of a Python dict.

coordinate_index_type

coordinates

{coordinates}

crs

A trait for unicode strings.

data_key

A trait type representing a Union type.

dataset

definition

dims

dtype

A trait which allows any value.

file_mode

A trait for unicode strings.

force_eval

A boolean (True, False) trait.

hash

interpolation

interpolation_class

Get the interpolation class currently set for this data source.

interpolators

Return the interpolators selected for the previous node evaluation interpolation.

json

json_pretty

keys

lat_key

A trait for unicode strings.

lon_key

A trait for unicode strings.

nan_vals

An instance of a Python list.

output

A trait for unicode strings.

outputs

An instance of a Python list.

s3

skip_validation

A boolean (True, False) trait.

source

A trait for unicode strings.

style

A trait whose value must be an instance of a specified class.

time_key

A trait for unicode strings.

units

A trait for unicode strings.

Members

__init__(**kwargs)

Do not overwrite me

chunk_exists(index=None, chunk_str=None, data_key=None, chunks=None, list_dir=[])[source]

Test to see if a chunk exists for a particular slice. Note: Only the start of the index is used.

Parameters
  • index (tuple(slice), optional) – Default is None. A tuple of slices indicating the data that the users wants to access

  • chunk_str (str, optional) – Default is None. A string equivalent to the filename of the chunk (.e.g. “1.0.5”)

  • data_key (str, optional) – Default is None. The data_key for the zarr array that will be queried.

  • chunks (list, optional) – Defaut is None. The chunk structure of the zarr array. If not provided will use self.dataset[data_key].chunks

  • list_dir (list, optional) – A list of existing paths – used in lieu of ‘exist’ calls

coordinate_index_type = 'slice'
property dataset
property dims
file_mode

A trait for unicode strings.

get_data(coordinates, coordinates_index)[source]

This method must be defined by the data source implementing the DataSource class. When data source nodes are evaluated, this method is called with request coordinates and coordinate indexes. The implementing method can choose which input provides the most efficient method of getting data (i.e via coordinates or via the index of the coordinates).

Coordinates and coordinate indexes may be strided or subsets of the source data, but all coordinates and coordinate indexes will match 1:1 with the subset data.

This method may return a numpy array, an xarray DaraArray, or a podpac UnitsDataArray. If a numpy array or xarray DataArray is returned, podpac.data.DataSource.evaluate() will cast the data into a UnitsDataArray using the requested source coordinates. If a podpac UnitsDataArray is passed back, the podpac.data.DataSource.evaluate() method will not do any further processing. The inherited Node method create_output_array can be used to generate the template UnitsDataArray in your DataSource. See podpac.Node.create_output_array() for more details.

Parameters
  • coordinates (podpac.Coordinates) – The coordinates that need to be retrieved from the data source using the coordinate system of the data source

  • coordinates_index (List) – A list of slices or a boolean array that give the indices of the data that needs to be retrieved from the data source. The values in the coordinate_index will vary depending on the coordinate_index_type defined for the data source.

Returns

A subset of the returned data. If a numpy array or xarray DataArray is returned, the data will be cast into UnitsDataArray using the returned data to fill values at the requested source coordinates.

Return type

np.ndarray, xr.DataArray, podpac.UnitsDataArray

property keys
list_dir(data_key=None)[source]