podpac.data.Rasterio

class podpac.data.Rasterio(**kwargs)[source]

Bases: podpac.core.data.datasource.DataSource

Create a DataSource using Rasterio.

Parameters
  • source (str, io.BytesIO) – Path to the data source

  • band (int) – The ‘band’ or index for the variable being accessed in files such as GeoTIFFs

dataset

A reference to the datasource opened by rasterio

Type

rasterio._io.RasterReader

native_coordinates

The coordinates of the data source.

Type

podpac.Coordinates

Notes

The source could be a path to an s3 bucket file, e.g.: s3://landsat-pds/L8/139/045/LC81390452014295LGN00/LC81390452014295LGN00_B1.TIF In that case, make sure to set the environmental variable: * Windows: set CURL_CA_BUNDLE=<path_to_conda_env>Librarysslcacert.pem * Linux: export CURL_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt

Methods

__init__(**kwargs)

Do not overwrite me

close_dataset()

Closes the file for the datasource

create_output_array(coords[, data])

Initialize an output data array

eval(coordinates[, output])

Evaluates this node using the supplied coordinates.

eval_group(group)

Evaluate the node for each of the coordinates in the group.

find_coordinates()

Get the available native coordinates for the Node.

from_url(url)

Create podpac Node from a WMS/WCS request.

get_band_numbers(key, value)

Return the bands that have a key equal to a specified value.

get_cache(key[, coordinates])

Get cached data for this node.

get_data(coordinates, coordinates_index)

This method must be defined by the data source implementing the DataSource class.

get_native_coordinates()

Returns a Coordinates object that describes the native coordinates of the data source.

has_cache(key[, coordinates])

Check for cached data for this node.

init()

Overwrite this method if a node needs to do any additional initialization after the standard initialization.

put_cache(data, key[, coordinates, overwrite])

Cache data for this node.

rem_cache(key[, coordinates, mode])

Clear cached data for this node.

Attributes

band

A casting version of the int trait.

band_count

The number of bands

band_descriptions

A description of each band contained in dataset.tags

band_keys

An alternative view of band_descriptions based on the keys present in the metadata

base_definition

Base node defintion for DataSource nodes.

base_ref

Default pipeline node reference/name in pipeline node definitions

cache_ctrl

A trait whose value must be an instance of a specified class.

cache_output

A boolean (True, False) trait.

cache_update

A boolean (True, False) trait.

coordinate_index_type

An enum whose value must be in a given sequence.

dataset

A trait which allows any value.

definition

Full pipeline definition for this node.

dtype

A trait which allows any value.

hash

interpolation

A trait type representing a Union type.

interpolation_class

Get the interpolation class currently set for this data source.

interpolators

Return the interpolators selected for the previous node evaluation interpolation.

json

definition for this node in json format

json_pretty

nan_vals

An instance of a Python list.

native_coordinates

A trait whose value must be an instance of a specified class.

pipeline

Create a pipeline node from this node

source

A trait type representing a Union type.

style

A trait whose value must be an instance of a specified class.

units

A trait for unicode strings.

Members

__init__(**kwargs)

Do not overwrite me

band

A casting version of the int trait.

property band_count

The number of bands

Returns

The number of bands in the dataset

Return type

int

property band_descriptions

A description of each band contained in dataset.tags

Returns

Dictionary of band_number: band_description pairs. The band_description values are a dictionary, each containing a number of keys – depending on the metadata

Return type

OrderedDict

property band_keys

An alternative view of band_descriptions based on the keys present in the metadata

Returns

Dictionary of metadata keys, where the values are the value of the key for each band. For example, band_keys[‘TIME’] = [‘2015’, ‘2016’, ‘2017’] for a dataset with three bands.

Return type

dict

close_dataset()[source]

Closes the file for the datasource

dataset

A trait which allows any value.

get_band_numbers(key, value)[source]

Return the bands that have a key equal to a specified value.

Parameters
  • key (str / list) – Key present in the metadata of the band. Can be a single key, or a list of keys.

  • value (str / list) – Value of the key that should be returned. Can be a single value, or a list of values

Returns

An array of band numbers that match the criteria

Return type

np.ndarray

get_data(coordinates, coordinates_index)[source]

This method must be defined by the data source implementing the DataSource class. When data source nodes are evaluated, this method is called with request coordinates and coordinate indexes. The implementing method can choose which input provides the most efficient method of getting data (i.e via coordinates or via the index of the coordinates).

Coordinates and coordinate indexes may be strided or subsets of the source data, but all coordinates and coordinate indexes will match 1:1 with the subset data.

This method may return a numpy array, an xarray DaraArray, or a podpac UnitsDataArray. If a numpy array or xarray DataArray is returned, podpac.data.DataSource.evaluate() will cast the data into a UnitsDataArray using the requested source coordinates. If a podpac UnitsDataArray is passed back, the podpac.data.DataSource.evaluate() method will not do any further processing. The inherited Node method create_output_array can be used to generate the template UnitsDataArray in your DataSource. See podpac.Node.create_output_array() for more details.

Parameters
  • coordinates (podpac.Coordinates) – The coordinates that need to be retrieved from the data source using the coordinate system of the data source

  • coordinates_index (List) – A list of slices or a boolean array that give the indices of the data that needs to be retrieved from the data source. The values in the coordinate_index will vary depending on the coordinate_index_type defined for the data source.

Returns

A subset of the returned data. If a numpy array or xarray DataArray is returned, the data will be cast into UnitsDataArray using the returned data to fill values at the requested source coordinates.

Return type

np.ndarray, xr.DataArray, podpac.UnitsDataArray

get_native_coordinates()[source]

Returns a Coordinates object that describes the native coordinates of the data source.

In most cases, this method is defined by the data source implementing the DataSource class. If method is not implemented by the data source, it will try to return self.native_coordinates if self.native_coordinates is not None.

Otherwise, this method will raise a NotImplementedError.

Returns

The coordinates describing the data source array.

Return type

podpac.Coordinates

Notes

Need to pay attention to: - the order of the dimensions - the stacking of the dimension - the type of coordinates

Coordinates should be non-nan and non-repeating for best compatibility

The default implementation tries to find the lat/lon coordinates based on dataset.affine. It cannot determine the alt or time dimensions, so child classes may have to overload this method.

source

A trait type representing a Union type.