podpac.data.CSV

class podpac.data.CSV(**kwargs)[source]

Bases: podpac.core.data.file.DatasetSource

Create a DataSource from a .csv file.

This class assumes that the data has a storage format such as: header 1, header 2, header 3, … row1_data1, row1_data2, row1_data3, … row2_data1, row2_data2, row2_data3, …

source

Path to the csv file

Type

str

header

Row number containing the column names, default 0. Use None for no header.

Type

int, None

dataset

Raw Pandas DataFrame used to read the data

Type

pd.DataFrame

native_coordinates

The coordinates of the data source.

Type

Coordinates

data_key

data column number or column title, default ‘data’

Type

str, int

lat_key

latitude column number or column title, default ‘lat’

Type

str, int

lon_key

longitude column number or column title, default ‘lon’

Type

str, int

time_key

time column number or column title, default ‘time’

Type

str, int

alt_key

altitude column number or column title, default ‘alt’

Type

str, int

crs

Coordinate reference system of the coordinates

Type

str

Alternative Constructors

from_definition(definition)

Create podpac Node from a dictionary definition.

from_json(s)

Create podpac Node from a JSON definition.

Methods

__init__(**kwargs)

Do not overwrite me

close_dataset()

Close the dataset.

create_output_array(coords[, data])

Initialize an output data array

eval(coordinates[, output])

Evaluates this node using the supplied coordinates.

eval_group(group)

Evaluate the node for each of the coordinates in the group.

find_coordinates()

Get the available native coordinates for the Node.

from_url(url)

Create podpac Node from a WMS/WCS request.

get_alt()

Get altitude coordinates from the csv file.

get_cache(key[, coordinates])

Get cached data for this node.

get_data(coordinates, coordinates_index)

This method must be defined by the data source implementing the DataSource class.

get_lat()

Get latitude coordinates from the csv file.

get_lon()

Get longitude coordinates from the csv file.

get_native_coordinates()

Returns a Coordinates object that describes the native coordinates of the data source.

get_time()

Get time coordinates from the csv file.

has_cache(key[, coordinates])

Check for cached data for this node.

init()

load(path)

Create podpac Node from file.

put_cache(data, key[, coordinates, overwrite])

Cache data for this node.

rem_cache(key[, coordinates, mode])

Clear cached data for this node.

save(path)

Write node to file.

Attributes

alt_key

A trait type representing a Union type.

available_keys

available data keys

base_definition

Base node definition for DatasetSource nodes.

base_ref

Default reference/name in node definitions

cache_ctrl

A trait whose value must be an instance of a specified class.

cache_output

A boolean (True, False) trait.

cache_update

A boolean (True, False) trait.

cf_calendar

A trait for unicode strings.

cf_time

A boolean (True, False) trait.

cf_units

A trait for unicode strings.

coordinate_index_type

An enum whose value must be in a given sequence.

crs

A trait for unicode strings.

data_key

A trait type representing a Union type.

dataset

A trait whose value must be an instance of a specified class.

definition

Full node definition.

dims

dataset coordinate dims

dtype

A trait which allows any value.

hash

header

A trait which allows any value.

interpolation

A trait type representing a Union type.

interpolation_class

Get the interpolation class currently set for this data source.

interpolators

Return the interpolators selected for the previous node evaluation interpolation.

json

definition for this node in json format

json_pretty

lat_key

A trait type representing a Union type.

lon_key

A trait type representing a Union type.

nan_vals

An instance of a Python list.

native_coordinates

A trait whose value must be an instance of a specified class.

output

A trait for unicode strings.

output_keys

A trait type representing a Union type.

outputs

An instance of a Python list.

source

A trait for unicode strings.

style

A trait whose value must be an instance of a specified class.

time_key

A trait type representing a Union type.

units

A trait for unicode strings.

Members

__init__(**kwargs)

Do not overwrite me

alt_key

A trait type representing a Union type.

property available_keys

available data keys

property base_definition

Base node definition for DatasetSource nodes.

Returns

Dictionary containing the location of the Node, the name of the plugin (if required), as well as any parameters and attributes that were tagged by children.

Return type

OrderedDict

data_key

A trait type representing a Union type.

dataset

A trait whose value must be an instance of a specified class.

The value can also be an instance of a subclass of the specified class.

Subclasses can declare default classes by overriding the klass attribute

property dims

dataset coordinate dims

get_alt()[source]

Get altitude coordinates from the csv file.

get_data(coordinates, coordinates_index)[source]

This method must be defined by the data source implementing the DataSource class. When data source nodes are evaluated, this method is called with request coordinates and coordinate indexes. The implementing method can choose which input provides the most efficient method of getting data (i.e via coordinates or via the index of the coordinates).

Coordinates and coordinate indexes may be strided or subsets of the source data, but all coordinates and coordinate indexes will match 1:1 with the subset data.

This method may return a numpy array, an xarray DaraArray, or a podpac UnitsDataArray. If a numpy array or xarray DataArray is returned, podpac.data.DataSource.evaluate() will cast the data into a UnitsDataArray using the requested source coordinates. If a podpac UnitsDataArray is passed back, the podpac.data.DataSource.evaluate() method will not do any further processing. The inherited Node method create_output_array can be used to generate the template UnitsDataArray in your DataSource. See podpac.Node.create_output_array() for more details.

Parameters
  • coordinates (podpac.Coordinates) – The coordinates that need to be retrieved from the data source using the coordinate system of the data source

  • coordinates_index (List) – A list of slices or a boolean array that give the indices of the data that needs to be retrieved from the data source. The values in the coordinate_index will vary depending on the coordinate_index_type defined for the data source.

Returns

A subset of the returned data. If a numpy array or xarray DataArray is returned, the data will be cast into UnitsDataArray using the returned data to fill values at the requested source coordinates.

Return type

np.ndarray, xr.DataArray, podpac.UnitsDataArray

get_lat()[source]

Get latitude coordinates from the csv file.

get_lon()[source]

Get longitude coordinates from the csv file.

get_native_coordinates()[source]

Returns a Coordinates object that describes the native coordinates of the data source.

In most cases, this method is defined by the data source implementing the DataSource class. If method is not implemented by the data source, it will try to return self.native_coordinates if self.native_coordinates is not None.

Otherwise, this method will raise a NotImplementedError.

Returns

The coordinates describing the data source array.

Return type

podpac.Coordinates

Notes

Need to pay attention to: - the order of the dimensions - the stacking of the dimension - the type of coordinates

Coordinates should be non-nan and non-repeating for best compatibility

Note: CSV files have StackedCoordinates.

get_time()[source]

Get time coordinates from the csv file.

header

A trait which allows any value.

lat_key

A trait type representing a Union type.

lon_key

A trait type representing a Union type.

output_keys

A trait type representing a Union type.

time_key

A trait type representing a Union type.