tape.analysis.base#

Contains the base class for analysis functions.

Module Contents#

Classes#

AnalysisFunction

Base class for analysis functions.

class AnalysisFunction[source]#

Bases: abc.ABC, Callable

Base class for analysis functions.

Analysis functions are functions that take few arrays representing an object and return a single pandas.Series representing the result.

cols(ens) List[str][source]#

Return the columns that the analysis function takes as input.

meta(ens) pd.DataFrame[source]#

Return the metadata pandas.DataFrame required by Dask to pre-build a computation graph. It is basically the schema for calculate() method output.

on(ens) List[str][source]#

Return the columns to group source table by. Typically, [ens._id_col].

__call__(*cols, \*\*kwargs)[source]#

Calculate the analysis function.

abstract cols(ens: Ensemble) List[str][source]#

Return the column names that the analysis function takes as input.

Parameters:

ens (Ensemble) – The ensemble object, it could be required to get column names of the “special” columns like ens._time_col or ens._err_col.

Returns:

The column names to select and pass to .calculate() method. For example [ens._time_col, ens._flux_col].

Return type:

List[str]

abstract meta(ens: Ensemble)[source]#

Return the schema of the analysis function output.

Parameters:

ens (Ensemble) – The ensemble object.

Returns:

pd.DataFrame or (str, dtype) tuple or {str – Dask meta, for example pd.DataFrame(columns=[‘x’, ‘y’], dtype=float).

Return type:

dtype} dictionary

abstract on(ens: Ensemble) List[str][source]#

Return the columns to group source table by.

Parameters:

ens (Ensemble) – The ensemble object.

Returns:

The column names to group by. Typically, [ens._id_col].

Return type:

List[str]

abstract __call__(*cols, **kwargs)[source]#

Calculate the analysis function.

Parameters:
  • *cols (array_like) – The columns to calculate the analysis function on. It must be consistent with .cols(ens) output.

  • **kwargs – Additional keyword arguments.

Returns:

The result, it must be consistent with .meta() output.

Return type:

pd.Series or pd.DataFrame or array or value