tape.analysis#
Subpackages#
Submodules#
Attributes#
Classes#
Base class for analysis functions. |
|
Apply light-curve package feature extractor to a light curve |
|
This base class is meant to support various analysis routines and be |
|
Compute the StetsonJ statistic on data from one or several bands |
|
Calculate structure function squared |
Package Contents#
- class AnalysisFunction[source]#
Bases:
abc.ABC,CallableBase class for analysis functions.
Analysis functions are functions that take few arrays representing an object and return a single pandas.Series representing the result.
- meta(ens) pd.DataFrame[source]#
Return the metadata pandas.DataFrame required by Dask to pre-build a computation graph. It is basically the schema for calculate() method output.
- abstract cols(ens: Ensemble) List[str][source]#
Return the column names that the analysis function takes as input.
- Parameters:
ens (Ensemble) – The ensemble object, it could be required to get column names of the “special” columns like ens._time_col or ens._err_col.
- Returns:
The column names to select and pass to .calculate() method. For example [ens._time_col, ens._flux_col].
- Return type:
List[str]
- abstract meta(ens: Ensemble)[source]#
Return the schema of the analysis function output.
- Parameters:
ens (Ensemble) – The ensemble object.
- Returns:
pd.DataFrame or (str, dtype) tuple or {str – Dask meta, for example pd.DataFrame(columns=[‘x’, ‘y’], dtype=float).
- Return type:
dtype} dictionary
- abstract on(ens: Ensemble) List[str][source]#
Return the columns to group source table by.
- Parameters:
ens (Ensemble) – The ensemble object.
- Returns:
The column names to group by. Typically, [ens._id_col].
- Return type:
List[str]
- abstract __call__(*cols, **kwargs)[source]#
Calculate the analysis function.
- Parameters:
*cols (array_like) – The columns to calculate the analysis function on. It must be consistent with .cols(ens) output.
**kwargs – Additional keyword arguments.
- Returns:
The result, it must be consistent with .meta() output.
- Return type:
pd.Series or pd.DataFrame or array or value
- class FeatureExtractor(feature: light_curve.light_curve_ext._FeatureEvaluator)[source]#
Bases:
tape.analysis.base.AnalysisFunctionApply light-curve package feature extractor to a light curve
- Parameters:
feature (light_curve.light_curve_ext._FeatureEvaluator) – Feature extractor to apply, see “light-curve” package for more details.
- feature#
Feature extractor to apply, see “light-curve” package for more details.
- Type:
light_curve.light_curve_ext._FeatureEvaluator
- feature#
- cols(ens: Ensemble) List[str][source]#
Return the column names that the analysis function takes as input.
- Parameters:
ens (Ensemble) – The ensemble object, it could be required to get column names of the “special” columns like ens._time_col or ens._err_col.
- Returns:
The column names to select and pass to .calculate() method. For example [ens._time_col, ens._flux_col].
- Return type:
List[str]
- meta(ens: Ensemble) pandas.DataFrame[source]#
Return the schema of the analysis function output.
It always returns a pandas.DataFrame with the same columns as self.feature.names and dtype np.float64. However, if input columns are all single precision floats then the output dtype will be np.float32.
- on(ens: Ensemble) List[str][source]#
Return the columns to group source table by.
- Parameters:
ens (Ensemble) – The ensemble object.
- Returns:
The column names to group by. Typically, [ens._id_col].
- Return type:
List[str]
- __call__(time, flux, err, band, *, band_to_calc: str, **kwargs) pandas.DataFrame[source]#
Apply a feature extractor to a light curve, concatenating the results over all bands.
- Parameters:
time (numpy.ndarray) – Time values
flux (numpy.ndarray) – Brightness values, flux or magnitudes
err (numpy.ndarray) – Errors for “flux”
band (numpy.ndarray) – Passband names.
band_to_calc (str or int or None) – Name of the passband to calculate features for, usually a string like “g” or “r”, or an integer. If None, then features are calculated for all sources - band is ignored.
**kwargs (dict) – Additional keyword arguments to pass to the feature extractor.
- Returns:
features – Feature values for each band, dtype is a common type for input arrays.
- Return type:
pandas.DataFrame
- class LightCurve(times: numpy.ndarray, fluxes: numpy.ndarray, errors: numpy.ndarray, minimum_observations: int = 0)[source]#
This base class is meant to support various analysis routines and be extended as needed. (Hence it’s location in the analysis package.)
The base class ensures that the data for a single lightcurve is well formed. Namely that the input data is all of the same length, with NaN’s removed and that there are enough observations to perform a given analysis.
- _times#
- _fluxes#
- _errors#
- _minimum_observations#
- _process_input_data()[source]#
Cleaning and validation occurs here, ideally by calling sub-methods for specific checks and cleaning tasks.
- class StetsonJ[source]#
Bases:
tape.analysis.base.AnalysisFunctionCompute the StetsonJ statistic on data from one or several bands
- cols(ens: Ensemble) List[str][source]#
Return the column names that the analysis function takes as input.
- Parameters:
ens (Ensemble) – The ensemble object, it could be required to get column names of the “special” columns like ens._time_col or ens._err_col.
- Returns:
The column names to select and pass to .calculate() method. For example [ens._time_col, ens._flux_col].
- Return type:
List[str]
- meta(ens: Ensemble)[source]#
Return the schema of the analysis function output.
- Parameters:
ens (Ensemble) – The ensemble object.
- Returns:
pd.DataFrame or (str, dtype) tuple or {str – Dask meta, for example pd.DataFrame(columns=[‘x’, ‘y’], dtype=float).
- Return type:
dtype} dictionary
- on(ens: Ensemble) List[str][source]#
Return the columns to group source table by.
- Parameters:
ens (Ensemble) – The ensemble object.
- Returns:
The column names to group by. Typically, [ens._id_col].
- Return type:
List[str]
- __call__(flux: numpy.ndarray, err: numpy.ndarray, band: numpy.ndarray, *, band_to_calc: str | Iterable[str] | None = None, check_nans: bool = False)[source]#
Compute the StetsonJ statistic on data from one or several bands
- Parameters:
flux (numpy.ndarray (N,)) – Array of flux/magnitude measurements
err (numpy.ndarray (N,)) – Array of associated flux/magnitude errors
band (numpy.ndarray (N,)) – Array of associated band labels
band_to_calc (str or list of str) – Bands to calculate StetsonJ on. Single band descriptor, or list of such descriptors.
check_nans (bool) – Boolean to run a check for NaN values and filter them out.
- Returns:
stetsonJ – StetsonJ statistic for each of input bands.
- Return type:
dict
Note
In case that no value for band_to_calc is passed, the function is executed on all available bands in band.
- class StructureFunction2[source]#
Bases:
tape.analysis.base.AnalysisFunctionCalculate structure function squared
- cols(ens: Ensemble) List[str][source]#
Return the column names that the analysis function takes as input.
- Parameters:
ens (Ensemble) – The ensemble object, it could be required to get column names of the “special” columns like ens._time_col or ens._err_col.
- Returns:
The column names to select and pass to .calculate() method. For example [ens._time_col, ens._flux_col].
- Return type:
List[str]
- meta(ens: Ensemble) Dict[str, type][source]#
Return the schema of the analysis function output.
- Parameters:
ens (Ensemble) – The ensemble object.
- Returns:
pd.DataFrame or (str, dtype) tuple or {str – Dask meta, for example pd.DataFrame(columns=[‘x’, ‘y’], dtype=float).
- Return type:
dtype} dictionary
- on(ens: Ensemble) List[str][source]#
Return the columns to group source table by.
- Parameters:
ens (Ensemble) – The ensemble object.
- Returns:
The column names to group by. Typically, [ens._id_col].
- Return type:
List[str]
- __call__(time, flux, err=None, band=None, lc_id=None, *, sf_method='basic', argument_container=None) pandas.DataFrame[source]#
Calculate structure function squared using one of a variety of structure function calculation methods defined by the input argument sf_method, or in the argument container object.
- Parameters:
time (numpy.ndarray (N,) or None) – Array of times when measurements were taken. If all array values are None or if a scalar None is provided, then equidistant time between measurements is assumed.
flux (numpy.ndarray (N,)) – Array of flux/magnitude measurements.
err (numpy.ndarray (N,), float, or None, optional) – Array of associated flux/magnitude errors. If a scalar value is provided we assume that error for all measurements. If None is provided, we assume all errors are 0. By default None
band (numpy.ndarray (N,), optional) – Array of associated band labels, by default None
lc_id (numpy.ndarray (N,), optional) – Array of lightcurve ids per data point. By default None
sf_method (str, optional) – The structure function calculation method to be used, by default “basic”.
argument_container (StructureFunctionArgumentContainer, optional) – Container object for additional configuration options, by default None.
- Returns:
sf2 – Structure function squared for each of input bands.
- Return type:
pandas.DataFrame
Notes
In case that no value for band_to_calc is passed, the function is executed on all available bands in band.