`src.superphot_plus.format_data_ztf`

This script provides functions for importing, preprocessing, and manipulating data related to ZTF lightcurves.

Module Contents

Functions

`import_labels_only`(input_csvs, allowed_types[, ...])	Filters CSVs for rows where label is in allowed_types and returns
`generate_K_fold`(features, classes, num_folds)	Generates set of K test sets and corresponding training sets.
`tally_each_class`(labels)	Prints the number of samples with each class label.
`oversample_using_posteriors`(lc_names, labels, ...[, ...])	Oversamples, drawing from posteriors of a certain fit.
`normalize_features`(features[, mean, std])	Normalizes the features for feeding into the neural network.
`oversample_smote`(features, labels)	Uses SMOTE to oversample data from rarer classes.

import_labels_only(input_csvs, allowed_types, fits_dir=None, needs_posteriors=True, sampler=None)[source]

Filters CSVs for rows where label is in allowed_types and returns names, labels.

Parameters:

input_csvs (list of str) – List of input CSV file paths.
allowed_types (list) – List of allowed types for labels.
fits_dir (str, optional) – Directory path for FITS files. Defaults to None.
needs_posteriors (boolean, optional) – Indicates whether to load posterior samples.
sampler (str, optional) – The sampler to get posteriors from.

Returns:

Tuple of names, labels and redshifts.

Return type:

tuple of np.ndarray

Notes

Maps groups of similar labels to a single representative label name (eg, “SN Ic”, “SNIc-BL”, and “21” all become “SN Ibc”).

generate_K_fold(features, classes, num_folds)[source]

Generates set of K test sets and corresponding training sets.

Parameters:

features (list) – Input features.
classes (list) – Input classes.
num_folds (int) – Number of folds. If -1, sets num_folds=len(features).

Returns:

Generator yielding the indices for training and test sets.

Return type:

generator

tally_each_class(labels)[source]

Prints the number of samples with each class label.

Parameters:: labels (list) – Input labels.

oversample_using_posteriors(lc_names, labels, goal_per_class, fits_dir, sampler=None, redshifts=None, oversample_redshifts=False)[source]

Oversamples, drawing from posteriors of a certain fit.

Parameters:

lc_names (str) – Lightcurve names.
labels (list) – List of labels.
goal_per_class (int) – Number of samples per class.
fits_dir (str) – Where fit parameters are stored.
sampler (str, optional) – The name of the sampler to use.
redshifts (list, optional) – List of redshift values.
oversample_redshifts (boolean, optional) – Indicates whether to oversample redshifts.

Returns:

Tuple containing oversampled features, labels, and redshifts.

Return type:

tuple of np.ndarray

normalize_features(features, mean=None, std=None)[source]

Normalizes the features for feeding into the neural network.

Parameters:

features (numpy array) – Input features. Must be a 2-d array where each row corresponds to a data point and each entry to a feature.
mean (ndarray, optional) – Mean values for normalization. Defaults to None.
std (ndarray, optional) – Standard deviation values for normalization. Defaults to None.

Returns:

Tuple containing normalized features, mean values, and standard deviation values.

Return type:

tuple of np.ndarray

oversample_smote(features, labels)[source]: Uses SMOTE to oversample data from rarer classes.

src.superphot_plus.format_data_ztf

Module Contents

Functions

`src.superphot_plus.format_data_ztf`