cycad module

class cycad.cycad

Bases: object

The cycad object contains data and methods pertaining to an in situ dataset

autocorrelate(samples=None, bkg_subtract=False)

Apply autocorrelation to self.df to generate correlation matrix. Store the correlation matrix in self.correlation_matrix.

Parameters:

samples (int) – Number of samples to use for the downsampled output
bkg_subtract (bool) – Whether to apply baseline correction to the data

autocorrelate_ec()

Generate a distance matrix from the single dimensional array stored in self.df_echem. This would usally be the cycling potentials.

# TODO: add in data file reading

static baseline_arPLS(y, ratio=1e-06, lam=100, niter=10, full_output=False)

Baseline correction routine

Parameters:: y (numpy.ndarray) – Data to be baseline corrected

bkg_subtract(df)

static get_skip(file, encoding='utf-8')

Get the number of rows to skip in a mpt file

Parameters:

file (str) – Path to a mpt file
encoding (str) – Encoding of the mpt file

Returns:

Number of rows to skip

Return type:

int

static natural_sort(l)

Sort a list of strings in a human friendly way.

Parameters:: l (list) – list of strings to be sorted
Returns:: sorted list of strings
Return type:: list

static parse_filename(filename)

Parse a file name to get the sample name. Overwrite for specific data if you want to use the dataframe directly

Parameters:: filename (str) – string containing a file name
Returns:: string containing the sample name
Return type:: str

plot(qmin=0.15, qmax=0.9, echem=False, echem_alpha=0.2, echem_quantile=0.2, save=False, filename=None)

Plot the full correlation matrix and the components

Parameters:

qmin (float) – Minimum quantile to use for the color scale
qmax (float) – Maximum quantile to use for the color scale
echem (bool) – Whether to plot the echem data
echem_alpha (float) – Opacity of the echem overlay
echem_quantile (float) – Quantile to use for the echem overlay (how close should the voltages be in the highlighted region)
save (bool) – Whether to save the figure
filename (str) – Filename to save the figure as

read_data(parse_names=False)

Read data from a list of files found by read_folder. Takes the first column of the first file as the x-values. Filetype is specified by self.read_folder().

The delimited for csv files will be detected automatically.

Parameters:: parse_names (bool) – If True, parse file names for column headings

read_data_csv(path)

Read a dataset to self.df from a single csv file

Parameters:: path (str) – Path to the csv file

read_echem_df(df)

Read cycling data from a single-column dataframe or series If this is called multiple times, the data will be concatenated The data are resampled to the size of the data dataframe in self.autocorrelate_ec()

Parameters:: df (pandas.DataFrame or pandas.Series) – Dataframe or series to read

read_echem_mpt(path, decimal=',')

Read cycling data from an echem mpt file If this is called multiple times, the data will be concatenated The data are resampled to the size of the data dataframe in self.autocorrelate_ec()

Parameters:

path (str) – Path to the mpt file
decimal (str) – Decimal separator in the mpt file

read_folder(root, filetype)

Read a sorted list of files in a folder to self.files. The path is stored as self.root. The lowest level folder name is stored as self.runname and used as the run name.

The filetype can be specified currently HDF type (h5, hdf5), or csv type (‘csv’, ‘txt’, ‘xy’, ‘xye’, ‘tsv’) are supported. The file type is stored in self.filetype.

Note: currently for HDF files the data is required to be individual scans with I and 2th keys.

Parameters:

root (str) – Path to a folder containing data files.
filetype (str) – Filetype of the files.