Preprocessing¶

Scaling¶

aggregation¶

hydrobox.preprocessing.scale.aggregate(x, by, func='mean')[source]¶

Time series aggregation

This function version will only operate on a single pandas.Series or pandas.DataFrame instance. It has to be indexed by a pandas.DatetimeIndex. The input data will be aggregated to the given frequency by passing a pandas.Grouper conform string argument specifying the desired period like: ‘1M’ for one month or ‘3Y-Sep’ for three years starting at the first of October.

Parameters:

x: ``pandas.Series``, ``pandas.DataFrame``: The input data, will be aggregated over the index.
by : string: Specifies the desired temporal resolution. Will be passed as freq argument of a pandas.Grouper object for grouping the data into the new resolution. If by is None, the whole Series will be aggregated to only one value. The same applies to by='all'.
func : string: Function identifier used for aggregation. Has to be importable from numpy. The function must accept n input values and aggregate them to only a single one.

Returns:

pandas.Series :: if x was of type pandas.Series
pandas.DataFrame :: if c was of type pandas.DataFrame

cut_period¶

hydrobox.preprocessing.scale.cut_period(x, start, stop)[source]¶

Truncate Time series

Truncates a pandas.Series or pandas.DataFrame to the given period. The start and stop parameter need to be either a string or a datetime.datetime, which will then be converted. Returns the truncated time series.

Parameters:

x : pandas.Series, pandas.DataFrame: The input data, will be truncated
start : string, datetime: Begin of truncation. Can be a datetime.datetime or a string. If a string is passed, it has to use the format ‘YYYYMMDDhhmmss’, where the time component ‘hhmmss’ can be omitted.
stop : string, datetime,: End of truncation. Can be a datetime.datetime or a string. If a string is passed, it has to use the format ‘YYYYMMDDhhmmss’, where the time component ‘hhmmss’ can be omitted.