Preprocessing

Scaling

aggregation

hydrobox.preprocessing.scale.aggregate(x, by, func='mean')[source]

Time series aggregation

This function version will only operate on a single pandas.Series or pandas.DataFrame instance. It has to be indexed by a pandas.DatetimeIndex. The input data will be aggregated to the given frequency by passing a pandas.Grouper conform string argument specifying the desired period like: ‘1M’ for one month or ‘3Y-Sep’ for three years starting at the first of October.

Parameters:
x: ``pandas.Series``, ``pandas.DataFrame``

The input data, will be aggregated over the index.

by : string

Specifies the desired temporal resolution. Will be passed as freq argument of a pandas.Grouper object for grouping the data into the new resolution. If by is None, the whole Series will be aggregated to only one value. The same applies to by='all'.

func : string

Function identifier used for aggregation. Has to be importable from numpy. The function must accept n input values and aggregate them to only a single one.

Returns:
pandas.Series :

if x was of type pandas.Series

pandas.DataFrame :

if c was of type pandas.DataFrame

cut_period

hydrobox.preprocessing.scale.cut_period(x, start, stop)[source]

Truncate Time series

Truncates a pandas.Series or pandas.DataFrame to the given period. The start and stop parameter need to be either a string or a datetime.datetime, which will then be converted. Returns the truncated time series.

Parameters:
x : pandas.Series, pandas.DataFrame

The input data, will be truncated

start : string, datetime

Begin of truncation. Can be a datetime.datetime or a string. If a string is passed, it has to use the format ‘YYYYMMDDhhmmss’, where the time component ‘hhmmss’ can be omitted.

stop : string, datetime,

End of truncation. Can be a datetime.datetime or a string. If a string is passed, it has to use the format ‘YYYYMMDDhhmmss’, where the time component ‘hhmmss’ can be omitted.