larvaworld.lib.process.dataset

Basic classes for larvaworld-format datasets

Classes

DatasetConfig

The configuration of a LarvaDataset.

ParamLarvaDataset

Base class for named objects that support Parameters and message

BaseLarvaDataset

Base class for named objects that support Parameters and message

LarvaDataset

Base class for named objects that support Parameters and message

LarvaDatasetCollection

Module Contents

class larvaworld.lib.process.dataset.DatasetConfig(**kwargs: Any)

Bases: larvaworld.lib.param.RuntimeDataOps, larvaworld.lib.param.SimMetricOps, larvaworld.lib.param.SimTimeOps

The configuration of a LarvaDataset.

Nticks
refID
group_id
color
env_params
larva_group
agent_ids
N
sample
filtered_at
rescaled_by
pooled_cycle_curves
bout_distros
intermitter
modelConfs
EEB_poly1d
property h5_kdic

Returns the keys of the h5 file that store the parameters of the dataset

update_Nagents()
property arena_vertices
get_sample_bout_distros(m)
class larvaworld.lib.process.dataset.ParamLarvaDataset(**kwargs: Any)

Bases: param.Parameterized

Base class for named objects that support Parameters and message formatting.

Automatic object naming: Every Parameterized instance has a name parameter. If the user doesn’t designate a name=<str> argument when constructing the object, the object will be given a name consisting of its class name followed by a unique 5-digit number.

Automatic parameter setting: The Parameterized __init__ method will automatically read the list of keyword parameters. If any keyword matches the name of a Parameter (see Parameter class) defined in the object’s class or any of its superclasses, that parameter in the instance will get the value given as a keyword argument. For example:

class Foo(Parameterized):

xx = Parameter(default=1)

foo = Foo(xx=20)

in this case foo.xx gets the value 20.

When initializing a Parameterized instance (‘foo’ in the example above), the values of parameters can be supplied as keyword arguments to the constructor (using parametername=parametervalue); these values will override the class default values for this one instance.

If no ‘name’ parameter is supplied, self.name defaults to the object’s class name with a unique number appended to it.

Message formatting: Each Parameterized instance has several methods for optionally printing output. This functionality is based on the standard Python ‘logging’ module; using the methods provided here, wraps calls to the ‘logging’ module’s root logger and prepends each message with information about the instance from which the call was made. For more information on how to set the global logging level and change the default message prefix, see documentation for the ‘logging’ module.

config
step_data
endpoint_data
config2
epoch_dict
larva_dicts
validate_IDs()
update_ids_in_data()
update_Nticks()
property c
property ids
property s
property e
property end_ps
property step_ps
property end_ks
property step_ks
property min_tick
timeseries_slice(time_range=None, df=None)
required()
valid(returned=None)
data_exists(ks=[], ps=[], eks=[], eps=[], config_attrs=[], attrs=[])
property chunk_dicts
property epoch_dicts
property fitted_epochs
property pooled_epochs
property cycle_curves
property pooled_cycle_curves
track_par_in_chunk(chunk, par)
epochs_pose_by_ID(chunk, id)
epochs_bearing_by_ID(chunk, id, loc=(0.0, 0.0))
epoch_durs(epochs)
epoch_amps(epochs, a)
epoch_maxs(epochs, a)
epoch_idx(epochs)
comp_chunk_bearing(chunk)
detect_epochs(idx, min_dur=None)
detect_runs(a, vel_thr=0.3, min_dur=0.5)

Annotates crawl-runs in timeseries.

Extended description of function.

Parameters

aarray

1D np.array : forward velocity timeseries

vel_thrfloat

Maximum velocity threshold

min_durfloat, optional

The minimum required duration for a turn

Returns

runslist

A list of pairs of the start-end indices of the runs.

detect_pauses(a, vel_thr=0.3, runs=None, min_dur=None)

Annotates crawl-pauses in timeseries.

Extended description of function.

Parameters

aarray

1D np.array : forward velocity timeseries

vel_thrfloat

Maximum velocity threshold

runslist

A list of pairs of the start-end indices of the runs. If provided pauses that overlap with runs will be excluded.

min_durfloat, optional

The minimum required duration for a turn

Returns

pauseslist

A list of pairs of the start-end indices of the pauses.

detect_strides(a, vel_thr=0.3, stretch=(0.75, 2.0), fr=None, return_extrema=True)

Annotates strides-runs and pauses in timeseries.

Extended description of function.

Parameters

aarray

1D np.array : forward velocity timeseries

vel_thrfloat

Maximum velocity threshold

stretchTuple[float,float]

The min-max stretch of a stride relative to the default derived from the dominnt frequency

frfloat, optional

The dominant crawling frequency.

return_extremaboolean

Whether to additionally return the stride extrema

Returns

strideslist

A list of pairs of the start-end indices of the strides.

i_minarray

Indices of the local minima.

i_maxarray

Indices of the local maxima

detect_stridechains(strides)

Annotates stridechains-runs by concatenating consecutive strides.

Extended description of function.

Parameters

stridesarray

2D np.array : the start-end tics of the stride epochs

Returns

runslist

A list of pairs of the start-end indices of the runs/stridechains.

run_countslist

Stride-counts of the runs/stridechains.

detect_turns(a, min_dur=None)

Annotates turns in timeseries.

Extended description of function.

Parameters

aarray

1D np.array : angular velocity timeseries

min_durfloat, optional

The minimum required duration for a turn

Returns

Lturnslist

A list of pairs of the start-end indices of the Left turns.

Rturnslist

A list of pairs of the start-end indices of the Right turns.

crawl_annotation(strides_enabled: bool = True, vel_thr: float = 0.3) larvaworld.lib.util.AttrDict
turn_annotation(min_dur=None)
turn_mode_annotation()
patch_residency_annotation()
detect_epoch_on_food_overlap(chunk)
detect_bouts(vel_thr=0.3, strides_enabled=True, castsNweathervanes=True)
comp_pooled_epochs()

Compute pooled epochs from chunk dictionaries.

This method processes the chunk_dicts attribute to create epoch_dicts and pooled_epochs. It first extracts unique epoch keys from the chunk dictionaries and then constructs a dictionary of epochs (epoch_dicts) where each key corresponds to a dictionary of chunk data.

The method then defines an inner function get_vs to concatenate values from the dictionaries, handling cases where the values have different shapes. If the majority of the values have a shape of 2 dimensions, it filters out those with a shape of 1 dimension before concatenation.

Finally, it creates the pooled_epochs attribute by concatenating the values for each epoch key, excluding specific keys such as “turn_slice”, “pause_idx”, “run_idx”, and “stride_idx”.

Attributes:

chunk_dicts (dict): A dictionary containing chunk data. epoch_dicts (AttrDict): A dictionary of epochs with chunk data. pooled_epochs (AttrDict): A dictionary of concatenated epoch data.

Raises:

Exception: If there is an issue with concatenating the values in get_vs.

Prints:

“Completed bout detection.” upon successful completion.

fit_pooled_epochs()
generate_pooled_epochs(mID)
comp_bout_distros()
register_bout_distros()
comp_cycle_curves(Nbins=64)
comp_attenuation(Nbins=64)
comp_interference(Nbins=64)
comp_pooled_cycle_curves()
annotate(anot_keys=['bout_detection', 'bout_distribution', 'interference'], is_last=False, **kwargs)
interpolate_nan_values()
filter(filter_f=2.0, recompute=False)
rescale(recompute=False, rescale_by=1.0)
exclude_rows(flag='collision_flag', accepted=[0], rejected=None)
smaller_dataset(p)

Generate a smaller dataset based on the given ReplayConf parameters.

Args:

p (ReplayConf): The configuration for dataset replay.

Returns:

LarvaDataset: A subset of the original dataset.

align_trajectories(track_point=None, arena_dims=None, transposition='origin', replace=True)
preprocess(drop_collisions=False, interpolate_nans=False, filter_f=None, rescale_by=None, transposition=None, recompute=False)
merge_configs()
set_data(step=None, end=None, agents=None, **kwargs)
property data
path_to_file(file='data.h5')
property path_to_config
store(df, key, file='data.h5')
save_dict(d, file)
read(key, file='data.h5')
load(step=True, h5_ks=None)
save(refID=None)
save_config(refID=None)
load_traj(mode='default')
load_dicts(type, ids=None)

Load dictionaries based on the specified type and optional IDs.

Args:

type (str): The type of dictionaries to load. ids (list, optional): A list of IDs to load. If None, uses self.ids.

Returns:

list: A list of dictionaries corresponding to the specified type and IDs.

Notes:
  • If the specified type and IDs are found in self.larva_dicts, the dictionaries are loaded from there.

  • Otherwise, the dictionaries are loaded from files located in the directory specified by self.config.data_dir.

store_dicts(type, dicts)

Stores a dictionary of dictionaries to individual files.

Args:

type (str): The type/category of the dictionaries to be stored. dicts (dict): A dictionary where keys are identifiers and values are dictionaries to be stored.

Example:
>>> store_dicts('example_type', {'id1': {'key1': 'value1'}, 'id2': {'key2': 'value2'}})
This will create files 'id1.txt' and 'id2.txt' in the directory specified by self.config.data_dir/individuals/example_type.
store_larva_dicts()

Stores larva dictionaries by iterating over the items in self.larva_dicts.

This method retrieves each type and its corresponding dictionary from self.larva_dicts and passes them to the store_dicts method for storage.

Returns:

None

property contour_xy_data_byID
property midline_xy_data_byID
property traj_xy_data_byID
data_by_ID(data)
property midline_xy_data
property contour_xy_data
empty_df(dim3=1)
apply_per_agent(pars, func, time_range=None, **kwargs)

Apply a function to each subdataframe of a MultiIndex DataFrame after grouping by the agentID.

Parameters

spandas.DataFrame

A MultiIndex DataFrame with levels [‘Step’, ‘AgentID’].

funcfunction

The function to apply to each subdataframe.

**kwargsdict

Additional keyword arguments to pass to the ‘func’ function.

Returns

numpy.ndarray

An array of dimensions [N_ticks, N_ids], where N_ticks is the number of unique ‘Step’ values, and N_ids is the number of unique ‘AgentID’ values.

Notes

This function groups the DataFrame ‘s’ by the specified ‘level’, applies ‘func’ to each subdataframe, and returns the results as a numpy array.

midline_xy_1less(mid)
property midline_seg_xy_data_byID
property midline_seg_orients_data_byID
midline_seg_orients_from_mid(mid)

Calculate the orientation of midline segments from midline coordinates.

Parameters: mid (numpy.ndarray): A 3D array of shape (Nticks, N, 2) where Nticks is the number of timesteps,

N is the number of midline points, and 2 represents the x and y coordinates of each point.

Returns: numpy.ndarray: A 2D array of shape (Nticks, N-1) containing the orientation angles (in radians)

of each segment for each timestep, with values in the range [0, 2π).

comp_freq(par, fr_range=(0.0, +np.inf))

Compute the frequency of a parameter for each agent.

This method calculates the dominant frequency of a given parameter for each agent in the dataset. It uses the Fast Fourier Transform (FFT) to find the frequency with the highest amplitude within a specified frequency range.

Parameters: par (str): The name of the parameter to compute the frequency for. fr_range (tuple, optional): A tuple specifying the frequency range to consider.

Defaults to (0.0, +np.inf).

Returns: None: The result is stored in the endpoint dataframe with the frequency name

as the key.

comp_freqs()

Compute dominant frequencies for translational and angular velocities. The frequency ranges (in Hz) are (1.0, 2.5) and (0.1, 0.8) respectively.

Parameters: None

Returns: None

comp_orientations(mode='minimal', recompute=False)

Compute the orientations of body segments for each timestep, for each agent in the dataset.

Parameters: mode (str): Determines whether to compute only front and rear orientations

or one for each body segment. Options are “minimal” (default) or “full”.

recompute (bool): If True, recompute the orientations even if they already exist.

Default is False.

Returns: None

comp_angular(is_last=False, **kwargs)

Perform angular analysis on the dataset.

This method computes orientations, bends, and angular moments for the dataset. If is_last is set to True, the results are saved after computation.

Parameters: is_last (bool): Flag to indicate if this is the last computation step. If True, the results are saved. **kwargs: Additional keyword arguments passed to the computation methods.

Returns: None

comp_bend(mode='minimal', recompute=False)

Compute the body bending angle for each timestep, for each agent in the dataset.

Parameters: mode (str): Determines whether to compute a single angle or one for each intersegmental joint.

Options are “minimal” (default) or “full”.

recompute (bool): If True, forces recomputation of the bending angles

even if they are already computed. Default is False.

Raises: Exception: If the bending angle computation method specified in the

configuration is not recognized.

Notes: - If the bending angles are already computed and recompute is set to False,

a message will be printed and the function will exit without recomputing.

  • The bending angle can be computed in two ways: 1. “from_vectors”: As the difference between front and rear orientations. 2. “from_angles”: As the sum of the first N front angles, where N is

    specified in the configuration.

  • The computed bending angles are stored in the step dataframe.

comp_ang_moments(pars=None, mode='minimal', recompute=False)
comp_xy_moments(point='', **kwargs)
comp_tortuosity(dur=20, **kwargs)
comp_dispersal(t0=0, t1=60, **kwargs)
comp_operators(pars)
comp_centroid(**kwargs)
comp_length(mode='minimal', recompute=False)
comp_spatial(**kwargs)
scale_to_length(pars=None, keys=None)
comp_source_metrics()
comp_wind()
comp_wind_metrics(woo, wo)
comp_final_anemotaxis(woo)
comp_PI2(xys, x=0.04)
comp_PI(arena_xdim, xs, return_num=False)
comp_dataPI()
process(proc_keys=['angular', 'spatial'], dsp_starts=[0], dsp_stops=[40, 60], tor_durs=[5, 10, 20], is_last=False, **kwargs)
get_par(par=None, k=None, key='step')
sample_larvagroup(N=1, ps=[])
imitate_larvagroup(N=None, ps=None)
property existing_dispersion_ranges
convert_to_pint()
class larvaworld.lib.process.dataset.BaseLarvaDataset(dir: str | None = None, refID: str | None = None, load_data: bool = True, config: larvaworld.lib.util.AttrDict | None = None, step: pandas.DataFrame | None = None, end: pandas.DataFrame | None = None, agents: list[str] | None = None, initialize: bool = False, **kwargs: Any)

Bases: ParamLarvaDataset

Base class for named objects that support Parameters and message formatting.

Automatic object naming: Every Parameterized instance has a name parameter. If the user doesn’t designate a name=<str> argument when constructing the object, the object will be given a name consisting of its class name followed by a unique 5-digit number.

Automatic parameter setting: The Parameterized __init__ method will automatically read the list of keyword parameters. If any keyword matches the name of a Parameter (see Parameter class) defined in the object’s class or any of its superclasses, that parameter in the instance will get the value given as a keyword argument. For example:

class Foo(Parameterized):

xx = Parameter(default=1)

foo = Foo(xx=20)

in this case foo.xx gets the value 20.

When initializing a Parameterized instance (‘foo’ in the example above), the values of parameters can be supplied as keyword arguments to the constructor (using parametername=parametervalue); these values will override the class default values for this one instance.

If no ‘name’ parameter is supplied, self.name defaults to the object’s class name with a unique number appended to it.

Message formatting: Each Parameterized instance has several methods for optionally printing output. This functionality is based on the standard Python ‘logging’ module; using the methods provided here, wraps calls to the ‘logging’ module’s root logger and prepends each message with information about the instance from which the call was made. For more information on how to set the global logging level and change the default message prefix, see documentation for the ‘logging’ module.

static initGeo(to_Geo: bool = False, **kwargs: Any) BaseLarvaDataset
generate_config(**kwargs)
delete()
set_id(id, save=True)
class larvaworld.lib.process.dataset.LarvaDataset(**kwargs: Any)

Bases: BaseLarvaDataset

Base class for named objects that support Parameters and message formatting.

Automatic object naming: Every Parameterized instance has a name parameter. If the user doesn’t designate a name=<str> argument when constructing the object, the object will be given a name consisting of its class name followed by a unique 5-digit number.

Automatic parameter setting: The Parameterized __init__ method will automatically read the list of keyword parameters. If any keyword matches the name of a Parameter (see Parameter class) defined in the object’s class or any of its superclasses, that parameter in the instance will get the value given as a keyword argument. For example:

class Foo(Parameterized):

xx = Parameter(default=1)

foo = Foo(xx=20)

in this case foo.xx gets the value 20.

When initializing a Parameterized instance (‘foo’ in the example above), the values of parameters can be supplied as keyword arguments to the constructor (using parametername=parametervalue); these values will override the class default values for this one instance.

If no ‘name’ parameter is supplied, self.name defaults to the object’s class name with a unique number appended to it.

Message formatting: Each Parameterized instance has several methods for optionally printing output. This functionality is based on the standard Python ‘logging’ module; using the methods provided here, wraps calls to the ‘logging’ module’s root logger and prepends each message with information about the instance from which the call was made. For more information on how to set the global logging level and change the default message prefix, see documentation for the ‘logging’ module.

visualize(parameters={}, **kwargs)
enrich(pre_kws={}, proc_keys=[], anot_keys=[], is_last=True, mode='minimal', recompute=False, **kwargs)
property epoch_bound_dicts
get_chunk_par(chunk, k=None, par=None, min_dur=0, mode='distro')
class larvaworld.lib.process.dataset.LarvaDatasetCollection(labels: list[str] | None = None, colors: list[Any] | None = None, add_samples: bool = False, config: larvaworld.lib.util.AttrDict | None = None, **kwargs: Any)
config = None
datasets
labels = None
Ndatasets
colors = None
group_ids
Ngroups
dir
set_dir(dir=None)
property plot_dir
plot(ids=[], gIDs=[], **kwargs)
get_datasets(datasets=None, refIDs=None, dirs=None, group_id=None)
get_colors()
property data_dict
property data_palette
property data_palette_with_N
property color_palette
property Nticks
property N
property labels_with_N
property fr
property dt
property duration
property tlim
trange(unit='min')
property arena_dims
property arena_geometry
concat_data(key)
classmethod from_agentpy_output(output=None, agents=None, to_Geo=False)

Convert agentpy output to a LarvaDataset