larvaworld.lib.process.dataset
Basic classes for larvaworld-format datasets
Classes
The configuration of a LarvaDataset. |
|
Base class for named objects that support Parameters and message |
|
Base class for named objects that support Parameters and message |
|
Base class for named objects that support Parameters and message |
|
Module Contents
- class larvaworld.lib.process.dataset.DatasetConfig(**kwargs: Any)
Bases:
larvaworld.lib.param.RuntimeDataOps,larvaworld.lib.param.SimMetricOps,larvaworld.lib.param.SimTimeOpsThe configuration of a LarvaDataset.
- Nticks
- refID
- group_id
- color
- env_params
- larva_group
- agent_ids
- N
- sample
- filtered_at
- rescaled_by
- pooled_cycle_curves
- bout_distros
- intermitter
- modelConfs
- EEB_poly1d
- property h5_kdic
Returns the keys of the h5 file that store the parameters of the dataset
- update_Nagents()
- property arena_vertices
- get_sample_bout_distros(m)
- class larvaworld.lib.process.dataset.ParamLarvaDataset(**kwargs: Any)
Bases:
param.ParameterizedBase class for named objects that support Parameters and message formatting.
Automatic object naming: Every Parameterized instance has a name parameter. If the user doesn’t designate a name=<str> argument when constructing the object, the object will be given a name consisting of its class name followed by a unique 5-digit number.
Automatic parameter setting: The Parameterized __init__ method will automatically read the list of keyword parameters. If any keyword matches the name of a Parameter (see Parameter class) defined in the object’s class or any of its superclasses, that parameter in the instance will get the value given as a keyword argument. For example:
- class Foo(Parameterized):
xx = Parameter(default=1)
foo = Foo(xx=20)
in this case foo.xx gets the value 20.
When initializing a Parameterized instance (‘foo’ in the example above), the values of parameters can be supplied as keyword arguments to the constructor (using parametername=parametervalue); these values will override the class default values for this one instance.
If no ‘name’ parameter is supplied, self.name defaults to the object’s class name with a unique number appended to it.
Message formatting: Each Parameterized instance has several methods for optionally printing output. This functionality is based on the standard Python ‘logging’ module; using the methods provided here, wraps calls to the ‘logging’ module’s root logger and prepends each message with information about the instance from which the call was made. For more information on how to set the global logging level and change the default message prefix, see documentation for the ‘logging’ module.
- config
- step_data
- endpoint_data
- config2
- epoch_dict
- larva_dicts
- validate_IDs()
- update_ids_in_data()
- update_Nticks()
- property c
- property ids
- property s
- property e
- property end_ps
- property step_ps
- property end_ks
- property step_ks
- property min_tick
- timeseries_slice(time_range=None, df=None)
- required()
- valid(returned=None)
- data_exists(ks=[], ps=[], eks=[], eps=[], config_attrs=[], attrs=[])
- property chunk_dicts
- property epoch_dicts
- property fitted_epochs
- property pooled_epochs
- property cycle_curves
- property pooled_cycle_curves
- track_par_in_chunk(chunk, par)
- epochs_pose_by_ID(chunk, id)
- epochs_bearing_by_ID(chunk, id, loc=(0.0, 0.0))
- epoch_durs(epochs)
- epoch_amps(epochs, a)
- epoch_maxs(epochs, a)
- epoch_idx(epochs)
- comp_chunk_bearing(chunk)
- detect_epochs(idx, min_dur=None)
- detect_runs(a, vel_thr=0.3, min_dur=0.5)
Annotates crawl-runs in timeseries.
Extended description of function.
Parameters
- aarray
1D np.array : forward velocity timeseries
- vel_thrfloat
Maximum velocity threshold
- min_durfloat, optional
The minimum required duration for a turn
Returns
- runslist
A list of pairs of the start-end indices of the runs.
- detect_pauses(a, vel_thr=0.3, runs=None, min_dur=None)
Annotates crawl-pauses in timeseries.
Extended description of function.
Parameters
- aarray
1D np.array : forward velocity timeseries
- vel_thrfloat
Maximum velocity threshold
- runslist
A list of pairs of the start-end indices of the runs. If provided pauses that overlap with runs will be excluded.
- min_durfloat, optional
The minimum required duration for a turn
Returns
- pauseslist
A list of pairs of the start-end indices of the pauses.
- detect_strides(a, vel_thr=0.3, stretch=(0.75, 2.0), fr=None, return_extrema=True)
Annotates strides-runs and pauses in timeseries.
Extended description of function.
Parameters
- aarray
1D np.array : forward velocity timeseries
- vel_thrfloat
Maximum velocity threshold
- stretchTuple[float,float]
The min-max stretch of a stride relative to the default derived from the dominnt frequency
- frfloat, optional
The dominant crawling frequency.
- return_extremaboolean
Whether to additionally return the stride extrema
Returns
- strideslist
A list of pairs of the start-end indices of the strides.
- i_minarray
Indices of the local minima.
- i_maxarray
Indices of the local maxima
- detect_stridechains(strides)
Annotates stridechains-runs by concatenating consecutive strides.
Extended description of function.
Parameters
- stridesarray
2D np.array : the start-end tics of the stride epochs
Returns
- runslist
A list of pairs of the start-end indices of the runs/stridechains.
- run_countslist
Stride-counts of the runs/stridechains.
- detect_turns(a, min_dur=None)
Annotates turns in timeseries.
Extended description of function.
Parameters
- aarray
1D np.array : angular velocity timeseries
- min_durfloat, optional
The minimum required duration for a turn
Returns
- Lturnslist
A list of pairs of the start-end indices of the Left turns.
- Rturnslist
A list of pairs of the start-end indices of the Right turns.
- crawl_annotation(strides_enabled: bool = True, vel_thr: float = 0.3) larvaworld.lib.util.AttrDict
- turn_annotation(min_dur=None)
- turn_mode_annotation()
- patch_residency_annotation()
- detect_epoch_on_food_overlap(chunk)
- detect_bouts(vel_thr=0.3, strides_enabled=True, castsNweathervanes=True)
- comp_pooled_epochs()
Compute pooled epochs from chunk dictionaries.
This method processes the chunk_dicts attribute to create epoch_dicts and pooled_epochs. It first extracts unique epoch keys from the chunk dictionaries and then constructs a dictionary of epochs (epoch_dicts) where each key corresponds to a dictionary of chunk data.
The method then defines an inner function get_vs to concatenate values from the dictionaries, handling cases where the values have different shapes. If the majority of the values have a shape of 2 dimensions, it filters out those with a shape of 1 dimension before concatenation.
Finally, it creates the pooled_epochs attribute by concatenating the values for each epoch key, excluding specific keys such as “turn_slice”, “pause_idx”, “run_idx”, and “stride_idx”.
- Attributes:
chunk_dicts (dict): A dictionary containing chunk data. epoch_dicts (AttrDict): A dictionary of epochs with chunk data. pooled_epochs (AttrDict): A dictionary of concatenated epoch data.
- Raises:
Exception: If there is an issue with concatenating the values in get_vs.
- Prints:
“Completed bout detection.” upon successful completion.
- fit_pooled_epochs()
- generate_pooled_epochs(mID)
- comp_bout_distros()
- register_bout_distros()
- comp_cycle_curves(Nbins=64)
- comp_attenuation(Nbins=64)
- comp_interference(Nbins=64)
- comp_pooled_cycle_curves()
- annotate(anot_keys=['bout_detection', 'bout_distribution', 'interference'], is_last=False, **kwargs)
- interpolate_nan_values()
- filter(filter_f=2.0, recompute=False)
- rescale(recompute=False, rescale_by=1.0)
- exclude_rows(flag='collision_flag', accepted=[0], rejected=None)
- smaller_dataset(p)
Generate a smaller dataset based on the given ReplayConf parameters.
- Args:
p (ReplayConf): The configuration for dataset replay.
- Returns:
LarvaDataset: A subset of the original dataset.
- align_trajectories(track_point=None, arena_dims=None, transposition='origin', replace=True)
- preprocess(drop_collisions=False, interpolate_nans=False, filter_f=None, rescale_by=None, transposition=None, recompute=False)
- merge_configs()
- set_data(step=None, end=None, agents=None, **kwargs)
- property data
- path_to_file(file='data.h5')
- property path_to_config
- store(df, key, file='data.h5')
- save_dict(d, file)
- read(key, file='data.h5')
- load(step=True, h5_ks=None)
- save(refID=None)
- save_config(refID=None)
- load_traj(mode='default')
- load_dicts(type, ids=None)
Load dictionaries based on the specified type and optional IDs.
- Args:
type (str): The type of dictionaries to load. ids (list, optional): A list of IDs to load. If None, uses self.ids.
- Returns:
list: A list of dictionaries corresponding to the specified type and IDs.
- Notes:
If the specified type and IDs are found in self.larva_dicts, the dictionaries are loaded from there.
Otherwise, the dictionaries are loaded from files located in the directory specified by self.config.data_dir.
- store_dicts(type, dicts)
Stores a dictionary of dictionaries to individual files.
- Args:
type (str): The type/category of the dictionaries to be stored. dicts (dict): A dictionary where keys are identifiers and values are dictionaries to be stored.
- Example:
>>> store_dicts('example_type', {'id1': {'key1': 'value1'}, 'id2': {'key2': 'value2'}}) This will create files 'id1.txt' and 'id2.txt' in the directory specified by self.config.data_dir/individuals/example_type.
- store_larva_dicts()
Stores larva dictionaries by iterating over the items in self.larva_dicts.
This method retrieves each type and its corresponding dictionary from self.larva_dicts and passes them to the store_dicts method for storage.
- Returns:
None
- property contour_xy_data_byID
- property midline_xy_data_byID
- property traj_xy_data_byID
- data_by_ID(data)
- property midline_xy_data
- property contour_xy_data
- empty_df(dim3=1)
- apply_per_agent(pars, func, time_range=None, **kwargs)
Apply a function to each subdataframe of a MultiIndex DataFrame after grouping by the agentID.
Parameters
- spandas.DataFrame
A MultiIndex DataFrame with levels [‘Step’, ‘AgentID’].
- funcfunction
The function to apply to each subdataframe.
- **kwargsdict
Additional keyword arguments to pass to the ‘func’ function.
Returns
- numpy.ndarray
An array of dimensions [N_ticks, N_ids], where N_ticks is the number of unique ‘Step’ values, and N_ids is the number of unique ‘AgentID’ values.
Notes
This function groups the DataFrame ‘s’ by the specified ‘level’, applies ‘func’ to each subdataframe, and returns the results as a numpy array.
- midline_xy_1less(mid)
- property midline_seg_xy_data_byID
- property midline_seg_orients_data_byID
- midline_seg_orients_from_mid(mid)
Calculate the orientation of midline segments from midline coordinates.
Parameters: mid (numpy.ndarray): A 3D array of shape (Nticks, N, 2) where Nticks is the number of timesteps,
N is the number of midline points, and 2 represents the x and y coordinates of each point.
Returns: numpy.ndarray: A 2D array of shape (Nticks, N-1) containing the orientation angles (in radians)
of each segment for each timestep, with values in the range [0, 2π).
- comp_freq(par, fr_range=(0.0, +np.inf))
Compute the frequency of a parameter for each agent.
This method calculates the dominant frequency of a given parameter for each agent in the dataset. It uses the Fast Fourier Transform (FFT) to find the frequency with the highest amplitude within a specified frequency range.
Parameters: par (str): The name of the parameter to compute the frequency for. fr_range (tuple, optional): A tuple specifying the frequency range to consider.
Defaults to (0.0, +np.inf).
Returns: None: The result is stored in the endpoint dataframe with the frequency name
as the key.
- comp_freqs()
Compute dominant frequencies for translational and angular velocities. The frequency ranges (in Hz) are (1.0, 2.5) and (0.1, 0.8) respectively.
Parameters: None
Returns: None
- comp_orientations(mode='minimal', recompute=False)
Compute the orientations of body segments for each timestep, for each agent in the dataset.
Parameters: mode (str): Determines whether to compute only front and rear orientations
or one for each body segment. Options are “minimal” (default) or “full”.
- recompute (bool): If True, recompute the orientations even if they already exist.
Default is False.
Returns: None
- comp_angular(is_last=False, **kwargs)
Perform angular analysis on the dataset.
This method computes orientations, bends, and angular moments for the dataset. If is_last is set to True, the results are saved after computation.
Parameters: is_last (bool): Flag to indicate if this is the last computation step. If True, the results are saved. **kwargs: Additional keyword arguments passed to the computation methods.
Returns: None
- comp_bend(mode='minimal', recompute=False)
Compute the body bending angle for each timestep, for each agent in the dataset.
Parameters: mode (str): Determines whether to compute a single angle or one for each intersegmental joint.
Options are “minimal” (default) or “full”.
- recompute (bool): If True, forces recomputation of the bending angles
even if they are already computed. Default is False.
Raises: Exception: If the bending angle computation method specified in the
configuration is not recognized.
Notes: - If the bending angles are already computed and recompute is set to False,
a message will be printed and the function will exit without recomputing.
The bending angle can be computed in two ways: 1. “from_vectors”: As the difference between front and rear orientations. 2. “from_angles”: As the sum of the first N front angles, where N is
specified in the configuration.
The computed bending angles are stored in the step dataframe.
- comp_ang_moments(pars=None, mode='minimal', recompute=False)
- comp_xy_moments(point='', **kwargs)
- comp_tortuosity(dur=20, **kwargs)
- comp_dispersal(t0=0, t1=60, **kwargs)
- comp_operators(pars)
- comp_centroid(**kwargs)
- comp_length(mode='minimal', recompute=False)
- comp_spatial(**kwargs)
- scale_to_length(pars=None, keys=None)
- comp_source_metrics()
- comp_wind()
- comp_wind_metrics(woo, wo)
- comp_final_anemotaxis(woo)
- comp_PI2(xys, x=0.04)
- comp_PI(arena_xdim, xs, return_num=False)
- comp_dataPI()
- process(proc_keys=['angular', 'spatial'], dsp_starts=[0], dsp_stops=[40, 60], tor_durs=[5, 10, 20], is_last=False, **kwargs)
- get_par(par=None, k=None, key='step')
- sample_larvagroup(N=1, ps=[])
- imitate_larvagroup(N=None, ps=None)
- property existing_dispersion_ranges
- convert_to_pint()
- class larvaworld.lib.process.dataset.BaseLarvaDataset(dir: str | None = None, refID: str | None = None, load_data: bool = True, config: larvaworld.lib.util.AttrDict | None = None, step: pandas.DataFrame | None = None, end: pandas.DataFrame | None = None, agents: list[str] | None = None, initialize: bool = False, **kwargs: Any)
Bases:
ParamLarvaDatasetBase class for named objects that support Parameters and message formatting.
Automatic object naming: Every Parameterized instance has a name parameter. If the user doesn’t designate a name=<str> argument when constructing the object, the object will be given a name consisting of its class name followed by a unique 5-digit number.
Automatic parameter setting: The Parameterized __init__ method will automatically read the list of keyword parameters. If any keyword matches the name of a Parameter (see Parameter class) defined in the object’s class or any of its superclasses, that parameter in the instance will get the value given as a keyword argument. For example:
- class Foo(Parameterized):
xx = Parameter(default=1)
foo = Foo(xx=20)
in this case foo.xx gets the value 20.
When initializing a Parameterized instance (‘foo’ in the example above), the values of parameters can be supplied as keyword arguments to the constructor (using parametername=parametervalue); these values will override the class default values for this one instance.
If no ‘name’ parameter is supplied, self.name defaults to the object’s class name with a unique number appended to it.
Message formatting: Each Parameterized instance has several methods for optionally printing output. This functionality is based on the standard Python ‘logging’ module; using the methods provided here, wraps calls to the ‘logging’ module’s root logger and prepends each message with information about the instance from which the call was made. For more information on how to set the global logging level and change the default message prefix, see documentation for the ‘logging’ module.
- static initGeo(to_Geo: bool = False, **kwargs: Any) BaseLarvaDataset
- generate_config(**kwargs)
- delete()
- set_id(id, save=True)
- class larvaworld.lib.process.dataset.LarvaDataset(**kwargs: Any)
Bases:
BaseLarvaDatasetBase class for named objects that support Parameters and message formatting.
Automatic object naming: Every Parameterized instance has a name parameter. If the user doesn’t designate a name=<str> argument when constructing the object, the object will be given a name consisting of its class name followed by a unique 5-digit number.
Automatic parameter setting: The Parameterized __init__ method will automatically read the list of keyword parameters. If any keyword matches the name of a Parameter (see Parameter class) defined in the object’s class or any of its superclasses, that parameter in the instance will get the value given as a keyword argument. For example:
- class Foo(Parameterized):
xx = Parameter(default=1)
foo = Foo(xx=20)
in this case foo.xx gets the value 20.
When initializing a Parameterized instance (‘foo’ in the example above), the values of parameters can be supplied as keyword arguments to the constructor (using parametername=parametervalue); these values will override the class default values for this one instance.
If no ‘name’ parameter is supplied, self.name defaults to the object’s class name with a unique number appended to it.
Message formatting: Each Parameterized instance has several methods for optionally printing output. This functionality is based on the standard Python ‘logging’ module; using the methods provided here, wraps calls to the ‘logging’ module’s root logger and prepends each message with information about the instance from which the call was made. For more information on how to set the global logging level and change the default message prefix, see documentation for the ‘logging’ module.
- visualize(parameters={}, **kwargs)
- enrich(pre_kws={}, proc_keys=[], anot_keys=[], is_last=True, mode='minimal', recompute=False, **kwargs)
- property epoch_bound_dicts
- get_chunk_par(chunk, k=None, par=None, min_dur=0, mode='distro')
- class larvaworld.lib.process.dataset.LarvaDatasetCollection(labels: list[str] | None = None, colors: list[Any] | None = None, add_samples: bool = False, config: larvaworld.lib.util.AttrDict | None = None, **kwargs: Any)
- config = None
- datasets
- labels = None
- Ndatasets
- colors = None
- group_ids
- Ngroups
- dir
- set_dir(dir=None)
- property plot_dir
- plot(ids=[], gIDs=[], **kwargs)
- get_datasets(datasets=None, refIDs=None, dirs=None, group_id=None)
- get_colors()
- property data_dict
- property data_palette
- property data_palette_with_N
- property color_palette
- property Nticks
- property N
- property labels_with_N
- property fr
- property dt
- property duration
- property tlim
- trange(unit='min')
- property arena_dims
- property arena_geometry
- concat_data(key)
- classmethod from_agentpy_output(output=None, agents=None, to_Geo=False)
Convert agentpy output to a LarvaDataset