larvaworld.lib.process.dataset ============================== .. py:module:: larvaworld.lib.process.dataset .. autoapi-nested-parse:: Basic classes for larvaworld-format datasets Classes ------- .. autoapisummary:: larvaworld.lib.process.dataset.DatasetConfig larvaworld.lib.process.dataset.ParamLarvaDataset larvaworld.lib.process.dataset.BaseLarvaDataset larvaworld.lib.process.dataset.LarvaDataset larvaworld.lib.process.dataset.LarvaDatasetCollection Module Contents --------------- .. py:class:: DatasetConfig(**kwargs: Any) Bases: :py:obj:`larvaworld.lib.param.RuntimeDataOps`, :py:obj:`larvaworld.lib.param.SimMetricOps`, :py:obj:`larvaworld.lib.param.SimTimeOps` The configuration of a LarvaDataset. .. py:attribute:: Nticks .. py:attribute:: refID .. py:attribute:: group_id .. py:attribute:: color .. py:attribute:: env_params .. py:attribute:: larva_group .. py:attribute:: agent_ids .. py:attribute:: N .. py:attribute:: sample .. py:attribute:: filtered_at .. py:attribute:: rescaled_by .. py:attribute:: pooled_cycle_curves .. py:attribute:: bout_distros .. py:attribute:: intermitter .. py:attribute:: modelConfs .. py:attribute:: EEB_poly1d .. py:property:: h5_kdic Returns the keys of the h5 file that store the parameters of the dataset .. py:method:: update_Nagents() .. py:property:: arena_vertices .. py:method:: get_sample_bout_distros(m) .. py:class:: ParamLarvaDataset(**kwargs: Any) Bases: :py:obj:`param.Parameterized` Base class for named objects that support Parameters and message formatting. Automatic object naming: Every Parameterized instance has a name parameter. If the user doesn't designate a name= argument when constructing the object, the object will be given a name consisting of its class name followed by a unique 5-digit number. Automatic parameter setting: The Parameterized __init__ method will automatically read the list of keyword parameters. If any keyword matches the name of a Parameter (see Parameter class) defined in the object's class or any of its superclasses, that parameter in the instance will get the value given as a keyword argument. For example: class Foo(Parameterized): xx = Parameter(default=1) foo = Foo(xx=20) in this case foo.xx gets the value 20. When initializing a Parameterized instance ('foo' in the example above), the values of parameters can be supplied as keyword arguments to the constructor (using parametername=parametervalue); these values will override the class default values for this one instance. If no 'name' parameter is supplied, self.name defaults to the object's class name with a unique number appended to it. Message formatting: Each Parameterized instance has several methods for optionally printing output. This functionality is based on the standard Python 'logging' module; using the methods provided here, wraps calls to the 'logging' module's root logger and prepends each message with information about the instance from which the call was made. For more information on how to set the global logging level and change the default message prefix, see documentation for the 'logging' module. .. py:attribute:: config .. py:attribute:: step_data .. py:attribute:: endpoint_data .. py:attribute:: config2 .. py:attribute:: epoch_dict .. py:attribute:: larva_dicts .. py:method:: validate_IDs() .. py:method:: update_ids_in_data() .. py:method:: update_Nticks() .. py:property:: c .. py:property:: ids .. py:property:: s .. py:property:: e .. py:property:: end_ps .. py:property:: step_ps .. py:property:: end_ks .. py:property:: step_ks .. py:property:: min_tick .. py:method:: timeseries_slice(time_range=None, df=None) .. py:method:: required() .. py:method:: valid(returned=None) .. py:method:: data_exists(ks=[], ps=[], eks=[], eps=[], config_attrs=[], attrs=[]) .. py:property:: chunk_dicts .. py:property:: epoch_dicts .. py:property:: fitted_epochs .. py:property:: pooled_epochs .. py:property:: cycle_curves .. py:property:: pooled_cycle_curves .. py:method:: track_par_in_chunk(chunk, par) .. py:method:: epochs_pose_by_ID(chunk, id) .. py:method:: epochs_bearing_by_ID(chunk, id, loc=(0.0, 0.0)) .. py:method:: epoch_durs(epochs) .. py:method:: epoch_amps(epochs, a) .. py:method:: epoch_maxs(epochs, a) .. py:method:: epoch_idx(epochs) .. py:method:: comp_chunk_bearing(chunk) .. py:method:: detect_epochs(idx, min_dur=None) .. py:method:: detect_runs(a, vel_thr=0.3, min_dur=0.5) Annotates crawl-runs in timeseries. Extended description of function. Parameters ---------- a : array 1D np.array : forward velocity timeseries vel_thr : float Maximum velocity threshold min_dur : float, optional The minimum required duration for a turn Returns ------- runs : list A list of pairs of the start-end indices of the runs. .. py:method:: detect_pauses(a, vel_thr=0.3, runs=None, min_dur=None) Annotates crawl-pauses in timeseries. Extended description of function. Parameters ---------- a : array 1D np.array : forward velocity timeseries vel_thr : float Maximum velocity threshold runs : list A list of pairs of the start-end indices of the runs. If provided pauses that overlap with runs will be excluded. min_dur : float, optional The minimum required duration for a turn Returns ------- pauses : list A list of pairs of the start-end indices of the pauses. .. py:method:: detect_strides(a, vel_thr=0.3, stretch=(0.75, 2.0), fr=None, return_extrema=True) Annotates strides-runs and pauses in timeseries. Extended description of function. Parameters ---------- a : array 1D np.array : forward velocity timeseries vel_thr : float Maximum velocity threshold stretch : Tuple[float,float] The min-max stretch of a stride relative to the default derived from the dominnt frequency fr : float, optional The dominant crawling frequency. return_extrema : boolean Whether to additionally return the stride extrema Returns ------- strides : list A list of pairs of the start-end indices of the strides. i_min : array Indices of the local minima. i_max : array Indices of the local maxima .. py:method:: detect_stridechains(strides) Annotates stridechains-runs by concatenating consecutive strides. Extended description of function. Parameters ---------- strides : array 2D np.array : the start-end tics of the stride epochs Returns ------- runs : list A list of pairs of the start-end indices of the runs/stridechains. run_counts : list Stride-counts of the runs/stridechains. .. py:method:: detect_turns(a, min_dur=None) Annotates turns in timeseries. Extended description of function. Parameters ---------- a : array 1D np.array : angular velocity timeseries min_dur : float, optional The minimum required duration for a turn Returns ------- Lturns : list A list of pairs of the start-end indices of the Left turns. Rturns : list A list of pairs of the start-end indices of the Right turns. .. py:method:: crawl_annotation(strides_enabled: bool = True, vel_thr: float = 0.3) -> larvaworld.lib.util.AttrDict .. py:method:: turn_annotation(min_dur=None) .. py:method:: turn_mode_annotation() .. py:method:: patch_residency_annotation() .. py:method:: detect_epoch_on_food_overlap(chunk) .. py:method:: detect_bouts(vel_thr=0.3, strides_enabled=True, castsNweathervanes=True) .. py:method:: comp_pooled_epochs() Compute pooled epochs from chunk dictionaries. This method processes the `chunk_dicts` attribute to create `epoch_dicts` and `pooled_epochs`. It first extracts unique epoch keys from the chunk dictionaries and then constructs a dictionary of epochs (`epoch_dicts`) where each key corresponds to a dictionary of chunk data. The method then defines an inner function `get_vs` to concatenate values from the dictionaries, handling cases where the values have different shapes. If the majority of the values have a shape of 2 dimensions, it filters out those with a shape of 1 dimension before concatenation. Finally, it creates the `pooled_epochs` attribute by concatenating the values for each epoch key, excluding specific keys such as "turn_slice", "pause_idx", "run_idx", and "stride_idx". Attributes: chunk_dicts (dict): A dictionary containing chunk data. epoch_dicts (AttrDict): A dictionary of epochs with chunk data. pooled_epochs (AttrDict): A dictionary of concatenated epoch data. Raises: Exception: If there is an issue with concatenating the values in `get_vs`. Prints: "Completed bout detection." upon successful completion. .. py:method:: fit_pooled_epochs() .. py:method:: generate_pooled_epochs(mID) .. py:method:: comp_bout_distros() .. py:method:: register_bout_distros() .. py:method:: comp_cycle_curves(Nbins=64) .. py:method:: comp_attenuation(Nbins=64) .. py:method:: comp_interference(Nbins=64) .. py:method:: comp_pooled_cycle_curves() .. py:method:: annotate(anot_keys=['bout_detection', 'bout_distribution', 'interference'], is_last=False, **kwargs) .. py:method:: interpolate_nan_values() .. py:method:: filter(filter_f=2.0, recompute=False) .. py:method:: rescale(recompute=False, rescale_by=1.0) .. py:method:: exclude_rows(flag='collision_flag', accepted=[0], rejected=None) .. py:method:: smaller_dataset(p) Generate a smaller dataset based on the given ReplayConf parameters. Args: p (ReplayConf): The configuration for dataset replay. Returns: LarvaDataset: A subset of the original dataset. .. py:method:: align_trajectories(track_point=None, arena_dims=None, transposition='origin', replace=True) .. py:method:: preprocess(drop_collisions=False, interpolate_nans=False, filter_f=None, rescale_by=None, transposition=None, recompute=False) .. py:method:: merge_configs() .. py:method:: set_data(step=None, end=None, agents=None, **kwargs) .. py:property:: data .. py:method:: path_to_file(file='data.h5') .. py:property:: path_to_config .. py:method:: store(df, key, file='data.h5') .. py:method:: save_dict(d, file) .. py:method:: read(key, file='data.h5') .. py:method:: load(step=True, h5_ks=None) .. py:method:: save(refID=None) .. py:method:: save_config(refID=None) .. py:method:: load_traj(mode='default') .. py:method:: load_dicts(type, ids=None) Load dictionaries based on the specified type and optional IDs. Args: type (str): The type of dictionaries to load. ids (list, optional): A list of IDs to load. If None, uses self.ids. Returns: list: A list of dictionaries corresponding to the specified type and IDs. Notes: - If the specified type and IDs are found in self.larva_dicts, the dictionaries are loaded from there. - Otherwise, the dictionaries are loaded from files located in the directory specified by self.config.data_dir. .. py:method:: store_dicts(type, dicts) Stores a dictionary of dictionaries to individual files. Args: type (str): The type/category of the dictionaries to be stored. dicts (dict): A dictionary where keys are identifiers and values are dictionaries to be stored. Example: >>> store_dicts('example_type', {'id1': {'key1': 'value1'}, 'id2': {'key2': 'value2'}}) This will create files 'id1.txt' and 'id2.txt' in the directory specified by self.config.data_dir/individuals/example_type. .. py:method:: store_larva_dicts() Stores larva dictionaries by iterating over the items in `self.larva_dicts`. This method retrieves each type and its corresponding dictionary from `self.larva_dicts` and passes them to the `store_dicts` method for storage. Returns: None .. py:property:: contour_xy_data_byID .. py:property:: midline_xy_data_byID .. py:property:: traj_xy_data_byID .. py:method:: data_by_ID(data) .. py:property:: midline_xy_data .. py:property:: contour_xy_data .. py:method:: empty_df(dim3=1) .. py:method:: apply_per_agent(pars, func, time_range=None, **kwargs) Apply a function to each subdataframe of a MultiIndex DataFrame after grouping by the agentID. Parameters ---------- s : pandas.DataFrame A MultiIndex DataFrame with levels ['Step', 'AgentID']. func : function The function to apply to each subdataframe. **kwargs : dict Additional keyword arguments to pass to the 'func' function. Returns ------- numpy.ndarray An array of dimensions [N_ticks, N_ids], where N_ticks is the number of unique 'Step' values, and N_ids is the number of unique 'AgentID' values. Notes ----- This function groups the DataFrame 's' by the specified 'level', applies 'func' to each subdataframe, and returns the results as a numpy array. .. py:method:: midline_xy_1less(mid) .. py:property:: midline_seg_xy_data_byID .. py:property:: midline_seg_orients_data_byID .. py:method:: midline_seg_orients_from_mid(mid) Calculate the orientation of midline segments from midline coordinates. Parameters: mid (numpy.ndarray): A 3D array of shape (Nticks, N, 2) where Nticks is the number of timesteps, N is the number of midline points, and 2 represents the x and y coordinates of each point. Returns: numpy.ndarray: A 2D array of shape (Nticks, N-1) containing the orientation angles (in radians) of each segment for each timestep, with values in the range [0, 2π). .. py:method:: comp_freq(par, fr_range=(0.0, +np.inf)) Compute the frequency of a parameter for each agent. This method calculates the dominant frequency of a given parameter for each agent in the dataset. It uses the Fast Fourier Transform (FFT) to find the frequency with the highest amplitude within a specified frequency range. Parameters: par (str): The name of the parameter to compute the frequency for. fr_range (tuple, optional): A tuple specifying the frequency range to consider. Defaults to (0.0, +np.inf). Returns: None: The result is stored in the endpoint dataframe with the frequency name as the key. .. py:method:: comp_freqs() Compute dominant frequencies for translational and angular velocities. The frequency ranges (in Hz) are (1.0, 2.5) and (0.1, 0.8) respectively. Parameters: None Returns: None .. py:method:: comp_orientations(mode='minimal', recompute=False) Compute the orientations of body segments for each timestep, for each agent in the dataset. Parameters: mode (str): Determines whether to compute only front and rear orientations or one for each body segment. Options are "minimal" (default) or "full". recompute (bool): If True, recompute the orientations even if they already exist. Default is False. Returns: None .. py:method:: comp_angular(is_last=False, **kwargs) Perform angular analysis on the dataset. This method computes orientations, bends, and angular moments for the dataset. If `is_last` is set to True, the results are saved after computation. Parameters: is_last (bool): Flag to indicate if this is the last computation step. If True, the results are saved. **kwargs: Additional keyword arguments passed to the computation methods. Returns: None .. py:method:: comp_bend(mode='minimal', recompute=False) Compute the body bending angle for each timestep, for each agent in the dataset. Parameters: mode (str): Determines whether to compute a single angle or one for each intersegmental joint. Options are "minimal" (default) or "full". recompute (bool): If True, forces recomputation of the bending angles even if they are already computed. Default is False. Raises: Exception: If the bending angle computation method specified in the configuration is not recognized. Notes: - If the bending angles are already computed and recompute is set to False, a message will be printed and the function will exit without recomputing. - The bending angle can be computed in two ways: 1. "from_vectors": As the difference between front and rear orientations. 2. "from_angles": As the sum of the first N front angles, where N is specified in the configuration. - The computed bending angles are stored in the step dataframe. .. py:method:: comp_ang_moments(pars=None, mode='minimal', recompute=False) .. py:method:: comp_xy_moments(point='', **kwargs) .. py:method:: comp_tortuosity(dur=20, **kwargs) .. py:method:: comp_dispersal(t0=0, t1=60, **kwargs) .. py:method:: comp_operators(pars) .. py:method:: comp_centroid(**kwargs) .. py:method:: comp_length(mode='minimal', recompute=False) .. py:method:: comp_spatial(**kwargs) .. py:method:: scale_to_length(pars=None, keys=None) .. py:method:: comp_source_metrics() .. py:method:: comp_wind() .. py:method:: comp_wind_metrics(woo, wo) .. py:method:: comp_final_anemotaxis(woo) .. py:method:: comp_PI2(xys, x=0.04) .. py:method:: comp_PI(arena_xdim, xs, return_num=False) .. py:method:: comp_dataPI() .. py:method:: process(proc_keys=['angular', 'spatial'], dsp_starts=[0], dsp_stops=[40, 60], tor_durs=[5, 10, 20], is_last=False, **kwargs) .. py:method:: get_par(par=None, k=None, key='step') .. py:method:: sample_larvagroup(N=1, ps=[]) .. py:method:: imitate_larvagroup(N=None, ps=None) .. py:property:: existing_dispersion_ranges .. py:method:: convert_to_pint() .. py:class:: BaseLarvaDataset(dir: Optional[str] = None, refID: Optional[str] = None, load_data: bool = True, config: Optional[larvaworld.lib.util.AttrDict] = None, step: Optional[pandas.DataFrame] = None, end: Optional[pandas.DataFrame] = None, agents: Optional[list[str]] = None, initialize: bool = False, **kwargs: Any) Bases: :py:obj:`ParamLarvaDataset` Base class for named objects that support Parameters and message formatting. Automatic object naming: Every Parameterized instance has a name parameter. If the user doesn't designate a name= argument when constructing the object, the object will be given a name consisting of its class name followed by a unique 5-digit number. Automatic parameter setting: The Parameterized __init__ method will automatically read the list of keyword parameters. If any keyword matches the name of a Parameter (see Parameter class) defined in the object's class or any of its superclasses, that parameter in the instance will get the value given as a keyword argument. For example: class Foo(Parameterized): xx = Parameter(default=1) foo = Foo(xx=20) in this case foo.xx gets the value 20. When initializing a Parameterized instance ('foo' in the example above), the values of parameters can be supplied as keyword arguments to the constructor (using parametername=parametervalue); these values will override the class default values for this one instance. If no 'name' parameter is supplied, self.name defaults to the object's class name with a unique number appended to it. Message formatting: Each Parameterized instance has several methods for optionally printing output. This functionality is based on the standard Python 'logging' module; using the methods provided here, wraps calls to the 'logging' module's root logger and prepends each message with information about the instance from which the call was made. For more information on how to set the global logging level and change the default message prefix, see documentation for the 'logging' module. .. py:method:: initGeo(to_Geo: bool = False, **kwargs: Any) -> BaseLarvaDataset :staticmethod: .. py:method:: generate_config(**kwargs) .. py:method:: delete() .. py:method:: set_id(id, save=True) .. py:class:: LarvaDataset(**kwargs: Any) Bases: :py:obj:`BaseLarvaDataset` Base class for named objects that support Parameters and message formatting. Automatic object naming: Every Parameterized instance has a name parameter. If the user doesn't designate a name= argument when constructing the object, the object will be given a name consisting of its class name followed by a unique 5-digit number. Automatic parameter setting: The Parameterized __init__ method will automatically read the list of keyword parameters. If any keyword matches the name of a Parameter (see Parameter class) defined in the object's class or any of its superclasses, that parameter in the instance will get the value given as a keyword argument. For example: class Foo(Parameterized): xx = Parameter(default=1) foo = Foo(xx=20) in this case foo.xx gets the value 20. When initializing a Parameterized instance ('foo' in the example above), the values of parameters can be supplied as keyword arguments to the constructor (using parametername=parametervalue); these values will override the class default values for this one instance. If no 'name' parameter is supplied, self.name defaults to the object's class name with a unique number appended to it. Message formatting: Each Parameterized instance has several methods for optionally printing output. This functionality is based on the standard Python 'logging' module; using the methods provided here, wraps calls to the 'logging' module's root logger and prepends each message with information about the instance from which the call was made. For more information on how to set the global logging level and change the default message prefix, see documentation for the 'logging' module. .. py:method:: visualize(parameters={}, **kwargs) .. py:method:: enrich(pre_kws={}, proc_keys=[], anot_keys=[], is_last=True, mode='minimal', recompute=False, **kwargs) .. py:property:: epoch_bound_dicts .. py:method:: get_chunk_par(chunk, k=None, par=None, min_dur=0, mode='distro') .. py:class:: LarvaDatasetCollection(labels: Optional[list[str]] = None, colors: Optional[list[Any]] = None, add_samples: bool = False, config: Optional[larvaworld.lib.util.AttrDict] = None, **kwargs: Any) .. py:attribute:: config :value: None .. py:attribute:: datasets .. py:attribute:: labels :value: None .. py:attribute:: Ndatasets .. py:attribute:: colors :value: None .. py:attribute:: group_ids .. py:attribute:: Ngroups .. py:attribute:: dir .. py:method:: set_dir(dir=None) .. py:property:: plot_dir .. py:method:: plot(ids=[], gIDs=[], **kwargs) .. py:method:: get_datasets(datasets=None, refIDs=None, dirs=None, group_id=None) .. py:method:: get_colors() .. py:property:: data_dict .. py:property:: data_palette .. py:property:: data_palette_with_N .. py:property:: color_palette .. py:property:: Nticks .. py:property:: N .. py:property:: labels_with_N .. py:property:: fr .. py:property:: dt .. py:property:: duration .. py:property:: tlim .. py:method:: trange(unit='min') .. py:property:: arena_dims .. py:property:: arena_geometry .. py:method:: concat_data(key) .. py:method:: from_agentpy_output(output=None, agents=None, to_Geo=False) :classmethod: Convert agentpy output to a LarvaDataset