Data Processing
Larvaworld provides a comprehensive data processing pipeline for trajectory analysis. All processing operates on LarvaDataset objects.
Processing Pipeline
Raw Data → Preprocess → Process → Annotate → Plot/Analyze
1. Preprocessing
Purpose: Clean and standardize raw trajectories
dataset.preprocess(
# Note: drop_collisions requires a "collision_flag" column (typically present in imported datasets)
drop_collisions=False,
interpolate_nans=True, # Fill missing data
filter_f=3.0, # Low-pass filter at 3 Hz
rescale_by=0.001, # Scale (e.g., mm to m)
transposition="center" # Center trajectories
)
Available Options
Parameter |
Description |
Default |
|---|---|---|
|
Remove frames with collisions |
|
|
Interpolate missing values |
|
|
Low-pass filter cutoff (Hz) |
|
|
Scale factor |
|
|
Alignment mode ( |
|
2. Processing
Purpose: Compute behavioral metrics
dataset.process(
proc_keys=["angular", "spatial"],
dsp_starts=[0],
dsp_stops=[40, 60],
tor_durs=[5, 10, 20]
)
Processing Categories
Angular Metrics
proc_keys = ["angular"]
Computes:
Orientation angle
Angular velocity
Angular acceleration
Head direction
Spatial Metrics
proc_keys = ["spatial"]
Computes:
Linear velocity
Linear acceleration
Cumulative distance
Forward Components
Forward/sideways components are part of the spatial metrics computed when
"spatial" is included in proc_keys. They are not a separate proc_keys
entry.
Dispersal
Dispersal metrics are always computed for the combinations of
dsp_starts and dsp_stops passed to process:
dataset.process(
proc_keys=["angular", "spatial"],
dsp_starts=[0], # Start times (s)
dsp_stops=[40, 60], # Stop times (s)
)
This adds endpoint columns such as dispersion_0_60_mean / std / max and
their scaled counterparts (e.g. scaled_dispersion_0_60_mean).
Tortuosity
Tortuosity metrics are controlled by tor_durs:
dataset.process(
proc_keys=["angular", "spatial"],
tor_durs=[5, 10, 20], # Window durations (s)
)
This adds endpoint columns such as tortuosity_5_mean / std,
tortuosity_10_mean / std, etc.
3. Annotation
Purpose: Detect behavioral events
dataset.annotate(
anot_keys=[
"bout_detection",
"bout_distribution",
"interference"
]
)
Annotation Types
Bout Detection
anot_keys = ["bout_detection"]
Detects:
Strides: Individual peristaltic waves
Runs: Chains of strides
Pauses: Immobility epochs
Turns: Reorientation maneuvers
Bout Distribution
anot_keys = ["bout_distribution"]
Computes:
Distribution fitting (exponential, power-law)
Duration/length statistics
Interference
anot_keys = ["interference"]
Analyzes:
Crawl-turn coupling
Phase relationships
Data Structure
LarvaDataset
dataset = run.datasets[0]
# Endpoint data (summary per larva)
print(dataset.e) # Pandas DataFrame
# Step-wise data (time-series)
print(dataset.s) # Pandas DataFrame
# Configuration
print(dataset.c) # AttrDict
Endpoint Metrics
dataset.e.columns
Available:
cum_dur: Total duration (s)velocity_mean,scaled_velocity_mean: Mean speed (and scaled variant)dispersion_0_60_mean: Example dispersal metric (if computed)tortuosity_5_mean: Example tortuosity metric (if computed)
Step-wise Data
dataset.s.columns
Available:
x,y: Positionbend,front_orientation,rear_orientation: Angular kinematicsvelocity,scaled_velocity: Speed (and scaled variant)
Example Workflow
from larvaworld.lib.sim import ExpRun
# Run experiment
run = ExpRun(experiment="dish", N=3, duration=1.0, screen_kws={}, store_data=False)
run.simulate()
# Get dataset
dataset = run.datasets[0]
# 1. Preprocess
dataset.preprocess(
interpolate_nans=True,
filter_f=3.0,
transposition="center"
)
# 2. Process
dataset.process(
proc_keys=["angular", "spatial"],
dsp_starts=[0],
dsp_stops=[40, 60],
tor_durs=[5, 10],
)
# 3. Annotate
dataset.annotate(
anot_keys=["bout_detection", "bout_distribution"]
)
# 4. Analyze
print("=== Summary Statistics ===")
cols = [c for c in ["cum_dur", "velocity_mean", "dispersion_0_60_mean", "tortuosity_5_mean"] if c in dataset.e.columns]
print(dataset.e[cols].describe() if cols else dataset.e.describe())
Saving Processed Data
from larvaworld.lib import reg
# Save dataset to HDF5 and register as a reference ID
dataset.save(refID="my_experiment")
# Load later
dataset = reg.loadRef(id="my_experiment", load=True)