Model evaluation

In this tutorial, we will describe the Model Evaluation mode, one of the Simulation modes available in larvaworld.

This mode is used to evaluate a number of larva models for similarity with a preexisting reference dataset, most often one retained via monitoring real experiments.

Let’s import the relevant classes :

%load_ext param.ipython
import panel as pn

import larvaworld as lw
from larvaworld.lib import reg
from larvaworld.lib.sim.model_evaluation import EvalRun, EvalConf, DataEvaluation
# from larvaworld.lib.reg.generators import ExpConf

# Setting the verbosity level to 0 to get more information
lw.VERBOSE = 1

# Tutorial safety switches (avoid GUI/heavy compute by default)
RUN_EVAL_DEMO = False
RUN_PLOTS_DEMO = False

DEMO_REF_ID = "exploration.30controls"
DEMO_MODEL_IDS = ["explorer", "navigator"]
DEMO_N = 5
DEMO_DURATION_MIN = 0.5  # >= 0.33 min recommended for 20s metrics
DEMO_SCREEN_KWS = {}

A look at the respective configuration class makes it easy to get an idea of the involved arguments:

  • Reference dataset, designated via ID or directory

  • Larva models retrieved via ID and the respective larvagroup IDs and size

  • Evaluation metrics and setup

# Show the attributes of the EvalConf class
%params EvalConf

# Show the attributes of the EvalConf class as a nested dictionary
# EvalConf.param

The preconfigured larva-model configurations can be inspected and selected by a unique ID

ids = reg.conf.Model.confIDs
print(ids)

The existing reference datasets can be inspected via their IDs

refIDs = reg.conf.Ref.confIDs
print(refIDs)

The model-evaluation launcher accepts also a number of runtype arguments :

# Show the attributes of the EvalRun class
%params EvalRun

# Show the attributes of the EvalRun class as a nested dictionary
EvalRun.param

A model-evaluation simulation can be launched easily :

kws = {
    "refID": DEMO_REF_ID,
    "modelIDs": DEMO_MODEL_IDS,
    "experiment": "dish",
    "N": DEMO_N,
    "duration": DEMO_DURATION_MIN,
    "screen_kws": DEMO_SCREEN_KWS,
}

r = EvalRun(**kws)
if RUN_EVAL_DEMO:
    r.simulate()

Further plotting is possible :

  • The simulated and reference datasets

  • The competing larva-models

if RUN_PLOTS_DEMO:
    r.plot_results(show=False)
    r.plot_models(show=False)