Skip to content

feat: create raw data loader and setup of repo#4

Open
arjunsridhar12345 wants to merge 23 commits into
devfrom
3-create-raw-data-loader
Open

feat: create raw data loader and setup of repo#4
arjunsridhar12345 wants to merge 23 commits into
devfrom
3-create-raw-data-loader

Conversation

@arjunsridhar12345
Copy link
Copy Markdown
Collaborator

@arjunsridhar12345 arjunsridhar12345 commented May 21, 2026

This PR attempts to address #3 by creating a module for loading the raw data using the data contract for the Dynamic Foraging experiment defined here: https://github.com/AllenNeuralDynamics/Aind.Behavior.DynamicForaging/blob/main/src/aind_behavior_dynamic_foraging/data_contract/_dataset.py.

Using this diagram as guidance: https://github.com/AllenNeuralDynamics/aind-software-docs/blob/main/diagrams/dynamic_foraging/low_level/dynamic-foraging-low-level-processing.svg

An initial example notebook has been made to try and show how to access the raw data:
https://github.com/AllenNeuralDynamics/dynamic-foraging-processing/blob/3-create-raw-data-loader/examples/raw_data_loader_example.ipynb

NOTE: The temp dataset file exists ONLY because test data has been acquired, and the test dataset I had does not have all of the streams captured that are defined in the data contract. This file will be removed once a full dataset is acquired and validated

In addition, the PR does some setup for the repo so it tries to align with the coding standards that were recently merged by a group of engineers from SIPE and Sci Comp as the current package template has not been updated yet. There may be things I missed or maybe messed up but at a high level is the following. Looking back, this might have been better off in a separate PR but these changes are relatively small and I think can be lumped in to the first PR for this repo:

  • Uses uv
  • Uses ruff instead of black and flake8 with uv
  • Runs pytest with uv
  • Runs tests on pushes to dev and main
  • Runs tests across the os matrix
  • Updates the contributing file with the above

Small note to finish: This PR targets the dev branch in accordance with the above coding standards

Another small note: I can't add branch protection rules, @dougollerenshaw could you add them or make me an admin on the repo?

Comment thread .github/workflows/test_and_lint.yml
Comment thread src/dynamic_foraging_processing/raw_data_loader/loader.py
Comment thread src/dynamic_foraging_processing/raw_data_loader/temp_dataset.py
Comment thread src/dynamic_foraging_processing/raw_data_loader/loader.py
Copy link
Copy Markdown

@seanmcculloch seanmcculloch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me, my comments were all addressed

Copy link
Copy Markdown
Collaborator

@arielleleon arielleleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome start! Just a few small things

@@ -0,0 +1,90 @@
"""Raw data loader for dynamic foraging datasets."""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rename this to loader.py? The names are too repetitive from dynamic_foraging_processing.raw_data_loader.raw_data_loader import RawDataLoader. The import at __init__ glosses over this but I would argue it needs to be renamed

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tried to address in 5286ee9

@@ -0,0 +1,139 @@
"""Tests for ``dynamic_foraging_processing.raw_data_loader.raw_data_loader``."""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One you change raw_data_loader.py -> loader.py, this should become test_loader.py

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tried to address in 5286ee9


def __init__(
self,
path: os.PathLike,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this either t.Union[ str | os.Pathlike ] or just str | os.Pathlike. You already cast it as a Path in the make_dataset

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tried to address in 5286ee9

Copy link
Copy Markdown
Collaborator

@arielleleon arielleleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants