Add split functionality to synthetic_data by stes · Pull Request #268 · AdaptiveMotorControlLab/CEBRA

stes · 2025-09-06T12:26:50Z

During refactoring of the pivae synthetic data we did not pull over the split function, which causes task.py to fail due to these lines:

CEBRA/third_party/pivae/task.py

Lines 220 to 221 in e982248

    
           train_set.split('train') 
        
           valid_set.split('valid')

This pulls over the split function to the main package.

TODO: check if the pivae runs now

Copilot

Pull Request Overview

This PR adds a missing split functionality to the synthetic_data module that was accidentally omitted during a refactoring of the pivae synthetic data code, which was causing failures in task.py.

Adds a split method to handle train/valid/all data splits
Implements 80/20 train/validation split logic
Provides compatibility with existing pivae task code

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-09-24T16:39:37Z

cebra/datasets/synthetic_data.py

+        train_idx = np.arange(tot_len)[:int(tot_len*0.8)]
+        valid_idx = np.arange(tot_len)[int(tot_len*0.8):]


[nitpick] The hardcoded 0.8 split ratio should be extracted as a configurable parameter or class constant to improve maintainability and allow for different split ratios.

Copilot · 2025-09-24T16:39:37Z

cebra/datasets/synthetic_data.py

+        if split == 'train':
+            self.neural = self.neural[train_idx]
+            self.index = self.index[train_idx]
+            self.idx = train_idx


The method modifies the instance state by setting self.idx, but this attribute is not defined in init. This could lead to inconsistent state if split() is called multiple times or if other code expects self.idx to always exist.

Copilot · 2025-09-24T16:39:38Z

cebra/datasets/synthetic_data.py

+        elif split == 'valid':
+            self.neural = self.neural[valid_idx]
+            self.index = self.index[valid_idx]
+            self.idx = valid_idx


The method modifies the instance state by setting self.idx, but this attribute is not defined in init. This could lead to inconsistent state if split() is called multiple times or if other code expects self.idx to always exist.

Add split functionality to synthetic_data

46e50db

cla-bot bot added the CLA signed label Sep 6, 2025

MMathisLab requested a review from Copilot September 24, 2025 16:38

Copilot AI reviewed Sep 24, 2025

View reviewed changes

Merge branch 'main' into stes/add-split

a720f8c

MMathisLab marked this pull request as ready for review January 15, 2026 17:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add split functionality to synthetic_data#268

Add split functionality to synthetic_data#268
stes wants to merge 2 commits intomainfrom
stes/add-split

stes commented Sep 6, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Sep 24, 2025

Uh oh!

Copilot AI Sep 24, 2025

Uh oh!

Copilot AI Sep 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		train_idx = np.arange(tot_len)[:int(tot_len*0.8)]
		valid_idx = np.arange(tot_len)[int(tot_len*0.8):]

Conversation

stes commented Sep 6, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants