New data record schema (Parquet) to replace .nc files

The current `.nc` files are just the standard IO type used by PyPSA. For this app they have tons of disadvantages and are just not made for it. PyPSA can't lazy load at all currently, for a small plot the full `.nc` file needs to be loaded into memory. The schema is very PyPSA specific and isn't cleanly structured and can't be easily expanded. ...

Proposed new data record/ schema:
- Schema derived by PyPSA, owned by App so we can adapt as needed.
- A data record is just an individual representation with data of that schema.
- Processing multiple records (queries for analytics, but could be for anything) can just simply work by pointing to N records, when the query and processing generalises over multiple dimensions. I should be able to get the energy balance of a single data record as well as multiple data records combined via the same defined query. 
- Multiple scenarios are not represented by a single record. But the same paros logic can work on multiple records, which allows processing of multiple scenarios.

```
DataRecordA/
  ├── manifest.json          # version, attribute catalog — immutable, user may not change anything here
  ├── snapshots.parquet
  ├── periods.parquet
  ├── components.parquet
  ├── scenarios.parquet      # for stochastic optimization, not workflow scenarios
  ├── data/
  │   └── <attr>.parquet     # ComponentType | component | snapshot | scenario | period | value
  └── results/
      └── <attr>.parquet     # ComponentType | component | snapshot | scenario | period | value
```

To be discussed/ unclear:
- [ ] When should a Data Record be immutable? Should it? For example snapshotting it before or after a job in a workflow
- [ ] Should a single data record store data of multiple results? Or would you rather create one data record per result and combine them for shared analytics?

Todos:
- [ ] Implement first version
- [ ] Draft Collections of multiple data records. Similar to statistics module in PyPSA which works on a single Network, as well as a collection of Networks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New data record schema (Parquet) to replace .nc files #101

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

New data record schema (Parquet) to replace .nc files #101

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions