Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/hands-on.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,12 @@ This page provides a step-by-step guide to help you run different components of
---

### Tile-based Training Implementation
Refer to the [training component documentation](./training-cwl.md) and run multiple training jobs with various model hyperparameter.
Refer to the [training component documentation](./training-cwl.md) and run multiple training jobs with various model hyperparameters.

---

### Tile-based Inference Implementation
Check the [inference component documentation](./inference-cwl.md) and run inference using different sentinel-2 products in parallel using calrissian or once in a time using cwltool.
Check the [inference component documentation](./inference-cwl.md) and run inference using different Sentinel-2 products in parallel using `calrissian`, or once at the time using `cwltool`.

---

Expand Down
10 changes: 5 additions & 5 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@

## Introduction

This learning resource demonstrates a machine learning system for classification of Sentinel-2 images into 10 different classes using cloud-native technologies. The system leverages MLFLOW to track the training process and select the best candidate trained model from MLFLOW server.
This learning resource demonstrates a Machine Learning (ML) system for classification of Sentinel-2 images into 10 different classes using cloud-native technologies. The system leverages MLFLOW to track the training process and selects the best candidate trained model from MLFLOW server.

There are two workflows developed one for training a deep learning model classifier on EuroSAT dataset and one for running prediction on a real world Sentinel-2 data. The automation is achieved using Kubernetes-native tools, making the setup scalable, modular, and suitable for Earth observation and geospatial applications.
There are two workflows developed: one for training a deep learning model classifier on EuroSAT dataset, and one for running a prediction on a real world Sentinel-2 data. The automation is achieved using Kubernetes-native tools, making the setup scalable, modular, and suitable for Earth Observation and geospatial applications.



Expand All @@ -15,8 +15,8 @@ This setup integrates the following technologies and concepts:

### MLFLOW

* Manage end-to-end ML workflows, from development to production
* End-to-end MLOps solution for traditional ML, including integrations with traditional ML models, and Deep learning one.
* Manages end-to-end ML workflows, from development to production
* End-to-end MLOps solution for traditional ML, including integrations with traditional ML models, and Deep learning one
* Simple, low-code performance tracking with autologging
* State-of-the-art UI for model analysis and comparison

Expand All @@ -28,7 +28,7 @@ This setup integrates the following technologies and concepts:

The system is designed to handle the following flow:

1. Training pipeline: A CNN model trained on [EuroSAT](https://github.com/phelber/EuroSAT) dataset which already exist on a dedicated STAC endpoint. The MLFLOW track the whole process to monitor the life cycle of training.
1. Training pipeline: A CNN model trained on [EuroSAT](https://github.com/phelber/EuroSAT) dataset which already exist on a dedicated STAC endpoint. The MLFLOW tracks the whole process to monitor the life cycle of training.

2. Inference: Run the inference pipeline to perform tile-based classification on Sentinel-2 L1C products.

Expand Down
42 changes: 15 additions & 27 deletions docs/inference-container.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,22 @@
# Inference container:

This module enables users to create an inference pipeline that take a Sentinel-2 STAC Item from the [Planetary Computer](https://planetarycomputer.microsoft.com/api/stac/v1/collections), and generates a binary mask TIFF image using a pre-trained CNN model. For details on how the model was trained, refer to the [training container documentation](./training-container.md).
This module enables users to create an inference pipeline that takes a Sentinel-2 STAC Item from the [Planetary Computer](https://planetarycomputer.microsoft.com/api/stac/v1/collections), and generates a binary mask TIFF image using a pre-trained CNN model. For details on how the model was trained, refer to the [training container documentation](./training-container.md).



## **Make Inference Module:**
## **`Make Inference` Module:**

**Inputs**:
- `input_reference`: The reference to a Sentinel-2 product on [planetary computer](https://planetarycomputer.microsoft.com/api/stac/v1/collections). The application will give you an accurate result if the sentinel-2 product has no/low cloud-cover.

- `input_reference`: A list of Sentinel-2 product references from [Planetary Computer](https://planetarycomputer.microsoft.com/api/stac/v1/collections). Note: the inference application provides accurate results only when the Sentinel-2 product has low or no cloud cover. High cloud coverage may significantly reduce prediction accuracy.

**Outputs**:

- `{STAC_ITEM_ID}_classified.tif`: A binary `.tif` image in `COG` format classifies:
- `{STAC_ITEM_ID}_classified.tif`: A binary `.tif` image in `COG` format containing the full-resolution land cover classification predicted by the model, with each pixel assigned to a land cover class as defined in the table below.
- `overview_{STAC_ITEM_ID}_classified.tif`: A binary `.tif` image in `COG` format containing lower-resolution overview of the classification result, generated to support fast visualisation and efficient browsing across zoom levels.
- `STAC objects`: STAC objects related to the provided masks, including STAC Catalog and STAC Item.

*Land Cover Classes*
| Class ID | Class Name |
|----------|-----------------------|
| 0 | AnnualCrop |
Expand All @@ -27,28 +31,12 @@ This module enables users to create an inference pipeline that take a Sentinel-2
| 9 | SeaLake |
| 10 | No Data |

- `overview_{STAC_ITEM_ID}_classified.tif`: A binary `.tif` image in `COG` format classifies:

| Class ID | Class Name |
|----------|-----------------------|
| 0 | AnnualCrop |
| 1 | Forest |
| 2 | HerbaceousVegetation |
| 3 | Highway |
| 4 | Industrial |
| 5 | Pasture |
| 6 | PermanentCrop |
| 7 | Residential |
| 8 | River |
| 9 | SeaLake |
| 10 | No Data |

- `STAC objects`: STAC objects related to the provided masks, including STAC catalog and STAC Item.

## How the Application Works

The application begins by reading a Sentinel-2 STAC Item from the [Planetary Computer](https://planetarycomputer.microsoft.com/api/stac/v1/collections). It then filters and selects 12 specific asset references in the order expected by the machine learning model. These assets correspond to common Sentinel-2 bands, as shown below:
The application begins by reading the input Sentinel-2 STAC Item(s) from the [Planetary Computer](https://planetarycomputer.microsoft.com/api/stac/v1/collections) and then extracting the 12 common Sentinel-2 spectral bands (see table below), ordered to match those expected by the trained ML model.

*Sentinel-2 Spectral Bands*
| Index | Asset Key | Asset Common Name |
|-------|------------|-------------------|
| 1 | B01 | Coastal |
Expand All @@ -64,11 +52,11 @@ The application begins by reading a Sentinel-2 STAC Item from the [Planetary Com
| 11 | B11 | SWIR 1 (16) |
| 12 | B12 | SWIR 2 (22) |

As a preprocessing step, all selected assets are resampled to a uniform resolution of 10 meters.
As part of the preprocessing, all selected bands are resampled to a consistent spatial resolution of 10 meters.

The pipeline then proceeds with a sliding window approach: it reads and stacks small image chips from the selected bands in the order listed above. These chips are fed into a trained CNN model, which predicts the corresponding class for each chip.
The pipeline then proceeds with a sliding window approach: it reads and stacks small image chips from the resampled bands (in the specified order), forming multi-band input arrays. These image chips are fed to the trained CNN model, which predicts the corresponding LC class for each chip.

Finally, the application generates:
- The classification prediction map (as a GeoTIFF mask)
At the end of the process, the application generates:
- The LC classification prediction map (COG mask)
- A visual overview image
- An updated STAC item containing metadata and references to the output files
- An updated STAC Catalog and Item containing metadata and references to the output files.
22 changes: 10 additions & 12 deletions docs/inference-cwl.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,29 +7,29 @@ This Application Package provides a CWL document that performs inference by appl
To execute the application, users have the option to use either [cwltool](https://github.com/common-workflow-language/cwltool) or [Calrissian](https://github.com/Duke-GCB/calrissian) as the CWL runner.

## Inputs:
- `input_reference`: A list of reference to a Sentinel-2 product on [planetary computer](https://planetarycomputer.microsoft.com/api/stac/v1/collections). The application will give you an accurate result if the sentinel-2 product has no/low cloud-cover.
- `input_reference`: A list of Sentinel-2 product references from [Planetary Computer](https://planetarycomputer.microsoft.com/api/stac/v1/collections). Note: the inference application provides accurate results only when the Sentinel-2 product has low or no cloud cover. High cloud coverage may significantly reduce prediction accuracy.

## How to Execute the Application Package

Before running the application with a CWL runner, make sure to download and use the latest version of the CWL document:

```bash
cd /workspace/machine-learning-process/inference/app-package
VERSION="0.0.4"
cd inference/app-package
VERSION=$(curl -s https://api.github.com/repos/eoap/machine-learning-process/releases/latest | jq -r '.tag_name')
curl -L -o "tile-sat-inference.cwl" \
"https://github.com/eoap/machine-learning-process/releases/download/${VERSION}/tile-sat-inference.${VERSION}.cwl"
```

### **Run the Application Package**:
There are two methods to execute the application:

- Executing the `tile-sat-inference` app using `cwltool`:
- Executing `tile-sat-inference` using `cwltool`:

```bash
cwltool --podman --debug --parallel tile-sat-inference.cwl#tile-sat-inference params.yml
```

- Executing the `tile-sat-inference` using `calrissian`:
- Executing `tile-sat-inference` using `calrissian`:

```bash

Expand All @@ -39,16 +39,14 @@ There are two methods to execute the application:
>
> `kubectl get pods`

## How the CWL document designed:
The CWL file can be triggered using `cwltool` or `calrissian`. The user provides a `params.yml` file that passes all inputs needed by the CWL file to execute the module. The CWL file is designed to execute the module based on the structure below:
## How the CWL document is designed:
The CWL file can be triggered using `cwltool` or `calrissian`. The execution requires a `params.yml` file, which supplies all the necessary inputs defined in the CWL specification. The workflow is structured to run the module according to the diagram outlined below:

![Inference Workflow](imgs/inference.png)
![image](imgs/inference.png "Inference Workflow")

> **`[]`** in the image above indicates that the user may pass a list of parameters to the application package.

The Application Package will generate a list of directories containing intermediate or final output. The number of folders containing a `{STAC_ITEM_ID}_classified.tif` and the corresponding STAC objects, such as STAC Catalog and STAC Item, depends on the number of input Sentinel-2 items.
The Application Package will generate a number of directories containing intermediate and final outputs. Each directory will contain a `{STAC_ITEM_ID}_classified.tif` file, along with the corresponding STAC objects (i.e. the STAC Catalog and STAC Item). The number of directories depends on the number of input Sentinel-2 products provided.


## Troubleshooting

The user might encounter to memory issues during the execution with CWL Runners(especially with the `cwltool`). This can be addressing by reducing the `ramMax`(e.g. `ramMax: 1000`) parameter in the cwl file.
Users might encounter memory-related issues when executing workflows with CWL Runners (especially with `cwltool`). These issues can often be mitigated by reducing the `ramMax` parameter (e.g. `ramMax: 1000`) specified in the CWL file, which can help prevent excessive memory allocation.
22 changes: 8 additions & 14 deletions docs/insights.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,34 +10,28 @@ It also includes recommendations for future improvements and practical advice fo

### Modular Workflow Templates

Decision: Separate the CWL execution, training pipeline, and inference pipeline into distinct workflow templates.
* Decision: Separate the CWL execution, training pipeline, and inference pipeline into distinct workflow templates.

Outcome:
* Enhanced reusability for other geospatial pipelines requiring similar preprocessing steps.
* Outcome: Enhanced reusability for other geospatial pipelines requiring similar preprocessing steps.

### STAC Integration

Decision: Leverage the STAC API, Geoparquet, and DuckDB for querying and storing geospatial data.

Outcome:
* Improved interoperability with other geospatial tools and standards.
* Decision: Leverage the STAC API, Geoparquet, and DuckDB for querying and storing geospatial data.
* Outcome: Improved interoperability with other geospatial tools and standards.

### Tracking the process

Decision: Use MLFLOW exclusively for tracking the process of training workflow and selecting the best model candidate.
* Decision: Use MLFLOW exclusively for tracking the process of training workflow and selecting the best model candidate.

### Test inference with Sentinel-2 product
Decision: Use Stars tool to stage-in a sentinel-2 product ready to pass to inference module.

* Decision: Use Stars tool to stage-in a sentinel-2 product ready to pass to inference module.

## Challenges and Solutions


### Build Docker Images

Challenge: Initially, we used an [advanced tooling technique](https://github.com/eoap/advanced-tooling) that leveraged **Taskfile** to build a Kaniko-based image and reference the CWL files. The image was then pushed to [ttl.sh](https://ttl.sh/), a temporary image registry. This will help us to execute the application packages using calrissian. However, this process was slow and hard to debug, often failing due to the large size of the Kaniko images.

Solution: We now push the Docker images to a dedicated GitHub Container Registry.
* Challenge: Initially, we used an [advanced tooling technique](https://github.com/eoap/advanced-tooling) that leveraged **Taskfile** to build a Kaniko-based image and reference the CWL files. The image was then pushed to [ttl.sh](https://ttl.sh/), a temporary image registry. This helps to execute the application packages using `calrissian`. However, this process was slow and hard to debug, often failing due to the large size of the Kaniko images.
* Solution: We now push the Docker images to a dedicated GitHub Container Registry.



11 changes: 6 additions & 5 deletions docs/mlm.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,16 @@
# Describes a trained machine learning model
This Item describe a trained machine learning model using [MLM](https://github.com/stac-extensions/mlm) STAC extension. The STAC Machine Learning Model (MLM) Extension provides a standard set of fields to describe machine learning models trained on overhead imagery and enable running model inference.
# Describes a trained Machine Learning model

This tutorial describes a trained Machine Learning model using [MLM](https://github.com/stac-extensions/mlm) STAC extension. The STAC MLM Extension provides a standard set of fields to describe machine learning models trained on overhead imagery and enable running model inference.

The main objectives of the extension are:

- to enable building model collections that can be searched alongside associated STAC datasets
- record all necessary bands, parameters, modeling artifact locations, and high-level processing steps to deploy an inference service.
- to enable building model collections that can be searched alongside associated STAC datasets;
- to record all necessary bands, parameters, modeling artifact locations, and high-level processing steps to deploy an inference service.

For additional information please follow this [Describe-MLmodel](./Describe-MLmodel.md) notebook.

## For developers:
To run the notebook successfully, you must install the dependencies with hatch:
To run the notebook successfully, you must install the dependencies with `hatch`:

```
hatch shell prod
Expand Down
2 changes: 1 addition & 1 deletion docs/packages.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@ This tutorial provides two separate application packages:

Each application package has its own Docker image, which has been published to a dedicated GitHub Container Registry.

For more details on how each package works, refer to the documentation for [training](./training-container.md) and [inference](./inference-container.md).
For more details on how each package works, refer to the Reference Guides for [training](./training-container.md) and [inference](./inference-container.md).
Loading
Loading