Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -225,6 +225,9 @@ txc data pkg import ./data-package \
--prefetch-limit 100 # pre-cache record lookups

txc data pkg convert --input export.xlsx --output data.xml

# Tear down records inserted by a previous import (handy for CI test teardown):
txc data pkg cleanup ./data-package --yes
```

See [docs/configuration-migration.md](docs/configuration-migration.md) for the full deep-dive into CMT internals, deduplication logic, and tuning strategies.
Expand Down
43 changes: 42 additions & 1 deletion docs/configuration-migration.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ The **Configuration Migration Tool (CMT)** is a Microsoft utility for migrating
| Safety-check override | ✅ `--override-safety-checks` | ❌ |
| Prefetch tuning | ✅ `--prefetch-limit` | ❌ |
| XLSX → CMT XML conversion | ✅ `txc data package convert` | ❌ |
| Cleanup of records produced by a package | ✅ `txc data package cleanup` | ❌ |
| Authentication | `txc` profiles | PAC auth profiles |

### When to Use CMT
Expand All @@ -29,6 +30,7 @@ Use CMT / `txc data package` when you need to:
- Move **reference/configuration data** (currencies, business units, security roles, option-set seed data) between environments.
- Preserve **record GUIDs** across environments so that lookups and relationships remain intact.
- Automate data seeding in CI/CD pipelines.
- Tear down test data inserted by a previous package import — see [`txc data package cleanup`](#cleanup---delete-records-produced-by-a-package).
- Migrate **file columns** and **image columns** between environments.

For **bulk transactional data** or **ETL workloads**, consider the Dataverse Import Data Wizard, Azure Data Factory, or SSIS instead.
Expand Down Expand Up @@ -155,6 +157,45 @@ txc data package import data.zip \
--profile staging
```

### Cleanup — delete records produced by a package

Tear down everything a previous import created. Useful in automated tests where you spin up a fixture before each suite and remove it after.

```
txc data package cleanup <path> [options]
```

| Argument / Option | Alias | Required | Default | Description |
|---|---|---|---|---|
| `<path>` *(argument)* | — | **Yes** | — | Path to the CMT data package (`.zip` file or folder containing `data.xml` and `data_schema.xml`). |
| `--connection-count <N>` | — | No | `1` | Open N parallel `ServiceClient` connections; entities are sharded across them. Higher values speed up cleanup of many small entities at the cost of more concurrent throttle pressure. |
| `--batch-size <N>` | — | No | `200` | How many `DeleteRequest` messages to send per `ExecuteMultiple` batch. Lower is safer, higher is faster. |
| `--dry-run` | — | No | `false` | Parse the package and report what would be deleted without issuing any `DeleteRequest`. |
| `--missing-action <value>` | — | No | `by-natural-key` | What to do when a record can't be deleted by its GUID: `by-natural-key` (look it up via `primarynamefield` + every `updateCompare="true"` field), `skip` (count as not-found), or `fail` (abort the run). |
| `--continue-on-error` | — | No | `true` | Keep going after the first per-record failure. Set to `false` to abort on the first error. |
| `--yes` | — | No | `false` | Required in non-interactive sessions because the command is destructive. |
| `--allow-production` | — | No | `false` | Inherited; required when the target profile is detected as Production. |
| `--profile <name>` | `-p` | No | *(active profile)* | Profile name to resolve. |
| `--verbose` | — | No | `false` | Emit verbose logging for this invocation. |

**How it works:**

1. The package is parsed (folder or `.zip`) and entities are processed in the **reverse** of the schema's `<entityImportOrder>` so children come down before their parents.
2. For each entity, `DeleteRequest` messages are batched through `ExecuteMultiple`. Records the server can't find by GUID fall through to a `QueryExpression` lookup against the schema's natural-key columns (the `primarynamefield` plus every field marked `updateCompare="true"`); an exact single match is then deleted.
3. Any `<m2mrelationship>` blocks in `data.xml` are issued as `DisassociateRequest`s before the endpoint records are deleted, so cleanup also works when only one side of the N:N relationship lives in the package.

**Example — clean up test data after an integration test:**

```bash
txc data pkg cleanup ./fixtures/seed-data --profile dev --yes
```

**Example — preview without touching the environment:**

```bash
txc data pkg cleanup ./data.zip --dry-run --profile dev --yes
```

### `txc data package convert`

Convert tables from an XLSX file to CMT data package XML.
Expand Down Expand Up @@ -839,7 +880,7 @@ This preserves the hour/minute/second component while shifting the date.

### `deleteBeforeAdd` is dead code

The CMT API accepts a `deleteBeforeAdd` parameter that is supposed to delete all existing records before importing. **This parameter exists in the code but is never actually executed** — the delete logic is unreachable. Do not rely on it. If you need a clean slate, truncate the target entity manually before import.
The CMT API accepts a `deleteBeforeAdd` parameter that is supposed to delete all existing records before importing. **This parameter exists in the code but is never actually executed** — the delete logic is unreachable. Do not rely on it. If you need a clean slate, use [`txc data package cleanup`](#cleanup---delete-records-produced-by-a-package) to tear down a previous import, or truncate the target entity manually.

### Image columns work despite documentation

Expand Down
4 changes: 4 additions & 0 deletions docs/data-plane.md
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,9 @@ For schema-driven dataset migration — exporting a curated slice of configurati
txc data pkg export --schema ./data_schema.xml --output ./data-package --export-files
txc data pkg import ./data-package
txc data pkg convert --input export.xlsx --output data.xml

# Tear down everything a previous import created (CI teardown, test fixtures):
txc data pkg cleanup ./data-package --yes
```

See [configuration-migration.md](configuration-migration.md) for the full deep-dive: deduplication logic, batching, parallel channels, prefetch tuning, and other options not exposed by PAC CLI or the CMT GUI.
Expand All @@ -227,6 +230,7 @@ See [configuration-migration.md](configuration-migration.md) for the full deep-d
| All-or-nothing semantics (rollback on any failure) | `record … --stage` × N, then `txc env changeset apply --strategy transaction` |
| Heterogeneous mix, no rollback, but want a single round-trip | `record … --stage` × N, then `txc env changeset apply --strategy batch` |
| Schema-driven dataset migration between environments | `txc data pkg export` / `import` (CMT) |
| Tear down records inserted by a previous CMT import (CI test teardown) | `txc data pkg cleanup` |

---

Expand Down
66 changes: 66 additions & 0 deletions src/TALXIS.CLI.Core/Contracts/Dataverse/IDataPackageService.cs
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,56 @@ public sealed record DataPackageExportResult(
string? ErrorMessage,
bool InteractiveAuthRequired);

/// <summary>
/// What to do when a record listed in the package cannot be deleted by its
/// primary-key GUID (server returns <c>ObjectDoesNotExist</c>).
/// </summary>
public enum DataPackageCleanupMissingAction
{
/// <summary>Look the record up by the schema's natural-key columns and delete that match if found.</summary>
ByNaturalKey = 0,
/// <summary>Count the record as not-found and move on.</summary>
Skip = 1,
/// <summary>Abort the run on the first miss.</summary>
Fail = 2,
}

/// <summary>
/// Per-call tuning options for <see cref="IDataPackageService.CleanupAsync"/>.
/// </summary>
public sealed record DataPackageCleanupOptions(
int BatchSize,
int ConnectionCount,
bool DryRun,
DataPackageCleanupMissingAction MissingAction,
bool ContinueOnError);

/// <summary>
/// Per-entity cleanup statistics returned by <see cref="IDataPackageService.CleanupAsync"/>.
/// </summary>
public sealed record DataPackageCleanupEntityResult(
string EntityLogicalName,
int Total,
int DeletedByGuid,
int DeletedByNaturalKey,
int NotFound,
int Errors,
IReadOnlyList<string> ErrorMessages);

/// <summary>
/// Outcome returned by <see cref="IDataPackageService.CleanupAsync"/>.
/// </summary>
public sealed record DataPackageCleanupResult(
bool Succeeded,
string? ErrorMessage,
bool InteractiveAuthRequired,
IReadOnlyList<DataPackageCleanupEntityResult> EntityResults,
int TotalDeletedByGuid,
int TotalDeletedByNaturalKey,
int TotalNotFound,
int TotalErrors,
int M2mDisassociations);

/// <summary>
/// Imports and exports Configuration-Migration-Tool (CMT) data packages
/// for the Dataverse environment referenced by a profile. Hides subprocess
Expand All @@ -41,4 +91,20 @@ Task<DataPackageExportResult> ExportAsync(
bool exportFiles,
bool verbose,
CancellationToken ct);

/// <summary>
/// Deletes every record described by a CMT data package from the live
/// environment referenced by <paramref name="profileName"/>. Entities are
/// processed in reverse <c>&lt;entityImportOrder&gt;</c>. Deletes first
/// dispatch by the record's GUID (<c>&lt;record id&gt;</c>); on
/// <c>ObjectDoesNotExist</c> the schema's primary-name field and every
/// <c>updateCompare="true"</c> field are used as a natural-key fallback
/// per <paramref name="options"/>.
/// </summary>
Task<DataPackageCleanupResult> CleanupAsync(
string? profileName,
string dataPackagePath,
DataPackageCleanupOptions options,
bool verbose,
CancellationToken ct);
}
187 changes: 187 additions & 0 deletions src/TALXIS.CLI.Features.Data/DataPackageCleanupCliCommand.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
using System.ComponentModel;
using System.Text.Json;
using DotMake.CommandLine;
using Microsoft.Extensions.Logging;
using TALXIS.CLI.Core;
using TALXIS.CLI.Core.Abstractions;
using TALXIS.CLI.Core.Contracts.Dataverse;
using TALXIS.CLI.Core.DependencyInjection;
using TALXIS.CLI.Logging;

namespace TALXIS.CLI.Features.Data;

[CliDestructive("Permanently deletes every record listed in the CMT data package from the target Dataverse environment.")]
[CliLongRunning]
[CliWorkflow("data-operations")]
[CliCommand(
Name = "cleanup",
Description = "Deletes every record contained in a CMT data package from the LIVE Dataverse environment referenced by the active profile. Intended for tearing down test data inserted by a previous 'data package import'."
)]
public class DataPackageCleanupCliCommand : ProfiledCliCommand, IDestructiveCommand
{
protected override ILogger Logger { get; } = TxcLoggerFactory.CreateLogger(nameof(DataPackageCleanupCliCommand));

[CliArgument(Description = "Path to the CMT data package (.zip file or folder containing data.xml and data_schema.xml)")]
public required string Data { get; set; }

[CliOption(Name = "--connection-count", Description = "How many parallel connections to open. Entities are sharded across connections — higher values speed up cleanup of many small entities at the cost of more concurrent throttle pressure.", Required = false)]
[DefaultValue(1)]
public int ConnectionCount { get; set; } = 1;

[CliOption(Name = "--batch-size", Description = "How many DeleteRequest messages to send per ExecuteMultiple batch. Lower is safer, higher is faster.", Required = false)]
[DefaultValue(200)]
public int BatchSize { get; set; } = 200;

[CliOption(Name = "--dry-run", Description = "Parse the package and report what would be deleted without issuing any DeleteRequest.", Required = false)]
[DefaultValue(false)]
public bool DryRun { get; set; }

[CliOption(Name = "--missing-action", Description = "What to do when a record can't be deleted by its GUID: by-natural-key (look it up via primary-name + updateCompare fields), skip (count as not-found), or fail (abort the run).", Required = false)]
[DefaultValue("by-natural-key")]
public string MissingAction { get; set; } = "by-natural-key";

[CliOption(Name = "--continue-on-error", Description = "Keep going after the first per-record failure. Default true — set to false to abort on the first error.", Required = false)]
[DefaultValue(true)]
public bool ContinueOnError { get; set; } = true;

[CliOption(Name = "--yes", Description = "Skip interactive confirmation for this destructive operation.", Required = false)]
public bool Yes { get; set; }

protected override async Task<int> ExecuteAsync()
{
if (string.IsNullOrWhiteSpace(Data))
{
Logger.LogError("A path to a CMT data package (.zip or folder) must be provided.");
return ExitValidationError;
}

if (!File.Exists(Data) && !Directory.Exists(Data))
{
Logger.LogError("Data package not found: {DataPath}", Data);
return ExitValidationError;
}

if (BatchSize <= 0)
{
Logger.LogError("--batch-size must be greater than zero (got {BatchSize}).", BatchSize);
return ExitValidationError;
}

if (ConnectionCount <= 0)
{
Logger.LogError("--connection-count must be greater than zero (got {ConnectionCount}).", ConnectionCount);
return ExitValidationError;
}

if (!TryParseMissingAction(MissingAction, out var missingAction))
{
Logger.LogError("--missing-action must be one of: by-natural-key, skip, fail (got '{Value}').", MissingAction);
return ExitValidationError;
}

var service = TxcServices.Get<IDataPackageService>();
var options = new DataPackageCleanupOptions(BatchSize, ConnectionCount, DryRun, missingAction, ContinueOnError);

var result = await service.CleanupAsync(Profile, Data, options, Verbose, CancellationToken.None).ConfigureAwait(false);

if (result.InteractiveAuthRequired)
{
Logger.LogError("Interactive authentication is required. Run 'txc config auth login' for profile '{Profile}' and retry.", Profile ?? "(default)");
OutputFormatter.WriteResult("failed", "Interactive authentication required.", exitCode: ExitError);
return ExitError;
}

if (result.ErrorMessage is not null)
{
Logger.LogError("{ErrorMessage}", result.ErrorMessage);
OutputFormatter.WriteResult("failed", result.ErrorMessage, exitCode: ExitError);
return ExitError;
}

EmitReport(result);

if (!result.Succeeded)
{
Logger.LogError(
"Data package cleanup completed with errors. Deleted: {DeletedByGuid} by id, {DeletedByNaturalKey} by natural key. Not found: {NotFound}. Errors: {Errors}.",
result.TotalDeletedByGuid, result.TotalDeletedByNaturalKey, result.TotalNotFound, result.TotalErrors);
OutputFormatter.WriteResult("failed", $"{result.TotalErrors} record(s) failed to delete.", exitCode: ExitError);
return ExitError;
}

var summary = DryRun
? $"Dry run: would delete {result.TotalDeletedByGuid} record(s) and disassociate {result.M2mDisassociations} M:N pair(s)."
: $"Deleted {result.TotalDeletedByGuid + result.TotalDeletedByNaturalKey} record(s) ({result.TotalDeletedByNaturalKey} via natural-key fallback). {result.TotalNotFound} not found. Disassociated {result.M2mDisassociations} M:N pair(s).";
OutputFormatter.WriteResult("succeeded", summary);
return ExitSuccess;
}

private void EmitReport(DataPackageCleanupResult result)
{
if (!OutputContext.IsJson)
{
foreach (var entity in result.EntityResults)
{
if (entity.Total == 0)
continue;
Logger.LogInformation(
"{Entity}: {Total} record(s) — deleted {DeletedByGuid} by id, {DeletedByNaturalKey} by natural key, {NotFound} not found, {Errors} error(s).",
entity.EntityLogicalName, entity.Total, entity.DeletedByGuid, entity.DeletedByNaturalKey, entity.NotFound, entity.Errors);
foreach (var message in entity.ErrorMessages)
Logger.LogWarning(" {ErrorMessage}", message);
}
return;
}

var payload = new
{
entities = result.EntityResults.Select(e => new
{
entity = e.EntityLogicalName,
total = e.Total,
deletedByGuid = e.DeletedByGuid,
deletedByNaturalKey = e.DeletedByNaturalKey,
notFound = e.NotFound,
errors = e.Errors,
errorMessages = e.ErrorMessages,
}).ToArray(),
totals = new
{
deletedByGuid = result.TotalDeletedByGuid,
deletedByNaturalKey = result.TotalDeletedByNaturalKey,
notFound = result.TotalNotFound,
errors = result.TotalErrors,
m2mDisassociations = result.M2mDisassociations,
},
dryRun = DryRun,
};
OutputFormatter.WriteData(payload);
}

/// <summary>
/// Pure helper kept for test coverage: parses the textual value passed to
/// <c>--missing-action</c> into the corresponding enum.
/// </summary>
public static bool TryParseMissingAction(string? value, out DataPackageCleanupMissingAction action)
{
switch (value?.Trim().ToLowerInvariant())
{
case null:
case "":
case "by-natural-key":
case "natural-key":
case "naturalkey":
action = DataPackageCleanupMissingAction.ByNaturalKey;
return true;
case "skip":
action = DataPackageCleanupMissingAction.Skip;
return true;
case "fail":
action = DataPackageCleanupMissingAction.Fail;
return true;
default:
action = DataPackageCleanupMissingAction.ByNaturalKey;
return false;
}
}
}
2 changes: 1 addition & 1 deletion src/TALXIS.CLI.Features.Data/DataPackageCliCommand.cs
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ namespace TALXIS.CLI.Features.Data;
Name = "package",
Alias = "pkg",
Description = "Configuration migration tool (CMT) for moving data between different environments",
Children = new[] { typeof(DataPackageImportCliCommand), typeof(DataPackageExportCliCommand) },
Children = new[] { typeof(DataPackageImportCliCommand), typeof(DataPackageExportCliCommand), typeof(DataPackageConvertCliCommand), typeof(DataPackageCleanupCliCommand) },
ShortFormAutoGenerate = CliNameAutoGenerate.None
)]
public class DataPackageCliCommand
Expand Down
Loading