add srelib poc#922
Conversation
|
Skipping CI for Draft Pull Request. |
WalkthroughThe PR integrates a HashiCorp go-plugin-backed SRE library client. A new Changessrelib Plugin Client Integration
Sequence Diagram(s)sequenceDiagram
participant main
participant resolveSrelibPlugin
participant srelibpkg as pkg/srelib.NewClient
participant goPlugin as go-plugin process
participant ocmUtils as pkg/utils (GetClusters)
main->>resolveSrelibPlugin: resolve plugin path
resolveSrelibPlugin-->>main: pluginPath
main->>srelibpkg: NewClient(pluginPath)
srelibpkg->>goPlugin: start process + RPC handshake
goPlugin-->>srelibpkg: v1.Client dispensed
srelibpkg-->>main: *Client
main->>ocmUtils: SetSrelibClient(*Client)
Note over ocmUtils: activeSrelibClient set
ocmUtils->>goPlugin: GetClusters / GetClusterAnyStatus via inner v1.Client
goPlugin-->>ocmUtils: []*cmv1.Cluster
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 13 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (13 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: petrkotas The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@go.mod`:
- Line 280: The local filesystem replace directive for
github.com/petrkotas/srelib that points to ../srelib will cause build failures
in CI systems and clean environments where that relative path does not exist.
Remove this replace directive from go.mod entirely, or if the module is needed
as a dependency, replace it with a resolvable reference such as a published
version, a commit-based pseudo-version, or a git branch reference that can be
accessed reproducibly across all build environments.
In `@pkg/srelib/client.go`:
- Line 42: The type assertion `raw.(v1.Client)` in the return statement is
unguarded and will panic if the plugin dispense does not return a value of the
expected type. Guard the type assertion by using the two-value form (value, ok
:= raw.(v1.Client)) to check if the assertion succeeds. If the assertion fails,
return nil and a descriptive error instead of allowing the panic. Only create
and return the Client struct if the type assertion succeeds.
In `@pkg/utils/ocm.go`:
- Around line 69-77: The GetClusterAnyStatus and GetClusters functions must
guard access to the global activeSrelibClient with nil checks to prevent panics
when called outside the main entrypoint (such as from docgen or tests), and the
GetClusters function must remove the log.Fatalf call that terminates the process
on error, instead returning the error to the caller so they can handle the
failure appropriately according to their error handling expectations.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 8f0259b6-27c3-4e3b-aafa-6b9b22f08c0c
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (4)
go.modmain.gopkg/srelib/client.gopkg/utils/ocm.go
|
|
||
| replace github.com/observatorium/api v0.1.2 => github.com/observatorium/api v0.1.3-0.20260326145919-3032029d0fde | ||
|
|
||
| replace github.com/petrkotas/srelib => ../srelib |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Replace directives:"
rg -n '^replace\s+' go.mod
echo "srelib entries:"
rg -n 'github.com/petrkotas/srelib' go.mod
if [ -d ../srelib ]; then
echo "Found ../srelib locally (still non-portable across CI/consumers)."
else
echo "../srelib is absent; this replace will fail in clean environments."
fiRepository: openshift/osdctl
Length of output: 425
Remove local replace directive; it blocks builds in any clean environment.
replace github.com/petrkotas/srelib => ../srelib will fail immediately on CI systems and consumer builds since ../srelib is absent from the repository. This is a build blocker, not just a reproducibility issue. Either remove the replace directive or reference a resolvable module version (published release, commit-based pseudo-version, or git branch).
Per supply chain security guidelines, all dependency references must be reproducible and pinnable. Local filesystem paths violate this requirement and must not reach production.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@go.mod` at line 280, The local filesystem replace directive for
github.com/petrkotas/srelib that points to ../srelib will cause build failures
in CI systems and clean environments where that relative path does not exist.
Remove this replace directive from go.mod entirely, or if the module is needed
as a dependency, replace it with a resolvable reference such as a published
version, a commit-based pseudo-version, or a git branch reference that can be
accessed reproducibly across all build environments.
Source: Coding guidelines
| return nil, fmt.Errorf("srelib: dispense plugin: %w", err) | ||
| } | ||
|
|
||
| return &Client{inner: raw.(v1.Client), killer: pc}, nil |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -type f -name "client.go" | grep srelibRepository: openshift/osdctl
Length of output: 83
🏁 Script executed:
head -60 pkg/srelib/client.go | cat -nRepository: openshift/osdctl
Length of output: 1697
Unguarded type assertion from plugin dispense can panic.
raw.(v1.Client) will panic if the plugin returns a value that doesn't match the expected type. This can happen due to version mismatch, incorrect plugin configuration, or handshake issues. Guard the assertion and return an error instead:
Proposed fix
- return &Client{inner: raw.(v1.Client), killer: pc}, nil
+ inner, ok := raw.(v1.Client)
+ if !ok {
+ pc.Kill()
+ return nil, fmt.Errorf("srelib: unexpected plugin client type %T", raw)
+ }
+ return &Client{inner: inner, killer: pc}, nil📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| return &Client{inner: raw.(v1.Client), killer: pc}, nil | |
| inner, ok := raw.(v1.Client) | |
| if !ok { | |
| pc.Kill() | |
| return nil, fmt.Errorf("srelib: unexpected plugin client type %T", raw) | |
| } | |
| return &Client{inner: inner, killer: pc}, nil |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/srelib/client.go` at line 42, The type assertion `raw.(v1.Client)` in the
return statement is unguarded and will panic if the plugin dispense does not
return a value of the expected type. Guard the type assertion by using the
two-value form (value, ok := raw.(v1.Client)) to check if the assertion
succeeds. If the assertion fails, return nil and a descriptive error instead of
allowing the panic. Only create and return the Client struct if the type
assertion succeeds.
| func GetClusterAnyStatus(conn *sdk.Connection, clusterId string) (*cmv1.Cluster, error) { | ||
| // identifier in the accounts management service. To find those clusters we need to check | ||
| // directly in the clusters management service. | ||
| clustersSearch := fmt.Sprintf(ClusterServiceClusterSearch, clusterId, clusterId, clusterId) | ||
| clustersListResponse, err := conn.ClustersMgmt().V1().Clusters().List().Search(clustersSearch).Size(1).Send() | ||
| if err != nil { | ||
| return nil, fmt.Errorf("can't retrieve clusters for clusterId '%s': %w", clusterId, err) | ||
| } | ||
|
|
||
| // If there is exactly one cluster matching then return it: | ||
| clustersTotal := clustersListResponse.Total() | ||
| if clustersTotal == 1 { | ||
| return clustersListResponse.Items().Slice()[0], nil | ||
| } | ||
|
|
||
| return nil, fmt.Errorf("there are %d clusters with identifier or name '%s', expected 1", clustersTotal, clusterId) | ||
| return activeSrelibClient.GetClusterAnyStatus(clusterId) | ||
| } | ||
|
|
||
| func GetClusters(ocmClient *sdk.Connection, clusterIds []string) []*cmv1.Cluster { | ||
| for i, id := range clusterIds { | ||
| clusterIds[i] = GenerateQuery(id) | ||
| } | ||
|
|
||
| clusters, err := ApplyFilters(ocmClient, []string{strings.Join(clusterIds, " or ")}) | ||
| clusters, err := activeSrelibClient.GetClusters(clusterIds) | ||
| if err != nil { | ||
| log.Fatalf("error while retrieving cluster(s) from ocm: %[1]s", err) | ||
| log.Fatalf("error retrieving cluster(s) via srelib: %s", err) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n pkg/utils/ocm.go | head -100Repository: openshift/osdctl
Length of output: 4078
🏁 Script executed:
# Search for activeSrelibClient initialization and usage
rg "activeSrelibClient" -B 2 -A 2Repository: openshift/osdctl
Length of output: 966
🏁 Script executed:
# Check for SetSrelibClient function definition
rg "SetSrelibClient" -B 3 -A 3Repository: openshift/osdctl
Length of output: 661
🏁 Script executed:
# Look for all calls to GetClusters to understand usage patterns
rg "GetClusters" -B 2 -A 2Repository: openshift/osdctl
Length of output: 2532
🏁 Script executed:
# Check for error handling in callers of GetClusters
rg "GetClusters|GetClusterAnyStatus" --type go -A 5 | head -50Repository: openshift/osdctl
Length of output: 2319
🏁 Script executed:
# Find test files
fd -e _test.go | head -20Repository: openshift/osdctl
Length of output: 42
🏁 Script executed:
# Search for calls to GetClusters and GetClusterAnyStatus in tests
rg "GetClusters|GetClusterAnyStatus" --type go -B 3 -A 3 | grep -E "(_test\.go|test_|Test)" -A 5 | head -40Repository: openshift/osdctl
Length of output: 42
🏁 Script executed:
# Check if tests initialize the client
rg "SetSrelibClient" --type goRepository: openshift/osdctl
Length of output: 191
🏁 Script executed:
# Look for alternative entrypoints or main functions
rg "func main" --type goRepository: openshift/osdctl
Length of output: 178
🏁 Script executed:
# Check docgen/main.go for potential usage
cat -n utils/docgen/main.go | head -80Repository: openshift/osdctl
Length of output: 222
🏁 Script executed:
# Check for any other imports or usage of GetClusters/GetClusterAnyStatus in tools
rg "GetClusters|GetClusterAnyStatus" utils/docgen/ 2>/dev/null || echo "No usage in docgen"Repository: openshift/osdctl
Length of output: 79
🏁 Script executed:
# Check if there are any test directories at all
find . -type d -name "*test*" 2>/dev/null | head -10Repository: openshift/osdctl
Length of output: 131
🏁 Script executed:
# Look for go.sum or go.mod to understand if there's a testing framework
ls -la | grep -E "go\.(mod|sum)|Makefile|\.github"Repository: openshift/osdctl
Length of output: 295
🏁 Script executed:
# Check callers of GetClusters and GetClusterAnyStatus to understand usage patterns
rg "GetClusters|GetClusterAnyStatus" --type go -B 2 -A 5 | grep -E "(cmd|pkg)/[^/]+\.(go|ts)" | head -50Repository: openshift/osdctl
Length of output: 42
🏁 Script executed:
# Verify the exact error handling expectations from callers
rg "GetClusters\(" -A 3 --type go | head -50Repository: openshift/osdctl
Length of output: 2191
Guard global client access and stop terminating process from utility helpers.
These helpers directly dereference the uninitialized activeSrelibClient global without nil checks, causing panic if invoked outside the main entrypoint (e.g., from docgen or tests). Additionally, log.Fatalf in GetClusters hard-exits on recoverable RPC failures, making lookup failures non-recoverable and breaking the error handling contract expected by callers.
Suggested fix
func GetClusterAnyStatus(conn *sdk.Connection, clusterId string) (*cmv1.Cluster, error) {
- return activeSrelibClient.GetClusterAnyStatus(clusterId)
+ if activeSrelibClient == nil {
+ return nil, fmt.Errorf("srelib client is not initialized")
+ }
+ return activeSrelibClient.GetClusterAnyStatus(clusterId)
}
-func GetClusters(ocmClient *sdk.Connection, clusterIds []string) []*cmv1.Cluster {
- clusters, err := activeSrelibClient.GetClusters(clusterIds)
- if err != nil {
- log.Fatalf("error retrieving cluster(s) via srelib: %s", err)
- }
- return clusters
+func GetClusters(ocmClient *sdk.Connection, clusterIds []string) ([]*cmv1.Cluster, error) {
+ if activeSrelibClient == nil {
+ return nil, fmt.Errorf("srelib client is not initialized")
+ }
+ return activeSrelibClient.GetClusters(clusterIds)
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/utils/ocm.go` around lines 69 - 77, The GetClusterAnyStatus and
GetClusters functions must guard access to the global activeSrelibClient with
nil checks to prevent panics when called outside the main entrypoint (such as
from docgen or tests), and the GetClusters function must remove the log.Fatalf
call that terminates the process on error, instead returning the error to the
caller so they can handle the failure appropriately according to their error
handling expectations.
|
Looks good - I'd love to see this for some more complex flows like the assume role chaining for AWS or constructing backplane clients for different OCM connections. |
Summary by CodeRabbit
New Features
Chores