Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Proxy-Owned UDS Agent Socket

## Summary

Move end-to-end UDS weblog trace transport from per-weblog `socat` processes into the central proxy. UDS weblogs keep using tracer auto-detection of `/var/run/datadog/apm.socket`; the proxy owns that socket and forwards traffic through its existing HTTP proxy path to the agent.

Lambda `socat` remains out of scope because it exposes the Lambda extension's container-local `127.0.0.1:8126`, which the central proxy cannot reach directly.

## Key Changes

- Add a Unix stream listener in `utils/proxy/core.py`.
- Default socket: `/var/run/datadog/apm.socket`.
- Optional env override: `PROXY_APM_RECEIVER_SOCKET`.
- Remove stale socket on startup, create the parent directory, and set permissive socket permissions.
- For each UDS connection, open a TCP connection to the proxy's local weblog port `8126` so existing mitmproxy logging and forwarding stay unchanged.

- Update container wiring in `utils/_context/containers.py`.
- Mount one runtime host directory, `./<logs>/interfaces/test_agent_socket`, into both proxy and UDS weblog containers at `/var/run/datadog`.
- For UDS weblogs, remove `DD_AGENT_HOST` and `DD_TRACE_AGENT_PORT` from the runtime environment so tracers keep testing automatic UDS discovery.
- Keep `DD_APM_RECEIVER_SOCKET=/var/run/datadog/apm.socket` as the UDS marker and path.

- Clean up end-to-end UDS weblog images.
- Remove `socat` installation, `UDS_WEBLOG=1`, `set-uds-transport.sh` copies, and startup calls from UDS variants under Java, Python, Node.js, Go, .NET, and Ruby.
- Remove the now-unused UDS transport helper if no non-lambda references remain.
- Keep Lambda `socat` references unchanged.

## Test Plan

- Static checks:
- `rg "socat|set-uds-transport|UDS_WEBLOG" utils/build/docker` should only show intentional Lambda usage, if any.
- `./format.sh`.

- Build representative UDS weblogs:
- `./build.sh python -w uds-flask`
- `./build.sh nodejs -w uds-express4`
- `./build.sh java -w uds-spring-boot`
- `./build.sh golang -w uds-echo`
- `./build.sh dotnet -w uds`
- Ruby UDS variants if credentials/images are available.

- Runtime smoke:
- `TEST_LIBRARY=python WEBLOG_VARIANT=uds-flask ./run.sh tests/test_smoke.py::Test_Library::test_receive_request_trace`
- Repeat for at least one non-Python UDS weblog.
- Confirm `logs_*/interfaces/library` receives trace payloads and the proxy stdout reports the UDS listener.

## Assumptions

- Existing UDS weblogs all use `/var/run/datadog/apm.socket`; that remains the supported socket path for this change.
- This preserves UDS auto-detection semantics and does not switch to `DD_TRACE_AGENT_URL=unix://...`.
- Lambda `socat` cleanup is excluded by choice because it is a separate localhost-extension bridge problem.
9 changes: 8 additions & 1 deletion tests/test_the_test/test_ci_orchestrator.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from utils import scenarios

from utils.scripts.ci_orchestrators.workflow_data import get_endtoend_definitions
from utils.scripts.ci_orchestrators.workflow_data import _is_supported, get_endtoend_definitions


@scenarios.test_the_test
Expand All @@ -22,3 +22,10 @@ def test_get_endtoend_definitions():
# graphql_appsec is executed on graphql23 weblog
# so the job should be equals to weblog count
assert len(defs["endtoend_defs"]["parallel_jobs"]) == weblog_count


@scenarios.test_the_test
def test_ipv6_is_not_supported_for_uds_weblogs():
assert not _is_supported("dotnet", "uds", "IPV6", "dev")
assert not _is_supported("python", "uds-flask", "IPV6", "dev")
assert _is_supported("python", "flask-poc", "IPV6", "dev")
14 changes: 7 additions & 7 deletions utils/_context/_scenarios/debugger.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import pytest

from utils._logger import logger
from utils.proxy.ports import ProxyPorts

from .core import scenario_groups
from .endtoend import EndToEndScenario
Expand Down Expand Up @@ -63,13 +64,12 @@ def configure(self, config: pytest.Config):
# when run from macOS as well.
self.agent_container.volumes["/sys/kernel/debug"] = {"bind": "/sys/kernel/debug", "mode": "ro"}
self.agent_container.volumes["/sys/fs/cgroup"] = {"bind": "/sys/fs/cgroup", "mode": "ro"}
# Set the system-probe to output to the proxy the same way the
# libraries are being told to. For golang, the system-probe acts
# as a tracer library and sends data to the trace-agent just like
# the other libraries.
weblog_env = self.weblog_container.environment
self.agent_container.environment["DD_TRACE_AGENT_PORT"] = weblog_env["DD_TRACE_AGENT_PORT"]
self.agent_container.environment["DD_AGENT_HOST"] = weblog_env["DD_AGENT_HOST"]
# For golang, the system-probe acts as a tracer library and sends
# data to the trace-agent. Route it through the proxy explicitly;
# UDS weblogs do not keep DD_AGENT_HOST/DD_TRACE_AGENT_PORT in
# their container environment.
self.agent_container.environment["DD_TRACE_AGENT_PORT"] = str(ProxyPorts.weblog)
self.agent_container.environment["DD_AGENT_HOST"] = "proxy"

if not self.replay:
self.warmups.append(self._wait_for_agent_debugging)
Expand Down
34 changes: 23 additions & 11 deletions utils/_context/containers.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
from utils._context.component_version import ComponentVersion, Version
from utils._context.docker import get_docker_client
from utils._context.ports import ContainerPorts
from utils.proxy.config import DEFAULT_APM_RECEIVER_SOCKET
from utils.proxy.tuf import get_tuf_root_json
from utils.proxy.ports import ProxyPorts
from utils.proxy.mocked_response import (
Expand Down Expand Up @@ -48,6 +49,8 @@

_DEFAULT_NETWORK_NAME = "system-tests_default"
_NETWORK_NAME = "bridge" if "GITLAB_CI" in os.environ else _DEFAULT_NETWORK_NAME
_AGENT_SOCKET_HOST_DIR = "interfaces/test_agent_socket"
_APM_SOCKET_CONTAINER_DIR = "/var/run/datadog"


def create_network() -> Network:
Expand Down Expand Up @@ -182,6 +185,14 @@ def container_name(self):
def log_folder_path(self):
return f"{self.host_project_dir}/{self.host_log_folder}/docker/{self.name}"

def _mount_agent_socket_dir(self) -> None:
socket_dir = Path(self.host_project_dir) / self.host_log_folder / _AGENT_SOCKET_HOST_DIR
socket_dir.mkdir(mode=0o777, exist_ok=True, parents=True)
self.volumes[f"./{self.host_log_folder}/{_AGENT_SOCKET_HOST_DIR}"] = {
"bind": _APM_SOCKET_CONTAINER_DIR,
"mode": "rw",
}

def get_existing_container(self) -> Container:
for container in get_docker_client().containers.list(all=True, filters={"name": self.container_name}):
if container.name == self.container_name:
Expand Down Expand Up @@ -612,6 +623,7 @@ def __init__(
"DD_SITE": os.environ.get("DD_SITE"),
"DD_API_KEY": os.environ.get("DD_API_KEY", _FAKE_DD_API_KEY),
"DD_APP_KEY": os.environ.get("DD_APP_KEY"),
"PROXY_APM_RECEIVER_SOCKET": os.environ.get("PROXY_APM_RECEIVER_SOCKET"),
"SYSTEM_TESTS_IPV6": str(enable_ipv6),
"SYSTEM_TESTS_MOCKED_BACKEND": str(mocked_backend),
},
Expand Down Expand Up @@ -677,6 +689,7 @@ def __init__(

def configure(self, *, host_log_folder: str, replay: bool):
super().configure(host_log_folder=host_log_folder, replay=replay)
self._mount_agent_socket_dir()

# Write tracer mocked responses JSON
tracer_mocks_path = f"{self.log_folder_path}/{MockedTracerResponse.internal_filename}"
Expand Down Expand Up @@ -994,6 +1007,12 @@ def configure(self, *, host_log_folder: str, replay: bool):

self.weblog_variant = self.image.labels["system-tests-weblog-variant"]

if self.uds_mode:
self._mount_agent_socket_dir()
self.environment["DD_APM_RECEIVER_SOCKET"] = DEFAULT_APM_RECEIVER_SOCKET
self.environment.pop("DD_AGENT_HOST", None)
self.environment.pop("DD_TRACE_AGENT_PORT", None)

# Some weblogs like uwsgi-poc may have known connection issues, when cpu is under heavy load.
# In this case, we retry the request a few times if the connection was aborted to avoid flaky tests.
if self.weblog_variant == "uwsgi-poc":
Expand Down Expand Up @@ -1141,12 +1160,11 @@ def library(self) -> ComponentVersion:

@property
def uds_socket(self):
assert self.image.env is not None, "No env set"
return self.image.env.get("DD_APM_RECEIVER_SOCKET", None)
return self.environment.get("DD_APM_RECEIVER_SOCKET", None)

@property
def uds_mode(self):
return self.uds_socket is not None
return self.weblog_variant.startswith("uds")

@property
def telemetry_heartbeat_interval(self):
Expand Down Expand Up @@ -1474,10 +1492,7 @@ def __init__(self, agent_port: int = 8126) -> None:

def configure(self, *, host_log_folder: str, replay: bool) -> None:
super().configure(host_log_folder=host_log_folder, replay=replay)
self.volumes[f"./{self.host_log_folder}/interfaces/test_agent_socket"] = {
"bind": "/var/run/datadog/",
"mode": "rw",
}
self._mount_agent_socket_dir()


class VCRCassettesContainer(TestedContainer):
Expand Down Expand Up @@ -1582,10 +1597,7 @@ def __init__(self, extra_env_vars: dict | None = None) -> None:

def configure(self, *, host_log_folder: str, replay: bool) -> None:
super().configure(host_log_folder=host_log_folder, replay=replay)
self.volumes[f"./{self.host_log_folder}/interfaces/test_agent_socket"] = {
"bind": "/var/run/datadog/",
"mode": "rw",
}
self._mount_agent_socket_dir()

def get_env(self, env_var: str):
"""Get env variables from the container"""
Expand Down
8 changes: 0 additions & 8 deletions utils/build/docker/dotnet/uds.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -44,11 +44,3 @@ COPY --from=build-app /app/out .

COPY utils/build/docker/dotnet/weblog/app.sh app.sh
CMD [ "./app.sh" ]

# The lines above is a copy of poc.Dockerfile
# The lines below are added for the UDS version only

RUN DEBIAN_FRONTEND=noninteractive apt-get install -y socat
ENV UDS_WEBLOG=1
ENV DD_APM_RECEIVER_SOCKET=/var/run/datadog/apm.socket
COPY utils/build/docker/set-uds-transport.sh set-uds-transport.sh
4 changes: 0 additions & 4 deletions utils/build/docker/dotnet/weblog/app.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,6 @@ set -eu

echo 'starting app'

if [ "${UDS_WEBLOG:-0}" = "1" ]; then
./set-uds-transport.sh
fi

if ( ! dotnet app.dll); then
echo recovering dump to /var/log/system-tests/dumps
mkdir -p /var/log/system-tests/dumps
Expand Down
6 changes: 1 addition & 5 deletions utils/build/docker/golang/app.sh
Original file line number Diff line number Diff line change
@@ -1,7 +1,3 @@
#!/bin/bash

if [ ${UDS_WEBLOG:-} = "1" ]; then
./set-uds-transport.sh
fi

exec ./weblog
exec ./weblog
6 changes: 1 addition & 5 deletions utils/build/docker/golang/uds-echo.Dockerfile
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
FROM golang:1.25-alpine AS build

RUN apk add --no-cache jq curl bash gcc musl-dev socat git
RUN apk add --no-cache jq curl bash gcc musl-dev git

# print important lib versions
RUN go version && curl --version
Expand All @@ -16,7 +16,6 @@ RUN go mod download && go mod verify
# copy the app code
COPY utils/build/docker/golang/app /app
COPY utils/build/docker/golang/app.sh /app/app.sh
COPY utils/build/docker/set-uds-transport.sh set-uds-transport.sh

# download the proper tracer version
COPY utils/build/docker/golang/install_ddtrace.sh binaries* /binaries/
Expand All @@ -25,9 +24,6 @@ ENV DD_TRACE_HEADER_TAGS='user-agent'

RUN go build -v -tags appsec -o weblog ./echo

ENV DD_APM_RECEIVER_SOCKET=/var/run/datadog/apm.socket
ENV UDS_WEBLOG=1

CMD ["./app.sh"]

# Datadog setup
Expand Down
4 changes: 0 additions & 4 deletions utils/build/docker/java/spring-boot/app.sh
Original file line number Diff line number Diff line change
@@ -1,9 +1,5 @@
#!/bin/bash

if [ ${UDS_WEBLOG:-} = "1" ]; then
./set-uds-transport.sh
fi

java \
-Xmx362m \
-XX:ErrorFile=/var/log/system-tests/hs_err_%p_%t_%u.log \
Expand Down
4 changes: 0 additions & 4 deletions utils/build/docker/java/uds-spring-boot.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,5 @@ COPY --from=build /dd-tracer/dd-java-agent.jar .
ENV DD_TRACE_HEADER_TAGS='user-agent:http.request.headers.user-agent'
ENV DD_TRACE_INTERNAL_EXIT_ON_FAILURE=true

COPY utils/build/docker/set-uds-transport.sh set-uds-transport.sh
ENV DD_APM_RECEIVER_SOCKET=/var/run/datadog/apm.socket
RUN apt-get update && apt-get install socat -y
ENV UDS_WEBLOG=1
COPY ./utils/build/docker/java/ConfigChaining.properties /app/ConfigChaining.properties
COPY utils/build/docker/java/spring-boot/app.sh app.sh
4 changes: 0 additions & 4 deletions utils/build/docker/java_otel/spring-boot/app.sh
Original file line number Diff line number Diff line change
@@ -1,7 +1,3 @@
#!/bin/bash

if [ ${UDS_WEBLOG:-} = "1" ]; then
./set-uds-transport.sh
fi

java -Xmx362m -javaagent:/app/dd-java-agent.jar -jar /app/myproject-0.0.1-SNAPSHOT.jar --server.port=7777
4 changes: 0 additions & 4 deletions utils/build/docker/nodejs/app.sh
Original file line number Diff line number Diff line change
@@ -1,9 +1,5 @@
#!/bin/bash

if [ "${UDS_WEBLOG:-}" = "1" ]; then
./set-uds-transport.sh
fi

set -e

if [ -e /volumes/dd-trace-js ]; then
Expand Down
6 changes: 1 addition & 5 deletions utils/build/docker/nodejs/parametric/app.sh
Original file line number Diff line number Diff line change
@@ -1,9 +1,5 @@
#!/bin/bash

if [ "${UDS_WEBLOG:-}" = "1" ]; then
./set-uds-transport.sh
fi

set -e

if [ -e /volumes/dd-trace-js ]; then
Expand All @@ -13,4 +9,4 @@ if [ -e /volumes/dd-trace-js ]; then
fi

# shellcheck disable=SC2086
node server.js ${SYSTEM_TESTS_EXTRA_COMMAND_ARGUMENTS:-}
node server.js ${SYSTEM_TESTS_EXTRA_COMMAND_ARGUMENTS:-}
4 changes: 0 additions & 4 deletions utils/build/docker/nodejs/uds-express4.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -23,15 +23,11 @@ ENV PGDATABASE=system_tests_dbname
ENV PGHOST=postgres
ENV PGPORT=5433

ENV DD_APM_RECEIVER_SOCKET=/var/run/datadog/apm.socket
ENV UDS_WEBLOG=1

ENV DD_DATA_STREAMS_ENABLED=true

# docker startup
COPY utils/build/docker/nodejs/app.sh app.sh
RUN printf 'node app.js' >> app.sh
COPY utils/build/docker/set-uds-transport.sh set-uds-transport.sh
CMD ./app.sh

COPY utils/build/docker/nodejs/install_ddtrace.sh binaries* /binaries/
Expand Down
3 changes: 0 additions & 3 deletions utils/build/docker/python/flask/app.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,6 @@ echo "--- PIP FREEZE ---"
python -m pip freeze
echo "------------------"

if [[ ${UDS_WEBLOG:-} = "1" ]]; then
./set-uds-transport.sh
fi
# CAVEAT: to debug the Python App, use these lines
# export FLASK_APP=app
# ddtrace-run flask run --no-reload --host=0.0.0.0 --port=7777
Expand Down
5 changes: 0 additions & 5 deletions utils/build/docker/python/uds-flask.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,6 @@ ENV _DD_APPSEC_DEDUPLICATION_ENABLED=false

ENV FLASK_APP=app.py

ENV DD_APM_RECEIVER_SOCKET=/var/run/datadog/apm.socket
RUN apt-get update && apt-get install socat -y
ENV UDS_WEBLOG=1
COPY utils/build/docker/set-uds-transport.sh set-uds-transport.sh

CMD ./app.sh

# docker build -f utils/build/docker/python.flask-poc.Dockerfile -t test .
Expand Down
4 changes: 1 addition & 3 deletions utils/build/docker/ruby/uds-rails.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,7 @@ RUN /binaries/install_ddtrace.sh

RUN bundle exec rails db:prepare

COPY utils/build/docker/set-uds-transport.sh set-uds-transport.sh

RUN echo "#!/bin/bash\n./set-uds-transport.sh\nbundle exec puma -b tcp://0.0.0.0 -p 7777 -w 1" > app.sh
RUN echo "#!/bin/bash\nbundle exec puma -b tcp://0.0.0.0 -p 7777 -w 1" > app.sh
RUN chmod +x app.sh

CMD [ "./app.sh" ]
4 changes: 1 addition & 3 deletions utils/build/docker/ruby/uds-sinatra.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,7 @@ COPY utils/build/docker/ruby/install_ddtrace.sh binaries* /binaries/
RUN /binaries/install_ddtrace.sh

ENV DD_TRACE_HEADER_TAGS=user-agent
ENV DD_APM_RECEIVER_SOCKET=/var/run/datadog/apm.socket

COPY utils/build/docker/set-uds-transport.sh set-uds-transport.sh
RUN echo "#!/bin/bash\n./set-uds-transport.sh\nbundle exec puma -b tcp://0.0.0.0 -p 7777 -w 1" > app.sh
RUN echo "#!/bin/bash\nbundle exec puma -b tcp://0.0.0.0 -p 7777 -w 1" > app.sh
RUN chmod +x app.sh
CMD [ "./app.sh" ]
Loading
Loading