Sources
Datadog Agent
Connect Datadog Agents to Streamfold to process metrics, logs, and traces.
The Datadog Agent is an open-source agent built in Go that supports Linux and Windows environments. The agent collects metrics and logs from the environment it is installed on. It can receive metrics or traces from applications, which it batches and sends on a regular interval. There are numerous installation methods, including pre-built Docker images that run in Kubernetes environments and pre-built images for all major clouds.
Streamfold supports using the Datadog Agent as a source for metrics, logs, and traces. The agent can be configured to send this data to your Streamfold ingress endpoint instead of Datadog. Telemetry data can be forwarded to Datadog or any of a number of other destinations.
Overview
Streamfold integration with Datadog Agent works by overriding the configurable URL endpoints for the telemetry collected by the agent. Typically without Streamfold you would not override these and instead use the default endpoints that point to the Datadog cloud service. If you are in a restricted network environment you may have proxied the endpoints through an external host, but that is uncommon.
The configuration section below details how to override these endpoints and includes instructions for different environments. The Datadog agent is a collection of different integrations. Each integration uses different endpoints so there are different configuration approaches for overriding each. The three integrations Streamfold supports today are:
Metrics: This will collect any integration metrics or custom metrics sent to the Dogstatsd service. There are also some additional event types that will be sent when overriding metrics. Streamfold supports collecting and forwarding all of these events to Datadog, or you can filter them to other destinations.
Logs: If you are using Datadog agent to collect log messages, you can override this to send your log data to Streamfold. Log collection is not enabled by default in the agent.
Traces: APM (trace) data is collected and aggregated at the agent if you are using a compatible tracing instrumentation library. Overriding the trace endpoints to use Streamfold will send all trace span data via your Streamfold pipeline. Stats data associated with APM usage is also collected and sent.
There are many more integrations within Datadog agent, for example: network monitoring, process monitoring, profiling, etc. At the moment Streamfold does not support these integrations. The configurations listed below will only override the three integrations above. If you are using any additional integrations they will not be impacted and they will continue to report directly to Datadog.
The agent supports dual shipping of telemetry data. Any of the data types can be sent to more than one endpoint at a time. This can be useful if you are just getting started with Streamfold and don't want to impact your existing Datadog integration. You can add Streamfold as a dual endpoint while you become comfortable with the product. You will need to be careful to not dual report the same data back to Datadog since that may impact your bill.
You should know!
You can only add a single Datadog Agent source to your Streamfold account at the moment.
Supported Versions
We support versions 6 & 7 of the Datadog Agent on Linux platforms. Earlier versions are not supported.
Configuration
The following sections detail how to override the endpoints for metrics, logs, and traces. We've documented the options for: environment variable, YAML, and Helm chart deployment models. For all configuration sections the following custom values are required:
- ingress endpoint: your ingress endpoint, represented below as
ingress.streamfold.com
- ingress token: your provided ingress API token, represented below as
<sf-ingress-token>
Set any and all integration overrides that you would like to be processed by Streamfold.
We have included the alternative dual shipping instructions for each configuration approach.
Authentication
You can find your API token in the configuration instructions in the application.
Environment variables
Environment variables are the standard approach for configuration in a Docker based environment.
Complete
This is the complete environment setup for metrics, logs, and traces. Alternatively you can use the split out configs if you only want to send a portion of your traffic.
Single endpoint:
DD_URL="https://streamfold:<sf-ingress-token>@ingress.streamfold.com"
DD_LOGS_CONFIG_LOGS_DD_URL="ingress.streamfold.com:443"
DD_LOGS_CONFIG_API_KEY="<sf-ingress-token>"
DD_LOGS_CONFIG_USE_HTTP=true
DD_APM_DD_URL="https://streamfold:<sf-ingress-token>@ingress.streamfold.com"
Dual ship:
DD_ADDITIONAL_ENDPOINTS="{\"https://streamfold:<sf-ingress-token>@ingress.streamfold.com\": [\"<sf-ingress-token>\"]}"
DD_LOGS_CONFIG_USE_HTTP=true
DD_LOGS_CONFIG_ADDITIONAL_ENDPOINTS="[{\"api_key\": \"<sf-ingress-token>\", \"Host\": \"ingress.streamfold.com\", \"Port\": 443, \"is_reliable\": false}]"
DD_APM_ADDITIONAL_ENDPOINTS="{\"https://streamfold:<sf-ingress-token>@ingress.streamfold.com\": [\"<sf-ingress-token>\"]}"
Metrics
Single endpoint:
DD_URL="https://streamfold:<sf-ingress-token>@ingress.streamfold.com"
Dual ship:
DD_ADDITIONAL_ENDPOINTS="{\"https://streamfold:<sf-ingress-token>@ingress.streamfold.com\": [\"<sf-ingress-token>\"]}"
Logs
Single endpoint:
DD_LOGS_CONFIG_LOGS_DD_URL="ingress.streamfold.com:443"
DD_LOGS_CONFIG_API_KEY="<sf-ingress-token>"
DD_LOGS_CONFIG_USE_HTTP=true
Dual ship:
DD_LOGS_CONFIG_USE_HTTP=true
DD_LOGS_CONFIG_ADDITIONAL_ENDPOINTS="[{\"api_key\": \"<sf-ingress-token>\", \"Host\": \"ingress.streamfold.com\", \"Port\": 443, \"is_reliable\": false}]"
Traces (APM)
Single endpoint:
DD_APM_DD_URL="https://streamfold:<sf-ingress-token>@ingress.streamfold.com"
Dual ship:
DD_APM_ADDITIONAL_ENDPOINTS="{\"https://streamfold:<sf-ingress-token>@ingress.streamfold.com\": [\"<sf-ingress-token>\"]}"
YAML config
YAML config is standard for most host based installations.
Complete
This is the complete YAML configuration required for metrics, logs, and traces.
Single endpoint:
dd_url: "https://streamfold:<sf-ingress-token>@ingress.streamfold.com"
logs_config:
use_http: true
logs_dd_url: "ingress.streamfold.com:443"
api_key: "<sf-ingress-token>"
apm_config:
apm_dd_url: "https://streamfold:<sf-ingress-token>@ingress.streamfold.com"
Dual ship:
additional_endpoints:
"https://streamfold:<sf-ingress-token>@ingress.streamfold.com":
- "<sf-ingress-token>"
logs_config:
use_http: true
additional_endpoints:
- api_key: "<sf-ingress-token>"
Host: "ingress.streamfold.com"
Port: 443
is_reliable: false
apm_config:
additional_endpoints:
"https://streamfold:<sf-ingress-token>@ingress.streamfold.com":
- "<sf-ingress-token>"
Metrics
Single endpoint:
dd_url: "https://streamfold:<sf-ingress-token>@ingress.streamfold.com"
Dual ship:
additional_endpoints:
"https://streamfold:<sf-ingress-token>@ingress.streamfold.com":
- "<sf-ingress-token>"
Logs
Single endpoint:
logs_config:
use_http: true
logs_dd_url: "ingress.streamfold.com:443"
api_key: "<sf-ingress-token>"
Dual ship:
logs_config:
use_http: true
additional_endpoints:
- api_key: "<sf-ingress-token>"
Host: "ingress.streamfold.com"
Port: 443
is_reliable: false
Traces (APM)
Single endpoint:
apm_config:
apm_dd_url: "https://streamfold:<sf-ingress-token>@ingress.streamfold.com"
Dual ship:
apm_config:
additional_endpoints:
"https://streamfold:<sf-ingress-token>@ingress.streamfold.com":
- "<sf-ingress-token>"
Helm charts
This section assumes you are using the default datadog/datadog
Helm chart for installing the agent.
The commands below use --reuse-values
that should maintain existing configuration and only override the required values
Complete
This is the complete configuration required for metrics, logs, and traces.
Single endpoint:
helm upgrade --reuse-values \
--set datadog.dd_url=https://streamfold:<sf-ingress-token>@ingress.streamfold.com \
--set datadog.envDict.DD_LOGS_CONFIG_USE_HTTP=true \
--set datadog.envDict.DD_LOGS_CONFIG_LOGS_DD_URL=ingress.streamfold.com:443 \
--set datadog.envDict.DD_LOGS_CONFIG_API_KEY=<sf-ingress-token> \
--set datadog.envDict.DD_APM_DD_URL=https://streamfold:<sf-ingress-token>@ingress.streamfold.com \
<release name> datadog/datadog
Dual ship:
helm upgrade --reuse-values \
--set agents.useConfigMap=true \
--set "agents.customAgentConfig.additional_endpoints.https://ingress\.streamfold\.com={<sf-ingress-token>}" \
--set agents.customAgentConfig.logs_config.use_http=true \
--set agents.customAgentConfig.logs_config.additional_endpoints[0].api_key=<sf-ingress-token> \
--set agents.customAgentConfig.logs_config.additional_endpoints[0].Host=ingress.streamfold.com \
--set agents.customAgentConfig.logs_config.additional_endpoints[0].Port=443 \
--set agents.customAgentConfig.logs_config.additional_endpoints[0].is_reliable=false \
--set "agents.customAgentConfig.apm_config.additional_endpoints.https://ingress\.streamfold\.com={<sf-ingress-token>}" \
<release name> datadog/datadog
Metrics
Single endpoint:
helm upgrade --reuse-values --set datadog.dd_url=https://streamfold:<sf-ingress-token>@ingress.streamfold.com <release name> datadog/datadog
Dual ship:
helm upgrade --reuse-values \
--set agents.useConfigMap=true \
--set "agents.customAgentConfig.additional_endpoints.https://ingress\.streamfold\.com={<sf-ingress-token>}" \
<release name> datadog/datadog
Logs
Single endpoint:
helm upgrade --reuse-values --set datadog.envDict.DD_LOGS_CONFIG_USE_HTTP=true \
--set datadog.envDict.DD_LOGS_CONFIG_LOGS_DD_URL=ingress.streamfold.com:443 \
--set datadog.envDict.DD_LOGS_CONFIG_API_KEY=<sf-ingress-token> \
<release name> datadog/datadog
Dual ship:
helm upgrade --reuse-values \
--set agents.useConfigMap=true \
--set agents.customAgentConfig.logs_config.use_http=true \
--set agents.customAgentConfig.logs_config.additional_endpoints[0].api_key=<sf-ingress-token> \
--set agents.customAgentConfig.logs_config.additional_endpoints[0].Host=ingress.streamfold.com \
--set agents.customAgentConfig.logs_config.additional_endpoints[0].Port=443 \
--set agents.customAgentConfig.logs_config.additional_endpoints[0].is_reliable=false \
<release name> datadog/datadog
Traces (APM)
Single endpoint:
helm upgrade --reuse-values --set datadog.envDict.DD_APM_DD_URL=https://streamfold:<sf-ingress-token>@ingress.streamfold.com \
<release name> datadog/datadog
Dual ship:
helm upgrade --reuse-values \
--set agents.useConfigMap=true \
--set "agents.customAgentConfig.apm_config.additional_endpoints.https://ingress\.streamfold\.com={<sf-ingress-token>}" \
<release name> datadog/datadog
Authentication
The Datadog Agent Streamfold source uses an ingress token for authentication. You can find your ingress tokens on your source configuration instructions. Metrics and traces use the token as part of a basic-auth scheme, while the above config for logs will send the ingress token as the DD_API_KEY
header. Learn more
We do not recommend setting the top-level datadog.api_key
configuration value. This can break other integrations in the Datadog agent that Streamfold does not support yet.
Event formats
Datadog Agent can emit may different event types based on what you are collecting. It is important to know the different event types so that you can correctly work with them in Streamfold. All events emitted from the agent will have a meta section that identifies the origin and type of the event. This is an example of the meta section from a Series V2 metric event (some headers have been stripped for brevity)
"_meta": {
"source": {
"id": "01H01990VE4BW9NGXWT6SH2CH4",
"name": "Datadog Agent",
"type": "datadog_agent"
},
"http": {
"path": "/api/v2/series",
"method": "POST",
"headers": {
"Dd-Agent-Version": [
"7.46.0"
],
"Content-Type": [
"application/x-protobuf"
],
...
}
},
"peer": {
"client_ip": "172.16.0.1:57484"
},
"type": "metrics",
"datadog_resource": "metrics"
}
The most important sections are:
@source.{id, name, type}
: These represent the source entry as defined in your Streamfold account. The name matches the value you entered when creating the source and the id represents the unique source identifier. The type will always be "datadog_agent".@http
: Any source that uses HTTP delivery will include information on the incoming HTTP request.@type
: This is the actual type of the event and defines the format of the contained data. The sections below breakout the specific types and examples of each.@datadog_resource
: There are three values supported right now: metrics, logs, or apm. The value corresponds to the resource type configured above when overriding the endpoint URLs. A given resource may have multiple types of actual event, identified by the@type
value.
The following are examples of the different event types emitted for each of the three resource types. For each type we include an example filter, formatted in the Streamfold selector syntax, that can be used to filter events of that type. We also include an abbreviated example of an event that was collected from the source so you can understand how to work with the event data.
These are just some samples, feel free to collect your own real data by hooking up a Datadog Agent source to an S3 bucket.
Metrics
Series Metrics
Most custom and integration metrics will be sent as a series payload. This payload mirrors the submit metric API payload format. In Streamfold a single series metrics submission will be processed as a single event. You can use range operators or custom Datadog functions to process the individual metrics within the event.
Example filter:
@type == metrics
Example event:
A single custom metric series from an internal agent metric, the series have been trimmed.
{
"series": [
{
"tags": [
"version:7.46.0",
"client:go",
"client_version:5.1.1",
"client_transport:udp"
],
"type": 2,
"unit": "",
"interval": 10,
"metric": "datadog.dogstatsd.client.packets_dropped_writer",
"points": [
{
"timestamp": 1692049750,
"value": 0.000000
}
],
"resources": [
{
"name": "91857706c12078",
"type": "host"
}
],
"source_type_name": ""
},
...
]
}
Sketches
Sketches are a custom Datadog data structure for calculating percentiles with low relative error rates. They are used to transmit any Distribution type metrics submitted to the agent. Datadog does not have public API support for submitting sketches directly so there is no public API documentation.
Example filter:
@type == sketches
Example event:
A single distribution metric sent over DogStatsD, emitted from the agent as a DDSketch.
{
"sketches": [
{
"metric": "demo_custom_dist_metric",
"host": "demo.example.com",
"tags": [
"env:demo",
"tier:testing"
],
"dogsketches": [
{
"avg": 3.066667,
"sum": 9.200000,
"k": [
1355,
1417,
1435
],
"n": [
1,
1,
1
],
"ts": 1692108080,
"cnt": 3,
"min": 1.300000,
"max": 4.500000
}
]
}
]
}
Service checks
The result of Datadog service checks executed by the agent are also sent when overriding the metrics endpoint. These events will be sent to Streamfold if metrics are overridden.
Example filter:
@type == check_run
Example event:
Two service check results that are initiated by the Datadog Agent, custom checks would also be included in the list.
{
"checks": [
{
"message": "",
"tags": null,
"check": "ntp.in_sync",
"host_name": "demo.example.com",
"timestamp": 1692107389,
"status": 0
},
{
"timestamp": 1692107389,
"status": 0,
"message": "",
"tags": [
"check:ntp"
],
"check": "datadog.agent.check_status",
"host_name": "demo.example.com"
}
]
}
Logs
Each log line submitted by the agent will be converted into a single event in Streamfold. This allows you to individually process log lines in a Streamfold stream. The schema for a log line matches the public API for submitting logs to Datadog.
Example filter:
@type == log
Example event:
Two event examples, each representing a single log line.
{
"ddsource": "demo",
"status": "error",
"hostname": "demo.example.com",
"service": "demo",
"ddtags": "image_name:ghcr.io/open-telemetry/demo,short_image:demo,image_tag:v1.0.0-recommendationservice,docker_image:ghcr.io/open-telemetry/demo:v1.0.0-recommendationservice,container_name:recommendation-service,container_id:af7f23ce29ccd2b8e20be6a1aaf3beac795881f9a15f20a22f9ab7f3115ac13d",
"message": "debug_error_string = \"UNKNOWN:failed to connect to all addresses; last error: UNKNOWN: ipv4:172.22.0.16:50053: Failed to connect to remote host: No route to host {created_time:\"2023-08-15T13:34:28.846768719+00:00\", grpc_status:14}\"",
"timestamp": 1692106468853
}
{
"ddsource": "demo",
"status": "error",
"hostname": "demo.example.com",
"service": "demo",
"ddtags": "image_name:ghcr.io/open-telemetry/demo,short_image:demo,image_tag:v1.0.0-recommendationservice,docker_image:ghcr.io/open-telemetry/demo:v1.0.0-recommendationservice,container_name:recommendation-service,container_id:af7f23ce29ccd2b8e20be6a1aaf3beac795881f9a15f20a22f9ab7f3115ac13d",
"message": "Unable to connect",
"timestamp": 1692106468853
}
Traces (APM)
Traces, or APM data, is sent in a single large payload that batches multiple spans together. Trace submission is not supported on the public API, so there is no public API documentation for trace spans.
Example filter:
@type == traces
Example event:
Example trace span from a Ruby on Rails application. The spans have been trimmed down for brevity.
{
"error_tps": 10.000000,
"rare_sampler_enabled": false,
"hostname": "91857706c12078",
"env": "none",
"tracer_payloads": [
{
"container_id": "",
"language_version": "3.1.0",
"env": "development",
"tags": {},
"hostname": "",
"app_version": "0.0.1",
"language_name": "ruby",
"tracer_version": "1.10.1",
"runtime_id": "",
"chunks": [
{
"priority": 2,
"origin": "",
"spans": [
{
"parent_id": 3428916435696766167,
"start": 1691620559485806764,
"duration": 10999329,
"error": 0,
"meta": {
"rails.route.action": "index",
"operation": "controller",
"component": "action_pack",
"version": "0.0.1",
"env": "development",
"rails.route.controller": "HomeController"
},
"metrics": {
"_dd.measured": 1.000000
},
"name": "rails.action_controller",
"span_id": 4265833200870890720,
"trace_id": 3309294656658767415,
"type": "web",
"meta_struct": {},
"service": "banjo-store",
"resource": "HomeController#index"
}
],
"tags": {},
"dropped_trace": false
}
]
}
],
"tags": {},
"agent_version": "7.46.0",
"target_tps": 10.000000
}