Embedding VRL in Rust: Transformation at Your Fingertips


Vector by Datadog is famous for its fast observability pipeline. At the core of its data manipulation layer is VRL (Vector Remap Language), a special language designed for transforming structured event data.
While most users interact with VRL through Vector config files, you can also embed VRL directly into your Rust application, giving you access to a production-grade transformation engine in a few lines of code.
In this post, we’ll explore what VRL is, and how to embed it as a Rust library to power custom pipelines, CLIs, or internal tooling.
What is VRL?
VRL (Vector Remap Language) is a safe, type-aware, embeddable language designed to transform observability data like logs, metrics, and traces.
It supports:
Field manipulation
Type coercion
Built-in functions (e.g.
parse_json
,strip
,parse_timestamp
)Error handling
Test cases (via
vrl-stdlib
)
It’s written in Rust, compiled to run in Vector's high-performance engine, and optimized for real-time transformation.
Why Use VRL as a Library?
Using VRL as a standalone Rust library allows you to:
Run Vector-style remaps without deploying Vector itself
Use VRL to transform arbitrary JSON data in your app
Create custom dev tools or CLI utilities for event shaping
Prototype and test transformations in a controlled environment
How to Use VRL in Rust
Here’s a minimal example showing how to parse and run VRL expressions in a Rust program.
🚀 Step 1: Initiate the project
cargo new vrl-intro && cd vrl-intro
cargo add log vrl serde_json
Step 2: Run a simple VRL Program
use log::warn;
use std::collections::BTreeMap;
use vrl::compiler::{TargetValue, TimeZone, compile, runtime::Runtime};
use vrl::prelude::*;
use vrl::{
value,
value::{Secrets, Value},
};
fn main() {
let functions = vrl::stdlib::all();
let program = r#"
# Remove key foo
del(.foo)
# Parse HTTP status code into local variable
http_status_code = parse_int!(.http_status)
del(.http_status) # remove key http_status
# Add key status
if http_status_code >= 200 && http_status_code <= 299 {
.status = "success"
} else {
.status = "error"
}
"#;
let result = compile(&program, &functions).unwrap();
if !result.warnings.is_empty() {
warn!("{:?}", result.warnings);
}
let mut target = TargetValue {
value: value!({
message: "Hello VRL",
foo: "delete me",
http_status: "200"
}),
metadata: Value::Object(BTreeMap::new()),
secrets: Secrets::new(),
};
let mut runtime = Runtime::default();
runtime
.resolve(&mut target, &result.program, &TimeZone::default())
.expect("Failed to resolve");
assert_eq!(
target.value,
value!({
message: "Hello VRL",
status: "success",
})
);
}
In the code above we use the following vrl program:
# Remove key foo
del(.foo)
# Parse HTTP status code into local variable
http_status_code = parse_int!(.http_status)
del(.http_status) # remove key http_status
# Add key status
if http_status_code >= 200 && http_status_code <= 299 {
.status = "success"
} else {
.status = "error"
}
to parse the following event:
{
"message": "Hello VRL",
"foo": "delete me",
"http_status": "200"
}
which gives us the following output:
{
"message": "Hello VRL",
"status": "success"
}
Many functions are available and we loaded all of them by calling vrl::stdlib::all()
and you can add you own function by looking at an example like abs.rs, see next section for a more advanced example
Notes and Tips
Performance: VRL compiles expressions once and executes quickly, making it suitable even in low-latency contexts.
Error Handling: Use the
??
operator ormatch
functions in VRL to handle errors gracefully.Testing: You can also use
vrl-stdlib
to unit test scripts independently.Safe Evaluation: All VRL executions are sandboxed—no arbitrary code execution, file system access, or unsafe operations.
Overwriting an existing function
The split function takes a value and can split it by a pattern, however it has a weird behavior that can be seen in those examples:
split("foobar", "b") --> ["foo", "ar"]
split("barbaz", "ba") --> ["", "r", "z"]
split("bab", "") --> ["", "b", "a", "b", ""]
What if what we wanted was this instead 👇
split("foobar", "b") --> ["foo", "ar"]
split("barbaz", "ba") --> ["r", "z"]
split("bab", "") --> ["b", "a", "b"]
In this case what we could do is simply overwrite the split function with our code before compiling our program. To keep things simple I’ve removed the limit parameter that exists in the original split function, and here's how you can proceed
Add code of the new split function
fn split(value: Value, pattern: Value) -> Resolved {
let string = value.try_bytes_utf8_lossy()?;
match pattern {
Value::Regex(pattern) => Ok(pattern // the pattern is first evaluated as a regex
.splitn(string.as_ref(), usize::MAX) // I just use MAX as the limit
.filter(|s| !s.is_empty()) // keep an eye on this
.collect::<Vec<_>>()
.into()),
Value::Bytes(bytes) => { // if the pattern is not a regex then it's a string
let pattern = String::from_utf8_lossy(&bytes);
Ok(string
.splitn(usize::MAX, pattern.as_ref())
.filter(|s| !s.is_empty())
.collect::<Vec<_>>()
.into())
}
value => Err(ValueError::Expected {
got: value.kind(),
expected: Kind::regex() | Kind::bytes(),
}
.into()),
}
}
#[derive(Clone, Copy, Debug)]
pub struct Split;
impl Function for Split {
fn identifier(&self) -> &'static str {
"split"
}
fn parameters(&self) -> &'static [Parameter] {
&[
Parameter {
keyword: "value",
kind: kind::BYTES,
required: true,
},
Parameter {
keyword: "pattern",
kind: kind::BYTES | kind::REGEX, // is a string or a regex
required: true,
},
]
}
fn examples(&self) -> &'static [Example] {
&[
Example {
title: "split string",
source: r#"split("foobar", "b")"#,
result: Ok(r#"["foo", "ar"]"#),
},
Example {
title: "split once",
source: r#"split("foobarbaz", "ba")"#,
result: Ok(r#"["foo", "r", "z"]"#),
},
Example {
title: "split regex",
source: r#"split("barbaz", r'ba')"#,
result: Ok(r#"["r", "z"]"#),
},
]
}
fn compile(
&self,
_state: &state::TypeState,
_ctx: &mut FunctionCompileContext,
arguments: ArgumentList,
) -> Compiled {
let value = arguments.required("value");
let pattern = arguments.required("pattern");
Ok(SplitFn { value, pattern }.as_expr())
}
}
#[derive(Debug, Clone)]
pub(crate) struct SplitFn {
value: Box<dyn Expression>,
pattern: Box<dyn Expression>,
}
impl FunctionExpression for SplitFn {
fn resolve(&self, ctx: &mut Context) -> Resolved {
let value = self.value.resolve(ctx)?;
let pattern = self.pattern.resolve(ctx)?;
split(value, pattern)
}
fn type_def(&self, _: &state::TypeState) -> TypeDef {
TypeDef::array(Collection::from_unknown(Kind::bytes())).infallible()
}
}
- Add tests
...
#[cfg(test)]
mod tests {
use super::*;
use serde_json::json;
#[test]
fn test_split_with_b_pattern() {
let test = crate::split("foobar".into(), "b".into());
assert_eq!(test, Ok(json!(["foo", "ar"]).into()));
}
#[test]
fn test_split_examples() {
use vrl::prelude::Function;
let mut functions = vrl::stdlib::all();
let split = functions
.iter_mut()
.find(|f| f.identifier() == "split")
.unwrap();
*split = Box::new(Split);
let mut runtime = vrl::compiler::runtime::Runtime::default();
let timezone = vrl::compiler::TimeZone::default();
for example in Split.examples() {
let compiled = vrl::compiler::compile(example.source, &functions).unwrap();
assert!(
compiled.warnings.is_empty(),
"Warnings during compile: {:?}",
compiled.warnings
);
let mut target_value = vrl::compiler::TargetValue {
value: serde_json::json!({}).into(),
metadata: vrl::value::Value::Object(Default::default()),
secrets: vrl::value::Secrets::new(),
};
let result = runtime
.resolve(&mut target_value, &compiled.program, &timezone)
.expect("Failed to resolve program");
let expected: vrl::value::Value =
serde_json::from_str(example.result.as_ref().unwrap()).unwrap();
assert_eq!(result, expected, "Failed example: {}", example.title);
}
}
}
I know it’s a lot, but the most important parts are:
The call to
.filter(|s| !s.is_empty())
to get rid of empty strings before collecting themThe examples function that shows what we expect in a few specific cases
A simple test named test_split_with_b_pattern that calls the split function directly from rust
A complete test named test_split_examples that uses the new Split struct to overwrite the one in
vrl::stdlib::all()
and then uses all examples, compiles their source program, runs it and compares their result to the output it gets
cargo test 59% (7:53) 00:03:32
Blocking waiting for file lock on build directory
Compiling vrl-test v0.1.0 (/Users/cscetbon/src/git/vrl-test)
Finished `test` profile [unoptimized + debuginfo] target(s) in 1.98s
Running unittests src/main.rs (target/debug/deps/vrl_test-51e3932d7c0094c0)
running 2 tests
test tests::test_split_with_b_pattern ... ok
test tests::test_split_examples ... ok
test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.01s
The code is available at https://github.com/cscetbon/vrl-test if you want to tinker with it.
But what about extending the existing functions?
As in the previous example, you would define your new functions, such as YourNewFunction1
, YourNewFunction2
, and YourNewFunction3
, and then add them to your vector of functions:
...
let mut functions = vrl::stdlib::all();
functions.extend(vec![
Box::new(YourNewFunction1) as Box<dyn Function>,
Box::new(YourNewFunction2),
Box::new(YourNewFunction3),
]);
...
You need to use as Box<dyn Function>
on the first element to make Rust enforce type coercion for all elements in that vector (vec type, not Vector from Datadog 😵💫) to Box<dyn Function>
. This is necessary because Rust doesn't automatically convert items when they are first passed to the vec!
macro.
Real-World Use Cases
Embedding in CLI tools to batch-process JSON logs.
Integrating into ingestion pipelines where lightweight transformation is needed before sending to S3/Redshift.
Running tests for VRL scripts in CI.
Applying consistent filters or transformations across services.
Where to Go From Here
Conclusion
With VRL embedded in your Rust app, you have a reliable, production-ready data transformation engine—ideal for logs, metrics, and more. Whether you're building dev tools, data pipelines, or internal observability layers, VRL offers clarity, safety, and speed for your transformations. In an upcoming post, I'll introduce a new syntax I developed for representing data in a graph database. Stay tuned!
Subscribe to my newsletter
Read articles from Cyril Scetbon directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
