Embedding VRL in Rust: Transformation at Your Fingertips

Cyril ScetbonCyril Scetbon
7 min read

Vector by Datadog is famous for its fast observability pipeline. At the core of its data manipulation layer is VRL (Vector Remap Language), a special language designed for transforming structured event data.

While most users interact with VRL through Vector config files, you can also embed VRL directly into your Rust application, giving you access to a production-grade transformation engine in a few lines of code.

In this post, we’ll explore what VRL is, and how to embed it as a Rust library to power custom pipelines, CLIs, or internal tooling.


What is VRL?

VRL (Vector Remap Language) is a safe, type-aware, embeddable language designed to transform observability data like logs, metrics, and traces.

It supports:

  • Field manipulation

  • Type coercion

  • Built-in functions (e.g. parse_json, strip, parse_timestamp)

  • Error handling

  • Test cases (via vrl-stdlib)

It’s written in Rust, compiled to run in Vector's high-performance engine, and optimized for real-time transformation.


Why Use VRL as a Library?

Using VRL as a standalone Rust library allows you to:

  • Run Vector-style remaps without deploying Vector itself

  • Use VRL to transform arbitrary JSON data in your app

  • Create custom dev tools or CLI utilities for event shaping

  • Prototype and test transformations in a controlled environment


How to Use VRL in Rust

Here’s a minimal example showing how to parse and run VRL expressions in a Rust program.

🚀 Step 1: Initiate the project

cargo new vrl-intro && cd vrl-intro
cargo add log vrl serde_json

Step 2: Run a simple VRL Program

use log::warn;
use std::collections::BTreeMap;
use vrl::compiler::{TargetValue, TimeZone, compile, runtime::Runtime};
use vrl::prelude::*;
use vrl::{
    value,
    value::{Secrets, Value},
};

fn main() {
    let functions = vrl::stdlib::all();
    let program = r#"
        # Remove key foo
        del(.foo)

        # Parse HTTP status code into local variable
        http_status_code = parse_int!(.http_status)
        del(.http_status) # remove key http_status

        # Add key status
        if http_status_code >= 200 && http_status_code <= 299 {
            .status = "success"
        } else {
            .status = "error"
        }
    "#;

    let result = compile(&program, &functions).unwrap();

    if !result.warnings.is_empty() {
        warn!("{:?}", result.warnings);
    }

    let mut target = TargetValue {
        value: value!({
            message: "Hello VRL",
            foo: "delete me",
            http_status: "200"
        }),
        metadata: Value::Object(BTreeMap::new()),
        secrets: Secrets::new(),
    };

    let mut runtime = Runtime::default();
    runtime
        .resolve(&mut target, &result.program, &TimeZone::default())
        .expect("Failed to resolve");
    assert_eq!(
        target.value,
        value!({
            message: "Hello VRL",
            status: "success",
        })
    );
}

In the code above we use the following vrl program:

# Remove key foo
del(.foo)

# Parse HTTP status code into local variable
http_status_code = parse_int!(.http_status)
del(.http_status) # remove key http_status

# Add key status
if http_status_code >= 200 && http_status_code <= 299 {
  .status = "success"
} else {
  .status = "error"
}

to parse the following event:

{
  "message": "Hello VRL",
  "foo": "delete me",
  "http_status": "200"
}

which gives us the following output:

{
  "message": "Hello VRL",
  "status": "success"
}

Many functions are available and we loaded all of them by calling vrl::stdlib::all() and you can add you own function by looking at an example like abs.rs, see next section for a more advanced example

Notes and Tips

  • Performance: VRL compiles expressions once and executes quickly, making it suitable even in low-latency contexts.

  • Error Handling: Use the ?? operator or match functions in VRL to handle errors gracefully.

  • Testing: You can also use vrl-stdlib to unit test scripts independently.

  • Safe Evaluation: All VRL executions are sandboxed—no arbitrary code execution, file system access, or unsafe operations.


Overwriting an existing function

The split function takes a value and can split it by a pattern, however it has a weird behavior that can be seen in those examples:

split("foobar", "b") --> ["foo", "ar"]
split("barbaz", "ba") --> ["", "r", "z"]
split("bab", "") --> ["", "b", "a", "b", ""]

What if what we wanted was this instead 👇

split("foobar", "b") --> ["foo", "ar"]
split("barbaz", "ba") --> ["r", "z"]
split("bab", "") --> ["b", "a", "b"]

In this case what we could do is simply overwrite the split function with our code before compiling our program. To keep things simple I’ve removed the limit parameter that exists in the original split function, and here's how you can proceed

  1. Add code of the new split function

fn split(value: Value, pattern: Value) -> Resolved {
    let string = value.try_bytes_utf8_lossy()?;
    match pattern {
        Value::Regex(pattern) => Ok(pattern // the pattern is first evaluated as a regex
            .splitn(string.as_ref(), usize::MAX) // I just use MAX as the limit
            .filter(|s| !s.is_empty()) // keep an eye on this
            .collect::<Vec<_>>()
            .into()),
        Value::Bytes(bytes) => { // if the pattern is not a regex then it's a string
            let pattern = String::from_utf8_lossy(&bytes);
            Ok(string
                .splitn(usize::MAX, pattern.as_ref())
                .filter(|s| !s.is_empty())
                .collect::<Vec<_>>()
                .into())
        }
        value => Err(ValueError::Expected {
            got: value.kind(),
            expected: Kind::regex() | Kind::bytes(),
        }
        .into()),
    }
}

#[derive(Clone, Copy, Debug)]
pub struct Split;

impl Function for Split {
    fn identifier(&self) -> &'static str {
        "split"
    }

    fn parameters(&self) -> &'static [Parameter] {
        &[
            Parameter {
                keyword: "value",
                kind: kind::BYTES,
                required: true,
            },
            Parameter {
                keyword: "pattern",
                kind: kind::BYTES | kind::REGEX, // is a string or a regex
                required: true,
            },
        ]
    }

    fn examples(&self) -> &'static [Example] {
        &[
            Example {
                title: "split string",
                source: r#"split("foobar", "b")"#,
                result: Ok(r#"["foo", "ar"]"#),
            },
            Example {
                title: "split once",
                source: r#"split("foobarbaz", "ba")"#,
                result: Ok(r#"["foo", "r", "z"]"#),
            },
            Example {
                title: "split regex",
                source: r#"split("barbaz", r'ba')"#,
                result: Ok(r#"["r", "z"]"#),
            },
        ]
    }

    fn compile(
        &self,
        _state: &state::TypeState,
        _ctx: &mut FunctionCompileContext,
        arguments: ArgumentList,
    ) -> Compiled {
        let value = arguments.required("value");
        let pattern = arguments.required("pattern");

        Ok(SplitFn { value, pattern }.as_expr())
    }
}

#[derive(Debug, Clone)]
pub(crate) struct SplitFn {
    value: Box<dyn Expression>,
    pattern: Box<dyn Expression>,
}

impl FunctionExpression for SplitFn {
    fn resolve(&self, ctx: &mut Context) -> Resolved {
        let value = self.value.resolve(ctx)?;
        let pattern = self.pattern.resolve(ctx)?;

        split(value, pattern)
    }

    fn type_def(&self, _: &state::TypeState) -> TypeDef {
        TypeDef::array(Collection::from_unknown(Kind::bytes())).infallible()
    }
}
  1. Add tests
...
#[cfg(test)]
mod tests {
    use super::*;
    use serde_json::json;

    #[test]
    fn test_split_with_b_pattern() {
        let test = crate::split("foobar".into(), "b".into());
        assert_eq!(test, Ok(json!(["foo", "ar"]).into()));
    }

    #[test]
    fn test_split_examples() {
        use vrl::prelude::Function;
        let mut functions = vrl::stdlib::all();
        let split = functions
            .iter_mut()
            .find(|f| f.identifier() == "split")
            .unwrap();
        *split = Box::new(Split);
        let mut runtime = vrl::compiler::runtime::Runtime::default();
        let timezone = vrl::compiler::TimeZone::default();

        for example in Split.examples() {
            let compiled = vrl::compiler::compile(example.source, &functions).unwrap();
            assert!(
                compiled.warnings.is_empty(),
                "Warnings during compile: {:?}",
                compiled.warnings
            );

            let mut target_value = vrl::compiler::TargetValue {
                value: serde_json::json!({}).into(),
                metadata: vrl::value::Value::Object(Default::default()),
                secrets: vrl::value::Secrets::new(),
            };

            let result = runtime
                .resolve(&mut target_value, &compiled.program, &timezone)
                .expect("Failed to resolve program");
            let expected: vrl::value::Value =
                serde_json::from_str(example.result.as_ref().unwrap()).unwrap();

            assert_eq!(result, expected, "Failed example: {}", example.title);
        }
    }
}

I know it’s a lot, but the most important parts are:

  • The call to .filter(|s| !s.is_empty()) to get rid of empty strings before collecting them

  • The examples function that shows what we expect in a few specific cases

  • A simple test named test_split_with_b_pattern that calls the split function directly from rust

  • A complete test named test_split_examples that uses the new Split struct to overwrite the one in vrl::stdlib::all() and then uses all examples, compiles their source program, runs it and compares their result to the output it gets

cargo test                                                                                                                                           59% (7:53)    00:03:32 
    Blocking waiting for file lock on build directory
   Compiling vrl-test v0.1.0 (/Users/cscetbon/src/git/vrl-test)
    Finished `test` profile [unoptimized + debuginfo] target(s) in 1.98s
     Running unittests src/main.rs (target/debug/deps/vrl_test-51e3932d7c0094c0)

running 2 tests
test tests::test_split_with_b_pattern ... ok
test tests::test_split_examples ... ok

test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.01s

The code is available at https://github.com/cscetbon/vrl-test if you want to tinker with it.


But what about extending the existing functions?

As in the previous example, you would define your new functions, such as YourNewFunction1, YourNewFunction2, and YourNewFunction3, and then add them to your vector of functions:

...
let mut functions = vrl::stdlib::all();
functions.extend(vec![
  Box::new(YourNewFunction1) as Box<dyn Function>,
  Box::new(YourNewFunction2),
  Box::new(YourNewFunction3),
]);
...

You need to use as Box<dyn Function> on the first element to make Rust enforce type coercion for all elements in that vector (vec type, not Vector from Datadog 😵‍💫) to Box<dyn Function>. This is necessary because Rust doesn't automatically convert items when they are first passed to the vec! macro.


Real-World Use Cases

  • Embedding in CLI tools to batch-process JSON logs.

  • Integrating into ingestion pipelines where lightweight transformation is needed before sending to S3/Redshift.

  • Running tests for VRL scripts in CI.

  • Applying consistent filters or transformations across services.


Where to Go From Here


Conclusion

With VRL embedded in your Rust app, you have a reliable, production-ready data transformation engine—ideal for logs, metrics, and more. Whether you're building dev tools, data pipelines, or internal observability layers, VRL offers clarity, safety, and speed for your transformations. In an upcoming post, I'll introduce a new syntax I developed for representing data in a graph database. Stay tuned!

0
Subscribe to my newsletter

Read articles from Cyril Scetbon directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Cyril Scetbon
Cyril Scetbon