Rust DataFrame Alternatives to Polars: Meet Elusion v4.0.0

The Rust ecosystem has seen tremendous growth in data processing libraries, with Polars leading the charge as a blazingly fast DataFrame library.
However, a new contender has emerged that takes a fundamentally different approach to data engineering and analysis: Elusion.
While Polars focuses on pure performance and memory efficiency with its Apache Arrow-based columnar engine, Elusion positions itself as equaly dedicated for performance and memory efficiency, also with Appache Arrow and DataFusion, as well as a comprehensive data engineering platform that prioritizes flexibility, ease of use, and integration capabilities alongside high performance.
Architecture Philosophy: Different Approaches to the Same Goals
Polars: Performance-First Design
Polars is written from scratch in Rust, designed close to the machine and without external dependencies. It's based on Apache Arrow's memory model, providing very cache efficient columnar data structures and focuses on:
Ultra-fast query execution with SIMD optimizations Memory-efficient columnar processing Lazy evaluation with query optimization Streaming for out-of-core processing
Elusion: Flexibility-First Design
Elusion takes a different approach, prioritizing developer experience and integration capabilities:
- Core Philosophy: "Elusion wants you to be you!"
Unlike traditional DataFrame libraries that enforce specific patterns, Elusion offers flexibility in constructing queries without enforcing specific patterns or chaining orders. You can build your queries in ANY SEQUENCE that makes sense to you, writing functions in ANY ORDER, and Elusion ensures consistent results regardless of the function call order.
Loading files into DataFrames:
Regular Loading: ~4.95 seconds for complex queries on 900k rows
CustomDataFrame::new()
Streaming Loading: ~3.62 seconds for the same operations
CustomDataFrame::new_with_stream()
Performance improvement: 26.9% faster with streaming approach
- Polars approach:
let df = LazyFrame::scan_csv("data.csv", ScanArgsCSV::default())?
.filter(col("amount").gt(100))
.select([col("customer"), col("amount")])
.collect()?;
- Elusion approach - flexible ordering:
let df = CustomDataFrame::new("data.csv", "sales").await?
.filter("amount > 100")
.select(["customer", "amount"])
.elusion("result").await?;
// Or reorder as you find fit - same result
let df = CustomDataFrame::new("data.csv", "sales").await?
.select(["customer", "amount"]) // Select first
.filter("amount > 100") // Filter second
.elusion("result").await?;
Polars Basic file loading:
let df = LazyFrame::scan_csv("data.csv", ScanArgsCSV::default())?
.collect()?;
// Parquet with options
let df = LazyFrame::scan_parquet("data.parquet", ScanArgsParquet::default())?
.collect()?;
Elusion Data Loading - Comprehensive Sources:
use elusion::prelude::*;
Local files with auto-recognition
let df = CustomDataFrame::new("data.csv", "sales").await?;
let df = CustomDataFrame::new("data.xlsx", "sales").await?; // Excel support
let df = CustomDataFrame::new("data.parquet", "sales").await?;
Streaming for large files (currently only supports .csv files)
let df = CustomDataFrame::new_with_stream("large_data.csv", "sales").await?;
Load entire folders
let df = CustomDataFrame::load_folder(
"/path/to/folder",
Some(vec!["csv", "xlsx"]), // Filter file types or `None` for all types
"combined_data"
).await?;
Azure Blob Storage (currently supports csv and json files)
let df = CustomDataFrame::from_azure_with_sas_token(
"https://account.blob.core.windows.net/container",
"sas_token",
Some("folder/file.csv"), //or keep `None` to take everything from folder
"azure_data"
).await?;
SharePoint
let df = CustomDataFrame::load_from_sharepoint(
"tenant-id",
"client-id",
"https://company.sharepoint.com/sites/Site",
"Documents/data.xlsx",
"sharepoint_data"
).await?;
REST API to DataFrame
let api = ElusionApi::new();
api.from_api_with_headers(
"https://api.example.com/data",
headers,
"/path/to/output.json"
).await?;
let df = CustomDataFrame::new("/path/to/output.json", "api_data").await?;
Database connections
let postgres_df = CustomDataFrame::from_postgres(&conn, query, "pg_data").await?;
let mysql_df = CustomDataFrame::from_mysql(&conn, query, "mysql_data").await?;
Polars: Structured Approach
Polars requires logical ordering
let result = df
.lazy()
.filter(col("amount").gt(100))
.group_by([col("category")])
.agg([col("amount").sum().alias("total")])
.sort("total", SortMultipleOptions::default())
.collect()?;
Elusion: Any-Order Flexibility
All of these produce the same result:
Traditional order:
let result1 = df
.select(["category", "amount"])
.filter("amount > 100")
.agg(["SUM(amount) as total"])
.group_by(["category"])
.order_by(["total"], ["DESC"])
.elusion("result").await?;
Filter first
let result2 = df
.filter("amount > 100")
.agg(["SUM(amount) as total"])
.select(["category", "amount"])
.group_by(["category"])
.order_by(["total"], ["DESC"])
.elusion("result").await?;
Aggregation first
let result3 = df
.agg(["SUM(amount) as total"])
.filter("amount > 100")
.group_by(["category"])
.select(["category", "amount"])
.order_by(["total"], ["DESC"])
.elusion("result").await?;
All produce identical results!
Advanced Features: Where Elusion Shines
- Built-in Visualization and Reporting Create interactive dashboards
let plots = [
(&line_plot, "Sales Timeline"),
(&bar_chart, "Category Performance"),
(&histogram, "Distribution Analysis"),
];
let tables = [
(&summary_table, "Summary Stats"),
(&detail_table, "Transaction Details")
];
CustomDataFrame::create_report(
Some(&plots),
Some(&tables),
"Sales Analysis Dashboard",
"dashboard.html",
Some(layout_config),
Some(table_options)
).await?;
- Automated Pipeline Scheduling Schedule data engineering pipelines
let scheduler = PipelineScheduler::new("5min", || async {
// Load from Azure
let df = CustomDataFrame::from_azure_with_sas_token(
azure_url, sas_token, Some("folder/"), "raw_data"
).await?;
// Process data
let processed = df
.select(["date", "amount", "category"])
.agg(["SUM(amount) as total", "COUNT(*) as transactions"])
.group_by(["date", "category"])
.order_by(["date"], ["ASC"])
.elusion("processed").await?;
// Write results
processed.write_to_parquet(
"overwrite",
"output/processed_data.parquet",
None
).await?;
Ok(())
}).await?;
Advanced JSON Processing Can handle complex JSON structures with Arrays and Objects
let df = CustomDataFrame::new("complex_data.json", "json_data").await?;
If you have json fields/columns in your files you can explode them:
- Extract simple JSON fields:
let simple = df.json([
"metadata.'$timestamp' AS event_time",
"metadata.'$user_id' AS user",
"data.'$amount' AS transaction_amount"
]);
- Extract from JSON arrays:
let complex = df.json_array([
"events.'$value:id=purchase' AS purchase_amount",
"events.'$timestamp:id=login' AS login_time",
"events.'$status:type=payment' AS payment_status"
]);
When to Choose Which
Choose Polars When: Pure performance is the top priority You prefer structured, optimized query patterns Memory efficiency is critical You need minimal dependencies
Choose Elusion When: You need integration flexibility (cloud storage, APIs, databases) Developer experience and query flexibility matter You want built-in visualization and reporting You need automated pipeline scheduling Working with diverse data sources (Excel, SharePoint, REST APIs) You prefer intuitive, any-order query building
Installation and Getting Started
- Polars
[dependencies]
polars = { version = "0.50.0", features = ["lazy"] }
- Elusion [dependencies]
elusion = "4.0.0"
tokio = { version = "1.45.0", features = ["rt-multi-thread"] }
- Elusion With specific features
elusion = { version = "4.0.0", features = ["dashboard", "azure", "postgres"] }
Rust version requirement:
Polars: >= 1.80
Elusion: >= 1.81
Real-World Example: Sales Data Analysis Polars Implementation:
use polars::prelude::*;
let df = LazyFrame::scan_csv("sales.csv", ScanArgsCSV::default())?
.filter(col("amount").gt(100))
.group_by([col("category")])
.agg([
col("amount").sum().alias("total_sales"),
col("amount").mean().alias("avg_sale"),
col("customer_id").n_unique().alias("unique_customers")
])
.sort("total_sales", SortMultipleOptions::default().with_order_descending(true))
.collect()?;
println!("{}", df);
Elusion Implementation: use elusion::prelude::*;
#[tokio::main]
async fn main() -> ElusionResult<()> {
// Load data (flexible source)
let df = CustomDataFrame::new("sales.csv", "sales").await?;
// Build query in any order that makes sense to you
let analysis = df
.filter("amount > 100")
.agg([
"SUM(amount) as total_sales",
"AVG(amount) as avg_sale",
"COUNT(DISTINCT customer_id) as unique_customers"
])
.group_by(["category"])
.order_by(["total_sales"], ["DESC"])
.elusion("sales_analysis").await?;
// If you like to display result
analysis.display().await?;
// Create visualization
let bar_chart = analysis.plot_bar(
"category",
"total_sales",
Some("Sales by Category")
).await?;
// Generate report
CustomDataFrame::create_report(
Some(&[(&bar_chart, "Sales Performance")]),
Some(&[(&analysis, "Summary Table")]),
"Sales Analysis Report",
"sales_report.html",
None,
None
).await?;
Ok(())
}
Conclusion
Elusion v4.0.0 represents a paradigm shift in DataFrame libraries, prioritizing developer experience, integration flexibility, and comprehensive data engineering capabilities. The choice between Polars and Elusion depends on your priorities:
For raw computational performance and memory efficiency: Polars For comprehensive data engineering with flexible development: Elusion
Elusion's "any-order" query building, extensive integration capabilities, built-in visualization, and automated scheduling make it particularly attractive for teams that need to work with diverse data sources and want a more intuitive development experience. Both libraries showcase the power of Rust in the data processing space, offering developers high-performance alternatives to traditional Python-based solutions. The Rust DataFrame ecosystem is thriving, and having multiple approaches ensures that different use cases and preferences are well-served.
Try Elusion v4.0.0 today:
cargo add elusion@4.0.0
For more information and examples, visit the Elusion github repository: Elusion repository and join the growing community of Rust data engineers who are discovering the flexibility and power of any-order DataFrame operations.
Subscribe to my newsletter
Read articles from Borivoj Grujicic directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
