Verse::Schema

Ensuring that the integrity and structure of incoming data is equally important and time-consuming in modern web application development, especially with APIs.

Invalid or unexpected data can lead to bugs, security risks, and frustrating user experiences. How can we effectively validate data, coerce it into the right types, and handle complex structures without drowning in boilerplate? Enter Verse::Schema.

The Challenge: Reliable Data Handling

Applications constantly interact with external data sources – API requests, database records, and configuration files. This data often arrives in less-than-ideal formats (e.g., strings instead of numbers, inconsistent keys). We need a robust system that can:

  1. Validate Structure: Ensure data conforms to expected shapes (hashes, arrays).

  2. Check Types: Verify that values have the correct data types (String, Integer, Boolean, etc.).

  3. Coerce Intelligently: Automatically convert compatible types (e.g., "123" to 123).

  4. Apply Rules: Enforce business logic beyond simple type checks.

  5. Provide Defaults: Handle missing optional values gracefully.

  6. Be Developer-Friendly: Offer a clear, concise, and maintainable way to define these rules.

Verse::Schema: Validation Made Clear and Powerful

Verse::Schema is a Ruby gem born from the need for a validation library that is both powerful and easy to understand. It provides an intuitive Domain-Specific Language (DSL) to define schemas, validate data against them, and transform it into a usable format.

The Journey to Verse::Schema

When initially building the Verse framework, we relied on the powerful Dry-rb ecosystem, particularly components like Dry::Validation, Dry::Schema, and Dry::Params, for handling data validation. While these tools are robust, we encountered significant hurdles as our needs evolved:

  1. Introspection Challenges: The internal Abstract Syntax Tree (AST) structure used by Dry-rb libraries made it difficult to introspect the defined schemas programmatically. This was a major roadblock for our goal of automatically generating API documentation directly from the schemas.

  2. Complexity and Learning Curve: The separation of concerns into multiple gems (Dry::Types, Dry::Schema, Dry::Logic, Dry::Validation Contracts/Params) created a steep learning curve. Developers found it confusing to navigate the distinctions and interactions between these components.

  3. Extensibility Hurdles: While extensible, tailoring the libraries to some of our specific needs required deep dives into complex internals.

These challenges led us to realize that a custom-built solution, tailored specifically for clarity, ease of use, and straightforward introspection, would better serve the goals of the Verse framework. Thus, Verse::Schema was created.

Core Principles:

  • Explicitness: Schemas clearly define the expected data structure and types.

  • Intelligent Coercion: Sensible automatic type conversions reduce manual effort.

  • Symbolized Keys: Hash keys are consistently converted to symbols.

  • Extensibility: Supports custom rules, transformations, and schema composition.

Defining Your Data Structure: The Basics

At its heart, Verse::Schema uses a simple DSL to define fields, their types, and constraints.

require 'verse/schema'

# Define a basic schema for a user profile
UserProfileSchema = Verse::Schema.define do
  field(:name, String).filled # Must be a non-empty string
  field(:age, Integer).rule("must be 18 or older") { |age| age >= 18 }
  field?(:email, String) # Optional field
  field(:role, String).default("guest") # Field with a default value
end

# Validate incoming data
data = { name: "Alice", age: "30", email: "alice@example.com" }
result = UserProfileSchema.validate(data)

if result.success?
  puts "Validation successful!"
  p result.value
  # => { name: "Alice", age: 30, email: "alice@example.com", role: "guest" }
  # Note: age was coerced from "30" to 30
else
  puts "Validation failed:"
  p result.errors
end

# Example of failed validation
invalid_data = { name: "Bob", age: 17 }
invalid_result = UserProfileSchema.validate(invalid_data)
puts invalid_result.errors # => { age: ["must be 18 or older"] }

Key features demonstrated:

  • field: Defines a required field.

  • field?: Defines an optional field.

  • .filled: A common rule ensuring the value is present and not empty (for strings or array).

  • .rule: Defines custom validation logic with a descriptive message.

  • .default: Provides a default value if the field is missing.

  • Automatic Coercion: "30" was automatically converted to the integer 30.

Handling Complex Data Structures

Real-world data isn't always flat. Verse::Schema provides tools for handling nested objects, arrays, and dictionaries.

Nested Schemas (Structs)

AddressSchema = Verse::Schema.define do
  field(:street, String)
  field(:city, String)
end

PersonSchema = Verse::Schema.define do
  field(:name, String)
  field(:address, AddressSchema) # Nesting the AddressSchema
  # Alternatively, define inline:
  # field(:address) do
  #   field(:street, String)
  #   field(:city, String)
  # end
end

data = { name: "Charlie", address: { street: "123 Main St", city: "Metropolis" } }
result = PersonSchema.validate(data)
p result.value[:address] # => { street: "123 Main St", city: "Metropolis" }

Arrays (Collections)

# Array of simple types
TagsSchema = Verse::Schema.define do
  field(:tags, Array, of: String)
end
result = TagsSchema.validate({ tags: ["ruby", "validation", 123] })
p result.value # => { tags: ["ruby", "validation", "123"] } # 123 coerced to "123"

# Array of complex objects
ItemSchema = Verse::Schema.define do
  field(:id, Integer)
  field(:name, String)
end
OrderSchema = Verse::Schema.define do
  field(:items, Array, of: ItemSchema)
  # Or inline:
  # field(:items, Array) do
  #   field(:id, Integer)
  #   field(:name, String)
  # end
end
result = OrderSchema.validate({ items: [{ id: "1", name: "Gadget" }, { id: 2, name: "Widget" }] })
p result.value[:items]
# => [ { id: 1, name: "Gadget" }, { id: 2, name: "Widget" } ]

Dictionaries (Hashes with Typed Values)

ScoreSchema = Verse::Schema.define do
  field(:scores, Hash, of: Integer) # Values must be Integers
end
result = ScoreSchema.validate({ scores: { math: "95", science: 88.0 } })
p result.value # => { scores: { math: 95, science: 88 } }

Advanced Features

Verse::Schema goes beyond basic validation with powerful features for complex scenarios.

Custom Rules and Transformations

  • Field Rules: Apply multiple rules directly to fields (.rule(...)).

  • Schema Rules: Validate relationships between multiple fields (rule([:field1, :field2], ...)).

  • Transformations: Modify data after validation using .transform { |value| ... } on fields or the entire schema. This is great for parsing data (e.g., comma-separated strings to arrays) or instantiating objects.

SchemaWithTransform = Verse::Schema.define do
  field(:tags_string, String).transform { |str| str.split(',').map(&:strip) }
end
result = SchemaWithTransform.validate({ tags_string: " ruby , validation, gem " })
p result.value # => { tags_string: ["ruby", "validation", "gem"] }

Polymorphic Data with Selectors

Handle data structures where the shape depends on the value of another field (e.g., an event object whose data field structure depends on the event_type).

FacebookData = Verse::Schema.define { field(:url, String) }
TwitterData = Verse::Schema.define { field(:tweet_id, String) }

EventSchema = Verse::Schema.define do
  field(:source, Symbol).in?(%i[facebook twitter])
  field(:data, { facebook: FacebookData, twitter: TwitterData }, over: :source)
end

result_fb = EventSchema.validate({ source: :facebook, data: { url: "http://fb.com/..." } })
result_tw = EventSchema.validate({ source: :twitter, data: { tweet_id: "12345" } })

p result_fb.value[:data] # => { url: "http://fb.com/..." }
p result_tw.value[:data] # => { tweet_id: "12345" }

Schema Composition

Combine and reuse schema definitions:

  • Inheritance: Create new schemas based on existing ones (Verse::Schema.define(ParentSchema) do ... end).

  • Aggregation: Merge schemas using the + operator (SchemaA + SchemaB).

Data Classes: From Validation to Objects

Verse::Schema can automatically generate simple Struct-like data classes from your schemas. This provides a convenient, object-oriented way to access validated and coerced data, avoiding nested hash lookups.

# Using AddressSchema from earlier
Address = AddressSchema.dataclass

# Create an instance (validation happens automatically)
address_data = { street: "456 Oak Ave", city: "Gotham", zip: "54321" } # zip is extra, ignored by default
addr_obj = Address.new(address_data)

puts addr_obj.street # => "456 Oak Ave"
puts addr_obj.city   # => "Gotham"

# Access the underlying schema
# puts Address.schema

Why Choose Verse::Schema?

  • Clarity: The DSL is designed to be readable and explicit.

  • Simplicity: Focuses on common validation tasks without unnecessary complexity.

  • Powerful Coercion: Handles type conversions intelligently.

  • Flexibility: Supports complex rules, transformations, and polymorphism.

  • Composition: Encourages reusable schema definitions.

  • Introspection: Schemas can be inspected, enabling features like automatic documentation generation.

  • Data Classes: Easily bridge the gap between raw data and structured objects.

Get Started

Ready to simplify your data validation?

  1. Install the gem:

     gem install verse-schema
     # or add to your Gemfile
     gem 'verse-schema'
    
  2. Explore the code and documentation:

All the information, source code, and an extensive README is available on GitHub:

https://github.com/verse-rb/verse-schema

Give Verse::Schema a try in your next Ruby project and experience a cleaner, more robust way to handle data validation and coercion.

0
Subscribe to my newsletter

Read articles from Yacine Petitprez directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Yacine Petitprez
Yacine Petitprez