Understanding YAML: A Beginner's Guide

What is YAML ?

YAML is data serialization language & used to create configuration files. Before YAML was defined as Yet Another Markup Language, but now it's defined as YAML ain't markup language. I'll tell you the reason later.


Why and where is it used ?

So YAML is a human readable data serialization language but what does that mean. So consider you created an application and want to deploy it on different platforms, your application will have data and configuration that needs to modified according to the platform. Here the role of YAML files come, YAML serializes data - Data is converted into stream of bytes to more easily save or transmit. This process is called data serialization. In DevOps YAML files play crucial role for deploying application.

Java Serialization and Deserialization - Studytonight

YAML is used to create configuration files. Configuration files are those which are used to configure parameters and initial settings for a computer application or program. These parameters and initial settings define how an application should run etc. Now once the data serialization is done, on the other end if someone wants to access the application, data de-serialization happens, which means byte stream of data is converted back into objects.

Now there are many more Data serialization languages e.g. JSON, XML, CSV etc.

But the reason that makes YAML different from these is that YAML is also used to create configuration files, that are easily human readable.

That's why it's called as YAML Ain't Markup Language.

So finally you have understood, what YAML is.


Understand Demo YAML file

YAML serializes data in key-value pair. There are different datatypes in YAML like any other programming language.

Let's first have a look at this simple YAML file.

This YAML file shows a glimpse to different datatypes. Another thing that you'll notice is that it's human readable.

Key-Value pair

Now let's go through each part of this YAML file.

As you can in this you have code as:

Here, myself: is the key and Vishwajeet is the value, forming a key-value pair and are separated by a colon :

This syntax is followed in the whole code.

Now, one thing to keep in mind is that YAML is highly syntax sensitive. For example, a misplaced whitespace can cause an error.

YAML basic datatypes

You can mention datatype of key-value pair by using !!datatype after the colon :

In the above image you can see how different datatypes are mentioned right after the key. These datatypes are :

!!float - for floating point values e.g. 22.4, 55.6, 567.89... etc.

!!int - for integer value e.g. 1, 2, 3, 4, 77, 88.... etc.

!!string - for strings e.g. "Hello", "How are you

!!null, !!NULL, ~ - for null values

!!timestamp - for time representation

Also there are few things you need to learn.

Literal Block Scalar - In code lines 4-6 you can see a pipe symbol | this used to denote a Literal Block Scalar. It means that the scalar value (value of key) should be interpreted literally in such a way that preserves newlines.

Folded Block Scalar - In code line 9-12 you see a symbol > this is used to denote Folded Block Scalar. It means that the value of key should be interpreted in such a way that newlines are converted to spaces.

So these were some basic YAML datatypes to get you started, you can checkout more through YouTube tutorials but I would recommend reading official documentations.

YAML advance datatypes

Now let's go a little further with YAML files.

Basic things about YAML files :

Sections are separated using 3 hyphens---

I've only used it once just for example purpose in code line 5.

The end of YAML file is represented by 3 full stops...

Now let's understand this code


Sequence

This code snippet demonstrates two ways to define a list in YAML: the block sequence (lines 2-4) and the inline sequence (line 7). Both are valid and can be used according to the readability preference or context requirement. The !!seq is an explicit data type declaration for a sequence, which is optional as YAML can infer data types automatically. The comment on line 6 does not affect the data structure and is for human readability. The key takeaway is the flexibility of YAML in representing data structures in a human-readable format.

The YAML code in the above image defines a list under the key student:. Here’s a breakdown of the code:

  • Line 1: The key student: is followed by !!seq, indicating that the value is a sequence (list).

  • Lines 2-4: List items marks, name, and roll no are defined, each on a new line and prefixed with a hyphen -, which is the standard way to define list items in YAML.

  • Line 6: A comment is indicated by #, which is not processed by YAML parsers.

  • Line 7: An alternative inline syntax for defining the same list is shown, using square brackets [ ] and separating items with commas.


Sparse Sequence

The YAML code in the image above defines a sequence under the key sparse seq:. Here’s a breakdown of the code:

  • Line 10: The comment # some oh the keys of the seq will be empty suggests that some elements in the sequence may be null or empty.

  • Line 11: The key sparse seq: indicates the start of a sequence.

  • Lines 12-15: The sequence contains three elements: ‘hey’, ‘how’, and ‘Null’. The ‘Null’ element represents a null value or an empty item in the sequence.

  • Line 18: The comment # nested sequence introduces a nested sequence.

  • Lines 19-22: The nested sequence contains three elements: ‘mango’, ‘apple’, and ‘banana’, which are presumably items within a list.

  • Line 24: The line - marks indicates another list item, but it’s not clear if it’s related to the previous list or if it’s starting a new one.

  • Line 25: The line - roll_no suggests another list item without further context.

This snippet shows how to define sequences in YAML, including sparse sequences that allow null values and nested sequences. The comments provide clarity on the structure and intent of the code. Remember, YAML is designed to be human-readable and easy to understand, which is demonstrated in this example.


Maps

The YAML code defines !!maps . Basically the key-value pairs are called maps.

  • Line 33-35: Represents key role: and inside that we have two more keys age: and job:

  • This is called nested mapping.

  • Line 45-46: Represents duplicate keys job: and inside that we have two values age: and job: respectively.


Pairs

The YAML code defines !!pairs . Basically the key-value pairs can have duplicate values.

Line 45-46: Represents duplicate keys job: and inside that we have two values age: and job: respectively & this will be an array of hash tables (data is stored in an array format, where each data value has its own unique index value.)


Set

Here’s a breakdown of the code:

  • Line 52: The comment # lset will allow you to have unique values suggests that the list should behave like a set, where all values are unique.

  • Following Line: The key names: starts the list.


OMAP

The !!omap tag specifies that the entries in the map should be kept in order. This is useful when the order of elements is important and must be preserved. Each entry in the map represents a person with their respective details. The keys Vishwajeet: and Rahul: likely serve as identifiers for Key people: Indicates the start of the ordered map.

  • Entries:

    • First Entry: Has the key Vishwajeet: and contains details such as name: Vishwajeet Singh, age: 56, and height: 876.

    • Second Entry: Has the key Rahul: with details name: Rahul, age: 57, and height: 678.each person within the map


Uses anchors and aliases is to avoid repetition. Here’s a breakdown of the code:

  • Anchor &likes: Defines a reusable anchor for the properties of a likings:.

  • Properties: Includes name: Vishwajeet Singh, fav fruit: mango, and dislikes: grapes.

  • Alias *likes: Reuses the properties of likings: for person2:.

  • Override: The fav fruit: for Ajay is overridden to berries.

This technique allows for concise YAML files by reusing common properties while still permitting specific overrides. It’s useful for maintaining clean and DRY (Don’t Repeat Yourself) code in configurations.


YAML vs Other

Now it's time to wrap up by having a quick glance at difference in readability of YAML, JSON, XML files for the same data.

  • Key School: Indicates the start of the data structure.

  • Properties:

    • name: The name of the school is set to ABC.

    • Students: A list of students is defined under this key.

  • Student Entry:

    • rno: The roll number of the student is 34.

    • name: The student’s name is Ajay.

    • marks: The student’s marks are 68.

YAML

JSON

XML

From these images you can observe how easy it is to read YAML files.

Thanks for the read & Join the newsletter

0
Subscribe to my newsletter

Read articles from Vishwajeet Singh directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Vishwajeet Singh
Vishwajeet Singh

I am developer from India. My profound understanding and insight into the realm of technology invariably compels me to continually broaden my knowledge base with emerging tech stacks, spearhead innovative projects, and foster collaborative relationships with like-minded individuals.