Wrangle log data in cli with pb

Nitish TiwariNitish Tiwari
5 min read

We released pb, a modern, fast and intuitive CLI for log management couple of months ago. pb is built for developers who like the comfort of their terminal. Log data is primarily textual, and hence it was important that pb is pluggable via the linux pipe | to other text manipulation tools like grep, awk, sed etc.

In this post, we will see how pb can be used to wrangle log data in the terminal. We'll also explore the new pb tail command to see how it can be used to tail logs in the terminal.

Pre-requisites

pb is a terminal client for Parseable. So you'll need a Parseable installation up and running to use pb. You can follow the official Parseable documentation for installation instructions.

Installation

pb is available for Linux, MacOS and Windows. You can download the latest release from here. Once downloaded, extract and move the binary to a location in your $PATH and rename it to pb.

Get started

pb allows you to manage streams, users, roles, alerts and more. In this post, we'll see how to manage streams, tail logs and query logs using pb.

➜  pb help           

pb is the command line interface for Parseable

Usage:
  pb [flags]
  pb [command]

Available Commands:
  help        Help about any command
  profile     Manage different Parseable targets
  query       Run SQL query on a log stream
  role        Manage roles
  stream      Manage streams
  tail        tail a log stream
  user        Manage users
  version     Print version

Flags:
  -h, --help      help for pb
  -v, --version   Print version

Use "pb [command] --help" for more information about a command.

Set up a profile

By default, pb ships with a profile called demo. This profile points to the Parseable demo installation. We'll use this profile (i.e. the Demo server) to get started with pb right away in this blog post.

If you'd rather use your own Parseable installation, you can create a new profile using the pb profile create command. For example, to create a profile called local pointing to a Parseable installation running on http://localhost:8000, run the following command.

pb profile add local http://localhost:8000 admin admin

Stream management

pb allows creating, deleting, listing stream on the Parseable server. You can also fetch info about a stream using pb. To list all streams

➜  pb stream list  

•  frontend
•  druide2e
•  playminio
•  backend

To create a new stream, run the following command

➜  pb stream create teststream

Fetch info about a stream using the info command. For example, to fetch info about the backend stream, run the following command

➜  pb stream info backend

Info:
  Event Count:     521311
  Ingestion Size:  171 MB
  Storage Size:    23 MB
  Compression Ratio:    86.58%

No retention period set on stream

No alerts set on stream

Tail logs

pb can tail logs in real-time. The Parseable server exposes Arrow Flight based streaming endpoint that pb hooks on, to fetch logs in real-time. This means these logs are stored in Parseable and are queryable at a later point in time. But with live tail, you can see the logs in real-time as they hit the Parseable server.

The tail command takes only the stream name, and returns the logs in real-time. Run the following command to tail logs from a given stream

pb tail frontend | jq .

To filter logs with jq, you can use below approach

pb tail frontend | jq '. | select(.method == "PATCH")'

You can also add additional filters to the tail response with grep. For example

pb tail frontend | grep "POST" | jq .

To stop tailing logs, press Ctrl + C.

Query logs

Above we saw how to tails logs from a given stream. Lets now see how to query logs using pb. Let's start with a simple query. Run the following command to query all logs from the last 1 minute.

pb query "select * from frontend" --from=1m --to=now | jq . | less

The pb query command takes a SQL query as input and returns the results in JSON format. The --from and --to flags are used to specify the time range for the query. The jq command is used to format the JSON output. Finally, the less command is used to paginate the output.

Next, let's try a filter in the SQL query. Run the following command to query all logs from the last 1 minute, where the status field is 500.

pb query "select host, id, method, status from frontend where status = 500" --from=1m --to=now | jq . | less

You can now also direct the output to a file. Run the following command to query all logs from the last 1 minute, where the status field is 500 and direct the output to a file called 500.json.

pb query "select host, id, method, status from frontend where status = 500" --from=1m --to=now | jq . > 500.json

You can also add additional filters to the query response with grep. For example

pb query "select host, id, method, status from frontend where status = 500" --from=1m --to=now | grep "POST" | jq . | less

Conclusion

In this post, we saw how to wrangle log data in the terminal with pb. We also saw how to tail logs in real-time and query logs using pb. pb is a modern, fast and intuitive CLI for log management. It is built for developers who like the comfort of their terminal. pb is pluggable via the linux pipe | to other text manipulation tools like grep, awk, sed etc.

0
Subscribe to my newsletter

Read articles from Nitish Tiwari directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Nitish Tiwari
Nitish Tiwari