How to investigate Sentry Alert with Doctor Droid
Sentry
Sentry is one of the best tools in the industry right now for error and exception tracking. It has high quality SDKs across the stack and has great integrations as well as a powerful dashboard for users to learn about an exception in the code. You can read more about Sentry here.
Debugging an exception in Sentry
An exception could arise because of n-different reasons. It could be because of code change in the place where the exception came, it can be a code change upstream, it could be bad user input or it could even be just an edge case that hadn’t appeared until now. In case of canary deployments, it could get even trickier as different containers or users could be on different versions of the code.
In production, investigation of a simple looking Sentry issue can span across multiple data sources & contexts:
Your infrastructure & deployment resources like Kubernetes
Your code repository to check the code for recent changes or even analysing the flow of data
Your database/logs to check for user entered data
Discussion with internal team members regarding expected behaviour
Using Doctor Droid to debug the issue
Initiating an investigation
You can start investigation of an alert directly from the home page which has all the recent alerts.
Once an investigation is created, this is what it looks like.
Here are the key elements of the investigation panel:
The alerts that it's investigating
The recommended investigation strategy and preliminary data for your evaluation
Additional panels related to related investigations or alerts
Investigation Strategy
Now what is it that it's able to fetch till now? Depending on the alert context, the platform recommends different steps.
It identified that the first thing that it should check is the stack trace itself in Sentry. So it goes and fetches the stack trace from Sentry, including the culprit.
Once it's able to fetch that, it goes and looks for the recent code changes for the same stack trace that it identified within the your GitHub repository. It shows you the recent commits, the URLs, and you can go and check if this this is potentially something that was done in the last couple of days.
You can also then go and check if within your Kubernetes infrastructure if there was any recent deployment that could be correlated with it. Given that this is related to the prototype instance, we can see that here there is a couple of releases for prototype in the last one hour. So it could potentially be the reason for this instance, for this alert to actually come up. And now you have all the data here.
You can ask it for more data, chat with it, ask for more data.
And what's what's also good is that it'll give you references to either existing playbooks, dashboards, or any other data point that your system already has. So we have integrations with almost every tool that your monitoring and observability stack would potentially have.
And we also have options for you to self host these integrations so that the data remains within your own plane.
Try it today
If if this is something that looked exciting to you, we have a lot more demos coming up, like how do you auto investigate an API latency alert, or how do you investigate CPU utilisation alerts on your databases.
Visit www.drdroid.io and try it out for yourself for your own stack. We have a free trial that we provide for the tool. So if you have any questions, please reach out to us, and we'll be happy to answer.
Subscribe to my newsletter
Read articles from Siddarth Jain directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Siddarth Jain
Siddarth Jain
Hey! I'm currently building something cool for engineers at Dr Droid. Love discussing developer tools with anyone interested. I prefer to work remotely and spend time in nature.