Introducing HashBlog - Generating a automated blog on every commit
What is HashBlog ?
To put it simply, HashBlog is a GitHub action through which you can automatically generate a blog describing the changes you made in your commits.
You can test out the actions by adding a workflow in your repository with the following code, additionally you need to add your Hashnode access token in your repository's action secrets.
on: [push]
jobs:
blog-generation-job:
runs-on: ubuntu-latest
name: Automatic hashnode blog generation
steps:
- name: HashBlog
id: hashblog
uses: srajansohani/blog-the-change@v1.1
with:
blog-domain: #your sub-domain like "aquiem.hashnode.dev"
env:
HASHNODE_ACCESS_TOKEN: ${{ secrets.HASHNODE_ACCESS_TOKEN }}
Adding customisation
Customising tags
on: [push]
jobs:
blog-generation-job:
runs-on: ubuntu-latest
name: Automatic hashnode blog generation
steps:
- name: HashBlog
id: hashblog
uses: srajansohani/blog-the-change@v1.1
with:
blog-domain: #your sub-domain like "aquiem.hashnode.dev"
add_tags: true #to make the action automatically add tags based on the files changed
tags: "[javascript, ts]" #make sure to add the tag slugs and not the tag name
env:
HASHNODE_ACCESS_TOKEN: ${{ secrets.HASHNODE_ACCESS_TOKEN }}
Adding custom cover image
on: [push]
jobs:
blog-generation-job:
runs-on: ubuntu-latest
name: Automatic hashnode blog generation
steps:
- name: HashBlog
id: hashblog
uses: srajansohani/blog-the-change@v1.1
with:
blog-domain: #your sub-domain like "aquiem.hashnode.dev"
cover-image-url: #your cover image url
env:
HASHNODE_ACCESS_TOKEN: ${{ secrets.HASHNODE_ACCESS_TOKEN }}
Adding blog to a series
on: [push]
jobs:
blog-generation-job:
runs-on: ubuntu-latest
name: Automatic hashnode blog generation
steps:
- name: HashBlog
id: hashblog
uses: srajansohani/blog-the-change@v1.1
with:
blog-domain: #your sub-domain like "aquiem.hashnode.dev"
series-slug: #your series slug (make sure to create the series before adding the slug)
env:
HASHNODE_ACCESS_TOKEN: ${{ secrets.HASHNODE_ACCESS_TOKEN }}
You can view the action on the marketplace here
How we made HashBlog
The original idea was to have to a GitHub action which would generate parts of the blog which directly reference the code and their explanation, after which the user could review the blog and add their personal touches. But, we could not find the mutation to create a draft via the GraphQL APIs (I believe that it is already posted in the feature request channel and would love to see an APIs for it).
So, instead we moved to directly publishing the blog on the user's sub-domain. But, this caused a bit of a problem as, we could no longer give the user the ability to add their personal touches (which I believe is very important for every blog) although one could go and edit a published blog but, I don't think its a very good idea to add that extra hassle on the user itself. So, ultimately we tried to give as much control as possible to the user via additional configurations.
So, our scope broadly included the following things
Get the necessary information required to generate the blog such as the changes in the code (i.e the git-diff of the commit) along with the issues that the commit references and solves.
Use Gemini's APIs to generate the content for the blog.
Use Hashnode's GraphQL APIs to actually publish the blog.
Extracting the commit details from GitHub
Initially, we though that this was going to be one the of biggest challenges as we had to get the git-diff of every file and also consider multiple local commits and generate the blog on the entire changes.
But, with a bit of googling and a lot of staring at the action's payload, we found a pretty simple solution. Apparently, if we are able to get the base commit's sha and the head's sha, GitHub's octokit already has .compareCommits() functions which also return the file diffs. So, with this our code actually got a lot simpler.
import { octokit } from "./octokit.js";
const getDiffData = async (payload) => {
const repository = payload.repository;
const owner = repository.owner.login;
const repo = repository.name;
const commitSha = payload.after;
const commit = await octokit.repos.getCommit({
owner,
repo,
ref: commitSha,
});
const parentCommitSha = payload.before;
if (payload.before === "0000000000000000000000000000000000000000") {
return commit.data.files;
}
const commitDiff = await octokit.repos.compareCommits({
owner,
repo,
base: parentCommitSha,
head: commitSha,
});
return commitDiff.data.files;
};
export { getDiffData };
Similarly, we found that octokit also has functions to get the issues details which are refereed in the commit which we got in the action payload.
import { octokit } from "./octokit.js";
const extractNumbers = (inputString) => {
let pattern = /#(\d+)/g;
let matches = inputString.match(pattern);
if (matches) {
let numbers = matches.map(function (match) {
return parseInt(match.slice(1), 10);
});
return numbers;
} else {
return [];
}
};
const getIssues = async (payload) => {
const commitMessage = payload.head_commit.message;
const issueNumbers = extractNumbers(commitMessage);
const repository = payload.repository;
const owner = repository.owner.login;
let issues = [];
for (let i = 0; i < issueNumbers.length; i++) {
const issue = await octokit.rest.issues.get({
owner: owner,
repo: repository.name,
issue_number: issueNumbers[i],
});
issues.push(issue.data);
}
return issues;
};
export { getIssues };
Getting the actual blog content using Gemini's APIs
This one was a total opposite of the previous one. We though that this would be the more easiest one the implement and indeed the implementation was pretty easy but, the actually get the model to generate good summaries was a totally different thing.
As this was our first time using generative AI in a project, we did not know just how much a LLM could hallucinate. Because of which we spent the majority of our time on getting this part right. Even after that, the model still hallucinates. We even were thinking of fine-tune a separate model to generate code summary but, that idea was dropped as we could not find a good dataset and even if we fine-tune a free model, it would be difficult of have it generate better content than these LLMs.
Finally, we decided to have 4 sections in our blog (Overview, file wise changes, issues resolved and summary) and make separate API calls to generate content for the 4 sections and combine them.
Also the reason for choosing Gemini was just that it was free(although the rate limit is 60 QPM), but we are still looking into possibly fine-tune a public model.
Creating the blog on the user's sub-domain
This was pretty straightforward thanks to Hashnode's amazing docs and the graphQL playground.
First of all, we get the publication ID of the sub-domain.
const getPublicationID = async (blogDomain) => {
try {
let response = await fetch("https://gql.hashnode.com/", {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({
query: `
query Publication {
publication(host: "${blogDomain}") {
id
}
}
`,
}),
});
let responseData = await response.json();
return responseData.data.publication.id;
} catch (error) {
core.setFailed(error.message);
}
};
There are few more APIs calls that we needed to make in order to add user customisation. Such as getting the series-slug (if provided) and getting the tag-slugs for the provided tags.
const getSerieseID = async (blogDomain, seriesSlug) => {
try {
let response = await fetch("https://gql.hashnode.com/", {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({
query: `
query Publication {
publication(host: "${blogDomain}") {
series(slug: "${seriesSlug}") {
id
}
}
}
`,
}),
});
let responseData = await response.json();
return responseData.data.publication.series.id;
} catch (error) {
core.setFailed(error.message);
}
};
const getTagDetails = async (tagSlug) => {
try {
let response = await fetch("https://gql.hashnode.com/", {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({
query: `
query Tag {
tag(slug: "${tagSlug}") {
id,
name,
slug,
}
}
`,
}),
});
let responseData = await response.json();
return responseData.data.tag;
} catch (error) {
core.setFailed(error.message);
}
};
Finally we publish the blog using the user's Hashnode access token.
const publishBlog = async (blogDomain, inputData, seriesSlug = undefined) => {
try {
const publicationID = await getPublicationID(blogDomain);
inputData.input.publicationId = publicationID;
if (seriesSlug) {
const seriesID = await getSerieseID(blogDomain, seriesSlug);
inputData.input.seriesId = seriesID;
}
let response = await fetch("https://gql.hashnode.com/", {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: process.env.HASHNODE_ACCESS_TOKEN,
},
body: JSON.stringify({
query: `
mutation PublishPost($input: PublishPostInput!) {
publishPost(input: $input) {
post {
id
title
content{
text
}
url
tags{
name
}
}
}
}
`,
variables: inputData,
}),
});
const responseData = await response.json();
return responseData;
} catch (error) {
core.setFailed(error.message);
}
};
You can view the entire code here
Conclusion
I worked on this project along with my friend Srajan and this was actually our first Hackathon so, I hope that you find this interesting. There is still a lot of things which we want to add and if you have any suggests feel free to leave a comment on this blog or create an issue on our GitHub repository.
Subscribe to my newsletter
Read articles from Ishaan Dasgupta directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by