Improving Tag Quality in CKAN: Dynamic Modals for Missing Tag Metadata

Table of contents
- Problem
- Solution Overview
- Step 1: Fetch Missing Tag Metadata (Custom Helper Function)
- Step 2: Add a Custom API for JS to Query
- Step 3: Enrich package_show with Metadata
- Step 4: Render Modal Conditionally in Template
- Step 5: JS to Dynamically Render the Modal
- Step 6: Submit Metadata Updates to Custom Controller
- Result

CKAN supports tagging datasets, but often tags are added without proper context — no description, no source URL. This leads to inconsistent reuse and confusion across datasets.
In this implementation, I built a dynamic modal that appears only when a user has added tags missing metadata, and only for that user. The modal prompts them to define what the tag means and where it came from, improving long-term tag governance.
This post covers the problem, the solution approach, and code-level steps I followed.
Problem
One of the trickiest parts of improving tag quality in CKAN wasn’t just prompting users — it was detecting when metadata was missing dynamically.
CKAN pages are rendered server-side, and while we can display pre-determined content, we can't easily decide whether to show a modal based on data only known after the page loads — such as which tags a user created and whether they’ve defined metadata for them.
The frontend needed to know:
Who the current user is
Which tags exist on the current dataset
Whether any of those tags (authored by the user) are missing description or source
All of this had to happen without affecting other users and without requiring full page reloads.
That’s where the real challenge lay: enabling real-time, user-specific modal logic based on backend metadata state.
Solution Overview
I implemented a system where:
Tags created by a user are checked for missing description/source
A modal prompts only that user on the dataset page
The modal allows inline definition of all missing metadata
The backend stores these definitions via a custom endpoint
This solution uses CKAN’s package_show
override, a custom API action, frontend JavaScript injection, and modal control via Bootstrap.
Step 1: Fetch Missing Tag Metadata (Custom Helper Function)
def get_incomplete_tags(dataset_id, user):
missing_tags = []
if not user:
return missing_tags
context = {'model': model, 'session': model.Session, 'user': user}
dataset = toolkit.get_action("package_show")(context, {"id": dataset_id})
for tag_data in dataset.get("tags", []):
tag_name = tag_data.get("name")
tag = Session.query(Tag).filter(func.lower(Tag.name) == func.lower(tag_name)).first()
tag_meta = Session.query(TagMetadata).filter(
TagMetadata.tag_id == tag.id,
TagMetadata.created_by == user
).first()
if tag_meta and (tag_meta.description == '' or tag_meta.source_url == ''):
missing_tags.append({"name": tag.name, "id": str(tag.id)})
return missing_tags
This returns only the tags for which the current user has authoring responsibility and missing metadata.
Step 2: Add a Custom API for JS to Query
@toolkit.get_action("missing_tag_metadata")
def missing_tag_metadata_action(context, data_dict):
user = context.get("user")
dataset_name = data_dict.get("dataset_name")
return {
"success": True,
"result": get_incomplete_tags(dataset_name, user)
}
This gives the frontend a way to determine — after page load — if the modal should be triggered.
Step 3: Enrich package_show
with Metadata
def custom_package_show(context, data_dict):
result = package_show(context, data_dict)
enriched_tags = fetch_tags_with_metadata(result.get("id"))
for tag in result.get("tags", []):
tag["description"] = next((t["description"] for t in enriched_tags if t["name"] == tag["name"]), None)
tag["source_url"] = next((t["source_url"] for t in enriched_tags if t["name"] == tag["name"]), None)
return result
This ensures that every dataset page loads enriched tag data, so we can decide what’s missing.
Step 4: Render Modal Conditionally in Template
{% if tags_missing_metadata and tags_missing_metadata|length > 0 %}
<script>
const missingTags = JSON.parse('{{ tags_missing_metadata|tojson|safe }}');
document.addEventListener("DOMContentLoaded", function () {
showMetadataModal(missingTags);
});
</script>
{% endif %}
If any missing tags exist for the current user, we inject a script to trigger the modal after load.
Step 5: JS to Dynamically Render the Modal
function showMetadataModal(tags) {
const tableBody = document.querySelector("#metadataTable tbody");
tableBody.innerHTML = "";
tags.forEach(tag => {
const row = document.createElement("tr");
row.innerHTML = `<td>
<input type="hidden" name="tag_ids[]" value="${tag.id}">
<input class="form-control" value="${tag.name}" disabled>
</td>
<td><input class="form-control tag-desc" name="descriptions[]" required></td>
<td><input class="form-control tag-source" name="sources[]" required></td>`;
tableBody.appendChild(row);
});
new bootstrap.Modal(document.getElementById("metadataModal"), {
backdrop: "static", keyboard: false
}).show();
}
This renders all missing tags into a Bootstrap modal, ready for inline definition.
Step 6: Submit Metadata Updates to Custom Controller
JavaScript Submit Handler:
fetch("/custom/tags/metadata/update", {
method: "POST",
body: new FormData(document.getElementById("metadataForm"))
})
.then(response => response.json())
.then(() => window.location.href = "/dataset")
.catch(() => alert("Failed to update metadata"));
Python Route to Handle the Post:
@blueprint.route('/custom/tags/metadata/update', methods=['POST'])
def update_tag_metadata():
tag_ids = request.form.getlist('tag_ids[]')
descriptions = request.form.getlist('descriptions[]')
sources = request.form.getlist('sources[]')
for tag_id, desc, src in zip(tag_ids, descriptions, sources):
if tag_id and desc and src:
update_tag_metadata_in_db({
'id': tag_id,
'description': desc,
'source_url': src
})
h.flash_success('Tag metadata updated successfully')
return {'success': True}
Result
✅ Modal appears only when the current user’s tags are incomplete
✅ Tag metadata can be updated without navigating away
✅ Users are nudged toward better data hygiene, with zero disruption to viewers
✅ Easily extendable for additional metadata fields
This pattern — fetch on load → check for gaps → trigger modal — can be reused across CKAN wherever metadata enforcement is needed.
Subscribe to my newsletter
Read articles from Shanit Paul directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
