Optimizing SwitchMap-NG: Efficient Device Metrics Collection and Front

Modern systems need to do more than just display data — they need to collect it reliably, store it efficiently, and expose it cleanly for frontend components. In my recent work on SwitchMap-NG, a network monitoring system, I focused on strengthening these foundations by extending the backend to capture detailed performance metrics while optimizing the frontend to keep the interface responsive and performant. The effort centered on two main areas: building robust metrics collection pipelines and applying application-level optimizations for smoother user experience.

Collecting and Storing Metrics

Monitoring systems rely on a reliable pipeline: data collection → storage → API exposure → client consumption. I extended this pipeline to handle system metrics more effectively.

1. Extending Metric Queries

Implemented new query functions for Cisco, Juniper, and generic devices to capture CPU utilization, memory usage, and uptime.

Designed the methods to return a consistent, nested dictionary format, simplifying downstream processing.

Example for cisco devices:

  def system(self, oidonly=False):
      """Return CPU and memory stats for Cisco devices."""
      from collections import defaultdict

      final = defaultdict(lambda: defaultdict(dict))

      # CPU OID
      cpu_total_oid = ".1.3.6.1.4.1.9.9.109.1.1.1.1.10"
      cpu_values = self.snmp_object.swalk(cpu_total_oid) or {}
      final["cpu"]["total"] = {"value": sum(int(v) for v in cpu_values.values() if v is not None)}

      # Memory OIDs
      used_oid = ".1.3.6.1.4.1.9.9.48.1.1.1.5"
      free_oid = ".1.3.6.1.4.1.9.9.48.1.1.1.6"
      used_values = self.snmp_object.swalk(used_oid) or {}
      free_values = self.snmp_object.swalk(free_oid) or {}
      final["memory"]["used"] = {"value": sum(int(v) for v in used_values.values() if v is not None)}
      final["memory"]["free"] = {"value": sum(int(v) for v in free_values.values() if v is not None)}

      return final

This ensures that all metrics, regardless of vendor, are uniformly structured, making them easy to store and query.

2. Database Modeling (MySQL)

To persist historical metrics, I designed a dedicated MySQL table optimized for append-only logging:

class DeviceMetricsHistory(BASE):
    __tablename__ = "smap_device_metrics_history"
    __table_args__ = {"mysql_engine": "InnoDB"}

    id = Column(BIGINT(unsigned=True), primary_key=True, autoincrement=True)
    hostname = Column(VARBINARY(256), nullable=False, index=True)
    timestamp = Column(DateTime, nullable=False, default=datetime.datetime.utcnow, index=True)
    uptime = Column(BIGINT(unsigned=True), nullable=True)
    cpu_utilization = Column(Float, nullable=True)
    memory_utilization = Column(Float, nullable=True)

Highlights:

Append-only inserts: Each poll creates a new row with a timestamp to maintain a complete history.
Indexed for performance: Queries on hostname and timestamp remain fast even as data grows.
ORM-friendly: Supports batch inserts to reduce transaction overhead during polling.

3. Exposing Metrics via GraphQL

To make device metrics accessible for the frontend efficiently, I created a GraphQL node, connection, and resolver:

class DeviceMetrics(SQLAlchemyObjectType, DeviceMetricsAttribute):
    """Device metrics node with decoded hostname."""

    class Meta:
        model = DeviceMetricsModel
        interfaces = (graphene.relay.Node,)

class Query(graphene.ObjectType):
    """Define GraphQL queries."""

    node = relay.Node.Field()

    deviceMetrics = BatchSQLAlchemyConnectionField(
        DeviceMetrics.connection, hostname=String()
    )

    def resolve_deviceMetrics(self, info, hostname=None, **kwargs):
        query = DeviceMetrics.get_query(info)
        if hostname:
            query = query.filter(
                DeviceMetricsModel.hostname == hostname.encode()
            )
        return query

Benefits:

Single query endpoint: Fetch metrics for all devices or a specific device.
Handles VARBINARY safely: Ensures hostname filtering works correctly in MySQL.
Efficient for large datasets: Supports batch fetching with BatchSQLAlchemyConnectionField.
Clean API for frontend: Reduces redundant endpoints and payload size, simplifying integration with charts and dashboards.

This work completes the pipeline from backend metric collection to frontend visualization, complementing the metric queries, GraphQL schema, and MySQL model.

Device Metrics in the UI

With the backend in place, the collected device metrics are visualized in SwitchMap-NG:

Device Overview: Shows uptime, CPU, and memory usage in responsive charts.
History Page: Step line charts track trends over time, with time-range filters and autocomplete search.
UX Optimizations: Memoization and responsive layouts keep charts smooth and aligned across different screen sizes.

Frontend Performance Optimizations

Collecting metrics is only half the story — making them efficient to query and display is equally important.

Reducing Redundant Work

Used memoization to prevent re-rendering of static components when only chart data changes.
Limited query load by fetching metrics per device rather than all devices at once.

Handling Large Datasets

Restricted custom date ranges to 180 days to prevent overly heavy queries.
Considering next steps: horizontal scrolling, aggregating data points, or enabling zoom in charts.

Improving Query Efficiency

Resolved mismatches between text and VARBINARY formats in database columns.
Reduced response size by returning only the fields needed by the UI.

Next Steps

Introduce and fully test the frontend, validating charts, tables, and interactions.
Explore simulated environments to test backend performance at scale.

Takeaways

The key lesson: performance and query efficiency need to be considered early. By cleaning up the schema and structuring the database and ORM models efficiently, the system is better prepared to handle growth without slowing down.

Next, I’ll focus on dataset scaling strategies and smarter testing approaches to maintain strong performance as usage increases.

Optimizing Metrics Pipelines: From Database Design to Frontend Performance