Welcome to My Technical Journal — Systems & Network Architecture by Q

Quang NguyenQuang Nguyen
7 min read

Hey there — I’m Q (Quang Nguyen), a Systems Engineer (and sometimes Network Engineer, depending on the day) who enjoys building systems that don’t fall apart the second you look away. I started this blog not just to show what I’ve done, but to talk through the why, the how, and the “oh wow, that actually worked” moments that come with working in infrastructure.

I’ve been officially working as a Systems Engineer for almost a year now — still early in the game, but already thrown into the deep end in the best way. From designing Proxmox clusters and Ceph storage pools to rethinking how our networks handle failover and segmentation, I’ve had a front-row seat to what makes systems reliable. And breakable. And fixable. All of it.

Oh — and I just wrapped up my Master’s in Information Technology in December 2024 (yep, still feels surreal saying that). A year before that, I finished my Bachelor’s. So yeah — a lot of learning in a short time, and I don’t plan on stopping.

This blog is part diary, part documentation, part “wait, let me try to explain this better.” Whether you’re an engineer, a student, or just curious what Ceph or SD-WAN or failover actually looks like when someone’s really doing it — welcome. I’m figuring it out as I go, and I’m bringing you with me.


What I’ve Done So Far~

While I don’t have decades of experience, I’ve executed projects most engineers don’t touch until years into their careers. Here's a glimpse into what I’ve built — and what I’m actively planning:

  1. 🖥️ System Design & Virtualization

  • Built and deployed a 3-node Proxmox HA cluster integrated with Ceph for distributed, fault-tolerant storage.

  • Designed a Proxmox Backup Server (PBS) + TrueNAS backup system based on the 3-2-1 strategy.

  • Implemented versioned backups, deduplication, and instance-level restores across the cluster.


  1. 🌐 Network Architecture & Security

  • Migrated from Cisco ASR 1002X to SonicWall NSa 2700 in HA mode with dual-ISP failover and SD-WAN-like behavior

  • Architected VLAN segmentation and ACLs across internal Cisco Catalyst switches

  • Integrated internal Windows DNS and external GoDaddy DNS server to securely publish internal web services. (LOL I shouldn’t done this in the beginning since GoDaddy doesn’t have full support feature for auto-failover, my bad on this one, lesson learned :D).

🛠️ In Progress: Remote Branch Expansion & SD-WAN Strategy

Currently in the planning and evaluation phase of expanding secure connectivity to remote branches, including:

  • Evaluating options between GRE over IPsec vs. VTI over IPsec for site-to-site tunnel configuration (currently this is what I’ve in my mind, if I see future better option to for this, I’ll make sure to changed - 08/03/2025)

  • Designing SD-WAN logic for intelligent route failover and link optimization

  • Selecting optimal tunnel types based on overhead, route control, and application performance

  • Planning centralized NAT and firewall control via SonicWall while ensuring scalable branch growth

These aren’t just theoretical ideas — they’re live planning efforts, and they’ll be executed in production once tested, validated, and approved. All designs will be documented here in detail as they evolve.


  1. -🔒 Authentication, MFA & Certificate Handling

Security has been a core consideration in every system I’ve designed — especially at the user authentication level. Instead of relying on traditional VPN or RADIUS-based models, I implemented desktop-level MFA integration that directly reinforces endpoint security without overcomplicating the infrastructure.

Key achievements include:

  • Deployed Okta Desktop MFA integration with Active Directory, allowing users to securely log in at the Windows login screen using push notifications when connected to the corporate network

  • Designed an offline fallback strategy using time-based one-time passwords (TOTP) for scenarios where users are disconnected (e.g., traveling, production floor, or off-grid environments)

  • Eliminated dependency on RADIUS or cloud SSO for initial login — enabling a hybrid MFA workflow that adapts to both online and offline user states

  • Managed and automated SSL certificate provisioning through Let’s Encrypt (DNS-01 challenge via DuckDNS) and GoDaddy, fully integrated with NGINX web services. (This will changed since I’ve planning to migrate to Cloudflare in order to achieved truly auto-failover).


  1. 🧱 Hardware & Infrastructure Visioning

When building out infrastructure, I don’t just look at what's immediately needed — I plan for what comes next. Every hardware decision I make considers scalability, fault tolerance, and real-world maintenance conditions.

Most of the work I've done started with evaluating the current limitations of our stack and identifying where performance bottlenecks or single points of failure existed. From there, I designed solutions around hardware that could scale, be easily integrated, and support future high-availability and storage demands.

This led to selecting and implementation base on the current infrastructure available equipment:

  • HPE ProLiant DL360 Gen9 servers for their reliability, dual-CPU support, and modular expandability

  • SSD-backed drives to support Ceph OSD nodes, ensuring performance consistency under cluster load

  • A physical topology that supports multi-path networking, redundant power, and efficient rack design

My role wasn’t just plugging in hardware — it was consulting internally to answer:

  • What performance targets do we need to hit?

  • What’s the cost-benefit of scaling now vs. later?

  • How do we design for disaster recovery from day one?

I worked closely with stakeholders and team members to ensure hardware selection aligned with both technical and business goals, including future Ceph expansion, faster VM provisioning, and reliable Proxmox HA behavior.

Infrastructure design isn’t just about what you have — it’s about what you can grow into without redoing everything later. Every piece of hardware I recommend is part of that bigger picture.


🪞 Reflecting on the Journey So Far

I’ll be honest — I didn’t set out expecting to be the person responsible for clustered systems, dual-ISP failover, or high-availability architecture this early in my career. I just followed the work, asked questions, and built what needed to be built.

I still hold a CCNA (LOL barely got the CCNA last year August 2024) and I’m studying for Cisco’s ENCOR 350-401, but most of what I know hasn’t come from certifications — it’s come from hands-on troubleshooting, real deployments, and asking "What happens if this fails?" over and over again. (I’ll be lie if I said I do not have any mentor LOL, so yes I do have a mentor who I asked question regarding of how things work, obtain and validate the designs that I build which put into practice and test for failure, trust me it’s suffering but also enjoying as the same time)

A lot of engineers don’t get to touch this kind of work until years down the line. I see that not as pressure, but as opportunity. And the more I build, the more I realize how much there is still to learn — which is exactly why I’m here.


✍️ Why I’m Writing This Blog

This isn’t just a technical blog — it’s more like a working journal.

I wanted a space to capture everything I’ve been doing in one place: the ideas, the diagrams, the "oh no" moments and the "yes, finally" fixes. I’ve found that writing things down helps me process what I’ve learned — and if it happens to help someone else along the way, even better.

So if you’re here just to learn, to compare notes, or to find a config snippet you forgot — welcome. This blog isn’t about posturing or proving anything. It’s just my way of recording the messy, evolving, and (sometimes) pretty cool things I get to work on.


🔍 What You’ll Find Here

Some things I’ll be diving into:

  • Proxmox + Ceph clustering and how to build it with reliability in mind

  • Failover design using SonicWall, dual ISPs, and NAT trickery

  • Site-to-site planning with GRE/IPsec, VTI, and SD-WAN logic (yes, we’ll compare them)

  • Backup strategies with PBS, TrueNAS, and recovery validation

  • MFA strategies for desktop login — even when you’re offline

  • Occasional thoughts about “why this matters” and what I’d do differently next time

If I can explain something clearly, it means I really understand it — so this blog is as much for me as it is for you.


🙌 A Quick Note Before You Go

Everything I share here is real — tested, implemented, or in active planning — but always scrubbed of sensitive info. I’m writing with transparency, but with care.

If you’re someone building your first homelab, managing real infrastructure, or just trying to understand what a Ceph OSD is supposed to do — I hope you find something here that clicks.

Thanks for reading.
Hope to see you around the next post.

— Q

0
Subscribe to my newsletter

Read articles from Quang Nguyen directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Quang Nguyen
Quang Nguyen