Troubleshooting concept for EC2..

Ankita LunawatAnkita Lunawat
4 min read

1. Check EC2 Instance Status

  • Verify Instance State: Ensure that the EC2 instance is running.

    • Go to the AWS Management ConsoleEC2 DashboardInstances → Check the instance's state (Running, Stopped, Terminated).
  • Instance Status Checks:

    • Go to the instance details and check Instance Status Checks.

    • Ensure both System status check and Instance status check are passed. If not, there might be underlying hardware or networking issues.

2. Check Security Group and NACLs

  • Security Groups: Ensure that the security group attached to your instance allows the necessary inbound/outbound traffic.

    • For SSH (Linux): Allow inbound traffic on port 22 (SSH).

    • For RDP (Windows): Allow inbound traffic on port 3389 (RDP).

    • Ensure outbound rules are not restrictive (typically allow all outbound for simplicity).

  • Network ACL (NACL):

    • Check if your NACL is allowing traffic to and from the instance’s subnet (both inbound and outbound rules).

    • Ensure port 22 (SSH) or port 3389 (RDP) is allowed.

    • Double-check that there are no DENY rules for your instance's IP range.

3. Check Route Table and IGW (Internet Gateway)

  • Route Table: Verify that your subnet's route table has a route to the Internet Gateway (IGW) for internet access.

    • Ensure there’s a route like 0.0.0.0/0 pointing to the IGW.
  • Internet Gateway (IGW): Ensure that your instance is in a public subnet and associated with an Internet Gateway.

    • For public access, instances must have a public IP or Elastic IP attached and be routed through the IGW.

4. Key Pair and SSH Configuration

  • Correct Key Pair: Ensure you are using the correct .pem file for SSH (Linux). If you lose the private key, you won't be able to SSH into the instance.

  • * File Permissions: Verify that your key file permissions are properly set.

      chmod 400 your-key.pem
    
  • SSH Configuration:

    • Use the correct syntax for SSH:

        ssh -i your-key.pem ec2-user@public_ip_address
      

      Double-check that you're using the correct user (ec2-user, ubuntu, centos, etc., depending on the AMI).

5. Elastic IP / Public IP

  • Elastic IP: If your instance is in a public subnet, check if it has a public IP or Elastic IP assigned.

  • Expired Public IP: If your instance is stopped and then started, the public IP might change unless you have an Elastic IP. Make sure you're using the latest public IP when trying to connect.

6. Check EC2 Instance's OS and Firewall

  • Instance-Level Firewall (iptables/UFW):

    • If your instance has an internal firewall (like iptables or ufw), check that it isn’t blocking SSH, RDP, or other required services.

    • You can use EC2's System Manager Session Manager (if previously configured) to get access without relying on SSH.

  • Disable Firewall Temporarily:

      sudo ufw disable   # For UFW (Ubuntu)
      sudo systemctl stop iptables   # For iptables (RedHat/CentOS)
    

7. SSH Through Another Instance (Bastion Host)

  • If your instance is in a private subnet or you can't SSH directly, use an EC2 instance in the public subnet (Bastion Host) to SSH into the private instance by setting up the Bastion Host and ensuring proper key exchange for SSH forwarding.

8. EC2 Serial Console (for Nitro-based instances)

  • AWS provides an EC2 Serial Console for Nitro-based instances that allows low-level system debugging (available for both Windows and Linux).

  • This is useful when you lose SSH access due to configuration errors (e.g., invalid SSHD settings).

  • Enable it via the AWS ConsoleEC2 DashboardInstancesActionsInstance SettingsEC2 Serial Console.

9. Verify IAM Roles and Instance Profile

  • Check if the IAM role attached to your instance allows proper access (if needed). Misconfigured roles might lead to access issues, especially if you're using AWS services like SSM to connect.

10. VPC Peering / VPN Issues (for private instances)

  • If you're connecting through VPC Peering or VPN, ensure that the routes, security groups, and NACLs on both ends are configured correctly.

  • Confirm if the DNS resolution and VPC DNS settings are correct for instances in private subnets.

11. System Logs and Errors

  • Review the System Logs from the AWS Console.

    • Navigate to EC2 DashboardInstancesInstance ActionsGet System Logs.

    • Look for any errors that could indicate issues with the OS, SSH configuration, or network.

12. Reboot or Stop/Start Instance

  • Reboot the instance to refresh the instance’s state.

  • If a reboot doesn’t work, try stopping and then starting the instance. This might assign a new public IP if you’re not using an Elastic IP, and resolve underlying instance state issues.

13. Verify SSM Agent for AWS Systems Manager

  • If AWS Systems Manager (SSM) is installed and configured on the instance, you can connect using Session Manager without SSH access.

  • Ensure the SSM Agent is running on the instance and that it has the appropriate IAM role attached.

0
Subscribe to my newsletter

Read articles from Ankita Lunawat directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Ankita Lunawat
Ankita Lunawat

I am a dedicated and experienced Cloud Engineer with two years in the industry, specializing in designing, implementing, and managing scalable and secure cloud infrastructures. With a strong foundation in AWS, Azure, and GCP, I excel at leveraging cloud services to optimize performance, enhance security, and reduce operational costs. My expertise includes automated deployment pipelines, infrastructure as code (IaC) with tools like Terraform and container orchestration using Kubernetes and Docker. Throughout my career, I've collaborated with cross-functional teams to deliver robust cloud solutions, ensuring high availability and fault tolerance. I'm passionate about staying at the forefront of cloud technology trends and continuously enhancing my skill set to provide innovative solutions that drive business success. Whether it's migrating legacy systems to the cloud or architecting new cloud-native applications, I bring a strategic approach to every project, focusing on efficiency, scalability, and reliability. In addition to my technical skills, I am an advocate for DevOps practices, promoting a culture of collaboration and continuous improvement within development and operations teams. My commitment to learning and adapting to new technologies ensures that I can meet the evolving needs of any organization and deliver top-tier cloud solutions.