Chaos Engineering | Type of Attacks
Today’s advanced distributed software systems must be tested for potential weaknesses and faults. Chaos engineering is the process of testing a distributed computing system to ensure that it can tolerate unexpected disruptions. It relies on concepts underlying chaos theory, which focus on random and unpredictable behavior. If you are interested in knowing more about Chaos Engineering and History please refer this article from Gremlin
In this article, we will discuss various categories of attacks and some use cases.
Resource Attack
Generate load across CPU, Memory and Storage devices
Help in preparation for sudden load change, validating auto scaling, test monitoring and alerting config. Its like preparing our system for Black Friday sale in advance.
CPU Attack
CPU attack sends heavy traffic on system which can help to identify stability and performance undrer stress. We can also validate auto scaling and alerting mechanism.
Memory Attack
Memory leak is the top reason for "Out Of Memory" in production. Memory leaks happen when applications consume more memory resources than release. This attack will help to validate the hypothesis for memory intensive work load like in-memory cache, and machine learning models. It will also help in cloud migration by simulating auto-scaling configuration.
Disk Attack
Disk attacks are often used to simulate reading or writing a large data set, such as a restored backup, or replicated database. It can also help in identifying loopholes in automatic disc cleanup process.
I/O Attack
An IO attack can help you prepare for slower storage solutions by simulating their performance. This attack help to validate disk heavy work load (batch process which read/write from disk) and effectiveness of in-memory cache.
State Attack
State attacks change the state of your environment by terminating processes, shutting down or restarting hosts, and changing the system clock. This lets you prepare your systems for unexpected changes in your environment such as power outages, node failures, clock drift, or application crashes.
Process Killer Attack
Process killer attacks allow teams to terminate a specific process or set of processes. This will ensure watch-dog effectiveness for application/service restart and testing leader re-election in clustered work load.
Shutdown Attack
This is similar to chaos monkey where entire host is shutdown which enable team to build highly resilient system. This will help to validate DR scenarios like automatic work load migration, replication and high availability of clustered workload.
Time Travel Attack
Time travel attacks allow you to change the system clock. This lets you prepare for scenarios such as Daylight Savings Time (DST), clock drift between hosts, and expiring SSL/TLS certificates.
Network Attack
Network attacks let you simulate unhealthy network conditions including dropped connections, high latency, packet loss, and DNS outages. This lets you build applications that are resilient to unreliable network conditions.
Blackhole Attack
Blackhole attacks help you simulate outages by dropping network traffic between services. This lets you uncover hard dependencies, test fallback and failover mechanisms, and prepare your applications for unreliable networks. We can also validate monitoring and alerting mechanism for cluster.
Latency Attack
Latency is the amount of time taken for a network request to travel from one network endpoint to another. The Latency attack injects a delay into outbound network traffic, letting you validate your system’s responsiveness under slow network conditions. This will also help in circuit breaker configuration for retry and timeout threshold.
DNS Attack
Recently we have seen Akamai DNS failure caused many popular becoming un-reachable. More info here The DNS attack simulates a DNS outage by blocking network access to DNS servers. This lets you prepare for DNS outages, test your fallback DNS servers, and validate DNS resolver configurations.
Packet Loss Attack
This attack is very helpful for streaming services, such as live video or multiplayer gaming which rely on a high throughput of data. When there is network congestion, many packets are queued and some packages may lost due to the queue capacity threshold on your hardware. Packet Loss attacks let you replicate this condition and simulate the end-user experience and configuration of the replay mechanism for a better user experience.
Summary
The article provides a comprehensive overview of different types of chaos engineering attacks. It explains how chaos engineering can help identify and mitigate failures in complex systems. The article dives into various types of attacks, such as CPU, memory, and network attacks, and how they can impact the system's behavior. It also discusses how to conduct chaos engineering experiments and explains the importance of using it to ensure system reliability and resilience. Overall, the article provides valuable insights into the world of chaos engineering and highlights the importance of implementing it in modern software systems.
In the next article, we will discuss other Chaos Engineering concepts.
Subscribe to my newsletter
Read articles from Amit Himani directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Amit Himani
Amit Himani
As a Senior Cloud Architect at a well-known product-based company, I possess a wealth of experience in hybrid cloud technologies and a passion for performance engineering, SRE, and Chaos engineering. In my leisure time, I take pleasure in staying abreast of emerging technologies and keeping up with industry trends. I also enjoy sharing my knowledge and insights with others by writing informative articles. I firmly believe in the significance of continuous learning and personal development to achieve success. Through my writing, I aspire to inspire and motivate others to pursue their own growth and professional aspirations. Thank you for visiting my blog. I hope you find the content informative and engaging. Please feel free to leave a comment or contact me if you have any queries or would like to connect.