Hadoop #Split-Brain Scenario & #Fencing

AATISH SINGHAATISH SINGH
2 min read

Let's #hadoop

๐Ÿ“Œ ๐–๐ก๐š๐ญ ๐ข๐ฌ #๐’๐ฉ๐ฅ๐ข๐ญ ๐๐ซ๐š๐ข๐ง ๐’๐œ๐ž๐ง๐š๐ซ๐ข๐จ ๐š๐ง๐ #๐Ÿ๐ž๐ง๐œ๐ข๐ง๐  ๐ข๐ง ๐‡๐š๐๐จ๐จ๐ฉ?

โœ” ๐˜๐˜ฏ ๐˜ต๐˜ฉ๐˜ฆ ๐˜ค๐˜ฐ๐˜ฏ๐˜ต๐˜ฆ๐˜น๐˜ต ๐˜ฐ๐˜ง ๐˜ฅ๐˜ช๐˜ด๐˜ต๐˜ณ๐˜ช๐˜ฃ๐˜ถ๐˜ต๐˜ฆ๐˜ฅ ๐˜ด๐˜บ๐˜ด๐˜ต๐˜ฆ๐˜ฎ๐˜ด, ๐˜ช๐˜ฏ๐˜ค๐˜ญ๐˜ถ๐˜ฅ๐˜ช๐˜ฏ๐˜จ ๐˜๐˜ข๐˜ฅ๐˜ฐ๐˜ฐ๐˜ฑ ๐˜ค๐˜ญ๐˜ถ๐˜ด๐˜ต๐˜ฆ๐˜ณ๐˜ด, ๐˜ข "#split_brain ๐˜ด๐˜ค๐˜ฆ๐˜ฏ๐˜ข๐˜ณ๐˜ช๐˜ฐ" ๐˜ข๐˜ฏ๐˜ฅ "#fencing" ๐˜ข๐˜ณ๐˜ฆ ๐˜ณ๐˜ฆ๐˜ญ๐˜ข๐˜ต๐˜ฆ๐˜ฅ ๐˜ค๐˜ฐ๐˜ฏ๐˜ค๐˜ฆ๐˜ฑ๐˜ต๐˜ด ๐˜ต๐˜ฉ๐˜ข๐˜ต ๐˜ฅ๐˜ฆ๐˜ข๐˜ญ ๐˜ธ๐˜ช๐˜ต๐˜ฉ ๐˜ต๐˜ฉ๐˜ฆ ๐˜ช๐˜ด๐˜ด๐˜ถ๐˜ฆ ๐˜ฐ๐˜ง ๐˜ฎ๐˜ข๐˜ช๐˜ฏ๐˜ต๐˜ข๐˜ช๐˜ฏ๐˜ช๐˜ฏ๐˜จ ๐˜ฅ๐˜ข๐˜ต๐˜ข ๐˜ค๐˜ฐ๐˜ฏ๐˜ด๐˜ช๐˜ด๐˜ต๐˜ฆ๐˜ฏ๐˜ค๐˜บ ๐˜ข๐˜ฏ๐˜ฅ ๐˜ข๐˜ท๐˜ฐ๐˜ช๐˜ฅ๐˜ช๐˜ฏ๐˜จ ๐˜ฅ๐˜ข๐˜ต๐˜ข ๐˜ค๐˜ฐ๐˜ณ๐˜ณ๐˜ถ๐˜ฑ๐˜ต๐˜ช๐˜ฐ๐˜ฏ ๐˜ช๐˜ฏ ๐˜ค๐˜ข๐˜ด๐˜ฆ ๐˜ฐ๐˜ง ๐˜ฏ๐˜ฆ๐˜ต๐˜ธ๐˜ฐ๐˜ณ๐˜ฌ ๐˜ฑ๐˜ข๐˜ณ๐˜ต๐˜ช๐˜ต๐˜ช๐˜ฐ๐˜ฏ๐˜ด ๐˜ฐ๐˜ณ ๐˜ค๐˜ฐ๐˜ฎ๐˜ฎ๐˜ถ๐˜ฏ๐˜ช๐˜ค๐˜ข๐˜ต๐˜ช๐˜ฐ๐˜ฏ ๐˜ง๐˜ข๐˜ช๐˜ญ๐˜ถ๐˜ณ๐˜ฆ๐˜ด ๐˜ข๐˜ฎ๐˜ฐ๐˜ฏ๐˜จ ๐˜ฏ๐˜ฐ๐˜ฅ๐˜ฆ๐˜ด.

โœ… ๐’๐ฉ๐ฅ๐ข๐ญ ๐๐ซ๐š๐ข๐ง ๐’๐œ๐ž๐ง๐š๐ซ๐ข๐จ:

โ–ช A split-brain scenario occurs when a cluster of nodes loses communication with each other, leading to the formation of isolated subclusters or partitions.

โ–ช Each partition continues to operate independently, potentially making decisions and processing data unaware of the other partitions' state.

โ–ช This can ๐ฅ๐’†๐’‚๐’… ๐’•๐’ ๐’…๐’‚๐’•๐’‚ ๐’Š๐’๐’„๐’๐’๐’”๐’Š๐’”๐’•๐’†๐’๐’„๐’Š๐’†๐’”, ๐’„๐’๐’๐’‡๐’๐’Š๐’„๐’•๐’”, ๐’‚๐’๐’… ๐’„๐’๐’“๐’“๐’–๐’‘๐’•๐’Š๐’๐’ when the partitions rejoin or when communication is restored.

โ–ช In the context of Hadoop, a split-brain scenario could ๐’„๐’‚๐’–๐’”๐’† ๐’…๐’‚๐’•๐’‚ ๐’Š๐’๐’•๐’†๐’ˆ๐’“๐’Š๐’•๐’š ๐’Š๐’”๐’”๐’–๐’†๐’”, ๐’…๐’‚๐’•๐’‚ ๐’๐’๐’”๐’”, ๐’๐’“ ๐’Š๐’๐’„๐’๐’“๐’“๐’†๐’„๐’• ๐’“๐’†๐’”๐’–๐’๐’•๐’” when processing large-scale data.

โ–ช It is important to prevent or manage such scenarios to maintain the stability and reliability of the Hadoop cluster.

โœ… ๐…๐ž๐ง๐œ๐ข๐ง๐ :

โ–ช Fencing is a mechanism used to prevent or ๐’“๐’†๐’”๐’๐’๐’—๐’† ๐’”๐’‘๐’๐’Š๐’•-๐’ƒ๐’“๐’‚๐’Š๐’ ๐’”๐’„๐’†๐’๐’‚๐’“๐’Š๐’๐’” in distributed systems like Hadoop.

โ–ช The purpose of fencing is to ensure that only one partition of the cluster remains active and continues to function while the other partitions are isolated or shut down.

โ–ช By fencing off the isolated partitions, you prevent multiple "brains" from making independent decisions, reducing the risk of data corruption and inconsistencies.

โ–ช In Hadoop, fencing mechanisms can be applied to ๐’†๐’๐’”๐’–๐’“๐’† ๐’‰๐’Š๐’ˆ๐’‰ ๐’‚๐’—๐’‚๐’Š๐’๐’‚๐’ƒ๐’Š๐’๐’Š๐’•๐’š ๐’‚๐’๐’… ๐’…๐’‚๐’•๐’‚ ๐’„๐’๐’๐’”๐’Š๐’”๐’•๐’†๐’๐’„๐’š. One commonly used fencing approach in Hadoop is the use of ZooKeeper, a distributed coordination service.

โ–ช ๐™๐จ๐จ๐Š๐ž๐ž๐ฉ๐ž๐ซ provides a distributed locking mechanism that allows the active Hadoop NameNode to obtain a lock, while other NameNodes are fenced off in case of a network partition or failure.

โ–ช By doing so, only the active NameNode continues to serve clients and manage data operations, while the others remain in a passive or standby state.

โ–ช By employing proper fencing mechanisms, Hadoop can effectively manage network partitions, maintain data consistency, and ensure the reliability of its distributed processing tasks.

โ–ช Fencing is an essential aspect of designing and configuring a robust Hadoop cluster for high availability and fault tolerance.

0
Subscribe to my newsletter

Read articles from AATISH SINGH directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

AATISH SINGH
AATISH SINGH

Hi, I am Aatish Raj Having Extensive Experience in Bigdata ๐Ÿš€I Have good knowledge of Hadoop and it's internals. ๐Ÿš€I have good knowledge of ingestion tools like Sqoop ๐Ÿš€I have good knowledge of dataWare Houses like Hive ๐Ÿš€I have Good knowledge of๐Ÿ”ฅ Spark with Scala(Dataframes, Datasets, SparkSql) and it's internals ๐Ÿš€I have good knowlege over AWS(EMR, S3,Glue) โœ๏ธTalks About #Data-Engineering โœ๏ธTalks about SQL A technology enthusiast and problem-solver, I specialize in Hadoop, MapReduce, Sqoop, Hive, Spark, AWS, SQL, Scala, Datastructures, and Algorithms. I have successfully designed and implemented solutions for diverse projects. My expertise in designing, coding, and troubleshooting allows me to quickly develop solutions and provide effective solutions to challenging problems. With a proven track record of success, I am well-equipped to take on new projects and deliver results