How To Setup Highly Available Kubernetes Clusters And Applications?

Рет қаралды 14,438

DevOps Toolkit

Күн бұрын

Пікірлер: 46

@DevOpsToolkit 3 жыл бұрын

Is your system highly available (HA)? If it is, how did you architect it?

@mohammadbagheri6841 11 ай бұрын

Just wow, you described any tiny aspect of it in just 17mins! You earn a subscribe!

@MarkusEicher70 4 ай бұрын

Such a concise and easy to understand explanation of HA. Well done. Thank you, Viktor.

@MUSHIN_888 2 ай бұрын

I put him in 1.5 speed for most of the video then when I put him back at normal speed it's like slomo. Thanks for the help man much appreciated!!!

@cloud-ji3qm 2 жыл бұрын

Unbelievable how simply you explained this complex subject and made it easy to understand, thanks you!

@vback4238 2 жыл бұрын

Thank you for making this as simple as ABC. Wow! You are great!

@deap5193 3 жыл бұрын

Victor, thanks for following up with this wonderful piece after our last convo. Impressed, brilliant content, you are close to your fans and one feels every second that you have IRL exp and not just reframing some other tutorials, don't know any better KZbinr out there. Now, I'm just thinking how to get multi zones without the big 3 haha. But yeah, there is nobody. I mean just look what killer bare machines you get, setup in 90sec or so on the fly at Equinix, what you pay. And tbh, I"ll setup an as-well-featured-k8s-cluster on bare metal with all bells an whistles faster, better and easier to manage than on any cloud provider, and yes with seamless k8s updates. Just without multi zones but yeah. Maybe I`ll find a way some day haha.

@DevOpsToolkit 3 жыл бұрын

I should have said in the video that it is not about being perfect. 100% HA. Is impossible and we need to work with what we have. The goal is to get as far as that makes business sense. If you do not have 3 DCs/zones, you use one. If you cannot afford 3 control planes, you can have one. If your apps do not scale, it is what it is. The import thing is to know what is what, and we do the best with what we have. A good example is digital ocean. Clusters in the cannot be HA. That does not mean that no one should use it. Instead, it means that there is a tradeoff potentially compensated with the low price. When running on-prem, almost no one has 3 geographically close DCs with low latency. That does not mean that there are no benefits with it and that everyone should use VMs in public cloud but that it is always "win some loose some" type of calculation we need to make.

@miloslavhantl8637 Жыл бұрын

Very nice explained how to accomplish and what aspects need to be aware. Thank you a lot Victor !

@mikegbow4203 3 жыл бұрын

Your videos are helping me a lot with really understanding Kubernetes and containerization. Thank you!

@javisartdesign 3 жыл бұрын

Great explanation! Quorum, leader election ,raft, gossip, etc.. all these concepts, protocols and patterns must be understood by anybody who wants to build distributed systems. Another topics such as CAP theorem, two phase commit, ACID transactions are the foundations of these concepts.

@romainlaisne Жыл бұрын

Very nice overview. Thanks!

@robarros21 3 жыл бұрын

the new k0s project is really cool for kubernetes environments

@ghadeerelsalhawy Жыл бұрын

Thank you so much for the explanation.

@quackycoder9565 3 жыл бұрын

Really interesting and informative! Please keep sharing your knowledge! Thanks!:)

@JackReacher1 2 жыл бұрын

1:48 Is that the same Engineer I know who would say "If you don't know kubectl, what are you doing here in an eksctl video" ? 😂 btw Another one of those good video.

@DevOpsToolkit 2 жыл бұрын

That's the one :)

@aliakbarhemmati31 2 жыл бұрын

I think we should differentiate etcd nodes from other control plane nodes. Yes, if we have two etcd nodes we can not call it HA. But what about api server? Because it is stateless, I think having more than one instance is HA for it.

@aliakbarhemmati31 2 жыл бұрын

By the way, thanks for your great videos

@DevOpsToolkit 2 жыл бұрын

Agreed. HA for control plane (etcd) nodes means that there are at least three. Two nodes is not enough since failure of one means there the concensus is lost (over 50%). So, it's not more than one (for the control plane). It's three or more (always odd number).

@anshuman2121 3 жыл бұрын

Great video. Good work. Could add a animation to show HA on 3 servers works and how to set up cluster and quorum in brief

@DevOpsToolkit 3 жыл бұрын

I'd love to do that but my artistic skill are very limited. I would need help for that

@illiakailli 2 жыл бұрын

thanks for a nice explanation! it really helps to start thinking about important things. Have a question: is it legitimate to state with such certainty 'rules of thumb' without knowing specifically which clusters we are talking about? You've mentioned speed degradation when cluster spans multiple geographical regions, but how important this speed for each specific cluster? For example, if this is a non-sharded database cluster, then fast replication might be important, but what if its sharded? what if it doesn't need to transfer much data across nodes and just needs to send packets to maintain quorum? My point is that it really depends on your specific app, business constraints, budget and all that jazz. Also, by saying that you need to host database somewhere else - you really just shifting responsibility to some other team: they will have to solve same problems you outlined.

@DevOpsToolkit 2 жыл бұрын

The further away servers in the same cluster are, the bigger the latency. Now, that does not mean that no one should have clusters that span multiple regions. It's always about pros and cons. If increased latency is less important than the benefits of having multi-region clusters, I say "go for it". I'm only trying to raise awareness about a potential issue, not saying that no one should go for multi-region clusters :)

@fenarRH 3 жыл бұрын

+ Notes from experiences form from wrong expectations of k8s consumers: Etcd uses the Raft consensus algorithm to replicate requests among members and reach agreement. Consensus performance, especially commit latency, is limited by two physical constraints: network IO latency and disk IO latency. If your cp nodes spread across multiple locations, the general approach is to keep latency

@ajk7151 Жыл бұрын

doesn't only etcd require to be minimum 3? in that case only 2 control planes are required for HA, if there are external 3 etcds. please clarify.

@DevOpsToolkit Жыл бұрын

I guess you're right if etcds are external. I always had them inside control planes though so three etcds equals three control plane nodes. In your setup, you'd have five nodes; 2 control plane nodes and 3 etcd nodes. Right? If that's the case, that results in more not less hsrdware (assuming that reduction in hardware is what you're aiming for).

@ajk7151 Жыл бұрын

@@DevOpsToolkit I was thinking in terms of datacenters. Only etcd requires 3 datacenters, while control planes & workers can be managed with 2 datacenters.

@DevOpsToolkit Жыл бұрын

Yes, as long as those datacenters are colocated so that there is no latency between them. Also, the main question is whether you do or you don't have 3 DCs. If you do, the rest is easy.

@andreykaparulin9214 3 жыл бұрын

thanks from Russia : )

@kiranyadav3528 Жыл бұрын

Hi victor Thanks for a detailed explanation. And my requirement exactly matching your solution. But I am unable find enough resources to help this solution deployment . So can you please share me any link or solution where your solution is practically implemented or any supporting architecture or documentation which will help me to build this setup .

@DevOpsToolkit Жыл бұрын

Can you be a bit more specific? Are you looking for a way to have a cluster itself in HA? If that's the case, which vendor are you using? Is it some other part of the HA story?

@Blkhole02 3 жыл бұрын

Great overview! From a purely infrastructure perspective (compute, storage, network) it's becoming increasingly hard to mess up HA, as long as you stick with the major cloud providers, and you do your basic due dilligence when designing it (multiple AZs, using a hosted LB, taking advantange of replication features offered by the various hosted services such as RDS or Aurora). Totally different story when running on prem though... to this day I still get goosebumps when I see a VMware HA alert.

@DevOpsToolkit 3 жыл бұрын

That's, more or less, what I say to people claiming that they can have just as good or better setup on-prem. Do "real" HA and call me when it's done to tell me how you failed or how much it cost you.

@spy.catcher 3 жыл бұрын

nice transparent screen notes

@DevOpsToolkit 3 жыл бұрын

Thanks

@shukhrate4203 3 жыл бұрын

1 comment. If adding more replicas it is Scale Out, if adding more CPU/RAM/Changing instance type - Scale Up

@DevOpsToolkit 3 жыл бұрын

You're right. Adding more replicas is scale-out or horizontal scaling and adding more resources scale-up or vertical scaling. I should have been clearer that only horizontal scaling matters for HA and that does not exclude combining it with vertical scaling.