Saturday, April 04, 2015
Weekly Kubernetes Community Hangout Notes - April 3 2015
Kubernetes: Weekly Kubernetes Community Hangout Notes
Every week the Kubernetes contributing community meet virtually over Google Hangouts. We want anyone who’s interested to know what’s discussed in this forum.
Agenda:
- Quinton - Cluster federation
- Satnam - Performance benchmarking update
Notes from meeting:
- Quinton - Cluster federation
- Ideas floating around after meetup in SF
- * Please read and comment
- Not 1.0, but put a doc together to show roadmap
- Can be built outside of Kubernetes
- API to control things across multiple clusters, include some logic 
- Auth(n)(z) 
- Scheduling Policies 
- … 
- Different reasons for cluster federation
- Zone (un) availability : Resilient to zone failures 
- Hybrid cloud: some in cloud, some on prem. for various reasons 
- Avoid cloud provider lock-in. For various reasons 
- “Cloudbursting” - automatic overflow into the cloud 
- Hard problems 
- Location affinity. How close do pods need to be? - Workload coupling 
- Absolute location (e.g. eu data needs to be in eu) 
 
- Cross cluster service discovery - How does service/DNS work across clusters
 
- Cross cluster workload migration - How do you move an application piece by piece across clusters?
 
- Cross cluster scheduling - How do know enough about clusters to know where to schedule 
- Possibly use a cost function to achieve affinities with minimal complexity 
- Can also use cost to determine where to schedule (under used clusters are cheaper than over-used clusters) 
 
- Implicit requirements
- Cross cluster integration shouldn’t create cross-cluster failure modes - Independently usable in a disaster situation where Ubernetes dies.
 
- Unified visibility - Want to have unified monitoring, alerting, logging, introspection, ux, etc.
 
- Unified quota and identity management - Want to have user database and auth(n)/(z) in a single place
 
- Important to note, most causes of software failure are not the infrastructure
- Botched software upgrades 
- Botched config upgrades 
- Botched key distribution 
- Overload 
- Failed external dependencies 
- Discussion:
- Where do you draw the “ubernetes” line - Likely at the availability zone, but could be at the rack, or the region
 
- Important to not pigeon hole and prevent other users 
- Satnam - Soak Test 
- Want to measure things that run for a long time to make sure that the cluster is stable over time. Performance doesn’t degrade, no memory leaks, etc. 
- github.com/GoogleCloudPlatform/kubernetes/test/soak/… 
- Single binary, puts lots of pods on each node, and queries each pod to make sure that it is running. 
- Pods are being created much, much more quickly (even in the past week) to make things go more quickly. 
- Once the pods are up running, we hit the pods via the proxy. Decision to hit the proxy was deliberate so that we test the kubernetes apiserver. 
- Code is already checked in. 
- Pin pods to each node, exercise every pod, make sure that you get a response for each node. 
- Single binary, run forever. 
- Brian - v1beta3 is enabled by default, v1beta1 and v1beta2 deprecated, turned off in June. Should still work with upgrading existing clusters, etc.