Following are the presentations used during the session:
https://www.glistensoft.com/GlistenPresentation-CloudComputing-20200913.pdf
https://www.glistensoft.com/GlistenPresentation-DevOps-20200913.pdf
YouTube Video on the presentation:
]]>
As the architecture of applications become more and more complex, it becomes difficult to implement it in Enterprise environments. Recently, was working on setting up Kubernetes cluster in an Enterprise environment and some of challenges were encountered there which I believe will appear in all Enterprise environments. They can be listed as follows along with some resolution:
Proxy server adds to the complexity of how different Docker and Kubernetes services communicate with the outside world.
Docker requires separate configuration for HTTP proxy to communicate –
https://docs.docker.com/network/proxy/#use-environment-variables
https://docs.docker.com/config/daemon/systemd/#httphttps-proxy
Kubernetes services like apiserver, controller and scheduler also need to be configured with no_proxy environment variable for the internal network to bypass communicating via proxy.
https://github.com/kubernetes/kubeadm/issues/324
Typical errors that you will see in system logs because of proxy communication are:
{“log”:”E0329 17:47:54.136036 1 leaderelection.go:224] error retrieving resource lock kube-system/kube-controller-manager: Get https://10.0.0.7:6443/api/v1/namespaces/kube-system/endpoints/kube-controller-manager: Gateway Timeout\n”,”stream”:”stderr”,”time”:”2018-03-29T17:47:54.136283562Z”}
We initially configured Kubernetes cluster with Flannel in both local and Enterprise environment, but client wanted to use Weave networking. So in our local environment without proxy, we were able to setup weave network but when we implemented the same in Enterprise somehow DNS service was getting enabled. Default DNS pod was unable to communicate to apiserver’s service network of 10.90.x.x. It always timed out. This added to the instability in pod communication.
There might be a solution for weave, but in one week of troubleshoot, we were not able to figure out the solution, so switched back to flannel networking.
In Enterprise environment, some of the security tuning is already enabled which causes issues during deployment of the Cluster. For example, IPv6 was already disabled on servers during Enterprise environment configuration of the servers, But kubeadm deployment expects IPv6 is already enabled and it tries to disable it. If it is already disabled, then the deployment fails.
Another condition was with appArmor to be disabled. During some installations, AppArmor is enabled by default and it adds to issue of docker service unable to function properly.
Due to complexity of architecture and a lot of diverse services involved, significant amount of ports need to be opened for internal service communication. Keeping a list, tracking these ports of communication, and being able to troubleshoot is always a challenge.
]]>Baud-rate speed used for SOL access is Volatile Bit Rate (kbps). It requires serial communication redirection to be enabled in BIOS. Serial communication speed should match Baud-rate speed.
Configuring the console login process
$ dmesg |grep tty
[ 0.000000] console [tty0] enabled
[ 1.073325] serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
[ 1.094732] 00:03: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
[ 1.115064] 0000:00:16.3: ttyS1 at I/O 0x9080 (irq = 17) is a 16550A
ttyS1 is for the BMC.
# ttyS1 – getty
#
#This service maintains a getty on ttyS1 from the point the system is
# started until it is shut down again.
start on stopped rc or RUNLEVEL=[2345]
stop on runlevel [!2345]
respawn
exec /sbin/getty -L 57600 ttyS1 vt102
$ sudo start ttyS1
$ sudo /sbin/telinit q
# ipmitool -I lanplus -H <System’s BMC IP address> -U <userid> -P <password> sol activate
This should enable see login prompt and being able to login using a system user.
# If you change this file, run ‘update-grub’ afterwards to update
# /boot/grub/grub.cfg.
GRUB_DEFAULT=0
GRUB_TIMEOUT=1
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX=”console=tty0 console=ttyS1,57600n8″
# Uncomment to disable graphical terminal (grub-pc only). Unit number indicates serial communication port. COM1 – 0, COM2 – 1, COM3 – 2, etc
GRUB_TERMINAL=”serial console“
GRUB_SERIAL_COMMAND=”serial –speed=57600 –unit=1 –word=8 –parity=no –stop=1″
# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command `vbeinfo’
#GRUB_GFXMODE=640×480
# Uncomment if you don’t want GRUB to pass “root=UUID=xxx” parameter to Linux
#GRUB_DISABLE_LINUX_UUID=true
Note: Comment GRUB_HIDDEN_TIMEOUT=0 and GRUB_HIDDEN_TIMEOUT_QUIET=true if they are in the original grub.cfg.
Additionally, if OS is installed via MAAS then above settings need to be configured in file /etc/default/grub.d/50-curtin-settings.cfg .
# update-grub
# ipmitool -I lanplus -H <IPMI-IP> -U <username> -P <password> -C3 sol activate
# ipmitool -I lanplus -H <IPMI-IP> -U <username> -P <password> -C3 sol deactivate
# ipmitool -I lanplus -H <IPMI-IP> -U <username> -P <password> -C3 sol info 1
# ipmitool -I lanplus -H <IPMI-IP> -U <username> -P <password> -C3 sol set volatile-bit-rate <value>
]]>