Quote
Hi @all, I have excited to bring
kubewekend
back to this weekend, more stuff need to tackle but not for this weekend. In this weekend, we will learn and figure more about storage inside kubernetes cluster, how you can provide one of these, ceph - opensource with really cool stuff and background for us digest. Not long for wait, back tokubewekend
Storage inside Kubewekend
Start with networking inside Kubernetes
, as you remember about cilium
, this one belong the component inside cluster, and community call that CNI (Container Network Interface). And with Storage inside Kubernetes
, you have same concept and that require you install kind stuff as driver calling CSI (Container Storage Interface).
You can go and check the article below to understand more about reason why kubernetes
give this concept for our community
Info
TheΒ Container Storage InterfaceΒ (CSI) is a standard for exposing arbitrary block and file storage systems to containerized workloads on Container Orchestration Systems (COs) like Kubernetes. Using CSI third-party storage providers can write and deploy plugins exposing new storage systems in Kubernetes without ever having to touch the core Kubernetes code.
Note
Inside the summary, you can see the real problem inside
kubernetes
that before CSI release. Technically, there is existence mechanism for plugin or volume solution insidekubernetes
but meet the challenge because this one methodology is put insidekubernetes
core - It means when upgrade, bug or issue, community must wait to control problems when next release ofkubernetes
come up
And that does why integrate 3rd party inside kubernetes
core, It is really challenge, if you know about cloud services, you will have multiple solutions and techniques for storage and it doesnβt really simple for you put a lot of things inside kubernetes
that will make cluster explosion. Therefore, really surprise about how they create alternative mechanism to reduce the problem. And Now we are having CSI
About CSI Concept
Really inspiration, this one is tough thing and contain a lots of stuff you will never care about, but I want to make some sorts of difficult things, such as overview about CSI and you will surprise about this one
You can explore with me through community and github repository
As you can see, CSI is built-in gRPC
of Google, actually kubernetes
is open-source program of Google when start that why gRPC
is integrating in this stuff, just guess from mine π
If I spread out of architecture of CSI, this will really stressful for us π
. And as my knowledge in this moment, I will not understand as much about them to transfer that one for you, honestly. But following the Specification, this concept will provide protobuf
- one of things belong of gRPC
protocol, you can find more information about protocol inside my Compile gRPC for 1st time blog, CSI use gRPC
protocol and protobuf
to make conversation between CO (Container Orchestraion - Kubernetes is one of them) with Plugin (Controller - Node)
And It make sense for connection and responsibility to make CSI work as stable. We are a end use of them, so if you concern about how we can developer own CSI stuff, I think that can be long story LOL, but we will learn about that one.
Following the article about CSI, you actually can create own CSI, and you need to ensure about create function or interface, if I donβt wrong to provide the method which can help kubernetes
or what ever orchestration can interact with your storage. Read information below
Info
TheΒ kubernetes-csiΒ site details how to develop, deploy, and test a CSI driver on Kubernetes. In general, CSI Drivers should be deployed on Kubernetes along with the following sidecar (helper) containers
- external-attacher
- Watches KubernetesΒ
VolumeAttachment
Β objects and triggersΒControllerPublish
Β andΒControllerUnpublish
Β operations against a CSI endpoint.- external-provisioner
- Watches KubernetesΒ
PersistentVolumeClaim
Β objects and triggersΒCreateVolume
Β andΒDeleteVolume
Β operations against a CSI endpoint.- node-driver-registrar
- Registers the CSI driver with kubelet using theΒ Kubelet device plugin mechanism.
- cluster-driver-registrarΒ (Alpha)
- Registers a CSI Driver with the Kubernetes cluster by creating aΒ
CSIDriver
Β object which enables the driver to customize how Kubernetes interacts with it.- external-snapshotterΒ (Alpha)
- Watches KubernetesΒ
VolumeSnapshot
Β CRD objects and triggersΒCreateSnapshot
Β andΒDeleteSnapshot
Β operations against a CSI endpoint.- livenessprobe
- May be included in a CSI plugin pod to enable theΒ Kubernetes Liveness ProbeΒ mechanism.
Storage vendors can build
Kubernetes
deployments for their plugins using these components, while leaving theirCSI
driver completely unaware ofKubernetes
.
Quote
I think that kind stuff will become more difficult, but when you want you can do for yourself, but it not conclude in this series and session. We will stop about
CSI
in here and back later when we actually learn more stuff about theses one
Now we head to mechanism of CSI inside kubewekend
Cluster
Development and Deployment CSI
Info
Kubernetes users interested in how to deploy or manage an existing CSI driver on Kubernetes should look at the documentation provided by the author of the CSI driver.
In currently situation, the minimize requirements are around how kubernetes
components find and communicate with a CSI Driver
Following the documentation, they give us perspective to see how communicate from
Note
Because these requirements are minimally prescriptive, CSI driver developers are free to implement and deploy their drivers as they see fit.
And you can do that with some recommended mechanism, like\
- Kubernetes CSIΒ Sidecar Containers
- Kubernetes CSIΒ objects
- CSIΒ Driver TestingΒ tools
How we can use CSI
inside kubewekend
Quote
Actually, this question is the main meaning when you hit button to my blog, because the mechanism isnβt really simple and ease to hand on, this is best of part when you want to contribute your own solution inside
kuberentes
In my opinion, I just figure out what exist solution can help us inject or implement CSI from Cloud services, from storage object and with techniques really provide us.
You can find one of them inside list of List of Kubernetes CSI Driver, and you need ensure install exactly what requirement from driver providers to permit your Kubernetes Cluster can interact with storage through this driver
Assuming a CSI storage plugin is already deployed on a Kubernetes cluster, users can use CSI volumes through the familiar Kubernetes storage API objects:Β PersistentVolumeClaims
,Β PersistentVolumes
, andΒ StorageClasses
. DocumentedΒ here.
Source: OVH Cloud
Info
Following concept, you can image the connection between of these ones via image. Start from Pod (unit of
kubernetes
) use PVC (Persistent volume claim) to connect and claim data from PV (Persistent volume) which represent to connect with storage (file, object or moreover). Andkubernetes
release SC (StorageClass) to help us dynamically create PV when you donβt want to implement all process and that whykubernetes
attach and mount volume to the pod
You can explore more detail about with storage in Kubernetes and method you can hand on with them
Dynamic Provisioning
Info
Enable automatic creation/deletion of volumes for CSI Storage plugins that support dynamic provisioning by creating aΒ
StorageClass
Β pointing to the CSI plugin.
Warning
When you work with storage like Azure Blob or Azure file, you can meet the problem about stuck when provide or talk with your cloud services, that why you need to provide parameter keys (
csiProvisionerSecretName
,ΒcsiProvisionerSecretNamespace
, etc.) to help you connect storageclass to your cloud providers
Dynamic provisioning is triggered by the creation of aΒ PersistentVolumeClaim
Β object. The followingΒ PersistentVolumeClaim
, for example, triggers dynamic provisioning using theΒ StorageClass
Β above.
And it will automatically create new volume and map that with PV and now you just need to bind the PV to PVC, and it will exchange to ready state. You can marked as default
tags for one of them, that will help you create default volume type when call PVC
Pre-Provisioned Volumes
Info
You can always expose a pre-existing volume in Kubernetes by manually creating a PersistentVolume object to represent the existing volume
We will deeply encounter with this one on next part, I just mention about method to help you figure out how we can provision and implement volume in Kubernetes
Attaching and Mounting
You can use volume inside your pod via call PVC value inside yaml
definition. This one will become help you attach and mount your PV to path inside pod
Info
And when your pod call PVC, Kubernetes will trigger the appropriate operations against the external CSI plugin (
ControllerPublishVolume
,ΒNodeStageVolume
,ΒNodePublishVolume
, etc.) to ensure the specified volume is attached, mounted, and ready to use by the containers in the pod.
Conclusion
Quote
Really tough thing, I think so when first meet the concept, anyone work around Kubernetes will stuck in few day and donβt figure out what problem. But if you understand and image what next step in provisioning and implement volumes in Kubernetes, this experience will become cool stuff and that can help you hand-on with lots of Kubernetes concept which one provide us
Check it out when you want to learn more about PVC, PV and Storage Class, I think those ones can help you
- Setup MySQL with Wordpress in k8s - Easy migrate or not !!
- Kubernetes Volumes explained
- Official Kubernetes about Storage
- Azure Blob Storage CSI driver for Kubernetes
- Create and use a volume with Azure Blob storage in Azure Kubernetes Service (AKS)
Practice with Volume in kubewekend
Question
Target of practice in this session about we supply to
kubewekend
cluster one of biggest platform when you think about object storage and huge techniques stay behind, Ceph. We will check it out and see how we need to do for provide this one for cluster
About the Ceph ?
Info
Ceph deliversΒ object, block, and file storage in one unified system. Ceph is highly reliable, easy to manage, and free. The power of Ceph can transform your companyβs IT infrastructure and your ability to manage vast amounts of data
You can explore about Ceph with me through Intro to Ceph article, as you can see, Ceph provide us three type of storage and you can manage all of them with only Ceph
Follow these link to figure out what you want about Ceph, and figure out method Ceph can beat it for you
We will deeply discover about Ceph in particular blog, but in this session I just want to help you to use Ceph or whatever CSI to example for imaginary how can you set up one of them for your Kubernetes Cluster
Methodology install Ceph in Kubewekend
You can check out on the List of Kubernetes CSI Driver and with Ceph you will have two type driver to install inside your cluster
Warning
Upset to tell you about,
cephfs
will remove out of cluster whenkubernetes
reach to version 1.31 and this one is really deprecate on version 1.28. Read more at cephfs. Buttttt, It just recommend you change from volume plugin to CSI Driver, and that is what exactly we want to do in this session π.
And now, the first thing you need to do is installing ceph-cli which one work as daemon in your host and help you provide CSI Driver for your cluster (Same as cilium
)
When you create ceph
inside cluster, It not simple install only one daemonset
things inside but that install bunch of things include things for cluster, including
- Monitors: AΒ Ceph MonitorΒ (
ceph-mon
) maintains maps of the cluster state, including theΒ monitor map, manager map, the OSD map, the MDS map, and the CRUSH map. - Managers: AΒ Ceph ManagerΒ daemon (
ceph-mgr
) is responsible for keeping track of runtime metrics and the current state of the Ceph cluster, including storage utilization, current performance metrics, and system load. - Ceph OSDs: An Object Storage Daemon (Ceph OSD,Β
ceph-osd
) stores data, handles data replication, recovery, rebalancing, and provides some monitoring information to Ceph Monitors and Managers by checking other Ceph OSD Daemons for a heartbeat. - MDSs: AΒ Ceph Metadata ServerΒ (MDS,Β
ceph-mds
) stores metadata for theΒ Ceph File System.
Info
Ceph stores data as objects within logical storage pools. Using the CRUSH algorithm, Ceph calculates which placement group (PG) should contain the object, and which OSD should store the placement group. The CRUSH algorithm enables the Ceph Storage Cluster to scale, rebalance, and recover dynamically.
And one question, how we can install ceph
for our cluster, that will really interesting question. We have multiple ways for handle
- Using helm
- Using rook
- Using rdb-manifest and fs-manifest. Example Medium - Deploy Ceph, integrate with Kubernetes
Technically, from ceph-installing documentation, using rook is recommending way to run Ceph
in Kubernetes or to connect an existing Ceph storage cluster to Kubernetes. And I really like that you will except more problems when you miss steps
With Rook
, an open sourceΒ cloud-native storage orchestrator, providing the platform, framework, and support for Ceph storage to natively integrate with cloud-native environments.
Info
Rook automates deployment and management of Ceph to provide self-managing, self-scaling, and self-healing storage services. The Rook operator does this by building on Kubernetes resources to deploy, configure, provision, scale, upgrade, and monitor Ceph.
Following Rook, this platform will provide us bunch of CRD
(Custom resource definition) in Kubernetes
(Relate you on other session) to help us create interface or API for ease to implement and interact with Ceph
with Rook
. But just excited to tell you about that, you can learn more about rook and ceph through
Prerequisites
With Quickstart, I will follow this one to hand-on with ceph and rook
First of all, they require me check the configuration about kubernetes, we will hand-on to that enough
Kubernetes version β
Info
Kubernetes versions v1.26 through v1.31 are supported.
You can check version of kubernetes with kubectl get node
Currently, kubewekend
of us work on v1.28.9 and this one enough for doing stuff with rook and ceph
CPU Architecture β
Info
Architectures supported are amd64 / x86_64 and arm64.
You can validate that with uname command in linux to check your host is adapt condition
Ceph Prerequisites β
To configure the Ceph storage cluster, at least one of these local storage types is required:
- Raw devices (no partitions or formatted filesystems)
- Raw partitions (no formatted filesystem)
- LVM Logical Volumes (no formatted filesystem)
- Persistent Volumes available from a storage class inΒ
block
Β mode
Currently, if you have look about raw devices that will not exist any for your ceph and that will make this deployment become false
Not exist any raw devices, poor for me π’. But I will do some stuff to provide it, therefore, you need head to Vagrantfile
to implement additionally information for your host
Depend on Vagrant - Basic Usage Documentation, you can hand-on some stuff with vagrant by add some configuration
And run reload
command with Vagrant
After ssh
again to machine run lsblk -f
and you will see raw device is addition to
Check again with lsblk
but for ensure your volume capacity right expectation
And now your are ready to continue
Kernel β
I think with upgrade kernel on last session, our cluster will be alright, for ensure not problem you can run some specific command following documentation to check about rbd
And not error is return, you should be okay π
One more thing, CephFS require minimum kernel version is 4.17
but we over that for newest version can be in Ubuntu 20.04 is 5.15.0-117-generic
Now you can reach to implement install Ceph, letβs see any problem occur when install that one π
Install Ceph with Rook
Following quickstart, A simple Rook cluster is created for Kubernetes with the followingΒ kubectl
Β commands andΒ example manifests.
Warning
Concern and following step by step
Remember you need to make sure have at least 2 or 3 node for doing configuration with ceph
(prevention some thing not understand), so you need to modify for apply the next session practice to add more node inside your cluster. In my situation, you just need modify kind-config.yaml
to get that work
And run create again with command
Re run command with rook
to install again ceph, you will need some tool from ceph repository to handle ceph cluster
But hold up, it actually problem or some thing different between cluster.yaml
and cluster-test.yaml
. Therefore, you just need to use test
for alternative because something you need deeper inside to use cluster
version, that why is run ceph status
command and it return error. Read more in version Medium - Implementing ROOK Ceph Storage solution on Virtual kubernetes clusters
You can simply use kind-config.yaml
with only control plane
Info
Actually,
cluster-test.yaml
is smaller version and you can use that for testing on small cluster, the error is not proven anything because you runceph
inside cluster andceph
command is not talk with cluster, and give you status
I think about situation, we need to modify the ceph command for understand that use for what cluster before we can use that command for check status inside kubewekend
cluster. So again, apply with cluster-test.yaml
instead of cluster.yaml
And wait at least 5 minute, can be long or short but afterward you will receive that value inside rook-ceph
namespace
And done, now you will have interface inside kubernetes for ceph cluster, you can retrieve that one with
Now we create storageclass
for ceph to being in-use with ceph cluster for both rbd
and fs
In the lastly, we will change default configuration of storage class for ceph-block can handle that
You will need to install ceph tools to help you monitor ceph cluster status
All available tools in the toolbox are ready for your troubleshooting needs.
Example:
ceph status
ceph osd status
ceph df
rados df
Actually insane, when you exec into the pod container with toolbox inside, you will meet some kind stuff and that will make causing for mount on next part
And use ceph
command for check status
If you apply storageclass
like above, It will create replicapool with make your ceph cluster become unhealthy with refer empty pool with push your cluster can become dataloss. It will really struggle, but I figure out about that so I will try modify
First of all, we reformat disk to raw because your disk is using by previous ceph cluster
Now reboot your machine
After that, try to connect to machine again but remember you need create again cluster for except difficult to control
And now try to create again
Wait at least 5min with check status of ceph cluster is work or not
And if problem continue expose HEALTH_WARN, you can exec inside tool command
It mean you cluster is not work with raw disk (not exist OSD), therefore you must back to configuration of Vagrantfile
, you need to command the extra_disk for UNATTACHED with disk, and uncommand to reload again
And now rook-ceph-osd-0-xxxxx-xxx
will provision for your, that really tough but efficiency
Validate again health status of ceph cluster with toolkit
Great, now you need to modify or delete inside RBD Storage class for except create replicapool with 3
And now apply rbd
and cephfs
again
Success
Like above your have twice storageclass for
rbd
andcephfs
Try to mount and attach the volume with RBD
I just collect example from Rook about create the wordpress and mysql with PVC, so let try to create one of them and see what result
Through of this manifest, you can see that will provide something with inside namespace include
- MySQL Deployment with use PVC
- PVC Use
rook-ceph-block
with request 5GB - Service to expose port 3306 of MySQL
And now we apply it with apply -f
command
And now you create PVC for mysql, and your result is creating successfully
Conclusion
Success
Really tough week, I stand with you in late night for complete this session, more thing I learn from this one and try to hand on with volume is new experience for me. CSI is one of things make Kubernetes become pleasant, delivery more solution for community and create opportunities for development and deployment can reach to next level with this techniques. And Ceph and Rook, I can say that really hard to control, learn and hand on with huge architecture stand behind, but like i said
kubewekend
is a change for myself and for my community to learn some thing new, hope you find well with content
Quote
A couple is passing, and I try to balance my workload, so have kept you waiting for a long time. But this week is really cool, and I can back with you, learn and do something greate inside Kubewekend. Therefore, I appreciate with my followers, our community is part which tell myself need to improve everyday and bring you interesting content, so stay safe, learn something new and I will see you on next week maybe with Kubewekend. Bye Bye π