Hem's Blog: gcp

Showing posts with label gcp. Show all posts

Saturday, 28 January 2023

Kubernetes in a nutshell

In my previous blog, We learned how to deploy an API to the GCP K8s engine, Today we will learn about Kubernetes as an Overview.

Google created Kubernetes (K8s) as part of their internal infrastructure to manage the containerized applications running on their infrastructure.

Kubernetes is an open-source platform designed to automate the deployment, scaling, and management of containerized applications. It's been around for a while now and has become the standard for managing containerized applications in production environments.

What is Kubernetes?

Kubernetes is an orchestration system that automates the deployment, scaling, and management of containers. It provides a unified platform for deploying and managing containers, making it easier for organizations to run and scale their applications. Kubernetes is highly extensible, allowing organizations to customize it to meet their specific needs.

Kubernetes (K8s) is an open-source container orchestration system for automating the deployment, scaling, and management of containerized applications. It works by using a master node to control and manage a group of worker nodes.

How does Kubernetes work?

Kubernetes works by dividing an application into smaller units, called containers. Each container holds a piece of an application, such as a microservice. These containers can be deployed and managed independently, making it easier to scale and manage applications.

Kubernetes uses a declarative approach to manage containers, meaning you define what you want your application to look like and Kubernetes takes care of the rest. This makes it easy to manage complex applications, as you don't need to worry about the details of how containers are deployed and managed..

The main components of a Kubernetes cluster are:

The API server: The entry point for all administrative tasks. It exposes the Kubernetes API and communicates with the other components.
etcd: A distributed key-value store that stores the configuration data of the cluster.
The controller manager: Responsible for maintaining the desired state of the system by making changes to the actual state of the system as necessary.
The kubelet: Runs on each worker node and communicates with the API server. It is responsible for starting and stopping containers on the node.
The kube-proxy: Runs on each worker node and provides network connectivity to the containers.

Kubernetes uses a declarative approach, where the user defines the desired state of the system in the form of manifests, and the system ensures that the actual state of the system matches the desired state.

Why use Kubernetes?

Kubernetes provides a number of benefits over traditional approaches to managing containers. Some of the most significant benefits include:

Scalability: Kubernetes makes it easy to scale your application, either by adding more containers or by increasing the resources assigned to existing containers.
Resilience: Kubernetes automatically monitors containers and restarts them if they fail, ensuring that your application is always available.
Portability: Kubernetes can run on a variety of cloud platforms, as well as on-premises. This makes it easier to move your application from one platform to another, reducing vendor lock-in.
Integration: Kubernetes integrates with a variety of tools and platforms, making it easier to integrate your application with other systems.

Kubernetes Architecture

Kubernetes is based on a master-worker architecture. The master node is responsible for managing the cluster, while worker nodes run the containers. The master node communicates with worker nodes to deploy and manage containers, and to ensure that the desired state of the application is maintained.

Kubernetes uses a number of components to manage containers, including:

API server: The API server is the central component of the Kubernetes architecture. It provides a RESTful interface for managing the cluster, and is used by other components to interact with the cluster.
etcd: etcd is a distributed key-value store that Kubernetes uses to store cluster data. This data is used to ensure that the desired state of the application is maintained.
Scheduler: The scheduler is responsible for scheduling containers to run on worker nodes. It uses data from the etcd store to determine the optimal placement of containers.
Controller manager: The controller manager is responsible for managing the state of the cluster. It monitors the state of the cluster and takes action to ensure that the desired state is maintained.
Kubelet: The kubelet is a component that runs on each worker node. It communicates with the master node to receive instructions for deploying and managing containers.

Deploying Applications on Kubernetes

To deploy an application in a Kubernetes cluster, you would create a deployment manifest that defines the desired state of the application, such as the number of replicas and the container image to use. The Kubernetes control plane will then ensure that the actual state of the system matches the desired state by creating the necessary pods and replication controllers.

Another example is if an application running on a node goes down, Kubernetes will automatically create a new pod to replace the failed one. Also, if the load on an application increases, Kubernetes can automatically scale the number of replicas to handle the increased load.

Kubernetes also provides features such as service discovery and load balancing to make it easier to access applications running in the cluster, as well as rolling updates to allow for updates to be made to the system with minimal downtime.

Overall, Kubernetes provides a powerful platform for managing containerized applications at scale, making it easier to deploy, scale, and manage applications in a production environment.

The process of deploying an application in a Kubernetes cluster involves several steps:

Containerizing the application: The first step is to containerize the application by creating a Docker image that includes the application code and all its dependencies.
Creating a deployment manifest: Once the application is containerized, you need to create a deployment manifest that defines the desired state of the application. This includes the number of replicas, the container image to use, and any environment variables or volumes that the application requires.
Creating a service manifest: A Service manifest defines the desired state of the service. It is responsible for the network communication between the pods and the external world.
Applying the manifests: The next step is to apply the manifests to the cluster. This can be done using the kubectl command-line tool, which communicates with the Kubernetes API server to create the necessary resources in the cluster.
Verifying the deployment: After applying the manifests, you can use the kubectl command-line tool to verify that the deployment was successful. This includes checking that the pods and replication controllers were created and that the desired number of replicas is running.
Updating the deployment: If you need to make changes to the deployment, such as updating the container image or changing the number of replicas, you can do so by modifying the deployment manifest and reapplying it to the cluster. Kubernetes will then update the actual state of the system to match the desired state.
Scaling the deployment: If the workload increases, you can scale the deployment by modifying the replicas count in the deployment manifest and reapplying it to the cluster. Kubernetes will then automatically create new pods to handle the increased load.
Monitoring the deployment: Monitoring the deployment, including the health and performance of the application, is important to ensure that the application is running as expected and to troubleshoot any issues that may arise.

Overall, the process of deploying an application in a Kubernetes cluster involves containerizing the application, creating manifests, applying them to the cluster, and then monitoring and updating the deployment as needed.

How we can secure the deployment in Kubernetes

Secure communication: Ensure that all communication between components within the cluster, as well as between the cluster and external systems, is secure. This can be done by using secure protocols such as HTTPS and securing etcd with proper authentication and authorization.
Network segmentation: Use network policies to segment the network and limit communication between pods and services.
Role-based access control (RBAC): Use RBAC to control access to the Kubernetes API, and limit the actions that users, groups, and service accounts can perform within the cluster.
Secrets and configMaps management: Use Kubernetes secrets and configMaps to store sensitive information such as passwords, tokens, and certificates in an encrypted form and avoid storing them in the application code.
Pod security policies: Use pod security policies to define the security context for pods, including setting resource limits and enabling security features such as AppArmor or SELinux.
Regular Auditing: Regularly audit the cluster for security risks and compliance issues, and take action as necessary.
Secure your nodes: Secure the nodes by using a firewall, configuring secure boot, using a trusted platform module (TPM), and securing the operating system.
Use security add-ons: Use security add-ons such as Kubernetes Network Policy, PodSecurityPolicy, Kubernetes Secrets, Kubernetes ConfigMaps, etc to secure your deployment. Use third-party tools:
Use third-party tools such as Kube-bench, Kube-hunter, etc to scan and test the cluster for vulnerabilities and misconfigurations.

Overall, securing a deployment in Kubernetes requires a combination of different security measures to protect the communication, network, and access control, as well as the data and application running within the cluster.

This deployment file creates a deployment named "my-app" with 3 replicas and runs the container as non-root user, with read-only root file system. The environment variables, SECRET_KEY, and CONFIG_SETTINGS are set using Kubernetes Secrets and ConfigMaps respectively, to store sensitive and non-sensitive information. Also, it uses a pod security policy to set the security context of the pod.

Happy Coding and Keep Sharing!!

Sunday, 22 January 2023

Google Cloud vs AWS - Comparing in 2023

Google Cloud Platform (GCP) and Amazon Web Services (AWS) are both popular cloud computing platforms that offer a wide range of services for businesses and organizations. Both GCP and AWS provide infrastructure as a service (IaaS), platform as a service (PaaS), and software as a service (SaaS) offerings, allowing customers to build, deploy and run applications in the cloud.

Here are some key differences between GCP and AWS:

Services: GCP and AWS offer a similar set of services, but they may have different names and slightly different functionality. GCP has a strong focus on big data and machine learning, while AWS has a wider range of services and a more established ecosystem of partners and third-party tools.
Pricing: GCP and AWS have different pricing models, with GCP generally being more flexible and customizable, while AWS often has a more straightforward pricing structure. GCP also offers sustained-use discounts, which can lower the cost of running long-running workloads.
Networking: GCP has a strong emphasis on global networking and offers services such as Google's global load balancer and Cloud VPN, while AWS has a more established ecosystem of partners and third-party tools for networking.
Data and Analytics: AWS has a wide range of data and analytics services, including Redshift, RDS, and Elasticsearch, while GCP has a big data focus with services such as BigQuery and Cloud Dataflow.
Machine learning: GCP has a strong focus on machine learning, with services such as TensorFlow, Cloud ML Engine, and Cloud Vision API, while AWS also has a range of machine learning services including SageMaker, Rekognition, and Lex.
Support: AWS has a more established support system with different levels of support options and a larger community, while GCP has a more limited support system and a smaller community.

Overall, GCP and AWS are both powerful cloud platforms that offer a wide range of services. The choice between the two will depend on the specific needs of your organization, including the services you require, your budget, and your existing infrastructure.

GCP over AWS and Vice-Versa?

Choosing between Google Cloud Platform (GCP) and Amazon Web Services (AWS) can depend on several factors, including the specific services and features offered by each platform, the pricing model, and the overall fit with your organization's existing infrastructure and workflow.

Here are some factors to consider when deciding between GCP and AWS:

Services: If your organization has specific needs for big data and machine learning, GCP may be a better choice as it has a strong focus on these areas. On the other hand, if your organization requires a wide range of services and a more established ecosystem of partners and third-party tools, AWS may be a better choice.
Pricing: GCP offers more flexible and customizable pricing, while AWS often has a more straightforward pricing structure. GCP also offers sustained-use discounts, which can lower the cost of running long-running workloads.
Networking: GCP has a strong emphasis on global networking, with services such as Google's global load balancer and Cloud VPN, while AWS has a more established ecosystem of partners and third-party tools for networking.
Data and Analytics: If your organization has a need for data warehousing and business intelligence, AWS has a wide range of services like Redshift, RDS, Elasticsearch, and more, while GCP has a big data focus with services such as BigQuery and Cloud Dataflow.
Machine learning: GCP has a strong focus on machine learning, with services such as TensorFlow, Cloud ML Engine, and Cloud Vision API, while AWS also has a range of machine learning services including SageMaker, Rekognition, and Lex.
Support: If your organization requires a more established support system with different levels of support options and a larger community, AWS may be a better choice. GCP has a more limited support system and a smaller community.
Hybrid and Multi-cloud: If your organization is planning to adopt a multi-cloud strategy, AWS has a more mature offering for hybrid and multi-cloud scenarios, with services such as Outposts and App Runner

Ultimately, the best choice between GCP and AWS will depend on the specific needs of your organization. It is important to evaluate the services offered by each platform, as well as the pricing, networking, data and analytics, Machine Learning, support, and other factors that are important to your organization.

Service Level Agreement (SLA)

They both offer a wide range of services with different Service Level Agreements (SLAs).

AWS offers an SLA of 99.95% availability for its Elastic Compute Cloud (EC2) and Elastic Block Store (EBS) services. Additionally, it offers an SLA of 99.99% for its Amazon RDS, Amazon DynamoDB, and Amazon ElastiCache services.

GCP offers a similar level of availability for its Compute Engine and Persistent Disk services, with an SLA of 99.95%. GCP also offers an SLA of 99.99% for its Cloud SQL and Cloud Datastore services.

When it comes to SLA, both AWS and GCP offer very similar levels of availability for their core services. However, AWS has a slightly higher SLA for some of its services than GCP.

It's also important to note that, while SLA is an important factor to consider when choosing a cloud provider, it's not the only one. Other factors such as security, scalability, and pricing should also be taken into account.

It's always a good idea to thoroughly review the SLA and other details of the services you plan to use with each provider before making a decision, as well as regularly monitor the services to ensure they meet their SLA.

AWS or GCP, Who has better availability region wise

Both AWS and GCP have a global presence, with multiple data centers and availability regions around the world.

AWS currently has 77 availability regions worldwide and plans to have 84 by the end of 2022. These regions are spread across 24 countries and are designed to provide low latency and high availability for customers.

GCP has 35 regions worldwide and it is spread across 14 countries. It also has plans to expand to more regions in the future and will have a total of 44 regions available by the end of 2024.

In terms of region coverage, AWS has more availability regions than GCP. However, it's important to note that the number of regions doesn't necessarily translate to better availability. The availability of service also depends on factors such as network infrastructure, data center design, and disaster recovery capabilities.

The most popular service of AWS and GCP is based on different regions there are other factors such as industry type or use-case that service is more popular but today we will see only region based

In North America, AWS's Elastic Compute Cloud (EC2) and Simple Storage Service (S3) are among the most popular services. EC2 is widely used for hosting web applications, running big data workloads, and more, while S3 is popular for storing and retrieving files, images, and backups.
In Europe, AWS's Elastic Container Service (ECS) and Elastic Container Registry (ECR) are also popular among users. ECS allows users to easily manage and run containerized applications, while ECR is a fully-managed Docker container registry that makes it easy to store, manage, and deploy Docker container images.
In Asia, AWS Elastic Block Store (EBS) and Amazon Relational Database Service (RDS) are among the most popular services. EBS provides block-level storage for use with EC2 instances, while RDS provides a managed relational database service for use with databases such as MySQL, PostgreSQL, and Oracle.
As for GCP, In North America, Google Compute Engine (GCE) and Google Cloud Storage (GCS) are among the most popular services. GCE allows users to launch virtual machines and configure network and security settings, while GCS is an object storage service that allows users to store and retrieve large amounts of data in the cloud.
In Europe, GCP's BigQuery and Cloud SQL are popular among users. BigQuery is a fully managed, cloud-native data warehouse that enables super-fast SQL queries using the processing power of Google's infrastructure, while Cloud SQL is a fully-managed database service for MySQL, PostgreSQL, and SQL Server.
In Asia, GCP's Cloud Spanner and Cloud Translation API are also popular among users. Cloud Spanner is a fully-managed, horizontally scalable, relational database service, while Cloud Translation API allows developers to easily translate text between thousands of language pairs.

Happy Coding and Keep Sharing!!!

Tuesday, 11 October 2022

SpringBoot API with GitHub Actions, Docker Deployment

Today, We are going to explore and see other possibilities of the most important aspect of SDLC, Which is Continues Integration & Continues Deployment aka CI/CD. There are many tools (Jenkins, Bamboo, etc) available in the market which we can use to Build, Test and Deploy the changes on servers.

In the above diagram, the entire CI/CD is taken care of by Jenkins which is a 3rd party tool. In the real world, this required additional resources (infrastructure) and a team to manage this.

So, since We are using GitHub is there a way we can reduce this additional stuff. Yes, we can use GitHub Actions where the entire CI/CD will run on the same platform. We all have seen this option in GitHub but very rarely do we go there.

To understand it better, let's build a sample Spring Boot application --> Push the code in GitHub -->Trigger Github Actions --> Docker hub.

First, we need to create a repository in GitHub and then go to the Actions tab and click new Workflow options, here we will get many workflow options that we want to integrate with our application, but for this demo, we need to select " Java with Maven".

After you click on configure it will create a maven.yml file which you need to merge with your code, but before that, we need to update the yml to support our application build.

and yes that's it so whenever we merge the code in the master branch the GitHub Actions workflow will trigger and build the code, but we want is that after building the code the, latest changes should also deploy to the Container Registry I am using Docker here, but you can use any other.

In order to push the changes to the docker, we first need to create a repository in the docker hub and after that, we need to tell our maven.yml file about this new step.

 # This workflow will build a Java project with Maven, and cache/restore any dependencies to improve the workflow execution time  
 # For more information see: https://help.github.com/actions/language-and-framework-guides/building-and-testing-java-with-maven  
 name: Java CI with Maven  
 on:  
  push:  
   branches: [ "master" ]  
  pull_request:  
   branches: [ "master" ]  
 jobs:  
  build:  
   runs-on: ubuntu-latest  
   steps:  
   - uses: actions/checkout@v3  
   - name: Set up JDK 17  
    uses: actions/setup-java@v3  
    with:  
     java-version: '17'  
     distribution: 'temurin'  
     cache: maven  
   - name: Build with Maven  
    run: mvn clean install  
   - name: Build & Push Docker Image  
    uses: mr-smithers-excellent/docker-build-push@v5  
    with:  
     image: hemkant/github-actions  
     tags: latest  
     registry: docker.io  
     dockerfile: Dockerfile  
     username: ${{ secrets.DOCKER_USERNAME}}  
     password: ${{ secrets.DOCKER_PASSWORD}}

I have used another image here which will perform all the operations docker-build-push. after that, the credentials to access the docker hub is stored in GitHub secrets.

After all of these let's commit some code and see, how all these work together. In the below screenshot, we can see all the workflow triggers whenever I committed the code.

and let's also see if the steps we mentioned in our maven.yml file are followed or not, for that we can click on any item to check and it will show us all the details.

And the docker file which I used here is

 FROM openjdk:17  
 EXPOSE 8080  
 ADD target/github-actions.jar github-actions.jar  
 ENTRYPOINT ["java", "-jar", "/github-actions.jar"]

let's check the Build & Push Docker Image step. looks like everything is fine here, and image is pushed to Docker Hub

The last thing we should also check is Docker Hub, looks good the image is pushed successfully.

We have covered all the points which we discussed at the beginning of this blog. Code Repo.

In the next blog, We will deploy the same image on the Google Cloud Platform, Kubernetes.

Happy Coding and Keep Sharing!!