Saturday, 28 January 2023

Kubernetes in a nutshell

In my previous blog, We learned how to deploy an API to the GCP K8s engine, Today we will learn about Kubernetes as an Overview.

Google created Kubernetes (K8s) as part of their internal infrastructure to manage the containerized applications running on their infrastructure.

Kubernetes is an open-source platform designed to automate the deployment, scaling, and management of containerized applications. It's been around for a while now and has become the standard for managing containerized applications in production environments.

What is Kubernetes? 

Kubernetes is an orchestration system that automates the deployment, scaling, and management of containers. It provides a unified platform for deploying and managing containers, making it easier for organizations to run and scale their applications. Kubernetes is highly extensible, allowing organizations to customize it to meet their specific needs.

Kubernetes (K8s) is an open-source container orchestration system for automating the deployment, scaling, and management of containerized applications. It works by using a master node to control and manage a group of worker nodes. 

How does Kubernetes work? 

Kubernetes works by dividing an application into smaller units, called containers. Each container holds a piece of an application, such as a microservice. These containers can be deployed and managed independently, making it easier to scale and manage applications. 

Kubernetes uses a declarative approach to manage containers, meaning you define what you want your application to look like and Kubernetes takes care of the rest. This makes it easy to manage complex applications, as you don't need to worry about the details of how containers are deployed and managed.. 

The main components of a Kubernetes cluster are: 

  • The API server: The entry point for all administrative tasks. It exposes the Kubernetes API and communicates with the other components. 
  • etcd: A distributed key-value store that stores the configuration data of the cluster. 
  • The controller manager: Responsible for maintaining the desired state of the system by making changes to the actual state of the system as necessary. 
  • The kubelet: Runs on each worker node and communicates with the API server. It is responsible for starting and stopping containers on the node. 
  • The kube-proxy: Runs on each worker node and provides network connectivity to the containers. 
Kubernetes uses a declarative approach, where the user defines the desired state of the system in the form of manifests, and the system ensures that the actual state of the system matches the desired state. 

Why use Kubernetes? 
Kubernetes provides a number of benefits over traditional approaches to managing containers. Some of the most significant benefits include: 
  • Scalability: Kubernetes makes it easy to scale your application, either by adding more containers or by increasing the resources assigned to existing containers. 
  • Resilience: Kubernetes automatically monitors containers and restarts them if they fail, ensuring that your application is always available. 
  • Portability: Kubernetes can run on a variety of cloud platforms, as well as on-premises. This makes it easier to move your application from one platform to another, reducing vendor lock-in. 
  • Integration: Kubernetes integrates with a variety of tools and platforms, making it easier to integrate your application with other systems.
Kubernetes Architecture 
Kubernetes is based on a master-worker architecture. The master node is responsible for managing the cluster, while worker nodes run the containers. The master node communicates with worker nodes to deploy and manage containers, and to ensure that the desired state of the application is maintained.

Kubernetes uses a number of components to manage containers, including: 
  • API server: The API server is the central component of the Kubernetes architecture. It provides a RESTful interface for managing the cluster, and is used by other components to interact with the cluster. 
  • etcd: etcd is a distributed key-value store that Kubernetes uses to store cluster data. This data is used to ensure that the desired state of the application is maintained. 
  • Scheduler: The scheduler is responsible for scheduling containers to run on worker nodes. It uses data from the etcd store to determine the optimal placement of containers. 
  • Controller manager: The controller manager is responsible for managing the state of the cluster. It monitors the state of the cluster and takes action to ensure that the desired state is maintained. 
  • Kubelet: The kubelet is a component that runs on each worker node. It communicates with the master node to receive instructions for deploying and managing containers.
Deploying Applications on Kubernetes

To deploy an application in a Kubernetes cluster, you would create a deployment manifest that defines the desired state of the application, such as the number of replicas and the container image to use. The Kubernetes control plane will then ensure that the actual state of the system matches the desired state by creating the necessary pods and replication controllers. 

Another example is if an application running on a node goes down, Kubernetes will automatically create a new pod to replace the failed one. Also, if the load on an application increases, Kubernetes can automatically scale the number of replicas to handle the increased load. 

Kubernetes also provides features such as service discovery and load balancing to make it easier to access applications running in the cluster, as well as rolling updates to allow for updates to be made to the system with minimal downtime. 

Overall, Kubernetes provides a powerful platform for managing containerized applications at scale, making it easier to deploy, scale, and manage applications in a production environment.


The process of deploying an application in a Kubernetes cluster involves several steps: 

  • Containerizing the application: The first step is to containerize the application by creating a Docker image that includes the application code and all its dependencies. 
  • Creating a deployment manifest: Once the application is containerized, you need to create a deployment manifest that defines the desired state of the application. This includes the number of replicas, the container image to use, and any environment variables or volumes that the application requires. 
  • Creating a service manifest: A Service manifest defines the desired state of the service. It is responsible for the network communication between the pods and the external world. 
  • Applying the manifests: The next step is to apply the manifests to the cluster. This can be done using the kubectl command-line tool, which communicates with the Kubernetes API server to create the necessary resources in the cluster. 
  • Verifying the deployment: After applying the manifests, you can use the kubectl command-line tool to verify that the deployment was successful. This includes checking that the pods and replication controllers were created and that the desired number of replicas is running. 
  • Updating the deployment: If you need to make changes to the deployment, such as updating the container image or changing the number of replicas, you can do so by modifying the deployment manifest and reapplying it to the cluster. Kubernetes will then update the actual state of the system to match the desired state. 
  • Scaling the deployment: If the workload increases, you can scale the deployment by modifying the replicas count in the deployment manifest and reapplying it to the cluster. Kubernetes will then automatically create new pods to handle the increased load. 
  • Monitoring the deployment: Monitoring the deployment, including the health and performance of the application, is important to ensure that the application is running as expected and to troubleshoot any issues that may arise. 
Overall, the process of deploying an application in a Kubernetes cluster involves containerizing the application, creating manifests, applying them to the cluster, and then monitoring and updating the deployment as needed.

How we can secure the deployment in Kubernetes

  • Secure communication: Ensure that all communication between components within the cluster, as well as between the cluster and external systems, is secure. This can be done by using secure protocols such as HTTPS and securing etcd with proper authentication and authorization. 
  • Network segmentation: Use network policies to segment the network and limit communication between pods and services. 
  • Role-based access control (RBAC): Use RBAC to control access to the Kubernetes API, and limit the actions that users, groups, and service accounts can perform within the cluster. 
  • Secrets and configMaps management: Use Kubernetes secrets and configMaps to store sensitive information such as passwords, tokens, and certificates in an encrypted form and avoid storing them in the application code. 
  • Pod security policies: Use pod security policies to define the security context for pods, including setting resource limits and enabling security features such as AppArmor or SELinux. 
  • Regular Auditing: Regularly audit the cluster for security risks and compliance issues, and take action as necessary. 
  • Secure your nodes: Secure the nodes by using a firewall, configuring secure boot, using a trusted platform module (TPM), and securing the operating system. 
  • Use security add-ons: Use security add-ons such as Kubernetes Network Policy, PodSecurityPolicy, Kubernetes Secrets, Kubernetes ConfigMaps, etc to secure your deployment. Use third-party tools: 
  • Use third-party tools such as Kube-bench, Kube-hunter, etc to scan and test the cluster for vulnerabilities and misconfigurations. 

Overall, securing a deployment in Kubernetes requires a combination of different security measures to protect the communication, network, and access control, as well as the data and application running within the cluster.



This deployment file creates a deployment named "my-app" with 3 replicas and runs the container as non-root user, with read-only root file system. The environment variables, SECRET_KEY, and CONFIG_SETTINGS are set using Kubernetes Secrets and ConfigMaps respectively, to store sensitive and non-sensitive information. Also, it uses a pod security policy to set the security context of the pod.
 

Happy Coding and Keep Sharing!!

Thursday, 26 January 2023

Serverless computing

Serverless computing is a cloud-based computing execution model in which the cloud provider dynamically manages the allocation of machine resources. With serverless computing, the cloud provider is responsible for provisioning, scaling, and managing the servers that run the code, rather than the user. This allows developers to focus on writing code and deploying their applications, without the need to worry about the underlying infrastructure. 



In serverless computing, the code is run in stateless compute containers that are triggered by events and automatically scaled to match the rate of incoming requests. This eliminates the need for provisioning, scaling, and maintaining servers, resulting in lower costs and increased scalability. 

Examples of serverless computing include AWS Lambda, Azure Functions, and Google Cloud Functions. These services allow developers to create and deploy their code as small, single-purpose functions, which are automatically triggered by events such as an HTTP request or a database update. 

Serverless computing is commonly used for building web and mobile backends, real-time data processing, and event-driven architectures. 

It's important to note that despite the name, there are servers still running behind the scene, the difference is that the provider manages the servers and the user only pays for the resources used (compute, storage, etc) and not for the servers.

Here are a few common use cases for serverless computing: 

  • Event-driven computing: Serverless architectures are well-suited for processing events, such as changes to a database or new files being uploaded to a storage service. This allows for real-time data processing and efficient scaling. 
  • APIs and Microservices: Serverless computing is often used to build and deploy APIs, as well as to run microservices. This allows for better scalability and cost management, as resources are only allocated when an API request is made or a microservice is invoked. 
  • Background tasks and cron jobs: Serverless computing can be used to run background tasks, such as image processing or data analysis, which can be triggered by a schedule or a specific event. 
  • Web and mobile apps: Serverless architectures can be used to build and deploy web and mobile applications, allowing for faster development, lower costs, and better scalability. 
  • IoT and edge computing: Serverless computing can be used to build and deploy applications for Internet of Things (IoT) devices and for edge computing, where compute resources are located at or near the edge of a network. 
  • Chatbot and voice assistants: Serverless function can be used to handle the logic of chatbot and voice assistants, this way only the necessary compute power is used when a user interacts with the chatbot or assistant.

Advantages of using serverless applications in .NET Core :

  • Cost-effective: serverless architecture eliminates the need for provisioning and maintaining servers, resulting in lower costs. 
  • Scalability: serverless applications can automatically scale in response to increased traffic, without the need for manual intervention. 
  • Flexibility: serverless architecture allows for the deployment of small, single-purpose functions, making it easier to build and maintain a microservices-based architecture. 
  • Reduced operational complexity: serverless applications are abstracted away from the underlying infrastructure, reducing the operational complexity of deploying and managing applications. 
  • Improved availability: serverless applications can be designed to automatically failover to other instances in the case of a failure, improving the overall availability of the application.
Disadvantages of using serverless applications:
  • Cold start: serverless applications may experience a delay in response time when they first receive a request after a period of inactivity, known as a "cold start." 
  • Limited control over the underlying infrastructure: serverless applications do not provide the same level of control over the underlying infrastructure as traditional server-based applications. 
  • Concurrency limitations: serverless applications may be subject to concurrency limitations, depending on the platform and the number of instances available. 
  • Limited support for long-running tasks: serverless architecture is best suited for short-lived, stateless tasks, and may not be the best choice for long-running, stateful tasks. 
  • Higher latency: serverless applications may experience higher latency because of the need to spin up new instances to handle incoming requests.
Here are the general steps to build a Serverless application: 
  • Choose a cloud provider: There are several popular cloud providers that offer serverless computing services, such as AWS Lambda, Azure Functions, and Google Cloud Functions. Choose the one that best fits your needs and has good Java support. 
  • Set up the development environment: Before you start building your serverless application, you will need to set up your development environment. This typically involves installing the necessary software and configuring your development environment. 
  • Create a new function: Once your development environment is set up, you can create a new function. This typically involves specifying the function's name, the trigger that will invoke the function, and the code that will be executed when the function is invoked. 
  • Write the code: Write the code for your function using Java. The code should handle the input and output of the function. Test the function: 
  • Test the function locally before deploying it to the cloud. This can be done using the cloud provider's command-line tools or SDK. 
  • Deploy the function: Once your function is tested and working, deploy it to the cloud provider's serverless computing service. 
  • Monitor and maintain: After deploying your function, monitor and maintain it. This includes monitoring the function's performance and error logs, and making updates and fixes as necessary.

Steps to build serverless application in .NET Core
  • Install the .NET Core SDK and the AWS SDK for .NET on your local machine. 
  • Create a new .NET Core project using the "dotnet new" command. 
  • Add the AWS Lambda NuGet package to the project. 
  • Create a new class that will serve as the entry point for the Lambda function. This class should implement Amazon.Lambda.Core.ILambda function interface. 
  • Add the necessary code to handle the input and output of the Lambda function in the class created in the previous step. 
  • Create an AWS profile and configure the AWS SDK for .NET with the appropriate credentials. 
  • Use the "dotnet lambda deploy-function" command to deploy the Lambda function to AWS. 
  • Test the deployed Lambda function using the AWS Lambda console or the AWS CLI. 
  • Add any other functionality, such as connecting to a database or invoking other AWS services, as needed for your application. 
  • Continuously monitor and update your serverless application to ensure optimal performance and stability.

Happy Coding and Keep Sharing !!

Sunday, 22 January 2023

Google Cloud vs AWS - Comparing in 2023

Google Cloud Platform (GCP) and Amazon Web Services (AWS) are both popular cloud computing platforms that offer a wide range of services for businesses and organizations. Both GCP and AWS provide infrastructure as a service (IaaS), platform as a service (PaaS), and software as a service (SaaS) offerings, allowing customers to build, deploy and run applications in the cloud. 



Here are some key differences between GCP and AWS: 

  • Services: GCP and AWS offer a similar set of services, but they may have different names and slightly different functionality. GCP has a strong focus on big data and machine learning, while AWS has a wider range of services and a more established ecosystem of partners and third-party tools. 
  • Pricing: GCP and AWS have different pricing models, with GCP generally being more flexible and customizable, while AWS often has a more straightforward pricing structure. GCP also offers sustained-use discounts, which can lower the cost of running long-running workloads. 
  • Networking: GCP has a strong emphasis on global networking and offers services such as Google's global load balancer and Cloud VPN, while AWS has a more established ecosystem of partners and third-party tools for networking. 
  • Data and Analytics: AWS has a wide range of data and analytics services, including Redshift, RDS, and Elasticsearch, while GCP has a big data focus with services such as BigQuery and Cloud Dataflow. 
  • Machine learning: GCP has a strong focus on machine learning, with services such as TensorFlow, Cloud ML Engine, and Cloud Vision API, while AWS also has a range of machine learning services including SageMaker, Rekognition, and Lex. 
  • Support: AWS has a more established support system with different levels of support options and a larger community, while GCP has a more limited support system and a smaller community. 
Overall, GCP and AWS are both powerful cloud platforms that offer a wide range of services. The choice between the two will depend on the specific needs of your organization, including the services you require, your budget, and your existing infrastructure.

GCP over AWS and Vice-Versa?

Choosing between Google Cloud Platform (GCP) and Amazon Web Services (AWS) can depend on several factors, including the specific services and features offered by each platform, the pricing model, and the overall fit with your organization's existing infrastructure and workflow. 

Here are some factors to consider when deciding between GCP and AWS: 

  • Services: If your organization has specific needs for big data and machine learning, GCP may be a better choice as it has a strong focus on these areas. On the other hand, if your organization requires a wide range of services and a more established ecosystem of partners and third-party tools, AWS may be a better choice. 
  • Pricing: GCP offers more flexible and customizable pricing, while AWS often has a more straightforward pricing structure. GCP also offers sustained-use discounts, which can lower the cost of running long-running workloads. 
  • Networking: GCP has a strong emphasis on global networking, with services such as Google's global load balancer and Cloud VPN, while AWS has a more established ecosystem of partners and third-party tools for networking. 
  • Data and Analytics: If your organization has a need for data warehousing and business intelligence, AWS has a wide range of services like Redshift, RDS, Elasticsearch, and more, while GCP has a big data focus with services such as BigQuery and Cloud Dataflow. 
  • Machine learning: GCP has a strong focus on machine learning, with services such as TensorFlow, Cloud ML Engine, and Cloud Vision API, while AWS also has a range of machine learning services including SageMaker, Rekognition, and Lex. 
  • Support: If your organization requires a more established support system with different levels of support options and a larger community, AWS may be a better choice. GCP has a more limited support system and a smaller community. 
  • Hybrid and Multi-cloud: If your organization is planning to adopt a multi-cloud strategy, AWS has a more mature offering for hybrid and multi-cloud scenarios, with services such as Outposts and App Runner 
Ultimately, the best choice between GCP and AWS will depend on the specific needs of your organization. It is important to evaluate the services offered by each platform, as well as the pricing, networking, data and analytics, Machine Learning, support, and other factors that are important to your organization.

Service Level Agreement (SLA)

They both offer a wide range of services with different Service Level Agreements (SLAs). 

AWS offers an SLA of 99.95% availability for its Elastic Compute Cloud (EC2) and Elastic Block Store (EBS) services. Additionally, it offers an SLA of 99.99% for its Amazon RDS, Amazon DynamoDB, and Amazon ElastiCache services. 

GCP offers a similar level of availability for its Compute Engine and Persistent Disk services, with an SLA of 99.95%. GCP also offers an SLA of 99.99% for its Cloud SQL and Cloud Datastore services. 

When it comes to SLA, both AWS and GCP offer very similar levels of availability for their core services. However, AWS has a slightly higher SLA for some of its services than GCP. 

It's also important to note that, while SLA is an important factor to consider when choosing a cloud provider, it's not the only one. Other factors such as security, scalability, and pricing should also be taken into account. 

It's always a good idea to thoroughly review the SLA and other details of the services you plan to use with each provider before making a decision, as well as regularly monitor the services to ensure they meet their SLA.


AWS or GCP, Who has better availability region wise

Both AWS and GCP have a global presence, with multiple data centers and availability regions around the world. 

AWS currently has 77 availability regions worldwide and plans to have 84 by the end of 2022. These regions are spread across 24 countries and are designed to provide low latency and high availability for customers. 

GCP has 35 regions worldwide and it is spread across 14 countries. It also has plans to expand to more regions in the future and will have a total of 44 regions available by the end of 2024. 

In terms of region coverage, AWS has more availability regions than GCP. However, it's important to note that the number of regions doesn't necessarily translate to better availability. The availability of service also depends on factors such as network infrastructure, data center design, and disaster recovery capabilities.

The most popular service of AWS and GCP is based on different regions there are other factors such as industry type or use-case that service is more popular but today we will see only region based

  • In North America, AWS's Elastic Compute Cloud (EC2) and Simple Storage Service (S3) are among the most popular services. EC2 is widely used for hosting web applications, running big data workloads, and more, while S3 is popular for storing and retrieving files, images, and backups. 
  • In Europe, AWS's Elastic Container Service (ECS) and Elastic Container Registry (ECR) are also popular among users. ECS allows users to easily manage and run containerized applications, while ECR is a fully-managed Docker container registry that makes it easy to store, manage, and deploy Docker container images. 
  • In Asia, AWS Elastic Block Store (EBS) and Amazon Relational Database Service (RDS) are among the most popular services. EBS provides block-level storage for use with EC2 instances, while RDS provides a managed relational database service for use with databases such as MySQL, PostgreSQL, and Oracle. 
  • As for GCP, In North America, Google Compute Engine (GCE) and Google Cloud Storage (GCS) are among the most popular services. GCE allows users to launch virtual machines and configure network and security settings, while GCS is an object storage service that allows users to store and retrieve large amounts of data in the cloud. 
  • In Europe, GCP's BigQuery and Cloud SQL are popular among users. BigQuery is a fully managed, cloud-native data warehouse that enables super-fast SQL queries using the processing power of Google's infrastructure, while Cloud SQL is a fully-managed database service for MySQL, PostgreSQL, and SQL Server. 
  • In Asia, GCP's Cloud Spanner and Cloud Translation API are also popular among users. Cloud Spanner is a fully-managed, horizontally scalable, relational database service, while Cloud Translation API allows developers to easily translate text between thousands of language pairs.

Happy Coding and Keep Sharing!!!

Friday, 20 January 2023

Spring Boot Security module best practice

Spring Boot Security is a module of the Spring Framework that provides a set of features for securing Spring-based applications. It is built on top of Spring Security, which is a powerful and highly customizable authentication and access-control framework. 



Spring Boot Security provides several features out of the box, including: 

Authentication: Spring Boot Security supports several authentication mechanisms, including basic authentication, token-based authentication (JWT), and OAuth2/OpenID Connect. 

Authorization: Spring Boot Security supports role-based access control (RBAC) and can be configured to check for specific roles or permissions before allowing access to an endpoint. 

CSRF protection: Spring Boot Security provides built-in protection against cross-site request forgery (CSRF) attacks. 

Encryption: Spring Boot Security supports HTTPS and can be configured to encrypt all traffic to and from the application. 

Security Configuration: Spring Boot Security allows to secure the application by providing a set of security configuration properties that can be easily integrated into different security scenarios. 

It also provides support for the integration of Spring Security with other Spring modules such as Spring MVC, Spring Data, and Spring Cloud.

Securing a Spring Boot API is important for several reasons: 

  • Confidentiality: By securing an API, you can prevent unauthorized access to sensitive data and protect against data breaches. 
  • Integrity: Securing an API can prevent unauthorized changes to data and ensure that data is not tampered with in transit. 
  • Authentication: By securing an API, you can ensure that only authorized users can access the API and perform specific actions. 
  • Authorization: Securing an API can also ensure that users can only access the resources and perform the actions that they are authorized to do. 
  • Compliance: Many industries and governments have regulations that mandate certain security measures for handling sensitive data. Failing to secure an API can result in non-compliance and penalties. 
  • Reputation: Security breaches can lead to loss of trust and damage to an organization's reputation. 
  • Business continuity: Security breaches can lead to loss of revenue, legal action, and other negative consequences. Securing an API can help to minimize the risk of a security breach and ensure business continuity.

There are several ways to secure a Spring Boot API, including: 

  • Basic authentication: This method involves sending a username and password with each request to the API. Spring Security can be used to implement basic authentication. 
  • Token-based authentication: This method involves sending a token with each request to the API. The token can be generated by the server and passed to the client, or the client can obtain the token from a third-party service. JSON Web Tokens (JWT) are a popular choice for token-based authentication in Spring Boot. 
  • OAuth2 and OpenID Connect: These are industry-standard protocols for authentication and authorization. Spring Security can be used to implement OAuth2 and OpenID Connect. 
  • HTTPS: All data sent to and from the API should be encrypted using HTTPS to protect against eavesdropping. Input validation: 
  • Input validation should be used to prevent malicious data from being passed to the API. Spring Boot provides built-in support for input validation. 
  • Regularly monitoring and maintaining the security of the application and its dependencies It's important to note that the best approach will depend on the requirements of your specific application and the level of security that is needed.
Token-based authentication is one of the most commonly used methods to secure Spring Boot API. The reason token-based authentication is so popular is that it provides several advantages over other authentication methods: 
  • Stateless: Tokens are self-contained and do not require the server to maintain a session, which makes it easy to scale the API horizontally. 
  • Decoupled: Tokens are decoupled from the API, which means that the API does not need to know anything about the user. This makes it easy to add or remove authentication providers without affecting the API. 
  • Portable: Tokens can be passed between different systems, which makes it easy to authenticate users across different platforms. 
  • JSON Web Tokens (JWT) which is a widely used token format, is an open standard and can be easily integrated into different systems. 
  • It can also be used in combination with OAuth2 and OpenID Connect. 
It's important to note that token-based authentication is not suitable for all use cases, but it is widely used and can be a good choice for many Spring Boot API.

Happy Coding and Keep Sharing!!

Wednesday, 18 January 2023

What is Spring Batch and Partitioning ?

Spring Batch is a framework for batch processing in Spring. It provides a set of reusable functions for processing large volumes of data, such as reading and writing data from/to files or databases, and performing complex calculations and transformations on the data. 

Spring Batch provides several key features, including: 

  • Chunk-oriented processing: Spring Batch processes data in small chunks, which allows for efficient memory usage and the ability to process large volumes of data. 
  • Transactions: Spring Batch supports the use of transactions, which ensures that data is processed consistently and that any errors can be rolled back. 
  • Job and step abstractions: Spring Batch uses the concepts of jobs and steps to organize the batch processing logic. A job is a high-level abstraction that represents a complete batch process, while a step is a more specific task that is part of a job. 
  • Retry and skippable exception handling: Spring Batch provides built-in retry and skippable exception handling, which makes it easy to handle errors and recover from failures during batch processing. 
  • Parallel processing: Spring Batch allows for parallel processing of steps, which can improve the performance of batch processing. 
  • Job scheduling: Spring Batch provides built-in support for scheduling jobs using either a Cron-like expression or a fixed delay. 
  • Extensibility: Spring Batch allows for custom code to be added to the framework by providing a set of callbacks and interfaces that can be implemented to perform custom logic. 
  • Spring Batch is typically used in situations where data needs to be processed in large volumes, where performance is critical, and where the data needs to be processed in a consistent and repeatable manner.
Example of how to implement a Spring Batch job using a Spring Boot application:

  • First, add the Spring Batch and Spring Boot Starter dependencies to your pom.xml file:


  • Create a Spring Batch Job configuration class, where you define the steps that make up the job and how they are related:

  • Create a Spring Boot main class, where you can run the job using the Spring Batch JobLauncher:



    Spring Partitioning is a feature of Spring Batch that allows for the processing of large amounts of data to be divided into smaller, manageable chunks, and then processed in parallel. This can improve the performance of batch processing by allowing multiple processors or machines to work on different parts of the data at the same time. 

    Spring Partitioning works by dividing the data into partitions, which are processed by different worker threads. Each partition is processed independently and the results are later combined. 

    Spring Partitioning provides several key features, including: 
    • Data partitioning: Spring Partitioning allows for the data to be divided into smaller, manageable chunks, which can then be processed in parallel. 
    • Parallel processing: Spring Partitioning allows for the parallel processing of partitions, which can improve the performance of batch processing. 
    • Scalability: Spring Partitioning allows for batch processing to be scaled out by adding more worker threads or machines. 
    • Flexibility: Spring Partitioning allows for different partitioning strategies to be used depending on the specific requirements of the data and the batch process. 
    • Integration with Spring Batch: Spring Partitioning is integrated with Spring Batch and can be used with the Job and Step abstractions provided by Spring Batch. 
    Sample code that demonstrates how to implement Spring Partitioning in a Spring Batch application:
    • First, create a Job and a Step that uses the partitioning feature:

    • Next, create a `Partitioner` class that will be responsible for dividing the data into partitions. This class should implement the `org.springframework.batch.core.partition.support.Partitioner` interface:

    • Finally, create an ItemReader, ItemProcessor, and ItemWriter that will be used by each partition:

    Spring Partitioning is useful in situations where a large amount of data needs to be processed in a short period of time, and where the batch process can be parallelized to improve performance. It's important to note that not all scenarios can be parallelized, and a proper analysis of the data and process needs to be done before deciding to use partitioning.

    Happy Coding and keep Sharing!!

    Sunday, 8 January 2023

    Microservice Architecture in .NET

    In my previous blog we already discussed Microservice Architecture using Java but today we are going to develop and understand how we can leverage the Microservice architecture using .NET core API, and what all tools we can use to make it Resilience, observability, Fault-tolerance, Monitoring, and Rate-limiting and all other components which are necessary for any Microservice driven architecture.   

    Before we move forward let's review some PROS and CONS of having Microservice and Monolithic architecture.




    To start with let's build an example, where we are going to create a Catalog and Inventory APIs and build our Microservice architecture around it. So the big question comes when to use Microservices?

    • It's fine to start with a monolith and then move to Microservices 
    • And We should start looking at Microservice when:
      • The code base size is more than what a small team can maintain
      • A team can't move fast anymore
      • Build because too slow due to large code base
      • Time to market is compromised due to infrequent deployments and long testing times.
    • It's all about team autonomy.

    In our previous Microservice architecture in JAVA, we build Microservice which communicates synchronously but today we are going to build an asynchronous Microservice, and will also understand its benefits.



    Synchronous Communication 

    • The client sends a request and waits for a response from the service which is exactly happening in our Mircorservice example in JAVA
    • The client cannot process without the response.
    • The client thread may use blocking or non-blocking callbacks.
    • REST+HTTP protocol is the traditional approach
    • Partial failures will happen.
    • In a distributed system whenever a service makes a synchronous request to another service, there is a risk of partial failure.
    • So, We must design our services to be resilient 
    • Timeouts, for more response experience and to ensure resources are never tied up 
    • Implement a Circuit breaker pattern to prevent our service from reaching resource exhaustion 

    Asynchronous Communication

    • The client does not wait for a response in a timely manner
    • There might be no response at all
    • Usually involves the use of a lightweight message broker
    • Message broker has high availability 
    • Messages are sent to the broker and could be received by a single receiver or multiple receivers 

    In the Asynchronous Microservice, which I build. 

    • For resilience and transient-fault handling capabilities, Retry, and Circuit Breaker.  I have used Polly.NET libraries.
    • For API Gateway, Caching, and Rate Limiting, I have used Ocelot .NET libraries. 
    • For Monitoring, I have used Prometheus and Grafana.
    • RabbitMQ is used for messaging.
    • MongoDB is used to store the data.

    The Code is available on GitHub

    Happy coding and keep sharing!!

    Wednesday, 19 October 2022

    Best Time to Buy and Sell Stock with Cooldown

    Problem Statement:- You are given an array of prices where prices[i] is the price of a given stock on the ith day.

    Find the maximum profit you can achieve. You may complete as many transactions as you like (i.e., buy one and sell one share of the stock multiple times) with the following restrictions:

    • After you sell your stock, you cannot buy stock on the next day (i.e., cooldown one day).

    Note: You may not engage in multiple transactions simultaneously (i.e., you must sell the stock before you buy again).

     Example 1:  
     Input: prices = [1,2,3,0,2]  
     Output: 3  
     Explanation: transactions = [buy, sell, cooldown, buy, sell]  
     Example 2:  
     Input: prices = [1]  
     Output: 0  
    

    In example 1:- We have an array of prices and transactions, before we move forward let's go back and see the main restrictions we have here. We can't buy a stock or sell it the next day, we have to have at least one day gap (Cooldown day).

    Now let's go back to the example, where we buy for 1 and sell for 2, which means the profit of 1, and then we have a cooldown period and then we buy for 0 and then sell for 2 which means the profit of 2.

    So the total output is 3 and let's see how we can solve this problem with linear time O(n).



    The downside of this approach would be Time Complexity = height of the tree n, where n is the size of the prices array, and the number of decisions we can make at every step is 2 so the overall time complexity would be (2n)

    Using a Dynamic programming technique called caching we can reduce the time complexity by O(n)

     State: Buying and selling  
     if buying => i+1  
     if selling => i+2 (remember we need to wait for the cooldown day so i+2)  
    





    Happy Coding and Keep Sharing !!   Code Repo

    Tuesday, 18 October 2022

    Microservice Architecture in Java

    Microservice Architecture enables large teams to build scalable applications that are composed of multiple small loosely coupled services. In Microservice each service handles a dedicated function inside a large-scale application.

    Challenges that we all see when designing Microservice Architecture are "Right-Sizing and Identifying the limitations and Boundaries of the Services".

    Some of the most commonly used approaches in the industry:-

    • Domain Driven:- In this approach, we would need good Domain Knowledge and it takes a lot of time to close alignment with all the Business stakeholders to identify the need and requirements to develop Microservices for business capabilities.  
    • Event Storming Sizing:-  We conduct a session with all business Stakeholders and identify various events in the system and based on that we can group them in Domain Driven.

    In the below Microservice Architecture for a Bank, where we have (Loan, Card, Account, and Customer) Microservices, along with other required services for the successful implementation of Microservice Architecture. 


    Let's look at the most critical components that are required for Microservice Architecture Implementation. 

    The API Gateway handles all incoming requests and routes to the relevant microservices.  The API gateway depends on the Identity Provider service to handle the authentication.

    To locate the service to route an incoming request to, API Gateway consults a service registry and discovery service. ALL Microservice register with Service Registry and Discover the location of other Microservices using Discovery services. 

    Let's take a look at the components in detail for a Successful Microservice Architecture and why they are required.
    1. Handle Routing Requirements API Gateway:- Spring Cloud Gateway is a library for building an API gateway. Spring cloud gateway sits between a requester and a resource, where it intercepts analysis of the request.  It is also a preferred API gateway from the spring cloud team.  It also has the following advantages:- 
      1. Built on Spring 5, reactor, and Spring WebFlux.
      2. It also includes circuit breaking and discovery service with Eureka.  
    2. Configuration Service:-  We can't Hard code the config details inside the service and in a DTAP it would be a nightmare to manage all config in the application properties plus manage them when a new service joins. So for that In a Microservice architecture, we have a config service that then can load and inject the configuration from (Git Repo, File system, or Database) to Microsrevies while they're starting up, and since we are talking about Java, I have used Spring Cloud Config for Configuration Management.
    3. Service Registry and Discovery:- In a Microservice Arihcture how do services locate each other inside a network and how do we tell our application architecture when a new service is onboarded or a new node is added for existing services and how load balancer will work. This all looks very complicated but, We have Spring Cloud Discovery Service using the Eureka agent. Some Advantages of using Service discovery. 
      1. No Limitation on Availability 
      2. Peer to Peer communication between service Discovery agent
      3. Dynamically Managed IPs, Configurations, and Load Balance.
      4. Fault-tolerance and Resilience 
    4. Resilience Inside Microservices:- In this, We make sure that we handle the service failure gracefully, avoid cascading effects if one of the services is failed, and have self-healing capabilities. For Resilience Spring Framework Support Resilience4J  which is a lightweight and easy-to-use fault tolerance library inspired by NetFlix Hystrix. Before Resilience4J NetFlix Hystrix.is most commonly used for resiliency but it is now in maintenance mode.  Resilience4J offers the following patterns for increasing fault tolerance. 
      1. Circuit Breaking:- Used to stop making a request when a service is failing.
      2. Fallback:- Alternative path to failing service.
      3. Retry:- Retry when a service is failing temporarily failed.
      4. Rate Limit:- Limit the number of calls a service gets at a time.
      5. Bulkhead:- To avoid overloading.
    5. Distributed Tracing and logging:- For debugging the problem in a microservice architecture we would need to aggregate all the logs traces and monitor the chain of service calls for that we have Spring Cloud Sleuth and Zipkin.
      1. Sleuth provides auto-configuration for disturbing logs it also adds the SPAN ID to all the logs by filtering and interacting with other spring components and generating the Correlation Id passes through to all the system calls.
      2. Zipkin:- Is used for Data-Visualisations 
    6.  Monitoring:- Is used to monitor service metrics health checks and create alerts based on Monitoring and we have different approaches to do that. Let's see the most commonly used approaches.
      1.  Actuator:- is mainly used to expose operational information like health, dump, info, and memory.
      2. Micrometer:- Expose Actuator data in a format that can be understood by the Monitoring system all we need to add vendor-specific Micrometer dependency in the service.
      3. Prometheus:- It is a time-series database to store metric data and also has the data-visualization capability.
      4. Grafana:-  Pulled the data from various data sources like Prometheus and offers rich UI to create custom Dashboard and also allows to set rule-based alerts and notifications. 

    We have covered all the relevant components for a successful Microservice Architecture, I build  Microservices using  Spring Framework and all the above Components Code Repo

    Happy Coding and Keep Sharing!!
     

    Wednesday, 12 October 2022

    Deploy Spring Boot API Docker Image to GCP Kubernetes Engine

    In the previous blog, we build a demo Spring Boot API and deployed it to Docker Hub using GitHub Actions. In this blog, we will deploy that same docker image to Kubernetes.  A quick recap [read].

    In order to deploy the docker image to Google Cloud, we need a Google Cloud Account signup for Free Trail, If you don't have a Google Cloud account already it will first show you the billing page. after that, it will redirect you to the landing page. Here we first need to create a project, because in GC everything we do, we do it in a project, and billing is also generated based on that.


    Here you can see all the billing-related information based on your use, after that, we need to go to the services section and click on the left burger menu and select Kubernetes Engine - > Cluster.



    Here we first need to create a Cluster because then only we would be able to deploy anything. I have selected the Self-Managed Cluster option,  you can select the same or the recommended one which is then managed by Google.


    Here we need to enter the Cluster name followed by the Location Type and the rest of the settings we can leave as default, click on Create button which will start the process of creating a cluster and it will 1-2 mins.

    So, the Cluster is created successfully with 12GB of Total Memory, and 6 CPUs which should be sufficient for our demo application to run.

    The next step is we need to create our deployment file.

     apiVersion: apps/v1  
     kind: Deployment  
     metadata:  
      name: spring-docker-k8s-deployment  
     spec:  
      replicas: 2  
      selector:  
       matchLabels:  
        app: spring-docker-k8s  
      template:  
       metadata:  
        labels:  
         app: spring-docker-k8s  
       spec:  
        containers:  
         - name: spring-docker-k8s  
          image: hemkant/github-actions  
          ports:  
           - containerPort: 5678  
    
    In this deployment file, I am using the same docker image which we deployed to the docker hub, with just one replica.

    Next, we need to execute this deployment file and for that, we can use Google Cloud shell. 


     
    Go to Cluster and click on three dots and Connect, this will open the shell prompt in the browser for us to run kubectl commands, after that we need to run the command to authenticate with GC.



    After that, we should be able to upload the deployment file which we created.


    Once your file is uploaded you can run ls command to check, and you should see the file in the directory.


    Next, we need to run "kubectl apply -f <filename.yaml>"


    This command will create the Pods inside the cluster which we created, from the menu go to Workloads.



    Here we can see the deployment is done and the status is ok with 2/2 Pods. Next, we need to expose the traffic on a specific port which is 8080 for our application.



    After a couple of mins, you can go to the Service & Ingress menu to get the external endpoint to access this application from the public domain. 

    That's it we have successfully deployed our Spring Boot API docker image to Google Cloud Kubernetes Engine. Deployment YAML


    Happy Coding and Keep Sharing!!


    Tuesday, 11 October 2022

    SpringBoot API with GitHub Actions, Docker Deployment

    Today, We are going to explore and see other possibilities of the most important aspect of SDLC, Which is Continues Integration & Continues Deployment aka CI/CD. There are many tools (Jenkins, Bamboo, etc) available in the market which we can use to Build, Test and Deploy the changes on servers.



    In the above diagram, the entire CI/CD is taken care of by Jenkins which is a 3rd party tool. In the real world, this required additional resources (infrastructure) and a team to manage this.

    So, since We are using GitHub is there a way we can reduce this additional stuff. Yes, we can use GitHub Actions where the entire CI/CD will run on the same platform. We all have seen this option in GitHub but very rarely do we go there.



    To understand it better, let's build a sample Spring Boot application --> Push the code in GitHub -->Trigger Github Actions --> Docker hub.

    First, we need to create a repository in GitHub and then go to the Actions tab and click new Workflow options, here we will get many workflow options that we want to integrate with our application, but for this demo, we need to select " Java with Maven".


    After you click on configure it will create a maven.yml file which you need to merge with your code, but before that, we need to update the yml to support our application build.


    and yes that's it so whenever we merge the code in the master branch the GitHub Actions workflow will trigger and build the code, but we want is that after building the code the, latest changes should also deploy to the Container Registry I am using Docker here, but you can use any other.

    In order to push the changes to the docker, we first need to create a repository in the docker hub and after that, we need to tell our maven.yml file about this new step.  

     # This workflow will build a Java project with Maven, and cache/restore any dependencies to improve the workflow execution time  
     # For more information see: https://help.github.com/actions/language-and-framework-guides/building-and-testing-java-with-maven  
     name: Java CI with Maven  
     on:  
      push:  
       branches: [ "master" ]  
      pull_request:  
       branches: [ "master" ]  
     jobs:  
      build:  
       runs-on: ubuntu-latest  
       steps:  
       - uses: actions/checkout@v3  
       - name: Set up JDK 17  
        uses: actions/setup-java@v3  
        with:  
         java-version: '17'  
         distribution: 'temurin'  
         cache: maven  
       - name: Build with Maven  
        run: mvn clean install  
       - name: Build & Push Docker Image  
        uses: mr-smithers-excellent/docker-build-push@v5  
        with:  
         image: hemkant/github-actions  
         tags: latest  
         registry: docker.io  
         dockerfile: Dockerfile  
         username: ${{ secrets.DOCKER_USERNAME}}  
         password: ${{ secrets.DOCKER_PASSWORD}}  
    
    I have used another image here which will perform all the operations docker-build-push. after that, the credentials to access the docker hub is stored in GitHub secrets.


     
    After all of these let's commit some code and see, how all these work together. In the below screenshot, we can see all the workflow triggers whenever I committed the code.



    and let's also see if the steps we mentioned in our maven.yml file are followed or not, for that we can click on any item to check and it will show us all the details.


     And the docker file which I used here is 
     FROM openjdk:17  
     EXPOSE 8080  
     ADD target/github-actions.jar github-actions.jar  
     ENTRYPOINT ["java", "-jar", "/github-actions.jar"]  
    

    let's check the Build & Push Docker Image step. looks like everything is fine here, and image is pushed to Docker Hub

    The last thing we should also check is Docker Hub, looks good the image is pushed successfully.
     



    We have covered all the points which we discussed at the beginning of this blog. Code Repo.

    In the next blog, We will deploy the same image on the Google Cloud Platform, Kubernetes. 

    Happy Coding and Keep Sharing!!

    Sunday, 9 October 2022

    Spring Boot API with MongoDB Atlas

    In this blog, We are going to explore and learn the Spring Boot application with MongoDB Atlas. For that, we need to first need to create an account at https://cloud.mongodb.com, and configure some default settings in order to spin a new free cluster. We also get the option to select the cloud provider and region.

     


    after creating the cluster and configuring credentials, we are all set as you can see I have created a new cluster in the AWS cloud, and the Collection is called "task" and it is a shared one so free no charges will apply. 


    Next, let's initialize the Spring Boot application for that, we can go to https://start.spring.io/  or if you have a Spring Boot plugin in your IDE you can use that as well.



    In the Spring initializer, I have added four dependencies. 
    1. Spring Web:- For building RESTful APIs.
    2. Spring Data MongoDB: -  It is a part of the Spring Data project, which provides integration with the MongoDB document database.
    3. Spring Boot Actuator:- it is a sub-project of Spring Boot and is used for monitoring purposes.
    4. Lombok:- For Java annotations. 
    After this, We can generate the code and open it in your favorite IDE. I have written code for CRUD operations to test the API with MongoDB Atlas.

    Let's see 1-2 examples.
    • Create some content in MongoDB Atlast using Spring Boot API:

    • Let's invoke the GET by ID endpoint

    • GET ALL endpoint

    • Rest of the operations you can check in the code but one more thing I wanted to show here is how the data is stored in the MongoDB Atlas, for that we can go to the cluster and click on Browse Collection 

    There are other things that we can explore such as Realtime Monitoring, Enabling Data API which is available in MongoDB Atlas.


      
     That's it in this blog. Code Repo.


    Happy Coding and Keep Sharing!!

    Wednesday, 5 October 2022

    Data Structures and Algorithms - Problem - 3 Coin Change challenge 2

    In the previous coin challenge, We solved the problem where have an integer array of coins of different denominations and an integer amount representing an amount of money.  and have an infinite number of each coin, and the order of coins doesn't matter, What we need to return is the fewest number of coins that we need to make up that amount.

    In this blog, We are going to see another famous coin challenge problem, 

    Problem Statement:- Where we are given an integer array of coins representing coins of different denominations and an integer amount representing a total amount of money. 

    Return the number of combinations that make up that amount. If that amount of money cannot be made up by any combination of the coins, return 0. 

     Example 1:  
     Input: coins = [5,10]  and  amount = 8
     Output: 0 
    

    In the above example, the Output is 0 because there is no way we can create any combination from the given array to sum 8.

     Example 2:  
     Input: coins = [1,5,10] and  amount = 8
     Output: 2
    

    In example 2, the output is 2 because we can sum 8 in 2 ways from the given array.

     Option 1 :- 1+1+1+1+1+1+1+1 = 8  
     Option 2 :- 5+1+1+1 = 8  
    

    Let's try to solve this in code using a Dynamic Programming approach.


    Happy Coding and Keep Sharing!! Code Repo