From Shipping Containers to Software: Evolution of Docker.

ยท

19 min read

From Shipping Containers to Software: Evolution of  Docker.

A short history about Containers...

Chroot : Introduced in 1979

No , it's not Docker that first introduced the usage of containers , It was way back in the 90's when chroot was introduced.

  1. The chroot command was introduced in Version 7 Unix, developed by AT&T Bell Laboratories. It allowed for changing the apparent root directory for the current running process and its children. chroot was first included in Unix Version 7, which was released in 1979.

  2. Chroot, short for "change root," was an operation on Unix-like systems that changed the apparent root directory for the current running process and its children. This created a kind of restricted environment, often called a "chroot jail".

  3. Here's a breakdown of how chroot worked:

  • Apparent Root Directory: The directory that the system treats as the root of the filesystem. Normally, this is the actual root directory (/).

  • Chroot Jail: An environment created by chroot where processes can only access files and directories within the designated root directory and its subdirectories. They are essentially cut off from the rest of the system.

FreeBSD Jails: Introduced in 1999

Chapter 17. Jails and Containers | FreeBSD Documentation Portal

  • FreeBSD: An Operating System , Developed and maintained by The FreeBSD Project, a non-profit organisation with a community of developers. It's a free and open-source operating system.

  • It introduced Jails . Lightweight virtualisation based on process isolation and resource controls.

  • Jails share the kernel with the host system but have their own user space (processes, filesystem, network configuration).

Solaris Zones: Introduced in 2004

Solaris Logo and symbol, meaning, history, PNG

  • Originally called Solaris Containers. Solaris, Originally developed by Sun Microsystems, a computer hardware and software company. Later acquired by Oracle Corporation in 2010.

  • Zone , Can operate in two modes:

    • Container mode: Similar to Jails, with process and resource isolation but sharing the kernel.

    • Whole system mode: Creates a virtual machine with a separate kernel, providing a more complete isolation environment.

Open VZ : Introduced in 2005

File:OpenVZ-logo.svg - Wikipedia

  • OpenVZ was created by SW soft, a software company specialising in server automation and virtualisation solutions. (roughly around 2001).

  • SWsoft initially developed OpenVZ as a commercially available product named Virtuozzo. Later, in 2005, they open-sourced the core technology behind Virtuozzo as the OpenVZ Project, making it freely available for use and development by the community.

  • Virtualization Approach:

    • Traditional virtualization (e.g., VMware, VirtualBox) relies on emulating hardware. Each virtual machine (VM) has its own virtual CPU, memory, and other resources. This offers strong isolation but can be resource-intensive.

    • OpenVZ uses a different approach called container-based virtualization. It virtualizes the operating system environment rather than emulating hardware. Multiple isolated virtual private servers (VPS) share the host kernel but have their own user space (processes, filesystems, configurations). This is more lightweight and efficient than traditional virtualisation.

  • Issues with it:

    • Resource Management:

      • While OpenVZ is efficient, resource allocation can be complex. CPU, memory, and other resources are carved out of the host system's pool, which can lead to noisy neighbors if one VPS consumes a large amount of resources, impacting others.
    • Limited Flexibility:

      • Since all VPSes share the same kernel version, you're limited by the kernel version available on the host system. This can restrict the ability to run applications that require a specific kernel version.
    • Limited Scalability:

      • Live migration of VPSes can be more challenging due to the shared kernel. Scaling resources up or down might require downtime for reconfiguration.
    • Limited Support:

      • Development activity on OpenVZ has slowed down in recent years. While still functional, there might be fewer resources and support available compared to more actively developed containerization technologies like Docker or Kubernetes.

Meanwhile ....

Google's journey with containerisation started well before Docker and Kubernetes became household names. Here's a timeline to understand their involvement:

Google logo - Wikipedia

Borg : Introduced in 2005

Mid-2000s:

  • Google engineers began working on solutions for efficient resource management and application deployment at scale within their data centres. This involved containerisation concepts.

  • One key contribution around this time was the development of cgroups (control groups). This technology, now a core part of the Linux kernel, allows for isolating and limiting resource usage (CPU, memory, etc.) for processes. This became a foundational element for containerization.

2003-2004 (estimated):

  • Google's internal cluster management system, Borg, is believed to have been introduced around this timeframe. While not publicly announced, Borg served as a cornerstone for container orchestration at Google's massive scale. It likely employed containerisation techniques for managing workloads.

2008-2013:

  • Docker's predecessor, dotCloud, uses Linux Containers (LXC) for its platform-as-a-service (PaaS) offerings.

2013:

  • This year marks a turning point. Google publicly acknowledged their use of containerisation through two events:

    • Omega Paper: A research paper by Google engineers described Omega, a large-scale distributed batch processing system built on containerisation principles.

    • Borg Presentation: At a conference, Google engineers shed light on their internal Borg system, hinting at its container-based approach to workload management.

2014 (Summer):

  • Recognising the potential of containerisation beyond Google's internal needs, Google engineers played a pivotal role in the development of Kubernetes. This open-source project aimed to provide a more accessible and user-friendly platform for container orchestration, building upon concepts pioneered in Borg.

So What is Docker ?

Docker Logo and symbol, meaning, history, PNG, brand

Earlier, the prevailing practice was to utilize a single server for running a solitary application. However, VMware addressed this limitation by introducing virtual machines, enabling servers to accommodate multiple operating systems and consequently run multiple applications.

Despite this advancement, a drawback persisted: virtual machines necessitated their own operating systems, thereby requiring dedicated allocations of resources such as RAM and storage.

The question arose: could multiple isolated applications run on the same operating system within a server? This query found resolution with the introduction of "containers."

Containers operate by leveraging virtualisation at the operating system level, as opposed to the hardware level utilised by traditional virtual machines.

Docker serves two primary functions:

  1. Containerization: Docker provides tools and a platform for creating container images. Developers use Docker to package their applications along with their dependencies into containers. This process is often referred to as containerization or Dockerizing an application.

  2. Container Engine: Docker includes a container runtime component, which is responsible for running and managing containers on a host system. This runtime component, known as ContainerD, is an essential part of the Docker platform. ContainerD provides the necessary runtime environment for executing containers, including starting, stopping, and managing container lifecycle.

Now, let's delve into how containers function

  1. Shared Kernel: In a containerized environment, multiple containers run on the same underlying host operating system. Each container shares the host OS kernel, which is the core component responsible for managing hardware resources and providing essential services to applications.

  2. Containerization: Containers are encapsulated environments that package an application along with its runtime dependencies, libraries, and configuration files. This packaging creates a lightweight, portable, and isolated unit that can be easily moved and deployed across different environments.

  3. Container Engine: To manage containers, you use a container engine such as Docker Engine. The container engine interacts with the host OS kernel to create, start, stop, and manage containers. It provides tools and APIs for building, running, and orchestrating containers at scale. Container engines manage resource utilisation through a combination of kernel features, such as control groups (cgroups) and namespaces, and internal management mechanisms within the container engine itself. He are's how it typically works:

    1. Control Groups (cgroups):
      Control groups are a kernel feature that allows the container engine to allocate and control the usage of system resources such as CPU, memory, disk I/O, and network bandwidth among containers and processes. Container runtimes use cgroups to set resource limits and constraints for each container, ensuring fair sharing of resources and preventing individual containers from consuming excessive resources.
    2. Namespaces:
      Namespaces provide isolation for processes and resources within a container. Each container has its own set of namespaces, including the PID namespace (for process isolation), the network namespace (for network isolation), the mount namespace (for filesystem isolation), and others. Namespaces allow containers to have their own view of the system, separate from other containers and the host system.
    3. Resource Limits and Constraints:
      Container engines allow users to specify resource limits and constraints for each container, including CPU shares, memory limits, disk quotas, and network bandwidth limits. These limits are enforced by the container engine using cgroups, ensuring that containers do not exceed their allocated resources and that system resources are fairly distributed among containers.
    4. Internal Management Mechanisms:
      Container engines implement internal management mechanisms to monitor resource utilization and enforce resource limits. These mechanisms may include resource monitoring daemons, resource schedulers, and controllers that periodically check container resource usage, adjust resource allocations, and enforce resource limits as needed.
  4. Resource Isolation: While containers share the same kernel, they are isolated at the user space level. Each container has its own filesystem, network interfaces, and process tree, which provides a level of isolation similar to that of virtual machines. This isolation ensures that containers do not interfere with each other and enables multiple containers to run concurrently on the same host.

  5. Resource Management: Containers can be allocated CPU, memory, and other resources by the container engine. Resource allocation and management are handled by the host operating system's kernel, which ensures that containers receive their fair share of resources without impacting the performance of other containers or the host system.

  6. Security: Containers employ various security mechanisms to ensure isolation and protect the host system. These mechanisms include namespaces, which provide process and network isolation, and control groups (cgroups), which enforce resource limits and quotas. Additionally, container images can be scanned for vulnerabilities, and runtime security features can be applied to further enhance security.

Docker File vs Docker Image vs Docker Container.

  • Docker file: Think of a Docker file like a LEGO instruction booklet. It tells you exactly how to build something using LEGO bricks. But instead of LEGO bricks, it tells your computer how to build a special package called a Docker image.

  • Docker Image: Now, imagine the Docker image as a fully built LEGO set. It's like a spaceship or a castle you've built following the instructions in the booklet (the Dockerfile). The image has everything it needs inside, like the toy's pieces, tools, and even snacks.

  • Docker Container: Finally, think of a Docker container like a playtime with your LEGO creation. You take your built spaceship or castle out of the box (the Docker image), and now it's ready to play with! It's like bringing your LEGO creation to life and having fun with it. And when playtime is over, you put it back in the box (stop the container), but the LEGO set (the Docker image) remains ready for the next adventure.

When you "open" or run a Docker image, it creates a Docker container. So, while the image is the package that contains all the necessary stuff, the container is where the program or app actually runs and does its job.

The Docker daemon, often referred to as dockerd, is a background process that manages Docker containers, images, volumes, networks, and other Docker objects on a host system. It serves as the central component of the Docker engine, responsible for handling Docker API requests, managing container lifecycles, and interacting with the host operating system to create and manage isolated environments known as containers.

What is Runtime ?

In simple terms, a "runtime" is the environment where your software runs and gets executed. It's like the stage where a play is performed. With respect to Containers , a "container runtime" is a software component responsible for running containers. It provides the necessary environment for the containerised application to execute, including managing resources, networking, and storage.

๐Ÿ“–

Docker, upon entering the scene, quickly became one of the most recognized containerisation tools. Subsequently, Kubernetes emerged as a platform specifically designed to orchestrate Docker containers. Initially, these two tools were tightly integrated, offering limited support for other containerisation solutions.

However, as Kubernetes gained popularity, the demand arose for compatibility with alternative containerisation tools. In response, Kubernetes introduced the Container Runtime Interface (CRI), allowing integration with various container runtimes, provided they adhere to OCI standards."

Wait wait wait .... ? What is OCI , CRI.

  1. OCI (Open Container Initiative): OCI is a specification for container runtimes and image formats. It defines standards for containers and container images, ensuring interoperability between different container runtime implementations and container image formats. OCI specifications ensure that containers built and run with one tool can be executed with another, promoting portability and flexibility in the container ecosystem. Kubernetes uses OCI-compliant container runtimes to execute containers within pods. Here are some of the key OCI (Open Container Initiative) standards

    1. OCI Image Specification:
      Specifies the format for container images, including layers and metadata.
    2. OCI Distribution Specification:
      Describes how container images are stored, fetched, and pushed to repositories.
    3. OCI Registry Specification:
      Defines the API for interacting with container image repositories.
    4. OCI Runtime Specification:
      Defines the specification for the runtime environment of containers, including the configuration and execution of containers.
    5. OCI Labels and Annotations Specification:
      Defines a standard way to add metadata to container images and runtime environments.
    6. OCI User Namespace Specification:
      Describes how user namespaces should be implemented in container runtimes
  2. CRI (Container Runtime Interface): CRI is an interface between Kubernetes and container runtimes. It standardises how Kubernetes interacts with container runtimes, allowing Kubernetes to support multiple container runtimes seamlessly. CRI defines a set of gRPC (Google Remote Procedure Call) APIs that Kubernetes components use to communicate with container runtimes. By decoupling Kubernetes from specific container runtimes, CRI enables users to choose the container runtime that best fits their requirements.

..continuing , Once CRI was introduced, Docker was the one that did not follow the CRI standards, so Kubernetes still had to support Docker. Kubernetes introduced DockerShim, a component within Kubernetes responsible for integrating Docker container runtime with the Kubernetes Container Runtime Interface (CRI). DockerShim served as a bridge between Kubernetes and Docker, allowing Kubernetes to manage Docker containers using the CRI.

When Docker was first introduced, it provided a complete platform for building, packaging, distributing, and running containerised applications. ContainerD was a crucial part of this platform, handling fundamental container operations such as creating, running, pausing, resuming, and deleting containers.

Containerd Logo PNG vector in SVG, PDF, AI, CDR format

Initially, ContainerD did not adhere to CRI standards, but as Kubernetes grew, Docker Inc. made changes, and ContainerD was adapted to support the CRI standards. This adaptation involved the creation of a CRI plugin for ContainerD, which allowed Kubernetes to communicate with ContainerD using the standardised CRI gRPC (Google Remote Procedure Call) APIs.

  1. Donation of ContainerD to CNCF:
    In 2017, Docker, Inc. donated ContainerD to the CNCF, a vendor-neutral organisation that hosts various open-source projects related to cloud-native computing. This move aimed to foster broader collaboration and ensure the long-term sustainability of ContainerD as a critical infrastructure component for container orchestration platforms.

Now having to maintain Docker Shim separately was a big Task so it was decided in theV1.24of Kubernetes to drop the support for Docker Shim.

Let's Start with some HandsOn...

I would request to download the Docker Desktop as per your OS. Once Done just start the application and open you terminal and type...

docker run hello-world

The docker run command is used to create and start a new Docker container based on a specified Docker image. Let's break down the docker run hello-world command:

  • docker: This is the Docker CLI (Command Line Interface) command that allows you to interact with Docker.

  • run: This is a sub-command of the docker command used to create and start a new container.

  • hello-world: This is the name of the Docker image that you want to use to create the container. In this case, hello-world is a special Docker image provided by Docker that is commonly used for testing and demonstration purposes. It's a very lightweight image that simply prints a "Hello from Docker!" message to your terminal when run.

When you execute docker run hello-world, Docker does the following:

akshatsharma@Akshats-MacBook-Air ~ % docker run hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
478afc919002: Pull complete 
Digest: sha256:a26bff933ddc26d5cdf7faa98b4ae1e3ec20c4985e6f87ac0973052224d24302
Status: Downloaded newer image for hello-world:latest

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (arm64v8)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/get-started/

Now Let's understand this..

  1. akshatsharma@Akshats-MacBook-Air ~ % docker run hello-world: This is a command-line instruction to run a Docker container named "hello-world".

  2. Unable to find image 'hello-world:latest' locally: Docker first checks if the image "hello-world" with the tag "latest" is available on your local machine. Since it's not found locally, Docker proceeds to download it from Docker Hub.

  3. latest: Pulling from library/hello-world: Docker informs you that it's pulling the "hello-world" image from the Docker Hub repository named "library/hello-world".

  4. 478afc919002: Pull complete: This line indicates that the pulling process of the image layer with the ID "478afc919002" is complete.

  5. Digest: sha256:a26bff933ddc26d5cdf7faa98b4ae1e3ec20c4985e6f87ac0973052224d24302: This is the digest of the pulled image. It's a unique identifier for the image, generated based on its content. This ensures the integrity of the image.

  6. Status: Downloaded newer image for hello-world:latest: Docker notifies you that it has successfully downloaded the "hello-world" image with the "latest" tag.

  7. Hello from Docker!: This message confirms that Docker is installed and functioning correctly.

  8. This message shows that your installation appears to be working correctly.: Further reinforcement that Docker is installed correctly and functioning.

  9. The subsequent lines (1. to 4.) provide a breakdown of the steps Docker took to generate the "Hello from Docker!" message. It outlines the process from the Docker client contacting the Docker daemon to the creation of a container running the "hello-world" image.

  10. To try something more ambitious, you can run an Ubuntu container with...: This line suggests running an Ubuntu container with an interactive shell, demonstrating the flexibility of Docker for running different environments.

  11. $ docker run -it ubuntu bash: This is a command to run an Ubuntu container interactively with the Bash shell.

  12. Share images, automate workflows, and more with a free Docker ID:: Docker encourages you to create a Docker ID to access additional features like image sharing and workflow automation on Docker Hub.

  13. https://hub.docker.com/: Provides the URL to Docker Hub, where you can create your Docker ID and explore more Docker features.

  14. For more examples and ideas, visit:: This line invites you to explore further documentation and examples on getting started with Docker.

  15. https://docs.docker.com/get-started/: Provides the URL to the Docker documentation for more examples and ideas on getting started with Docker.

Few Important facts :)

  1. Layered Image Architecture during Image Pull: When you pull a Docker image from a registry, such as Docker Hub, the image is also composed of layers. These layers are essentially the building blocks of the image. When you pull an image, Docker retrieves these layers individually and assembles them on your local machine to form the complete image.

    If you already have some of the layers cached locally, Docker will only need to download the layers that are new or have been updated since your last pull. This optimization makes pulling images faster, especially when multiple images share common layers.

    For instance, if you pull an updated version of an image that shares many layers with the previous version you pulled, Docker will only download the new or changed layers, while reusing the existing ones from the cache.

  2. Base Operating System: Docker containers are designed to be lightweight and portable. Instead of containing an entire operating system like a virtual machine, Docker containers typically only include the runtime environment and dependencies required to run the application. However, some applications may still require a full operating system to function properly. In such cases, Docker images may include a minimal operating system along with the application.

    For example, if you're running a Java application in a Docker container, you might use an image that includes the Java runtime environment (JRE) along with a minimal Linux distribution like Alpine Linux.

  3. Port Forwarding: When you run a web server or any networked application inside a Docker container, by default it's isolated from the host machine's network. However, you can expose ports from the container to the host machine using port forwarding. This allows you to access the application running inside the container from your local machine.

    For example, if you have a web server running inside a Docker container on port 8080, you can expose this port to your host machine's port 8080 using the -p flag when running the container:

     docker run -p 8080:8080 <image_name>
    

    Now, you can access the web server running inside the container by navigating to http://localhost:8080 in your web browser.

till now we saw , how to pull images , What about creating our own...

Docker File :

We already talked about docker file above in the starting, A Docker file is a text document that contains instructions for building a Docker image which image would be later used for developing, shipping, and running applications in containers.

In a Docker file, you specify things like the base image to use, commands to run during the build process, environment variables, and instructions for copying files into the image. Once you have a Docker file, you can use the docker build command to build an image from it.

Here's a very basic example of a Dockerfile that pulls the Ubuntu base image, installs Python, copies a Python script (main.py) into the image, and then runs that script:

# Step 1: Use the official Ubuntu base image
FROM ubuntu:latest

# Step 2: Update package lists and install Python
RUN apt-get update && apt-get install -y python3

# Step 3: Set the working directory inside the container
WORKDIR /app

# Step 4: Copy the local main.py file into the container at /app
COPY main.py /app

# Step 5: Run the Python script when the container starts
CMD ["python3", "main.py"]

Explanation of each step:

  1. FROM ubuntu:latest: This line specifies the base image to use for the Docker image. In this case, we're using the latest version of Ubuntu.

  2. RUN apt-get update && apt-get install -y python3: This line installs Python 3 inside the Docker image by updating the package lists and then installing Python 3 using apt-get.

  3. WORKDIR /app: This line sets the working directory inside the container to /app. This means that any subsequent commands will be executed relative to this directory.

  4. COPYmain.py/app: This line copies the main.py file from your local directory (where the Dockerfile is located) into the /app directory inside the container.

  5. CMD ["python3", "main.py"]: This line specifies the command to run when a container is started from the image. Here, it runs the main.py script using Python 3. This will be the default command executed when you start a container from this image.

Once you have this Dockerfile saved in a directory along with your main.py script, you can build the Docker image using the docker build command:

docker build -t my-python-app .

Then, you can run a container based on that image using the docker run command:

docker run my-python-app

This will start a container based on your Docker image, which will execute the main.py script inside the container.

ll the topics we've discussed above cover everything you need to know to begin with Docker. As you move forward, feel free to explore online tutorials and additional articles that delve deeper into Docker.

Thank-you!

I'm pleased you've reached the end of this article. I hope you've gained some valuable insights. If you have, please consider leaving a Like. Your support encourages me for future write-ups.