This article is intended for database administrators and system administrators who plan to configure the DBFS FUSE file-system on a Linux machine using the Docker containers. I would recommend reading the previous articles “Introduction to Docker and Docker toolbox” and “Docker and SQL Server on Linux”. This article assumes that the reader has a rough knowledge of the Docker Containers and SQL Server.
This article is a walk-through of the step-by-step procedure to set up and configure the DBFS FUSE file system to access local or remote SQL database instance on Linux CentOS docker containers.
In this article, you’ll see that the following topics are covered:
- Introduction to the Docker, DBFS, FUSE and DMV Tool
- Explain DBFS concepts
- Flow diagram
- Step by step installation
- Discuss the required configuration setup docker components
- Mount and data extraction procedures
- Setup DBFS to run custom SQL queries
- How to a add an existing project to GitHub using the command line
- and more…
Introduction to the Docker, DBFS, FUSE and DMV Tool
This section briefs about Docker, DBFS, FUSE, and DMV Tool. Let us deep-dive into the concepts.
Docker
Docker provides the ability to package and run an application in a loosely isolated environment called a container. Docker is an open-platform engine that facilitates the faster delivery, faster development, and faster deployment in an isolated environment.
DBFS and FUSE
DBFS( Database File System), it’s a process of mounting a virtual file system on top of that database objects using a FUSE technique which provides an interface between a VFS and Kernel module. This setup will allow the access to those files using dbfs programs.
A big part SQL Server monitoring is to ensure that SQL Server is performing at optimum levels and it’s highly available. Traditionally, to get system metrics or performance information, one could rely on GUI admin tools such as SSMS or command line tools such as SQLCMD to run queries.
Now, there is another option to query the information using a new experimental Linux tool is introduced and it is known as DBFS, which enables you to access live DMVs mounted to a virtual file-system using FUSE file-system concept.
DBFS uses FUSE to mount MS SQL Server DMVs and custom queries as a virtual file system. This gives you the ability to explore information about your database (Dynamic Management Views) using native bash commands.
There is no need to log in to the SQL Server using a GUI or command line tool or run SQL queries. The configuration file meta-data is used internally by DBFS process to query the SQL Server databases. The output is a JSON file for SQL Server 2016 and later version or CSV file for older versions of SQL Server and it is that easy to be used with Unix/Linux programs such as grep, awk, and sed.
DBFS uses the FUSE file system module to create two zero byte files for each DMV—one for showing the data in CSV format and one for showing the data in JSON format. When a file is “read,” the relevant information from the corresponding DMV is queried from SQL Server and displayed just like the contents of any CSV or JSON text file.
Features
- Access data in JSON or CSV format.
- Compatible grep, sed, and awk programs.
- Real-time DMV data
- Cross-platform support works well SQL Server on Windows and SQL Server on Linux
Note:
This tool is currently only available for Ubuntu, Red Hat, and CentOS (SUSE coming soon!).
DMV Tool
In short, the DMV TOOL, a command line, cross-platform tool that allows accessing live data by mounting SQL query output file or DMVs (Dynamic Management Views) to a virtual directory using FUSE and by querying file structure as an object.
Explain DBFS concepts
As we all know that in Linux, everything is a file. The FUSE (Filesystem in Userspace) a loadable kernel programs for all several Linux distros that provides a platform for non-privileged users to create their own file systems without even touching the existing kernel code. This is achieved by mounting a file-system in a user space using an interface known FUSE modules
Implementation Flow Diagram
In this section, the high-level diagrammatic representation of the implementation steps is curated.
Getting started
As we already discussed, our goal here is to get the configuration to work with the process. Let us follow the following steps to configure the DBFS:
- Log in to the console with the SU account(Super User Mode)
- Install wget utility. The GNU Wget is non-interactive, file retrieval utility to download files from the web or using FTP protocols.
# yum install wget
- To get Docker set up on a CentOS, go ahead and start to download the .rpm package for the corresponding release and install it manually using YUM command. Browse the URL to download the latest docker-ce file for the Docker version that you want to install.
- Download the rpm package using wget utility
# wget https://download.docker.com/linux/centos/7/x86_64/stable/Packages/docker-ce-18.03.0.ce-1.el7.centos.x86_64.rpm
- Install Docker CE package using the following yum install command
#yum install docker-ce-18.03.0.ce-1.el7.centos.x86_64.rpm
- Check the docker package information using yum info command.
#yum info docker
- Now, Docker is installed but the service is not started. Use the systemctl command to start Docker, verify and enable the docker service
#systemctl start docker #systemctl status docker #systemctl enable docker
- Verify that docker is installed correctly by running the welcome image.
#docker run hello-world
- Let’s download the CentOS latest image from the library. Use docker pull command to extract the CentOS image, the community edition of RHEL Linux distros. We can also specify a specific image from the list using tags or just append colon ‘-‘ and type latest would only give the most current image of CentOS.
#docker pull centos:latest
- Once the pull is complete, go ahead and create a container from the extracted image. We’ll do that with sudo docker run.
# docker run –it centos:latest bash
Here, -it refers to interactive terminal, centos:latest is the reference to the image and bash defines the shell that will open when the container starts.
Now, we can see that the container is created. We can see the status of all Docker containers with the command docker ps –a
# docker ps -a
- Let’s go ahead and install the wget utility
# yum install wget
- Download the DBFS (dbfs-0.2.5-0) package using the wget utility
wget https://github.com/Microsoft/dbfs/releases/download/0.2.5/dbfs-0.2.5-0.x86_64.rpm
- Download the EPEL repository using wget utility
#wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
- Setup the EPEL package
#rpm -ivh epel-release-latest-7.noarch.rpm
- Install the DBFS package
#yum install dbfs-0.2.5-0.x86_64.rpm
- Verify the DBFS package information
#yum info dbfs
- Lets quit the docker container session by typing EXIT. Now, commit the docker image. This commits all the previous change that was made to the centos:latest image. The commited image will have the wget utilities, repositories of EPEL and DBFS package.
#exit #docker commit 7af55818f007
- Add the tag to the committed image using docker tag command. Let us name the tag as centos:dbfs. We can see the new image using docker images.
#docker tag 88cd8b6c264890547bc0fa24047f5248f2f6c40b5147a580c18a78018f8c664c centos:dbfs #docker images
Docker image is ready with the DBFS configuration.
Wrapping Up
This article covered Docker setup and configuration steps and the process of configuring DBFS FUSE technique on the CentOS Linux container.
In a nutshell, here’s what Docker can do for you: It can get more applications running on the same hardware than other technologies; it makes it easy for developers to quickly create ready-to-run container applications, and it makes managing and deploying applications much easier. Put it all together and I can see why Docker rode the hype cycle as fast as I can recall ever seeing an enterprise technology go.
The workings of DBFS and SQL Server will be explained in the next article. That’s all for now….stay tuned for more updates.
References
- https://docs.docker.com/engine/reference/run/#/runtime-privilege-and-linux-capabilities
- https://cloudblogs.microsoft.com/sqlserver/2017/05/17/try-new-sql-server-command-line-tools-to-generate-t-sql-scripts-and-monitor-dynamic-management-views/