DBFS stands for Database File System, and it’s Oracle’s method for housing an operating system-like file-system within an Oracle relational database. It has its roots with iFS, but its functionality and capabilities have been extended. Here are some reasons why DBFS is a good thing:
I’m sure there are more excellent features of DBFS, but the list above is particular relevant to the Exadata database machine.
Setting up DBFS on Exadata
Create the DBFS Tablespace
The first step in the process is to create your DBFS tablespace. The steps below will go through this process:
From this, we can see we’ve got a 872Gb chunk of free space in our DBFS_DG ASM disk group. I’ll create a tablespace, for starters, at 100Gb:
Create the DBFS Oracle user
Next we’ll create a database user, DBFS:
Perform additional OS configuration
The steps below are required to enable oracle to mount a DBFS file-system. I’ll do these on both database tier nodes in our Exadata database machine:
Create the DBFS store
Mount DBFS
The first thing we need to do is create a text file containing the DBFS password. I’ve created this on /home/oracle/dbfs/passwd.txt on both nodes – note that this is only required when mounting the file-system, not for normal use of DBFS.
Now we mount the DBFS file-system:
After this, let’s check it:
When complete with cm01dbm01, we’ll do the same mount on cm01dbm02:
Configure the target database
After we’ve got our DBFS file-system created, there are a couple of things we need to ensure in order to use it:
Let’s walk through these.
Recommendations for Data Loads
Sample: External Table
In this section, I’ll show how to create an external table based on a comma-separated text file. We’ll use this external table as our staging table, and from this, create an operational table. I’ll use our SOE.CUSTOMER tables to generate a CSV file first:
This will spool a 1,000,000-row file into our DBFS file-system – I’m going to “amplify” it to 10 million rows to demonstrate some performance:
OK, now let’s create our external table and validate the row counts:
Loading a “real” table from our external table
Below, let’s follow our test case and insert values from our external table to a real table:
Now we’ll do a direct path insert:
We loaded 10 million rows in under 13 seconds. Not to bad. Let’s do a PCTAS:
Using SQL*Loader to Compare Times
Now that we see how much time it took to load via an external table, let’s compare with direct path SQL*Loader:
It’s difficult to tell from the above, but the job completed it a couple of minutes. Longer than using parallel direct path insert, but still not too bad.
Load and Transform the “Old Way”
Developers typically write data load programs, and many developers do things this way:
In the example below, we’re doing to write a program that does just this. Specifically, it will:
Let’s run our example:
As you can see, the main PL/SQL job ran in 25.5 minutes, and end-to-end the execution was a little under 27 minutes.
Load and Transform with External Tables and SQL
Now we know how long the old way took, let’s try it a new way, using SQL and external tables. In the below, I’m doing an analytic function, ROW_NUMBER(), to get the unique/distinct values, which is what the “old style” program did:
As you can see, the job ran in a little over 8 seconds. If we still wanted to populated our CUST_DUPE table, it would look like this:
So end-to-end, the ELT job took about 30 seconds with SQL and external tables, compared to over 25 minutes with PL/SQL. Clearly, using SQL is in this case a much more efficient way to do things. As a developer, if you think it’s impossible to do conditional transformation logic with SQL instead of PL/SQL, it’s time to update your skills – you can do a great deal of work with analytic functions, SQL case statements, and so forth.
1050 Wilshire Drive,
Suite 170,
Troy, MI 48084
Phone: (248) 465-9533
Toll free: 1-877-868-1753
Email: [email protected]
© Centroid, Inc. All rights reserved. Contact Privacy Policy Terms of Use CCPA Policy
Centroid is a cloud services and technology company that provides Oracle enterprise workload consulting and managed services across Oracle, Azure, Amazon, Google, and private cloud. From applications to technology to infrastructure, Centroid’s depth of Oracle expertise and breadth of cloud capabilities helps clients modernize, transform, and grow their business to the next level.
© Centroid, Inc. All rights reserved. Contact Privacy Policy Terms of Use CCPA Policy