Optimizing BLOB Data Extraction from Oracle Database with Connect Server

  • 14 May 2024
  • 1 reply
  • 78 views

Userlevel 2
Badge

Exploring databases and retrieving information from standard data types such as varchar, string, or char is usually straightforward. However, dealing with BLOB (Binary Large Object) type columns introduces added complexity to the process.

Generally, BLOB refers to Binary Large Object, representing a collection of binary data stored as a single entity. This encompasses various file types such as images, videos, spreadsheets, and PDFs, all stored within cells designated as BLOB.

In this discussion, we'll delve into the process of accessing a PDF file stored within a BLOB type column via Connect Server, ensuring the preservation of both the file's integrity in terms of size and its actual content.

 

Firstly, I have created a database featuring four columns: ID, FILE_NAME, FILE_TYPE, and COLUMN4. COLUMN4 was specifically designated as the BLOB type column within the database.

Below is a screenshot illustrating this setup.

 

Within this table, there are two records, each storing a PDF file within the Column4 field, as depicted in the screenshot below:

 

To ascertain the actual size of the PDF file, simply double-click on the value within Column4. This action will reveal details regarding the file's size, as demonstrated in the screenshot below:

The displayed information confirms that the file size is 312003 bytes or approximately 304 KB.

 

 

 

Configuring Connect Server for Database Interaction

 

After downloading the Oracle connector in Connect Server, the initial step involves adding the path of the OCI files to the Environment variables. Once this configuration is complete, proceed to input your connection details for the Oracle DB instance. Subsequently, you can seamlessly query your data using the Data Explorer feature.

Upon executing your query within Data Explorer, the retrieved data will be displayed as follows:

 

 

However, the pressing question remains: How can I verify if this file is indeed correct or matches the size as stored in my database? Crucially, how do I retrieve the file stored within my BLOB column?

It's important to note that what we currently see as the value for Column 4 is the memory address where the array is stored. Here, 'B' signifies BLOB, and '@4b468b75' denotes the memory location where that specific array resides.

 

Now, addressing both questions concerning how to obtain the actual value of this file to ensure it matches my stored file and its correct size. To achieve this, three essential components are required:

 

1. Establishing an OData Endpoint for the Connection in Connect Server

 

To proceed, navigate to the ODATA tab and select "Create Table." Choose the Connection Name, then confirm the specific table you wish to reference (in my case, it's Table 1). Next, ensure to provide all necessary permissions (GET, POST, PUT, DELETE) and save the changes.

Upon completion, you'll find the endpoints listed under the API section within the ODATA tab, as illustrated in the screenshot:

 

 

 

2. Accessing the GET API in an External API Testing Tool (e.g., POSTMAN) with Username and Password

 

You can generate a username within the USERS tab. The application already includes an Admin user by default. Your password will be the AUTH token generated for that particular user (screenshot below). Ensure that all required permissions are also granted.

 

 

After executing the GET request in POSTMAN, your data should be retrieved in the following format.

 

 

Please note that the value for COLUMN4 is now returned in the BASE64 encoded string format.

 

3. Decoding the BASE64 Encoded String and Saving as PDF

 

The final step involves opening any text editor app (such as Notepad++) and pasting the provided string into it. Once completed, decode the BASE64 encoded string using the following steps: Navigate to Plugins > MIME Tools > Base64 Decode.

 

 

After decoding, save the resulting content locally with a chosen name and the .pdf extension (e.g., Abc.pdf). Following storage, navigate to the properties section of the file to confirm its actual size.

 

 

Opening this file should yield the exact PDF document that you originally uploaded into your Oracle Database.

 

Important Note:

 

It's essential to acknowledge the possibility that upon saving this file locally, it may not precisely match the size of the version uploaded to your Oracle database. Additionally, when opening the file, you may encounter the following:

 

 

In such instances, there's a high likelihood that the encoded string obtained via Postman was incorrect or not of the appropriate length, with the latter being the more common reason. It's important to note that the length of the Base64 encoded string is contingent upon a property called MaxLOBSize.

 

MaxLOBSize denotes the maximum volume of bytes permitted to be queried by non-parameterized SELECT queries. By default, this property is set to 2000 bytes. Adjusting this property may be necessary to accommodate larger data sizes.

This value can be configured within the Advanced settings of your Connection in Connect Server.

 

 

For more information on the same, please refer to: CData JDBC Driver for Oracle - MaxLobSize

 

Please note that determining the optimal length for this setting may involve some trial and error to accommodate all of your file extensions. For instance, I encountered a PDF file with a size of 46.6 KB (49831 bytes), and setting the value of this property to 80000 enabled me to retrieve the exact file after decoding the returned string.

 

Hope this information proves helpful.


1 reply

Userlevel 7
Badge

Hi @Tapan18118,

 

Thank you for the information provided!

 

I acknowledge the importance of this information. Could we consider publishing it as a Knowledge Base article on our CData Community rather than a question? I believe it would be more beneficial as part of our Articles section within the How-to with CData category. By creating a dedicated Knowledge Base article, we can provide more clarity and address any uncertainties our customers may have.

Reply