Earlier this year Oracle announced support for MongoDB API on the Oracle Autonomous database family. It has been recognized as a huge milestone by Forbes and multiple other analysts, technical and market experts. In my opinion it was logical and fully predicted one more step to the main idea of “Converged Database”. I didn’t coin that term – Oracle was using it for two or three years as of now. The main idea is to create a fully managed database platform supporting the most of APIs and interfaces and put the data together under the Oracle Autonomous Database umbrella. Here I will try to look inside and understand what it is, how it works, what is supported and what is not.
First, let’s understand the goals and what Oracle means when it talks about supporting MongoDB. Oracle says that it speaks MongoDB and it means that Oracle supports the tools and MongoDB developers ecosystem to keep working with the data as they are in a MongoDB database. At the same time it provides benefits of having your data in the Oracle Autonomous database. I am going to go through all the potential benefits to understand why somebody would move from MongoDB to Oracle Autonomous.
I am a technical guy and don’t take any press-releases or promises without a hands-on tests and understanding how and what actually works there. I did some tests using my own deployment and some sample data taken from open sources. So, what do you have behind the scenes? The first is of course the database. We can use either Autonomous JSON or any other Autonomous database. To understand it better I used the Oracle Autonomous Transaction Processing (ATP) database on the shared infrastructure. It is based on the Exadata hardware and represents a pluggable database as a service. It has the most of the features of any Oracle database and supports multiple data types. Among others it supports the JSON data type which is behind the MongoDB support.
When we create a collection using either MongoDB API or a command line utility like mongosh it translates the command sent by MongoDB Wire Protocol to a SQL statement executed on the Autonomous database. The main goals are to provide an environment for MongoDB based applications where it can run with minimal or no changes and attract MongoDB developers providing compatible backend with option to evolve and use the benefits of the Oracle Autonomous in the future. It sounds great and I am sure this is a big step forward for the Autonomous databases. So, what exactly do we get there?
Let’s start from the benefits. We are getting fully managed, reliable, fully ACID-compliant RDBMS with SQL support and packed with multiple features. It makes it possible to improve and extend the application including machine learning, advanced security, reporting and indexing and many more features coming with the Oracle database. It can help to unify the management and administration for the database engine and reduce operational overhead. As for now both Autonomous JSON and Autonomous Transaction Processing support the MongoDB API.
And while we have many potential improvements we have some trade-offs. Let me try to observe what I noticed during my tests.
We are starting from the version. As of April 2022 we can use version 4.2.14 API for MongoDB which is not the latest one. The version 4.2.14 was issued in May 2021. This is a supported version but if you want to use some new commands and operators from version 4.4 and 5.0 you might need to find an equivalent in the Oracle database toolbox.
The authentication supports only the PLAIN (LDAP SASL) method and you need to keep in mind that behind the scenes we are using Oracle schemas. And using the “root” role provides you the option to create other database schemas while dbOwner gives you only privileges in the current Oracle schema. So, the authorization model is not exactly the same as in MongoDB and I might elaborate on it in some of the subsequent articles.
If your MongoDB deployment scripts contain index creation commands they will be ignored and you need to create the required indexes using Oracle interface. And you need to keep in mind that inside the same database (schema) you cannot use the same index name for different collections. Naming for the same type of objects (including indexes) must be unique in the schema. It is different than in mongoDB and has to be adjusted. Even if you are unable to create the indexes using mongoDB interface you still can use hints to tell the Oracle optimizer which index to pick up. If you don’t do that then the Oracle optimizer will decide it by itself.
Oracle Autonomous doesn’t use the MongoDB aggregation pipelines and you need to use SQL for such operations where the Oracle optimizer behind the scenes will choose the optimal order of execution. So even though there is a clear room for improvements and transition from the MongoDB aggregations there are new requirements too. You cannot use your old aggregations.
Also you have to pay your attention to the difference in the structure for a document in MongoDB and Oracle. Here is the typical Oracle table with MongoDB information after the import through the standard mongoimport tool.
SQL> info title_episode TABLE: TITLE_EPISODE LAST ANALYZED:2022-04-13 14:06:15.0 ROWS :6635610 SAMPLE SIZE :6635610 INMEMORY :DISABLED COMMENTS : COLUMNS NAME DATA TYPE NULL DEFAULT COMMENTS *ID VARCHAR2(255 BYTE) No CREATED_ON TIMESTAMP(6) No sys_extract_utc(SYSTIMESTAMP) LAST_MODIFIED TIMESTAMP(6) No sys_extract_utc(SYSTIMESTAMP) VERSION VARCHAR2(255 BYTE) No DATA JSON Yes Indexes ADMIN.SYS_C0026862 UNIQUE VALID ID
All document fields including the “_id” are stored in the JSON type column “DATA” and when the “_id” in the MongoDB is unique and used to locate the document it is different for Oracle. The import will populate the “ID” column in Oracle table by values from “_id” or generate them both if you import from a file which doesn’t have the “_id” field. And the “_id” is never used after and you cannot change it even from an application. At the same time the “ID” column value can be changed. Also the “ID” is always VARCHAR2(255 BYTE) type when the “_id” in MongoDB can use different data types. As a result the “ID” can be the result of conversion from another data type. It creates a situation when “_id” and “ID” can have different values. That has to be taken into account when you move your data to Oracle.
And, of course, we are talking about slightly different implementations for binary JSON formats. The BSON used in MongoDB, for example, has a notion of field order when OSON doesn’t guarantee it. If your application logic builds on the ordered fields it has to be taken into consideration.
One more thing I noticed when I was testing the new API was the size of a collection in the Oracle database. Of course, it could be different for different types of data but if we compare the same imported data in the Oracle database and a single replica of a MongoDB we see a significant difference in size. For example a collection in the MongoDB (1 replica) takes 1373Mb and the same data take 12655Mb which is about 9 times more. The same index in oracle takes about 3 times more space. Again, it is not a fair comparison and probably better to compare with a cloud implementation of MongoDB but if you move from an on-prem solution you should count the storage allocation and estimate the cost. I know some say that storage is cheap in the cloud but I want to be realistic and be aware about all implications.
And finally I want to talk about the toolset. The command line tools like mongosh and mongoimport work perfectly well with the Oracle API but if you want to use the MongoDB Compass it might be a bit frustrating. You can connect and run some queries but the rest of the tabs like aggregations, indexes, and explain plans are not supported. But as a replacement you have a web based interface for your JSON collection in the OCI console and it can do all those tasks and maybe more.
As a short summary I would conclude that the new MongoDB compatible API for Oracle Autonomous creates new opportunities for the MongoDB community and companies to move the MongoDB workload to the Oracle Cloud Infrastructure and get all the benefits of the advanced database engine. But at the same time the migration should be carefully planned and some parts of your application, deployment process and maybe even some logic could be shifted and required to be adjusted to the new platform.