As someone who once embraced the allure of NoSQL databases I want to share my journey of database technology change by explaining why I switched from RavenDB to PostgreSQL. I remember being captivated by RavenDB's exotic feel when I first encountered it. But now I realized that PostgreSQL is a better choice for my case.
RavenDB's initial charm
The initial impression was great. RavenDB seemed like a breath of fresh air in the database world. I was particularly impressed by its document-oriented approach its flexible schema. I loved experimenting with Javascript indexes pushing the boundaries of what I could achieve. I successfully delivered production-scale app features leveraging the power of Javascript index functions. The backup and restore functionality was also a standout feature working seamlessly every time I needed it.
They also offer a free cloud-based playground which you can use for developing your app. I went a step further: the whole production and development databases for this AFIEther website resided there 🙀!
The cracks began to show
However as time went on and I delved deeper into real-world application development with RavenDB some fundamental challenges began to surface. These issues ultimately led me to re-evaluate my database choice.
And I decided to move on to PostgreSQL now before my app grows making it harder to do the switch. Now the website and associated worker node work fully on PostgreSQL for both DEV and PROD environments.
Relational reality: What you model is relational at some point. Despite the initial appeal of a document database I realized that many of my data models inherently leaned towards a relational structure regardless of the immediate app use case.
The join venture: What you can achieve with a JOIN might get super tricky in RavenDB. The absence of straightforward JOIN operations a cornerstone of relational databases became increasingly problematic. Complex queries that would have been simple with SQL became convoluted workarounds.
Aggregation mystery: If you want to find the MAX value of a column then it's tricky. Simple aggregations like finding the maximum value of a column which are trivial in SQL required creating custom indexes in RavenDB adding unnecessary complexity.
Flexible schema is a double-edged sword: It's cool you don't need to face migrations whenever your schema changes but it's a problem too. While the schema-less nature of RavenDB seemed liberating initially I soon discovered its drawbacks. The lack of enforced data types led to inconsistencies where the same column could hold different data types across documents causing runtime errors. If for the same column name one document has a string other a number your business logic will crash.
Licensing and scalability:RavenDB is super scalable however you pay the price. RavenDB's scalability is undeniable but the limitations imposed by the free license particularly the restricted resource usage made it unsuitable for anything beyond small-scale projects. In contrast PostgreSQL being open-source offers unrestricted resource utilization.
Licensing in Docker headaches: A few months ago, I needed to develop something quite fast. However, the RavenDB Docker container would not allow me to use it unless I upgraded it to a new minor version. And after that, it didn't want to work without the bloody license. I vowed to myself that I'll migrate away from it should it do the same. And it did!
A strange bug in Javascript index: I was facing an annoying bug where a transaction report would not be grouped properly by the index. The output was miserable and contained many NaN values. Data seemed ok and despite numerous attempts to solve this, I gave up on it.
Cloud connectivity concerns: Even more than that I oftentimes see connectivity issues. Frequent connectivity issues with their managed cloud instances even on the free tier raised concerns about reliability. While I appreciate the availability of a free cloud offering the instability was a significant drawback.
Problematic collection identities: And also I had to waste time to implement a mechanism of checking that an ID belongs to a certain collection. The fact that querying a collection with an ID belonging to another collection returned the document created issues with entity mappings requiring me to implement additional validation logic.
Embracing PostgreSQL - hurdles
Adding all these up, I gathered enough motivation to make the switch. Which was quite challengeful but worth every minute spent on it! Luckily, I've enacted a quite good architecture for this AFIEther project, with a clear separation of entities, persistence and business logic. The latter works with the persistence mainly via what I call DocumentInterface objects. This way the business logic is decoupled from database access therefore you just need to change one layer, logic stays the same.
Even so I encountered many issues among which I want to mention a couple:
Timestamps: in RavenDB, I stored timestamps as milliseconds from Epoch. But PostgreSQL has a Timestamp which works with seconds from Epoch. I had to do many modifications in mappings and property types.
Big numbers: so I've got some places where I need reliable numeric data. So I deigned to use PostgreSQL's numeric type. But this type is returned as string by TyoeORM so I had to make adjustments to take care of this case.
Nullable columns: another major issue was that the initial migration for the Postgres db contained many non-nullable columns. In RavenDB, you are very flexible with the data, in PostgreSQL I had to fix the null-errors one by one
Attachments: oh, this one introduced a bit of feature development because in RavenDB, every document can have binary attachments. But in PostgreSQL, nope. For this case, I had to create a new attachments table for each entity which had attachments. I created it in a reusable way, though so I'm proud of that!
Entity IDs: I wanted to preserve the same IDs throughout the whole migration. But because of the collection identity issue I mentioned above, a RavenDB document ID was composed of 2 parts: {collection_identified}_{guid}. In PostgreSQL, I wanted to preserve just the guid part. This introduced some problems with referential integrity and got solved when I created a data copy script.
The result is more than satisfactory
I can definitely say that it's worth the effort. The PotgreSQL db seems more stable and reliable. Also the speed of loading is visibly faster:
I took advantage of the PostgreSQL managed cloud offering of Vultr, my project's cloud provider. This way, the apps and the database work on the same internal network thus delivering much faster!
If you want to check out their offerings, I encourage to use my link below 👇. You'll get some free credit to convince yourself of the sturdy cloud offerings at a very good rate.
Conclusion
I'm more than satisfied of how it works right now after doing this migration. It was not without challenges, but I tried to overcome each obstacle with the right approach of simplicity, efficiency and clean architecture for the continuation of the project.
On this note, I'd like to thank you all for reading this article, hope you found it useful and don't forget to subscribe to my newsletter.