After switching to DynamoDB I noticed a couple interesting things. When working with hosted solutions, I pay for the server resources: RAM, CPU, disk space, sometimes network. But when I work with DynamoDB I pay for WCUs and RCUs. Write capacity units and read capacity units. And network if you are going out of AWS. You can use EC2 instances inside AWS VPC and by using local connection get away from network payments.
It makes me think differently about query optimization. Plus DynamoDB is promoting usage of overloaded indexes and has interesting pattern how to index the data.
AWS DynamoDB recommends to use a single table for everything and use different table only when you need different data access pattern not just to store a different kind of data.
The price
Each query, depending on what it does, consume some WCU and RCU. I also have to pay for disk space but it’s just a small fraction of the price and also while at Trafikito I use global tables — for the network between regions to keep regions in sync.
While I use DynamoDB from local EC2 instances — the network doesn’t cost from EC2 to DB but if you use RESTfull API from some 3rd party VPS — you have to pay for the network going out of AWS cloud. The outgoing network is an expensive thing but I use local EC2 instances, so not an issue for my case.
Indexing
I must have a primary index on one field of the table. Then I can define sort key for the primary index.
I can add several global secondary indexes (GSI) with the same pair: primary index key + optional sort key. You may also have a local secondary index but I don’t use it because of technical limitations of such index.
Read more at docs: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/LSI.html
Different feel while designing the data
When I work with DynamoDB I like to always have in mind that I pay for the query to the database, not for resources in general.
It’s harder to say how much RAM, CPU, SSD query will take, for how long it will use it etc.etc.
So basically you don’t know how much you pay for the query, how much you pay if you decide to add extra optional fields that you may use in the future.
With DynamoDB I know exactly. If I want to write 1KB of data per second — it will cost 1WCU. If I want to read 4KB of data — it will cost 1 RCU. If I write 0.7KB of data it will still cost 1WCU and if I write 1.001KB of data — it will cost twice — 2WCUs.
It’s good to know what is the cost of each thing. Easier to optimize.
I know my data better
With all barriers, my data structure is better. I have constantly think about optimization, data access patterns. What Trafikito really needs to deliver the best performance for the price tag.
I know access pattern of each bit saved to the database. There are no dead data which just stays without any real need.
So?
Trafikito must deliver value for low cost or no cost at all to many many people around the globe. I want it to be available for everyone and stay quick and reliably from any location.
To make it happen, you need global infrastructure, which is hard to build and maintain.
The bigger your database is, the harder is to modify it. When building software you always have to think about database layer.
With DynamoDB I have to think about it from the perspective of business logic — how to optimize data access patterns, not how to maintain the best possible configuration of the database software and operating system.
I don’t have to think which OS to choose, which hardware to use, which configuration. I don’t have to monitor all these things. I can just think about business logic and spend time on data access patterns, data storage optimization, and similar things.
DynamoDB limits make Trafikito better
When I started using DynamoDB it was frustrating.
Indexing and data access patterns are very different in comparison to SQL, MongoDB, and Redis but I’m super thankful for AWS teams to deliver such service.
It’s just great. It makes me a better developer, it makes Trafikito a better service.