Queue of execution in DynamoDB

Lukas Liesis
3 min readJan 5, 2023

In many scenarios you have a bunch of items which you want to execute one after another. There are many ways how to get it done. Many times you may use proper queue solution like SQS. You may use some database with native pub-sub support, like Redis. But for now let’s talk about solution with DynamoDB.

We will assume this queue of execution is not mission-critical and we can have delays. To save $ we will use EC2 running as spot instance. If it’s important queue, I would pick just a normal EC2 instance instead.

When items come in, you don’t know the pace of incoming items, so do minimum work to ingest without issue — save directly to database for feature and do nothing else.

Have file which is running on interval, maybe executing with crontab or whatever else you may have.

This file would take next item to execute, execute it and delete from queue.

Picking next item to execute

You must be able to run multiple execution processes. Maybe on same machine, maybe on different EC2 instance(s). Maybe some main executor would be regular EC2, additional machines would be spot instance to balance performance & cost. Punchline is to make sure you do not execute same item from different processes at any moment in time.

Our source of truth in this setup is DynamoDB. To pick next item use Update query to update next pending item with id of current running process. Current running process must be set during starting the process to some random value which would identify this process. If you kill the file execution & re-start it, process id is new now.

So item is pending, you update it with this process id — it means this item now belongs to this process. This process is executing it.

Execution of the item

EC2 spot instance may be terminated at any moment by AWS and EC2 instance will get a notice 2 minutes before termination, so your process has 2 minutes to finish.

Once you updated the item at DynamoDB your time starts to run. You have 2 minutes to execute the item or it must be considered that your process failed to execute and this same item is available for any other process.

Finishing the execution

Once you done executing your item you must delete it from the queue, so no other process would take it for second execution attempt.

Running the file

File which does all of these steps — pick, execute, finish. Must run on some interval. This interval will define lag between data ingest and start of execution when queue is empty.

Let’s say interval on which file runs is set to 1 minute. File runs, nothing to be done, exists. Immediately after exit new item is save in queue. This item will way for all 1 minute before starting the execution. Depending in queue and what you are doing here, it may be very bad UX. Sometimes you may need to run on short interval, like couple seconds.

Let’s say your interval is 1 second, in this case every second you will call database to check if there are items in queue pending. Again, depending on what you are doing it may be overkill and waste of resources.

Often, good starting point is to run file every 5 seconds for high-pace things and every 5–15 minutes for slow-pace things.

The smaller time amount needed for single item execution the better.

If you want to ensure how many executions are happening at once on single machine, you must track processes on this machine.

You can try to use “timing” technique, where if you run file every 5 seconds and your execution of single item is let’s say 300ms, then you pick items till you spend 5 seconds minus 300ms (to make sure to not go over-limit) and you execute as many items as possible and end process once 5 seconds are close.

You can try to use sub-process technique which for me feels way more robust. In NodeJS this would be child processes. So you could run single runner file which would handle concurrency of all workers and 1 worker would be 1 execution process. In EC2 spot instance case you must listen for AWS notification about upcoming termination and not spawn any new processes and anything inside those processes — finish execution and not pick anything new to do, so during termination processes would sit idle.

--

--

Lukas Liesis

2 decades exp of building business value through web tech. Has master's degree in Physics. Built startups with ex-Googlers. Was world’s top 3% developer.