How do you protect your system from DDoS and other attacks?

Lukas Liesis
4 min readMar 18, 2019

I found this question on the PitchGround group and decided to write a quick post on this.

Trafikito.com is the monitoring service. When your service is down, Trafikito must be still up. To make it happen, I have implemented several layers of defense tactics to protect against all kinds of bad things.

Tools and tactics used to protect Trafikito at the network layer

1. Cloudflare.com, which has a huge collection of data about all kinds of attacks and can block requests before they even touch Trafikito servers.
2. AWS Virtual Private Cloud various configuration options, aggressive firewall and local private connections to AWS Global DynamoDB tables. All data between the NodeJS process and database travels inside private AWS networks.
3. Load balancing API requests to several regions at the DNS level as well as at the application layer.
4. Moving load off the servers for static files by using AWS CloudFront Content Delivery Network and Cloudflare.

At the application layer

1. By per-request throttling mechanisms for all API endpoints
2. By validating every single bit of incoming data. Trafikito has 69 functions dedicated just to this task.
3. By active monitoring and notifications by email and Slack messages. Notifications go even before there are issues with servers. You can have the same thing for free at https://trafikito.com/
4. Session-less connections and API-centric application design.
5. By using static files for all content. The main app is made of ReactJS+Webpack, and other pages are under development with Gatsby.
6. By fall-back mechanisms to another endpoint with the lowest lag from the client’s location.

At the database layer

1. By using AWS DynamoDB tables to store all critical data. DynamoDB can handle more than 10 trillion requests per day and support peaks of more than 20 million requests per second.
2. By using AWS DynamoDB Streams to continuously replicate data between global tables at all 3 edges: Asia, US, and Europe and by using local tables for each EC2 instance.
3. By using continuous Point-in-Time Recovery backups. Trafikito global tables can be restored to any point in time during the last 35 days.
4. By encrypting all sensitive data and generating unique salt hashes for all.

The first layer is Cloudflare service

Cloudflare does a great job scanning for DDoS attacks. They have a huge collection of data around it. Who, when, from where etc.. So they can often detect DDoS before even starting and blocks those requests. I see 3 attacks blocked during the last 24h.

Trafikito.com stats for last 24 hours at Cloudflare.com

AWS magic

All system is load balanced to 3 separate AWS regions: USA, Europe, and Asia. The system can easily work with 1 region online. When I do update, I execute rolling update and for a minute or so one region goes offline to update itself but the system is still online with 2 other regions. It can work perfectly fine with only a single region online. Depending on client location, service may work a bit slower.

Trafikito.com edge locations for load-balancing and quicker load

Incoming data validation

All API endpoints validate each bit of incoming data. No unexpected entries. At Trafikito there are 69 dedicated functions to validate all kinds of input.

Trafikito.com source code. List of validation functions.

API throttling

All API endpoints have a throttling mechanism. If you will make an unusual amount of requests, it will slow itself down for you.

Monitoring

All servers have monitoring once per minute. If some server would go forward the overload, I would get an email and message to Slack. It’s all actually done with Trafikito itself. We also use independent from Trafikito application monitoring service as a backup.

Trafikito.com homepage https://trafikito.com/

API-centric static files application

All application is made of static files. Many of those are cached on a lot of AWS Cloudfront edges + Cloudflare does it’s own caching. All API requests are also load balanced with fallback at the application layer.

The database

And lastly, for the database, I use global AWS DynamoDB tables which can handle a ton of requests, has continuous backups and time-in-point recovery as a backup.

DynamoDB can scale to more than 10 trillion requests per day with peaks greater than 20 million requests per second, over petabytes of storage.

This is the performance of the database during the last 2 weeks:

The average response time (~3–4ms) of the database at Trafikito.com

--

--

Lukas Liesis

2 decades exp of building business value through web tech. Has master's degree in Physics. Built startups with ex-Googlers. Was world’s top 3% developer.