I write about tech, Indian classical music, literature, and the workplace among other things. 1x engineer on weekdays. https://kovidrathee.medium.com/membership

The title is a bit over-the-top. Nothing is as sexy as building rockets. Nothing else has to be, as you rightly pointed out as the end of the article, there are many ways of contributing to the causes you care about.

Data privacy, security, and governance are going to be some of the big challenges both for organisations and individuals in the time to come. Keeping people's data safe, not just from a cybersecurity point of view, but also from a governance point of view sounds really cool too. It won't probably get compared to rocket science, but that's okay.

Great read for someone who wants to understand how Kubernetes can help data engineers. I just started reading Zero to Hero JupyterHub documentation on @Vladimir's recommendation. It's more useful that I thought - great introduction to Kubernetes and Helm charts!

Interesting! Haven't come across this as a necessary condition before. The other five, i.e., completeness, uniqueness, validity, accuracy, and consistency make perfect sense.

Timelessness as defined here is how useful or relevant your data is according to its age. The other five conditions can usually be tested by writing some code. How do you test timelessless of data? Isn't this too subjective? What'd be interesting would be to understand how to track this data quality metric for a column or a table.

If estimating the timelessness parameter is just for the sake of deciding on an archival strategy, then that's not very critical. You can eyeball that parameter.

Great, in-depth article on the struggles or choosing the right tech for solving data engineering problems and course correcting when things go south. From a MySQL instance to Apache Druid, from diffing complete data sets for updates to building a solution that Druid doesn't recommend, but works for you, this is a good, optimistic read for anyone facing challenges choosing the right tech and making it work.

Photo by Alex Kotliarskyi on Unsplash

DATA ON AWS

And the newest TSDBs on the AWS Marketplace

Following from my previous posts on Timeseries databases, where I argued for the case for using timeseries databases along with some hands-on tutorials, in this post, I’ll talk about different timeseries databases and how you can run these databases on AWS.

Running Timeseries Databases on AWS 🏃‍

Hosting and running any open source or enterprise database…

I get where you are coming from, but databases like PostgreSQL and MySQL are extremely good with complex queries. I think a better way to put it would be that relational databases aren't good with handling queries to serve the needs of reporting and business intelligence.

In other words, read scaling is not a problem with relational databases if you're sending the right kind of reads to the database.

Kovid Rathee

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store