I write about tech, Indian classical music, literature, and the workplace among other things. 1x engineer on weekdays. https://kovidrathee.medium.com/membership
Photo by Uriel SC on Unsplash

What usually goes wrong with database index design and implementation

Introduction

Index design is one of the most important things you do when working with a database. Effective index design can have dramatic effects on database performance. Well-designed indexes can make your database super fast and efficient. Badly designed indexes can make your database miserable, slow, would cost you a lot…

The title is a bit over-the-top. Nothing is as sexy as building rockets. Nothing else has to be, as you rightly pointed out as the end of the article, there are many ways of contributing to the causes you care about.

Data privacy, security, and governance are going to be some of the big challenges both for organisations and individuals in the time to come. Keeping people's data safe, not just from a cybersecurity point of view, but also from a governance point of view sounds really cool too. It won't probably get compared to rocket science, but that's okay.

Great read for someone who wants to understand how Kubernetes can help data engineers. I just started reading Zero to Hero JupyterHub documentation on @Vladimir's recommendation. It's more useful that I thought - great introduction to Kubernetes and Helm charts!

Interesting! Haven't come across this as a necessary condition before. The other five, i.e., completeness, uniqueness, validity, accuracy, and consistency make perfect sense.

Timelessness as defined here is how useful or relevant your data is according to its age. The other five conditions can usually be tested by writing some code. How do you test timelessless of data? Isn't this too subjective? What'd be interesting would be to understand how to track this data quality metric for a column or a table.

If estimating the timelessness parameter is just for the sake of deciding on an archival strategy, then that's not very critical. You can eyeball that parameter.

Great, in-depth article on the struggles or choosing the right tech for solving data engineering problems and course correcting when things go south. From a MySQL instance to Apache Druid, from diffing complete data sets for updates to building a solution that Druid doesn't recommend, but works for you, this is a good, optimistic read for anyone facing challenges choosing the right tech and making it work.

Photo by Liam Briese on Unsplash

A short introduction to the latest in-memory database service from AWS.

In a recent post in this series, I talked about different options for caching in AWS. I talked about ElastiCache for Redis, as it is one of the most popular caches out there, especially because of its advanced data types like sorted sets which make a developer’s life way easier…

Photo by Alex Kotliarskyi on Unsplash

And the newest TSDBs on the AWS Marketplace

Following from my previous posts on Timeseries databases, where I argued for the case for using timeseries databases along with some hands-on tutorials, in this post, I’ll talk about different timeseries databases and how you can run these databases on AWS.

Running Timeseries Databases on AWS 🏃‍

Hosting and running any open source or enterprise database…

Kovid Rathee

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store