@Duy Tran - A reference to the TableAPI you've built brought me here. Absolutely wonderful job with the article, and even a better job with building a scalable data system that enforces standardization.
Although you've underscored the importance of having the standardized metadata by way of defining interfaces and APIs…
@Amir - Both indexes and partitions help with joining tables. If you don't have a distributed system like a single MySQL or PostgreSQL instance, but still have a lot of data, indexes might get too big - and maintaining them can become a pain. So, you can think about partitioning…
@Robby - Awesome introduction to DuckDB! I have been reading about it and have been doing some basic PoCs using DuckDB lately. I'm not sure how many databases have actually implemented the EXCLUDE keyword. I think it's a novelty in DuckDB. With ClickHouse, Druid, and now DuckDB (and maybe some…
The fact that the engineering team quickly figured out what was wrong and fixed it within four hours (which is impressive, given the scale of the problem) and also profoundly thought through what happened in the post-incident analysis is a testament to some great work under severely high pressure.
It's great when recommendation engines work and you end up reading something meaningful. I stumbled upon this blog post as it showed up in my feed. It's full of very practical advice regarding compensation, growth, and the ethics of changing jobs in a really competitive market.
I agree with most…
I didn't know about this tool, I would have written some kind of recursive SQL query or some Python code to generate test data, but it's great to know that Databricks already provides this out-of-the-box.
This can be great for running small scale tests for loads specific to your business…