Đối với một Data Engineer như mình, ưu tiên chọn một ngôn ngữ dựa trên việc nó có giải quyết được hết hầu hết các nhu cầu và bài toán của mình hay không: Data Engineering, Distributed System và Web Development. Và cuối cùng mình dự định sẽ bắt đầu với Rust, bởi vì ...
Spark 3.1 on the Kubernetes project is now officially declared as production-ready and Generally Available. Spot instances in Kubernetes can cut your bill by up to 70-80% if you are willing to trade in reliability. The new feature - SPIP: Add better handling for node shutdown (SPARK-20624) was implemented to deal with the problem of losing an executor when working with spot nodes - the need to recompute the shuffle or cached data.
Hey, I just found this tool, so incredibly clever that it uses Github Actions for uptime monitor and status page.
I'm looking for some of alternatives for Docker. Currently, there are a few of container technologies which are Docker’s most direct competitors, such as rkt, Podman, containerd, ...
More than 200+ companies are using ClickHouse today. With many features support, it's equally powerful for both Analytics and Big Data service backend.
A tool for writing better scripts. I usually choose to write a Python or Deno script instead of a shell script for more convenience. I found this tool is so great, helping to write the script quickly.
Bitbucket Pipelines document is fragmented everywhere. It always makes me search for a while every time I write a new one for CI/CD. So I'll make a few notes here.
Postgres has built-in functions to handle Full Text Search queries. This is like a "search engine" within Postgres.