data-engineering

Materialize

Materialize

0
Materialize is a streaming database with a SQL API. However, despite the fact that Materialize uses SQL idioms and can process data from databases, it actually has very little in common with “databases” as most people think of them. Link
Setting up Airflow on AWS

Setting up Airflow on AWS

0
Airflow is one of my favorite tools that I frequently use to setup and manage data science pipelines. The Airflow UI gives us a clear picture of the DAGS and its current status. I may be wrong here but from my experience, I have seen that Airflow on a single machine is not scalable. Thus, to scale Airflow, we can use Kubernetes. Link
A simple approach for background task in Django

A simple approach for background task in Django

0
When there is a long running task, there are usually below 2 requirements: As a user, I want to know the progress of the task As a user, I want to get the output of the task if it is finished We will use the out of the box features Threading and Cache in Python and Django respectively to achieve this Link
7 Tips To Maximize PyTorch Performance

7 Tips To Maximize PyTorch Performance

0
Throughout the last 10 months, while working on PyTorch Lightning, the team and I have been exposed to many styles of structuring PyTorch code and we have identified a few key places where we see people inadvertently introducing bottlenecks. Link
Airtable

Airtable

0
Airtable’s intuitive yet powerful platform gives everyone the flexibility to create their own solution and make work flow faster. Link