Machine learning operations (MLOps) is becoming an exciting space as we figure out the best practices and technologies to deploy machine learning models in the real world. MLOps enable ML teams to build responsible and scalable machine learning systems and infrastructure.
Link
This book is an introduction to the Polars DataFrame library. Its goal is to introduce you to Polars by going through examples and comparing it to other solutions. Some design choices are introduced here. The guide will also introduce you to optimal usage of Polars.
Link
smart_open is a Python 3 library for efficient streaming of very large files from/to storages such as S3, GCS, Azure Blob Storage, HDFS, WebHDFS, HTTP, HTTPS, SFTP, or local filesystem. It supports transparent, on-the-fly (de-)compression for a variety of different formats.
Link
A time series is a sequence of data points indexed in time order. It’s an observation of the same variable at successive points in time. In other words, it’s a set of data that has been observed over a period of time. Link
Kedro is an open-source Python framework for creating reproducible, maintainable and modular data science code. It borrows concepts from software engineering and applies them to machine-learning code; applied concepts include modularity, separation of concerns and versioning. Kedro is hosted by the LF AI & Data Foundation.
Link
Implementations are for learning purposes only. As they may be less efficient than the implementations in the Python standard library, use them at your discretion.
Link
AWS Step Functions is a visual workflow service that allows you to orchestrate AWS services, automate business processes, and build serverless applications.
Link
Kinesis Data Stream is used to ingest real-time streaming data. Now such streaming data can be ingested to Amazon Redshift for the real-time analytics purpose. Learn how to integrate Kinesis Data Stream with Amazon Redshift.
Link