Machine learning is a field of artificial intelligence (AI) that is concerned with learning from data. Machine learning has three components:
Supervised learning: Fitting predictive models using data for which outcomes are available.
Unsupervised learning: Transforming and partitioning data where outcomes are not available.
Reinforcement learning: on-line learning in environments where not all events are observable. Reinforcement learning is frequently applied in robotics.
Posts on machine learning
In the following posts, machine learning is applied to solve problems using R.
LMOps is a research initiative on fundamental research and technology for building AI products w/ foundation models, especially on the general technology for enabling AI capabilities w/ LLMs and Generative AI models.
Link
Industrialization of Data Consumption by ML systems during both experimentation (historical batch) and production (batch & stream) — grow above & beyond toy-ML-with-csv and single-threaded-pickle-flasked-deployment.
[Link]{https://medium.com/@sunil_iitb/ml-systems-industrialization-and-mlops-b30106974454}
Loading your training data becomes an escalating challenge as datasets grow bigger in size and the number of nodes scales. We built StreamingDataset to make training on large datasets from cloud storage as fast, cheap, and scalable as possible. Specially designed for multi-node, distributed training, StreamingDataset maximizes correctness guarantees, performance, and ease of use.
Link
Dust apps rely on model providers to interact with large language models. You can setup your first model provider by clicking on the Providers pane and setting up the OpenAI provider. You’ll need to create an account at OpenAI and retrieve your API key.
Link
This repository contains a curated list of awesome open source libraries that will help you deploy, monitor, version, scale and secure your production machine learning
Link
Aqueduct gives you a simple Python-native API to define machine learning pipelines, the ability to deploy those pipelines on your existing infrastructure (e.g., Spark, Kubernetes, Lambda), and visibility into the code, data, and metadata associated with your workflows. Aqueduct is fully open-source and runs securely in your cloud.
Link