YouTube Transcripts → Knowledge Graphs for RAG Applications

Here we will explore how to scrape YouTube video transcripts into a knowledge graph for Retrieval Augmented Generation (RAG) applications. We will use Google Cloud Platform to store our initial transcripts, LangChain to create documents from the transcripts and a Neo4j graph database to store the resulting documents. In this example we will be creating a knowledge graph containing objective musical facts spoken by Anthony Fantano himself on a select few music genres.

Link