In today’s discussion, we will explore how to create knowledge graphs using LangChain. A knowledge graph represents unstructured data in a structured format using graph structures such as nodes and the relationships between them. Knowledge graphs have gained popularity in fields like biology and have proven particularly useful with the advent of graph analytics tools such as Neo4j. Our tutorial will demonstrate how generative AI can help create knowledge graphs from textual or unstructured datasets.
To get started, you need to install several libraries:
langchain-experimental
langchain-community
langchain
networkx
google-genai
(if using the free API by Google)langchain-core
jsonrepair
After installing the libraries, import the essential packages, including the llm_graph_transformer
.
from langchain_core import Document
from langchain.llms import LLMGraphTransformer
import networkx as nx
import google_genai as genai
Configure your LLM with your API key. For instance, if you are using Google's free API, set it up accordingly.
api_key = 'your_google_api_key'
llm = genai.LLM(api_key=api_key)
Consider a baseline text about Marie Curie. This unstructured text will be converted into a structured format.
text = "Marie Curie was a Polish and naturalized-French physicist and chemist who conducted pioneering research on radioactivity."
The goal is to convert this text into a format where relationships and nodes are represented clearly:
Marie Curie -> Polish Nationality
Load the text into the document function and pass it to the llm_graph_transformer
.
doc = Document(text)
transformer = LLMGraphTransformer()
Identify the nodes and possible relationships within the text. In this example, nodes are ‘country’ and ‘person,’ while relationships might include ‘nationality,’ ‘located in,’ ‘worked at,’ ‘spouse,’ and ‘mother.’
nodes = ['country', 'person']
relationships = ['nationality', 'located in', 'worked at', 'spouse', 'mother']
transformer.set_nodes(nodes)
transformer.set_relationships(relationships)
Generate the graph by calling the function with the loaded document.
knowledge_graph = transformer.transform(doc)
For better readability or to train machine learning models, convert the graph into a CSV format.
import csv
with open('knowledge_graph.csv', 'w') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(['Source ID', 'Target ID', 'Relationship'])
for edge in knowledge_graph.edges(data=True):
writer.writerow([edge[0], edge[1], edge[2]['relation']])
If you don't provide labels for entities and relationships, the LLM will identify potential entities and relationships on its own, albeit possibly less accurately.
transformer.set_nodes(None)
transformer.set_relationships(None)
anonymous_graph = transformer.transform(doc)
Creating a knowledge graph helps convert unstructured data into a structured format useful for various applications, including machine learning workflows. Experiment with smaller datasets before scaling up to conserve costs and ensure efficacy.
A knowledge graph is a structured representation of information where entities are nodes, and relationships between them are edges.
LangChain helps automate the extraction of structured data from unstructured text using its generative AI capabilities, making the process efficient and scalable.
You need langchain-experimental
, langchain-community
, langchain
, networkx
, google-genai
, langchain-core
, and jsonrepair
.
Load your text into a document function, configure nodes and relationships, and use LLMGraphTransformer
to generate the knowledge graph.
Yes, the LLM can attempt to identify entities and relationships on its own if you don’t specify them, though specificity improves accuracy.
Start with smaller datasets to gauge performance and cost before scaling up to larger datasets, especially when using paid APIs.
In addition to the incredible tools mentioned above, for those looking to elevate their video creation process even further, Topview.ai stands out as a revolutionary online AI video editor.
TopView.ai provides two powerful tools to help you make ads video in one click.
Materials to Video: you can upload your raw footage or pictures, TopView.ai will edit video based on media you uploaded for you.
Link to Video: you can paste an E-Commerce product link, TopView.ai will generate a video for you.