LangChain has become one of the most talked about topics in the developer ecosystem, especially for those building enterprise applications using large language models for natural interactions with data.
In a recent blog post ‘Breaking the Language Model Barriers with LangChain’, associate consultant–Python and AI developer–at Infosys and Intel openAPI evangelist, Jayita Bhattacharyya, explains various components like memory, chains and agents, alongside showcasing examples of how LangChain works in the enterprise setup, citing Hugging Face.
LangChain is a Python framework for LLMs released in October 2022. Programmer Harrison Chase developed it. At its base, the main offering of this framework, is an abstraction wrapper that makes it easier for programmers to integrate LLMs into their programs.
Initially, it only had support for the OpenAI and Cohere API, along with a Python interpreter. Today, the project has blossomed to support over 20 model providers and hosting platforms, over 50 document loaders, more than 10 vector databases, and over 15 tools commonly used by LLMs. Last month, it raised seed funding of $10 million from Benchmark. Soon after, it received another round of funding in the range of $20 to $35 million from Sequoia, quoting a valuation of $200 million.
What makes LangChain special?
Developers have stated that what NumPy and Pandas did for machine learning, LangChain has done for LLMs, significantly increasing their usability and functionality. By using LangChain, developers can empower their applications by connecting them to an LLM, or leverage a large dataset by connecting an LLM to it.
One of the fascinating aspects of LangChain is its ability to create a chain of commands – an intuitive way to relay instructions to an LLM. Each command or ‘link’ of this chain can either call an LLM or a different utility, allowing for the creation of AI agents that can decide information flow based on user input. In addition to chaining, the package has ways to implement memory and built-in benchmarks to evaluate the potential utility of an LLM.
Example
LangChain using Python:
Installation:
pip install langchain
# or
conda install langchain -c conda-forge
LangChain using Hugging Face:
Here’s a step-by-step guide to experimenting with LangChain.
- Install the HuggingFaceHub library:
pip install huggingface_hub
pip install huggingface_hub
- Set the API token:
import os
os.environ["HUGGINGFACEHUB_API_TOKEN"] = "YOUR_HF_TOKEN"
- Import the necessary modules:
from langchain import HuggingFaceHub
from langchain.chat_models import ChatOpenAI
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.prompts import PromptTemplate, ChatPromptTemplate
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.retrievers import ArxivRetriever
from langchain import OpenAI, ConversationChain
from langchain import LLMChain
from langchain.agents import load_tools, initialize_agent
- Use the HuggingFaceHub model for translation:
llm = HuggingFaceHub(repo_id="google/flan-t5-xl", model_kwargs={"temperature": 0, "max_length": 64})
result = llm("translate English to German: How old are you?")
- Utilise the ChatOpenAI model for chat-based interactions:
chat = ChatOpenAI(temperature=0)
result = chat([HumanMessage(content="Translate this sentence from English to French. I love programming.")])
- Use Hugging Face Embeddings for text embedding models:
embeddings = HuggingFaceEmbeddings()
text = "This is a test document."
query_result = embeddings.embed_query(text)
doc_result = embeddings.embed_documents([text])
- Construct prompts using PromptTemplate and ChatPromptTemplate:
string_prompt = PromptTemplate.from_template("tell me a joke about {subject}")
string_prompt_value = string_prompt.format_prompt(subject="soccer")
string_prompt_value.to_string()
- Load documents using TextLoader:
loader = TextLoader('./state_of_the_union.txt')
documents = loader.load()
- Split text using CharacterTextSplitter:
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)
- Create a VectorStore using FAISS:
db = FAISS.from_documents(docs, embeddings)
query = "What did the president say about Ketanji Brown Jackson"
docs = db.similarity_search(query)
print(docs[0].page_content)
- Use ArxivRetriever to retrieve relevant documents:
retriever = ArxivRetriever(load_max_docs=2)
docs = retriever.get_relevant_documents(query='1605.08386')
print(docs[0].metadata) # meta-information of the Document
- Implement conversation chains using ConversationChain:
llm = OpenAI(temperature=0)
conversation = ConversationChain(llm=llm, verbose=True)
conversation.predict(input="Hi there!")
conversation.predict(input="Let's talk about AI.")
conversation.predict(input="I'm interested in Foundational Models.")
- Create an LLMChain for combining components:
llm_chain = LLMChain(prompt=prompt, llm=llm)
question = "Can Barack Obama have a conversation with George Washington?"
print(llm_chain.run(question))
- Use agents with tools and LLM for decision-making:
llm = OpenAI(temperature=0)
tools = load_tools(["wikipedia", "llm-math"],
Wrapping up
LangChain offers access to SOTA large language models (LLMs) from various providers like OpenAI, Hugging Face, Cohere, AI24labs, and more. These models can be accessed through API calls using platform-specific API tokens, allowing users to leverage their advanced capabilities in language processing and understanding. In this article, we have used the Hugging Face library.
From translation and chat-based interactions to text embeddings and document retrieval, the library offers a wide range of functionalities. With Hugging Face Hub, users can access pre-trained models and leverage their capabilities for various applications. LangChain, with its intuitive interfaces like PromptTemplate and ChatPromptTemplate, simplifies prompt engineering and management.
The integration of document loaders, text splitters, vector stores, and retrievers enhances the processing and analysis of textual data. Additionally, the memory module and conversational chains enable the creation of more interactive and context-aware applications. With LangChain, agents can be designed to make informed decisions by combining tools and language models. Overall, Hugging Face and LangChain provide a comprehensive ecosystem for NLP tasks, offering flexibility and efficiency in building language-driven applications.