What You’ll Learn

What You’ll Need

Originally developed for internal Cisco engineering use, pyATS is at the core of the Cisco test automation solution. Cisco pyATS was made available, for free, to the general public in 2017.

It is currently used:

In this tutorial, we will be using a pyATS job to gather and save the Cisco IOS XE routing table as a JSON file. This file will be the source of the retrieval-augmented generation (RAG) LangChain.

LangChain is not just a tool for integrating language models; it’s a comprehensive framework that allows developers to build complex language-based applications. It supports various functionalities such as chaining different language models, customizing language model behaviors, and integrating external knowledge sources. In our project, LangChain plays a pivotal role. It’s not only used for integrating ChatGPT-4 but also for orchestrating the flow of information between the user interface (Streamlit), the language model, and external data sources like Cisco pyATS and ChromaDB. This orchestration is crucial for creating a seamless and intelligent user experience.

Please visit https://www.langchain.com/, and specifically https://python.langchain.com/docs/modules/data_connection/, for more information about using LangChain for RAG.

RAG is an innovative technique in the field of natural language processing. It combines the strengths of two artificial intelligence (AI) paradigms: neural retrieval (finding relevant documents or data) and neural sequence generation (creating coherent and contextually relevant text based on the retrieved data).

In the context of our Streamlit app, RAG can dramatically enhance the capabilities of ChatGPT-4. For example, when a user asks a technical question, RAG can retrieve technical documentation or relevant data from ChromaDB, which ChatGPT-4 then uses to craft a detailed, accurate response. This makes the app not just a conversational agent but also a powerful tool for information retrieval and knowledge dissemination.

The following image encapsulates the RAG approach with LangChain:

Retrieval Augmented Generation

ChatGPT represents a significant advancement in language models. Unlike traditional models that simply generate text based on the input they receive, ChatGPT is designed to understand and maintain context over a conversation. This ability makes it ideal for applications requiring interactive and engaging dialogues.

Within our Streamlit app, ChatGPT-4 is the core engine driving the conversational interface. Its advanced capabilities allow it to handle complex queries, engage in meaningful dialogues, and provide responses that are contextually relevant, informative, and engaging. This transforms our app from a simple query-response system into an interactive conversational platform.

Streamlit stands out in the realm of web application frameworks for its simplicity and focus on data science and machine learning projects. It allows developers to create beautiful, interactive web apps with minimal effort, using straightforward Python scripts. Streamlit apps are inherently interactive, with widgets for user input and the ability to refresh data and views in real-time.

For our application, Streamlit is the gateway through which users interact with the powerful back end (LangChain, ChatGPT-4, and Cisco pyATS). It’s not just a user interface; it’s an interactive platform that allows users to input their queries, interact with the responses, and even visualize data and results in a user-friendly manner.

Please visit https://streamlit.io for more information.

Docker containers are a cornerstone of modern software deployment strategies. They encapsulate an application and its environment, ensuring that it works uniformly across different computing environments. This encapsulation includes the application’s code, runtime environment, system tools, libraries, and settings. Containers are isolated but can communicate with each other through well-defined channels.

In the deployment of our Streamlit app, Docker containers provide several benefits. They ensure that the app runs consistently regardless of where it is deployed, be it a local machine, a test environment, or the cloud. This is particularly important given the complexity of our app, which involves multiple components like LangChain, ChatGPT-4, and Cisco pyATS. Docker containers simplify the process of managing these dependencies, making deployment, scaling, and testing more straightforward and reliable.

Create a parent folder called streamlit_langchain_pyats.

Inside this folder, create three subfolders:

In the root, we will also have a docker-compose.yaml file.

First, let’s set up our docker-compose.yaml file. This file will be used to start the application, using the docker compose up command. The application can be stopped by using the docker compose down command.

---

version: '3'

services:
  streamlit_langchain_pyats:
    image: ciscou/streamlit_langchain_pyats:streamlit_langchain_pyats
    container_name: streamlit_langchain_pyats
    restart: always
    build:
      context: ./
      dockerfile: ./docker/Dockerfile
    ports:
      - "8501:8501"

This will name the container and use the Dockerfile, which we will build next, to assemble the container. We are exposing port 8501 in order to reach the Streamlit application after the container starts via localhost:8501.

Next, inside the docker subfolder, create a Dockerfile.

This Dockerfile will provide the base Ubuntu Linux environment, along with Python, pip, and all the required pyATS, LangChain, RAG, and Streamlit packages. It will run a Bourne Again Shell (Bash) script to start the Streamlit application.

FROM ubuntu:latest

ARG DEBIAN_FRONTEND=noninteractive

RUN echo "==> Upgrading apk and installing system utilities ...." \
 && apt -y update \
 && apt-get install -y wget \
 && apt-get -y install sudo \
 && sudo apt-get update -y

RUN echo "==> Installing Python3 and pip ...." \
 && apt-get install python3 -y \
 && apt install python3-pip -y \
 && apt install openssh-client -y

RUN echo "==> Adding pyATS ..." \
 && pip install pyats[full]

RUN echo "==> Install dos2unix..." \
  && sudo apt-get install dos2unix -y

RUN echo "==> Install langchain requirements.." \
  && pip install -U langchain langchain-openai langchain-community \
  && pip install chromadb \
  && pip install openai \
  && pip install tiktoken

RUN echo "==> Install jq.." \
  && pip install jq

RUN echo "==> Install streamlit.." \
  && pip install streamlit

COPY /streamlit_langchain_pyats /streamlit_langchain_pyats/
COPY /scripts /scripts/

RUN echo "==> Convert script..." \
  && dos2unix /scripts/startup.sh

CMD ["/bin/bash", "/scripts/startup.sh"]

Finally, we will create the startup.sh Bash script inside the scripts folder as follows:

cd streamlit_langchain_pyats
streamlit run chat_with_routing_table.py

pyATS uses the concept of a “testbed” to describe devices, including their connectivity requirements. Testbeds also include some additional information such as the platform’s operating system. Create a Secure Shell (SSH)-based pyATS testbed for the Cisco DevNet Always-On IOS XE Sandbox. Create the following YAML file inside the streamlit_langchain_pyats folder:

---

devices:
  Cat8000V:
    alias: "Sandbox Router"
    type: "router"
    os: "iosxe"
    platform: Cat8000V
    credentials:
      default:
        username: admin
        password: C1sco12345
    connections:
      cli:
        protocol: ssh
        ip: sandbox-iosxe-latest-1.cisco.com
        port: 22
        arguments:
          connection_timeout: 360

pyATS has a validation command that you can use to validate the structure and syntax of the testbed file.

Validate your testbed file:

(ciscou) $ pyats validate testbed --testbed-file testbed.yaml

Review and correct any Lint messages found by pyATS validation before moving on to the next step.

We can create do a lot of things with pyATS, including tests against the JSON keys and values. pyATS jobs are Python 3 scripts that use the testbed file that can be used for testing, documentation, snapshots, differentials, and even configuration management. For our purposes in this application, the pyATS job will run as a prestep to gather and create a JSON file from the routing table.

pyATS jobs are made up of three files: the testbed, a job file, and the Python 3 script. The job file is a control file that loads the testbed file and the Python logic file and utilizes the pyATS framework. In our example, our script will be called show_ip_route_langchain.py.

Inside the streamlit_langchain_pyats folder, create the following pyATS job file, called show_ip_route_langchain_job.py:

import os
from genie.testbed import load

def main(runtime):

    # ----------------
    # Load the testbed
    # ----------------
    if not runtime.testbed:
        # If no testbed is provided, load the default one.
        # Load default location of Testbed
        testbedfile = os.path.join('testbed.yaml')
        testbed = load(testbedfile)
    else:
        # Use the one provided
        testbed = runtime.testbed

    # Find the location of the script in relation to the job file
    testscript = os.path.join(os.path.dirname(__file__), 'show_ip_route_langchain.py')

    # run script
    runtime.tasks.run(testscript=testscript, testbed=testbed)

pyATS test scripts have a universal approach that can be used to help users develop their test scripts:

pyATS Search IOS-XE Parsers

Let’s start our pyATS test script with the common setup and common cleanup sections first and then write the test to capture and save the show ip route command output as JSON. We will be using pyATS AEtest, the core testing harness and banner for pretty output.

Create the show_ip_route_langchain.py file as follows:

import os
import json
import logging
from pyats import aetest
from pyats.log.utils import banner

# ----------------
# Get logger for script
# ----------------

log = logging.getLogger(__name__)

# ----------------
# AE Test Setup
# ----------------
class common_setup(aetest.CommonSetup):
    """Common Setup section"""
# ----------------
# Connected to devices
# ----------------
    @aetest.subsection
    def connect_to_devices(self, testbed):
        """Connect to all the devices"""
        testbed.connect()
# ----------------
# Mark the loop for Learn Interfaces
# ----------------
    @aetest.subsection
    def loop_mark(self, testbed):
        aetest.loop.mark(Show_IP_Route_Langchain, device_name=testbed.devices)

# ----------------
# Test Case #1
# ----------------
class Show_IP_Route_Langchain(aetest.Testcase):
    """pyATS Get and Save Show IP Route"""

    @aetest.test
    def setup(self, testbed, device_name):
        """ Testcase Setup section """
        # Set current device in loop as self.device
        self.device = testbed.devices[device_name]

    @aetest.test
    def get_raw_config(self):
        raw_json = self.device.parse("show ip route")
        ## Add a parent key for jquery inside the JSON loader step
        self.parsed_json = {"info": raw_json}

    @aetest.test
    def create_file(self):
        with open('Show_IP_Route.json', 'w') as f:
            f.write(json.dumps(self.parsed_json, indent=4, sort_keys=True))

class CommonCleanup(aetest.CommonCleanup):
    @aetest.subsection
    def disconnect_from_devices(self, testbed):
        testbed.disconnect()

# for running as its own executable
if __name__ == '__main__':
    aetest.main()

A few notes about the script: We are using pyATS ability to .parse the raw routing table output into structured JSON. We are then wrapping a parent key, info:, because inside LangChain, the JSONLoader requires a jquery schema. You will see this info key again during the LangChain step.

You will need an OpenAI account and application programming interface (API) key to proceed. There are free tiers, which this guide will use, that use the ChatGPT-3.5 Turbo model. For more advanced and modern models, you’ll need a ChatGPT Plus account (monthly subscription fees) to access ChatGPT-4 or ChatGPT-4 Turbo.

Visit https://platform.openai.com to create an account. Once you have an account, visit your profile, then APIs, and generate an API key. Keep this API key secret and safe at all times.

pyATS Logo pyATS Logo

We will use environment variables and a .env file to protect our OpenAI API key.

Now that we have a pyATS job that will ultimately gather the routing table and create a JSON file of the parsed data, wrapped inside a parent key for querying, we can proceed with our LangChain.

Before we start, inside the streamlit_langchain_pyats folder along with the pyATS job file, script, and testbed, create a .env file. Inside this file, create a key to store your OpenAI API key.

OPENAI_API_KEY="<your openapi key>

First, we need to import our various packages:

import os
import streamlit as st
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_community.document_loaders import JSONLoader
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain
from langchain.text_splitter import RecursiveCharacterTextSplitter

As you can see, we are going to import os and dotenv to secure our OpenAI API key, Streamlit for the application prototype web GUI, and most importantly, a variety of LangChain imports.

We will use langchain_openai to provide API access to the ChatOpenAI API as well as the OpenAIEmbeddings API to create our vectors (list of floating point numbers, a vital part of our RAG approach).

Also, we will use from the LangChain community our vector store (ChromaDB, a local, private, in-memory vector store or database) as well as our document loader (in this case, JSONLoader, used to load the pyATS-parsed JSON payload of the show ip route command).

Additionally, from LangChain, we will import our memory (ConversationBufferMemory), which enables the LangChain and large language model (LLM), ChatGPT, to maintain history and awareness of the previous questions and answers as if it were humanlike in that regard.

The ConversationalRetrievalChain, from LangChain, provides the linkage between our LLM and our vector store, allowing our questions to retrieve relevant vectors and chunks of data in order to augment the generation of the response.

Now we can write our LangChain, followed by the Streamlit conversational user interface.

We will first perform some preprocessing and load our environment variables and get the application ready to start the LangChain and conversational user interface.

# Function to run pyATS job
def run_pyats_job():
    os.system("pyats run job show_ip_route_langchain_job.py")

# Use Streamlit's caching to run the job only once
if 'job_done' not in st.session_state:
    st.session_state['job_done'] = st.cache_resource(run_pyats_job)()

# Instantiate openAI client
load_dotenv()

openai_api_key = os.getenv('OPENAI_API_KEY')

llm = ChatOpenAI(temperature=0, model="gpt-4-1106-preview")

Here, we are running our pyATS job, using the os.system command in Python, which saves a local file in JSON format that our LangChain can utilize. We are also loading our OpenAI API key and setting our model. For free usage, or if you are not a paid ChatGPT Plus user, change the model from gpt-4-1106-preview (also known as ChatGPT-4 Turbo) to gpt-3.5-turbo.

llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo)

The temperature setting indicates how “creative” or “flexible” the LLM answer generation is. You can play with this setting; 0.7 happens to be the best balance between creative and factual. However, in this application, we are referencing via RAG the exact JSON values from the routing table, and the “less creative” or 0.0 setting will yield the “most factual” and “least hallucinations” result.

Now that we have set up our Python environment and have all the required libraries, we will make a new class in Python. A Python class is an object-oriented structure that we can instantiate and has internal methods that we can call. This class will represent our LangChain, which we will instantiate and call from our Streamlit application.

class ChatWithRoutingTable:
    def __init__(self):
        self.conversation_history = []
        self.load_text()
        self.split_into_chunks()
        self.store_in_chroma()
        self.setup_conversation_memory()
        self.setup_conversation_retrieval_chain()

As you can see in the class definition, we are going to run a series of functions when this class is initialized. These functions map direction to the LangChain RAG image discussed earlier.

First, we will load, using the appropriate JSONLoader, the pyATS-parsed routing table from the show ip route command:

    def load_text(self):
        self.loader = JSONLoader(
            file_path='Show_IP_Route.json',
            jq_schema=".info[]",
            text_content=False
        )
        self.pages = self.loader.load_and_split()

We will use the loader’s built-in “load and split” function to break down the JSON into split-up high-level pages, or chunks, of data. But we want to further break down those pages of data into even smaller chunks, with a slight overlap between chunks, to store in our vector store (ChromaDB).


    def split_into_chunks(self):
        # Create a text splitter
        self.text_splitter = RecursiveCharacterTextSplitter(
            chunk_size=1000,
            chunk_overlap=100,
            length_function=len,
        )
        self.docs = self.text_splitter.split_documents(self.pages)

Now that we have a text splitter set up for 1000 characters with a 100-character overlap, we can move onto making our list of floating point numbers (or vectors) and store them into our ChromaDB vector store:

    def store_in_chroma(self):
        embeddings = OpenAIEmbeddings()
        self.vectordb = Chroma.from_documents(self.docs, embedding=embeddings)
        self.vectordb.persist()

As of this step, the self.vectordb variable is storing our chunks of JSON data as well as the vectors and floating point number references to these chunks of data. It is ready to be referenced in the retrieval phase of the LangChain.

Next, we will set up our memory:

    def setup_conversation_memory(self):
        self.memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

Then, we will set up our ConversationalRetrievalChain, which links the ChatGPT LLM to our vector store:

    def setup_conversation_retrieval_chain(self):
        self.qa = ConversationalRetrievalChain.from_llm(llm, self.vectordb.as_retriever(search_kwargs={"k": 10}), memory=self.memory)

Now, in our Streamlit app, we can invoke self.qa to answer the user input question!

Our last two functions are more related to presenting a conversation and retaining and using the full history inside the Streamlit app. Our LangChain is complete, so we now just need to handle the user input and conversational history:

    def chat(self, question):
        # Format the user's prompt and add it to the conversation history
        user_prompt = f"User: {question}"
        self.conversation_history.append({"text": user_prompt, "sender": "user"})

        # Format the entire conversation history for context, excluding the current prompt
        conversation_context = self.format_conversation_history(include_current=False)

        # Concatenate the current question with conversation context
        combined_input = f"Context: {conversation_context}\nQuestion: {question}"

        # Generate a response using the ConversationalRetrievalChain
        response = self.qa.invoke(combined_input)

        # Extract the answer from the response
        answer = response.get('answer', 'No answer found.')

        # Format the AI's response
        ai_response = f"Cisco IOS XE: {answer}"
        self.conversation_history.append({"text": ai_response, "sender": "bot"})

        # Update the Streamlit session state by appending new history with both user prompt and AI response
        st.session_state['conversation_history'] += f"\n{user_prompt}\n{ai_response}"

        # Return the formatted AI response for immediate display
        return ai_response


    def format_conversation_history(self, include_current=True):
        formatted_history = ""
        history_to_format = self.conversation_history[:-1] if not include_current else self.conversation_history
        for msg in history_to_format:
            speaker = "You: " if msg["sender"] == "user" else "Bot: "
            formatted_history += f"{speaker}{msg['text']}\n"
        return formatted_history

Now that we have set up our class and are ready to initialize it and invoke the various RAG functions, we can move onto our Streamlit front-end integration.

# Create an instance of your class
chat_instance = ChatWithRoutingTable()

# Streamlit UI for chat
st.title("Chat with Cisco IOS XE Routing Table")

# Initialize conversation history in session state if not present
if 'conversation_history' not in st.session_state:
    st.session_state['conversation_history'] = ""

user_input = st.text_input("Ask a question about the routing table:", key="user_input")

if st.button("Ask"):
    with st.spinner('Processing...'):
        # Call the chat method and get the AI's response
        ai_response = chat_instance.chat(user_input)
        # Display the conversation history
        st.text_area("Conversation History:", value=st.session_state['conversation_history'], height=300, key="conversation_history_display")

In this Streamlit integration, we are first instantiating our ChatWithRoutingTable class and then providing the front-end interface into our LangChain.

Now users can launch https://localhost:8501 after bringing up our Docker container!

Here is what our complete chat_with_routing_table.py file should look like:

import os
import streamlit as st
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_community.document_loaders import JSONLoader
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Function to run pyATS job
def run_pyats_job():
    os.system("pyats run job show_ip_route_langchain_job.py")

# Use Streamlit's caching to run the job only once
if 'job_done' not in st.session_state:
    st.session_state['job_done'] = st.cache_resource(run_pyats_job)()

# Instantiate openAI client
load_dotenv()

openai_api_key = os.getenv('OPENAI_API_KEY')

llm = ChatOpenAI(temperature=0, model="gpt-4-1106-preview")

class ChatWithRoutingTable:
    def __init__(self):
        self.conversation_history = []
        self.load_text()
        self.split_into_chunks()
        self.store_in_chroma()
        self.setup_conversation_memory()
        self.setup_conversation_retrieval_chain()

    def load_text(self):
        self.loader = JSONLoader(
            file_path='Show_IP_Route.json',
            jq_schema=".info[]",
            text_content=False
        )
        self.pages = self.loader.load_and_split()

    def split_into_chunks(self):
        # Create a text splitter
        self.text_splitter = RecursiveCharacterTextSplitter(
            chunk_size=1000,
            chunk_overlap=100,
            length_function=len,
        )
        self.docs = self.text_splitter.split_documents(self.pages)

    def store_in_chroma(self):
        embeddings = OpenAIEmbeddings()
        self.vectordb = Chroma.from_documents(self.docs, embedding=embeddings)
        self.vectordb.persist()

    def setup_conversation_memory(self):
        self.memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

    def setup_conversation_retrieval_chain(self):
        self.qa = ConversationalRetrievalChain.from_llm(llm, self.vectordb.as_retriever(search_kwargs={"k": 10}), memory=self.memory)

    def chat(self, question):
        # Format the user's prompt and add it to the conversation history
        user_prompt = f"User: {question}"
        self.conversation_history.append({"text": user_prompt, "sender": "user"})

        # Format the entire conversation history for context, excluding the current prompt
        conversation_context = self.format_conversation_history(include_current=False)

        # Concatenate the current question with conversation context
        combined_input = f"Context: {conversation_context}\nQuestion: {question}"

        # Generate a response using the ConversationalRetrievalChain
        response = self.qa.invoke(combined_input)

        # Extract the answer from the response
        answer = response.get('answer', 'No answer found.')

        # Format the AI's response
        ai_response = f"Cisco IOS XE: {answer}"
        self.conversation_history.append({"text": ai_response, "sender": "bot"})

        # Update the Streamlit session state by appending new history with both user prompt and AI response
        st.session_state['conversation_history'] += f"\n{user_prompt}\n{ai_response}"

        # Return the formatted AI response for immediate display
        return ai_response


    def format_conversation_history(self, include_current=True):
        formatted_history = ""
        history_to_format = self.conversation_history[:-1] if not include_current else self.conversation_history
        for msg in history_to_format:
            speaker = "You: " if msg["sender"] == "user" else "Bot: "
            formatted_history += f"{speaker}{msg['text']}\n"
        return formatted_history

# Create an instance of your class
chat_instance = ChatWithRoutingTable()

# Streamlit UI for chat
st.title("Chat with Cisco IOS XE Routing Table")

# Initialize conversation history in session state if not present
if 'conversation_history' not in st.session_state:
    st.session_state['conversation_history'] = ""

user_input = st.text_input("Ask a question about the routing table:", key="user_input")

if st.button("Ask"):
    with st.spinner('Processing...'):
        # Call the chat method and get the AI's response
        ai_response = chat_instance.chat(user_input)
        # Display the conversation history
        st.text_area("Conversation History:", value=st.session_state['conversation_history'], height=300, key="conversation_history_display")

Here is what our final folder and file structure should look like:

Final Files and Folde Structue

Please review all the contents of your files and make sure that the structure matches the image above before proceeding further.

Next, bring up your Docker container and your Streamlit app:

$ ~/streamlit_langchain_pyats# docker-compose up
[+] Running 1/0
 ✔ Container streamlit_langchain_pyats  Created                                                                                                       0.0s
Attaching to streamlit_langchain_pyats
streamlit_langchain_pyats  |
streamlit_langchain_pyats  | Collecting usage statistics. To deactivate, set browser.gatherUsageStats to False.
streamlit_langchain_pyats  |
streamlit_langchain_pyats  |
streamlit_langchain_pyats  |   You can now view your Streamlit app in your browser.
streamlit_langchain_pyats  |
streamlit_langchain_pyats  |   Network URL: http://172.20.0.2:8501
streamlit_langchain_pyats  |   External URL: http://142.170.39.142:8501
streamlit_langchain_pyats  |

Now you can visit https://localhost:8501 to “chat” with your Cisco IOS XE routing table.

You can view in the Streamlit user interface:

pyATS Learning via Streamlit

Or in your VS Code terminal console logs:

pyATS Learning via Streamlit

After the pyATS job and LangChain is complete, you can go ahead and start chatting with the routing table!

pyATS Learning via Streamlit

pyATS Learning via Streamlit

Congratulations! In this tutorial, you have created a LangChain that uses RAG provided by Cisco pyATS-parsed JSON. You store the chunks and vectors inside ChromaDB and then can use ChatGPT to perform conversational retrieval against the vector store. This is all wrapped inside a Streamlit application inside a Docker container.

As an action item takeaway, try other commands inside pyATS. Try different loaders, such as TextLoader, to chat with things like show running-configuration or show logging, or even show-tech.

Thank you for taking your first steps in AI with RAG!

Learn More