✨Magic Moments✨™️ With AI
AI (We Get It)
Like everyone else, I have been pretty inundated with the seemingly constant stream of AI content over the last few months. Obviously, generative AI has taken the main stage — in Google’s case, quite literally.
https://www.youtube.com/watch?v=-P-ein58laA
I was able to play with Copilot when it was in beta (Another GitHub Copilot Blog Post), as well as Amazon’s competitor, CodeWhisperer. There were definitely some nice moments where they generated some boiler plate for me, but that is about as far as it went. What I really wanted to do was create something I can use which will actually save me time — an actual product. So, I decided to try and solve an annoying problem I’ve had for a few years now using my shiny new ChatGPT+ subscription (I know, I’m rolling my eyes at me, too).
Making Something I Would Pay For
This is pretty specific, but it’s a huge pet peeve of mine. Many, many times a day I go to the AWS SAML sign in page. All I want to do is to be able to have my cursor focus to a text box on top of the page, type in what account I want to sign in to, click enter, and sign in. Basically, something like this:
So I popped over to ChatGPT and asked it this:
I actually started with pasting in the HTML from the website to give it context, but I kept hitting the context length so I figured I would just ask it if it had seen it before. With a little back and forth it came back with this:
(function() {
const createFilterBox = () => {
const filterBox = document.createElement('input');
filterBox.setAttribute('type', 'text');
filterBox.setAttribute('placeholder', 'Filter AWS accounts...');
filterBox.style.marginBottom = '10px';
filterBox.style.width = '100%';
filterBox.addEventListener('input', filterAccounts);
filterBox.addEventListener('keydown', handleEnterKey);
return filterBox;
};
const filterAccounts = (event) => {
const searchString = event.target.value.toLowerCase();
const accountElements = document.querySelectorAll('.saml-account');
let firstVisibleAccount = null;
accountElements.forEach(account => {
const accountName = account.innerText.toLowerCase();
if (accountName.includes(searchString)) {
account.style.display = '';
if (!firstVisibleAccount) {
firstVisibleAccount = account;
}
} else {
account.style.display = 'none';
}
});
if (firstVisibleAccount) {
firstVisibleAccount.querySelector('input[type="radio"]').checked = true;
}
};
const handleEnterKey = (event) => {
if (event.key === 'Enter') {
const signInButton = document.querySelector('#signin_button');
if (signInButton) {
signInButton.click();
}
}
};
const filterBox = createFilterBox();
const container = document.querySelector('#container');
if (container) {
container.insertBefore(filterBox, container.firstChild);
}
filterBox.focus();
})();
Which I have to admit, basically worked first try:
This whole process start to finish probably took me about 15 minutes. Could I have created it myself? Probably. It wasn’t super complicated. But I didn’t have to. All I had to do was say “hey, make me this” and in 2 seconds it generated some code, I looked it over, set it up as a Chrome extension, and off I was. Notice how I said I looked it over. This is still incredibly important. Don’t go copy-pasta what our new AI Overlords spit out at us or who knows what could happen next. Fortunately, I could tell that it was just searching for some css classes and changing the DOM, and I know this is just executed on my client machine, so I feel pretty good.
Magic Moments
I have had a few ✨Magic Moments✨™️ with AI so far. One was the first time I used GitHub Copilot, another was what I just described above using ChatGPT to make my own Chrome extension I use every day, and another has been playing with Langchain.
Langchain has a variety of features, but it really shines by letting you create what it calls “agents”:
Agents involve an LLM making decisions about which Actions to take, taking that Action, seeing an Observation, and repeating that until done. LangChain provides a standard interface for agents, a selection of agents to choose from, and examples of end-to-end agents.
Agents can also be given tools to use, like a way to search the internet, or a Python REPL. Here is an example of some Python code which can be used to generate some cat jokes along with the output:
from dotenv import load_dotenv
from langchain.llms import OpenAI
from langchain.agents import initialize_agent, load_tools
from langchain.agents import AgentType
def main():
llm = OpenAI(temperature=0)
tools = load_tools(["python_repl"], llm=llm)
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
agent.run("""
Think of cat jokes and save them to a CSV files called 'catjokes.csv'.
INDENT the code appropriately.
""")
if __name__ == "__main__":
load_dotenv()
main()
> Entering new AgentExecutor chain...
I need to create a CSV file and write the jokes to it.
Action: Python REPL
Action Input:
import csv
with open('catjokes.csv', 'w', newline='') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(['Q', 'A'])
writer.writerow(['What do you call a pile of cats?', 'A meowntain'])
writer.writerow(['What did the cat say when he lost all his money?', 'I'm paw!'])
writer.writerow(['What did the cat say when he saw a bowl of milk?', 'Meeee-ow!'])
Observation: unterminated string literal (detected at line 8) (<string>, line 8)
Thought: I need to add quotation marks around the strings
Action: Python REPL
Action Input:
import csv
with open('catjokes.csv', 'w', newline='') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(['Q', 'A'])
writer.writerow(['What do you call a pile of cats?', 'A meowntain'])
writer.writerow(['What did the cat say when he lost all his money?', 'I\'m paw!'])
writer.writerow(['What did the cat say when he saw a bowl of milk?', 'Meeee-ow!'])
Observation:
Thought: I now know the final answer
Final Answer: The code to save cat jokes to a CSV file named 'catjokes.csv' is:
import csv
with open('catjokes.csv', 'w', newline='') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(['Q', 'A'])
writer.writerow(['What do you call a pile of cats?', 'A meowntain'])
writer.writerow(['What did the cat say when he lost all his money?', 'I\'m paw!'])
writer.writerow(['What did the cat say when he saw a bowl of milk?', 'Meeee-ow!'])
> Finished chain.
This did generate a CSV of cat jokes for me to read in my free time, but if you skim through the output above you might miss something incredible: it corrected itself.
Observation: unterminated string literal (detected at line 8) (<string>, line 8)
Thought: I need to add quotation marks around the strings
The agent created some Python code, ran the code, observed the error, and corrected the error. That was my magic moment.
Miracle Number Two
The last ✨Magic Moment✨™️ I’ve had is with https://github.com/oobabooga/text-generation-webui:
A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA.
Its goal is to become the AUTOMATIC1111/stable-diffusion-webui of text generation.
Langchain is compelling because you can create your own ChatGPT Plugin like experiences, but what if you didn’t have to use ChatGPT at all? Enter text-generation-web-ui
, which is a project you can run locally to create a chat based application with an LLM you are running on your laptop. This truly eliminates all privacy concerns, and I think is where we are moving toward — at least for hackers. There has already been a ton of progress in this space, especially with Meta’s release of Llama.
Currently, the models you can install and run on a beefy MacBook Pro don’t hold a candle to GPT-4, but this is only the beginning.
I’m hoping by this time next year I’ll have three more ✨Magic Moment’s✨™️ and I’l be parsing CSVs with my Neuralink in VR. Only time will tell!