AI (We Get It)

Like everyone else, I have been pretty inundated with the seemingly constant stream of AI content over the last few months. Obviously, generative AI has taken the main stage — in Google’s case, quite literally.

https://www.youtube.com/watch?v=-P-ein58laA

I was able to play with Copilot when it was in beta (Another GitHub Copilot Blog Post), as well as Amazon’s competitor, CodeWhisperer. There were definitely some nice moments where they generated some boiler plate for me, but that is about as far as it went. What I really wanted to do was create something I can use which will actually save me time — an actual product. So, I decided to try and solve an annoying problem I’ve had for a few years now using my shiny new ChatGPT+ subscription (I know, I’m rolling my eyes at me, too).

Making Something I Would Pay For

This is pretty specific, but it’s a huge pet peeve of mine. Many, many times a day I go to the AWS SAML sign in page. All I want to do is to be able to have my cursor focus to a text box on top of the page, type in what account I want to sign in to, click enter, and sign in. Basically, something like this:

Untitled

So I popped over to ChatGPT and asked it this:

Untitled

I actually started with pasting in the HTML from the website to give it context, but I kept hitting the context length so I figured I would just ask it if it had seen it before. With a little back and forth it came back with this:

(function() {
  const createFilterBox = () => {
    const filterBox = document.createElement('input');
    filterBox.setAttribute('type', 'text');
    filterBox.setAttribute('placeholder', 'Filter AWS accounts...');
    filterBox.style.marginBottom = '10px';
    filterBox.style.width = '100%';
    filterBox.addEventListener('input', filterAccounts);
    filterBox.addEventListener('keydown', handleEnterKey);
    return filterBox;
  };

  const filterAccounts = (event) => {
    const searchString = event.target.value.toLowerCase();
    const accountElements = document.querySelectorAll('.saml-account');
    let firstVisibleAccount = null;
    
    accountElements.forEach(account => {
      const accountName = account.innerText.toLowerCase();
      if (accountName.includes(searchString)) {
        account.style.display = '';
        if (!firstVisibleAccount) {
          firstVisibleAccount = account;
        }
      } else {
        account.style.display = 'none';
      }
    });

    if (firstVisibleAccount) {
      firstVisibleAccount.querySelector('input[type="radio"]').checked = true;
    }
  };

  const handleEnterKey = (event) => {
    if (event.key === 'Enter') {
      const signInButton = document.querySelector('#signin_button');
      if (signInButton) {
        signInButton.click();
      }
    }
  };

  const filterBox = createFilterBox();
  const container = document.querySelector('#container');
  if (container) {
    container.insertBefore(filterBox, container.firstChild);
  }

  filterBox.focus();
})();

Which I have to admit, basically worked first try:

Untitled design.gif

This whole process start to finish probably took me about 15 minutes. Could I have created it myself? Probably. It wasn’t super complicated. But I didn’t have to. All I had to do was say “hey, make me this” and in 2 seconds it generated some code, I looked it over, set it up as a Chrome extension, and off I was. Notice how I said I looked it over. This is still incredibly important. Don’t go copy-pasta what our new AI Overlords spit out at us or who knows what could happen next. Fortunately, I could tell that it was just searching for some css classes and changing the DOM, and I know this is just executed on my client machine, so I feel pretty good.

Magic Moments

I have had a few ✨Magic Moments✨™️ with AI so far. One was the first time I used GitHub Copilot, another was what I just described above using ChatGPT to make my own Chrome extension I use every day, and another has been playing with Langchain.

Langchain has a variety of features, but it really shines by letting you create what it calls “agents”:

Agents involve an LLM making decisions about which Actions to take, taking that Action, seeing an Observation, and repeating that until done. LangChain provides a standard interface for agents, a selection of agents to choose from, and examples of end-to-end agents.

Agents can also be given tools to use, like a way to search the internet, or a Python REPL. Here is an example of some Python code which can be used to generate some cat jokes along with the output:

from dotenv import load_dotenv
from langchain.llms import OpenAI
from langchain.agents import initialize_agent, load_tools
from langchain.agents import AgentType

def main():

    llm = OpenAI(temperature=0)
    tools = load_tools(["python_repl"], llm=llm)
    agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
    agent.run("""

        Think of cat jokes and save them to a CSV files called 'catjokes.csv'.
        INDENT the code appropriately.

    """)

if __name__ == "__main__":
    load_dotenv()
    main()
Powered by Fruition