Integrating AI into Immersive Interactive Fiction

Part of a series on prototyping with generative AI

At a nondescript payphone booth in Chicago, at the corner of California and Fullerton, there’s a strange payphone that allows you to talk to a robot from the future.

The Concept:

I was inspired by Danny DeRuntz’ post Crit-Bot Hotline, in where he is able to use AI in a call-in phone service to critique ideas. As soon as I experienced the service, I was reminded of a conversation I had with a buddy, Justin Brink, where we discussed the idea of an augmented reality game (ARG) that strictly uses analog media, and we had purchased a payphone to experiment.

As soon as I dipped my toes into Google’s DialogFlow CX I had an epiphany: This is exactly like building a game logic engine! Now I had to do something with this payphone idea.

I decided I would turn this payphone into a wireless cell-phone, make a piece of AI enabled interactive fiction, program the phone to call into this AI game, then put the phone back into an empty payphone stall in Chicago.

I don’t want to spoil the fun for anyone who wants to try it, but I’ll reveal the basics of what the experience is: You have stumbled on a very special payphone. This phone is called TAC-2112, a “Temporal Assistance Communicator.” There has been a cataclysmic event in the year 2112 that has destroyed the ability for digital products to function. These future beings have, however, figured out how to use an analog payphone to communicate with the past as a call for help. You are tasked with helping this future.

It’s a twist on interactive fiction, and perhaps closer to immersive art because you are not asked to pretend to be someone else or pretend you are in another place.

If you live in Chicago, it’ll be more interesting to go in person. If you are elsewhere, you can play along by calling (833)-353-4824.

On the backend what I have done is convert the payphone into a mobile phone with its own 3G connection and rechargeable batteries. When you put 25 cents in the phone and pick up the receiver, it dials a phone number that is connected to my DialogFlow phone service. There are two levels of AI, one that is used to train what DialogFlow calls “intents.” These intents steer the participant through the game logic tree. At certain points in the experience, the system passes your response into chatGPT for a more immersive experience.

For me, this experiment’s value went beyond merely understanding these new tools; I also discovered the multifaceted potential of AI as a partner in coding, writing, and troubleshooting. Not only does this piece rely on AI to function, but I relied on AI to get it to work.

In the next parts of this article, I’ll get into the details of how I build the software and hardware components of the system.

TAC-2112 is currently in service.

The Software:

Prototype

I prototyped the game as a ChatGPT prompt that lays out the game and the rules in a single message and then just played the game with ChatGPT in the browser. These single-prompt games are extremely fun to invent and play. Here’s the final prototype prompt I got to before transferring the gameplay into DialogFlow.

The Final Game

I was able to realize the whole project within Google Cloud with API access to OpenAI’s API. As mentioned, the core engine is DialogFlow CX. This allows me to map out the game states, set up the rules for how one moves between states, and call webhooks. Dialogflow makes things really easy to set up a phone number as a route into this experience, and will handle all of the speech-to-text / text-to-speech automatically for you. This is what my final flow looks like:

Whenever I want to pass something the caller says to chatGPT I use a webhook. Inside DialogFlow it looks like this:

In this screen, sys.no-match-default Event Handler is selected. This means that if the caller says something that is not expected, this gets triggered. In my case, we have asked the caller to describe a dream of theirs. If they say something like “What is going on?” they are routed to a backstory, if they describe a dream, it should not match a predefined intent, so it gets routed to sys.no-match-default.

In the webhook field, I have selected “Dream Response” This is a Google Cloud Function that is part of the same project as the DialogFlow agent. The function is a node.js function that looks like this:
index.js:

const { Configuration, OpenAIApi } = require('openai');

const configuration = new Configuration({
  apiKey: process.env.OPENAI_API_KEY
});
const openai = new OpenAIApi(configuration);

exports.dialogflowCxWebhook = async (req, res) => {
  // Parse the message and tag from the incoming request
  const msg2 = req.body.text ? req.body.text : req.body.transcript;

  let messages = [
    {"role": "system", "content": "You are a telephone bot from the year 2112. You speak in an apocalyptic science fiction theme. You are in the middle of a conversation."},
    {"role": "user", "content": "In my dream, you'll find a link to a group called /'skimmers./' Act surprised that you found this connection. Take a word or phrase from my dream, and create a detailed connection to /'skimmers,/' making up details if needed. Then, tell me about the skimmers. Skimmers can travel through time and are believed to be responsible for a /'the great event/' that destroys humanity. The dream is as follows:" + msg2}
  ]

  // Send the question to ChatGPT
  const response = await openai.createChatCompletion({
    model: "gpt-3.5-turbo",
    messages: messages,
    temperature: 0.8,
    max_tokens: 1500,
    // top_p: 1.0,
    // frequency_penalty: 0.0,
    // presence_penalty: 0.0,
    });

  // Extract the response from ChatGPT and send it back to Dialogflow CX
    const chatGPTResponse = response.data.choices[0].message.content;

  res.json({
    fulfillment_response: {
      messages: [{
        text: {
          text: [chatGPTResponse]
        },
      }],
    },
  });
};

package.json needs openai:

{
  "name": "chat-test",
  "version": "1.0.0",
  "description": "Your function description",
  "main": "index.js",
  "dependencies": {
    "openai": "^3.2.0"
  },
  "scripts": {
  },
  "author": "Your Name",
  "license": "ISC"
}

Your API key can be saved as a “Runtime environment variable” when you set up the cloud function.

in index.js notice the system field:

{"role": "system", "content": "You are a telephone bot from the year 2112. 
You speak in an apocalyptic science fiction theme. You are in the middle 
of a conversation."},

This tells chatGPT the type of role they are playing.

and the user field:

{"role": "user", "content": "In my dream, you'll find a link to a group called
 /'skimmers./' Act surprised that you found this connection. Take a word or 
phrase from my dream, and create a detailed connection to /'skimmers,/' 
making up details if needed. Then, tell me about the skimmers. Skimmers can 
travel through time and are believed to be responsible for a /'the great event/'
 that destroys humanity. The dream is as follows:" + msg2}

This is how I steer the conversation the way I want. The caller never sees this content. What I think is interesting is that my role moves from an author of the story to a director. I set up direction for the way the story unfolds but ultimately the AI writes the narrative.

That’s the basic plumbing of the systems. This allows me to use rough intent analysis for movement around the state machine, but when I want richer communication I can pass requests into chatGPT.

For more detailed DialogFlow setup info see: Crit-Bot Hotline.

The Hardware:

The materials used in this project are:

The Electronics:

The goal was to convert the payphone into a cell phone with its own signal and its own power. This way the phone can be dropped into any empty payphone stall easily. There are a number of LTE / 5G solutions for sending data thart are commonly used for asset tracking purposes. However, I wanted something that could easily be used for voice. The Adafruit Fona 3G was the best option I could find that could easily do voice and had built in amplifiers for the speaker and microphone. The trick is that most carriers will not activate 3G phones anymore. US Mobile fortunately still allows this.

I didn’t want to have to be opening this payphone up and constantly replacing batteries, so a low power sleep state is important. The Seeeduino XIAO uses the Atmel SAMD21 single-chip platform which has an ultra low-lower sleep state with hardware interrupts that has an existing library available.

Wiring the FONA to the XIAO is pretty straightforward. I have the power on a switch, so when I plug USB into it I’m not competing with the battery power source. The batteries are going into the regulated input of the XIAO which is needed because their maximum voltage of 4.2V is higher than the maximum voltage the XIAO is rated for. I’m also wiring in the coin acceptor for the phone which will send a pulse to ground when the corresponding coin is inserted.

This is what the code looks like:

#include "Adafruit_FONA.h"
#include <EnergySaving.h>

#define FONA_RST 0
#define QUARTERPIN 8
#define DIMEPIN 9
#define NICKLEPIN 10
#define HANGUPPIN 5
#define DEBOUNCEDELAY 200
#define KEYPIN 4

#define NOSERIAL

HardwareSerial *fonaSerial = &Serial1;

Adafruit_FONA_3G fona = Adafruit_FONA_3G(FONA_RST);

EnergySaving nrgSave;
bool calling = false;
int moneyDeposited;

void setup() {
  pinMode(KEYPIN, OUTPUT);
  digitalWrite(KEYPIN, HIGH);

  #ifndef NOSERIAL
  while (!Serial.available());
  Serial.begin(9600);
  Serial.print(F("Im ON"));
  #endif

  nrgSave.begin(WAKE_EXT_INTERRUPT, HANGUPPIN, dummy);  //standby setup for external interrupts,  this will also define it as input pullup
  nrgSave.begin(WAKE_EXT_INTERRUPT, QUARTERPIN, dummy); 
  nrgSave.begin(WAKE_EXT_INTERRUPT, DIMEPIN, dummy);  
  nrgSave.begin(WAKE_EXT_INTERRUPT, NICKLEPIN, dummy);  

  moneyDeposited=0;
}

void loop() {
  // read the state of the switch into a local variable:
  bool quarter = !digitalRead(QUARTERPIN);
  bool nickle = !digitalRead(NICKLEPIN);
  bool dime = !digitalRead(DIMEPIN);
  bool hungUp = digitalRead(HANGUPPIN);

  if(quarter==true){
    delay(DEBOUNCEDELAY);
    moneyDeposited+=25;
    #ifndef NOSERIAL
      Serial.print(F("money: "));
      Serial.println(moneyDeposited);
     #endif
  }

  if(nickle==true){
    delay(DEBOUNCEDELAY);
    moneyDeposited+=5;
    #ifndef NOSERIAL
      Serial.print(F("money: "));
      Serial.println(moneyDeposited);
     #endif
  }

  if(dime==true){
    delay(DEBOUNCEDELAY);
    moneyDeposited+=10;
    #ifndef NOSERIAL
      Serial.print(F("money: "));
      Serial.println(moneyDeposited);
     #endif
  }

  if (calling == false && moneyDeposited>=25 && hungUp==false) {
    #ifndef NOSERIAL
      Serial.println(F("calling!"));
     #endif
    callService();
    moneyDeposited=0;
  }

  if(hungUp==true && calling==true){
    #ifndef NOSERIAL
     Serial.println(F("hanging up!"));
    #endif
    hangUpService();
  }

}

void dummy(void)  //interrupt routine (isn't necessary to execute any tasks in this routine
{
  #ifndef NOSERIAL
  Serial.print(F("wokeup"));
  #endif
}

void callService(void){
  powerCycle();
  delay(500); 
  startUpFonaComms();
  delay(500); 

  while(!fona.callPhone("PUT THE SERVICE PHONE NUMBER HERE")){ 
    powerCycle();
    delay(500);  
    startUpFonaComms();
    delay(500);  
  }
  calling=true;
}

void hangUpService(void){
  uint16_t vbat;
  char message[100];

  fona.hangUp();
  calling=false;

  if (fona.getBattVoltage(&vbat)){
    if(vbat<4200){//replace this with 3600 if the texts get out of hand
      char batteryLife[5];
      itoa(vbat,batteryLife,10);

      strcpy(message, "Battery is ");
      strcat(message, batteryLife);
      strcat(message, "mV");

      while(!fona.sendSMS("PUT YOUR PHONE NUMBER HERE", message)){
        powerCycle();
        delay(500);
        startUpFonaComms();
        delay(500);
      }
    }
  }

  if(!fona.turnOff()){
    powerCycle();
    delay(500); 
    startUpFonaComms();
    delay(500); 
  }
  #ifdef NOSERIAL
    nrgSave.standby();  //now mcu goes in standby mode
  #endif
}

void powerCycle(void){
  //digitalWrite(KEYPIN, HIGH);
  //delay(500);
  digitalWrite(KEYPIN, LOW);
  delay(3500);
  digitalWrite(KEYPIN, HIGH);
}

void startUpFonaComms(void){
  fonaSerial->begin(4800);
  if (!fona.begin(*fonaSerial)) {
    #ifndef NOSERIAL
    Serial.println(F("Couldn't find FONA"));
    #endif
    //while (1);
  }
  #ifndef NOSERIAL
  Serial.println(F("FONA is OK"));
  #endif
}

The basic idea of the operation is that it goes to low power sleep with hardware interrupts on the 3 coin acceptor pins and the hangup switch. If the user inserts a coin or picks up the receiver, the system wakes up and turns on the FONA. If the receiver is picked up and the user has put more than $0.25 in, the call goes through. When they hang up, the phone texts me its battery life before going to sleep again so I can know when it’s time to replace the batteries. (And it’s fun to know when someone uses it).

In sleep, the whole thing only uses 1.5mA, so if the mAh ratings are to be trusted (they shouldn’t be) then that gives me 355 days of standby battery life.

Tidied up and installed in the phone:

Once I had the code working, I put the mounting studs on the back, and installed it into an open payphone stall with 1/4–20 bolts.

Conclusion

After having this installed for about a week, It’s been fun talking with folks who have tried it. Reading about how AI is a “bullshit generator” I wanted to make an experience that really leans into it and plays with that idea. Although a lot of what I learned technically was around how to work with AI, I’m most excited about where others might take this in the world of interactive fiction. It feels like a new frontier in this space and my little game is really just a demo of what could come. If you like to write interactive fiction, let me know if you are looking for a collaborator! I had a blast working on this and I hope you all enjoy it.

Did you find this article valuable?

Support Kiran Rajput by becoming a sponsor. Any amount is appreciated!