Summarizing

Summarizing

Summarize text with a focus on specific topics.

Setup

import openai
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_key  = os.getenv('OPENAI_API_KEY')
def get_completion(prompt, model="gpt-3.5-turbo"): # Andrew mentioned that the prompt/ completion paradigm is preferable for this class
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0, # this is the degree of randomness of the model's output
    )
    return response.choices[0].message["content"]

Text to summarize

prod_review = """
Got this panda plush toy for my daughter's birthday, \
who loves it and takes it everywhere. It's soft and \ 
super cute, and its face has a friendly look. It's \ 
a bit small for what I paid though. I think there \ 
might be other options that are bigger for the \ 
same price. It arrived a day earlier than expected, \ 
so I got to play with it myself before I gave it \ 
to her.
"""

Summarize with a word/sentence/character limit

prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site. 

Summarize the review below, delimited by triple 
backticks, in at most 30 words. 

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)
Soft and cute panda plush toy loved by daughter, but a bit small for the price. Arrived early.

Summarize with a focus on shipping and delivery

prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site to give feedback to the \
Shipping deparmtment. 

Summarize the review below, delimited by triple 
backticks, in at most 30 words, and focusing on any aspects \
that mention shipping and delivery of the product. 

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)
The panda plush toy arrived a day earlier than expected, but the customer felt it was a bit small for the price paid.

Summarize with a focus on price and value

prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site to give feedback to the \
pricing deparmtment, responsible for determining the \
price of the product.  

Summarize the review below, delimited by triple 
backticks, in at most 30 words, and focusing on any aspects \
that are relevant to the price and perceived value. 

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)
The panda plush toy is soft, cute, and loved by the recipient, but the price may be too high for its size.

Comment

  • Summaries include topics that are not related to the topic of focus.

Try “extract” instead of “summarize”

prompt = f"""
Your task is to extract relevant information from \ 
a product review from an ecommerce site to give \
feedback to the Shipping department. 

From the review below, delimited by triple quotes \
extract the information relevant to shipping and \ 
delivery. Limit to 30 words. 

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)
The product arrived a day earlier than expected.

Summarize multiple product reviews


review_1 = prod_review 

# review for a standing lamp
review_2 = """
Needed a nice lamp for my bedroom, and this one \
had additional storage and not too high of a price \
point. Got it fast - arrived in 2 days. The string \
to the lamp broke during the transit and the company \
happily sent over a new one. Came within a few days \
as well. It was easy to put together. Then I had a \
missing part, so I contacted their support and they \
very quickly got me the missing piece! Seems to me \
to be a great company that cares about their customers \
and products. 
"""

# review for an electric toothbrush
review_3 = """
My dental hygienist recommended an electric toothbrush, \
which is why I got this. The battery life seems to be \
pretty impressive so far. After initial charging and \
leaving the charger plugged in for the first week to \
condition the battery, I've unplugged the charger and \
been using it for twice daily brushing for the last \
3 weeks all on the same charge. But the toothbrush head \
is too small. I’ve seen baby toothbrushes bigger than \
this one. I wish the head was bigger with different \
length bristles to get between teeth better because \
this one doesn’t.  Overall if you can get this one \
around the $50 mark, it's a good deal. The manufactuer's \
replacements heads are pretty expensive, but you can \
get generic ones that're more reasonably priced. This \
toothbrush makes me feel like I've been to the dentist \
every day. My teeth feel sparkly clean! 
"""

# review for a blender
review_4 = """
So, they still had the 17 piece system on seasonal \
sale for around $49 in the month of November, about \
half off, but for some reason (call it price gouging) \
around the second week of December the prices all went \
up to about anywhere from between $70-$89 for the same \
system. And the 11 piece system went up around $10 or \
so in price also from the earlier sale price of $29. \
So it looks okay, but if you look at the base, the part \
where the blade locks into place doesn’t look as good \
as in previous editions from a few years ago, but I \
plan to be very gentle with it (example, I crush \
very hard items like beans, ice, rice, etc. in the \ 
blender first then pulverize them in the serving size \
I want in the blender then switch to the whipping \
blade for a finer flour, and use the cross cutting blade \
first when making smoothies, then use the flat blade \
if I need them finer/less pulpy). Special tip when making \
smoothies, finely cut and freeze the fruits and \
vegetables (if using spinach-lightly stew soften the \ 
spinach then freeze until ready for use-and if making \
sorbet, use a small to medium sized food processor) \ 
that you plan to use that way you can avoid adding so \
much ice if at all-when making your smoothie. \
After about a year, the motor was making a funny noise. \
I called customer service but the warranty expired \
already, so I had to buy another one. FYI: The overall \
quality has gone done in these types of products, so \
they are kind of counting on brand recognition and \
consumer loyalty to maintain sales. Got it in about \
two days.
"""

reviews = [review_1, review_2, review_3, review_4]
for i in range(len(reviews)):
    prompt = f"""
    Your task is to generate a short summary of a product \ 
    review from an ecommerce site. 

    Summarize the review below, delimited by triple \
    backticks in at most 20 words. 

    Review: ```{reviews[i]}```
    """

    response = get_completion(prompt)
    print(i, response, "\n")
0 Soft and cute panda plush toy loved by daughter, but a bit small for the price. Arrived early. 

1 Affordable lamp with storage, fast shipping, and excellent customer service. Easy to assemble and missing parts were quickly replaced. 

2 Good battery life, small toothbrush head, but effective cleaning. Good deal if bought around $50. 

3 The product was on sale for $49 in November, but the price increased to $70-$89 in December. The base doesn't look as good as previous editions, but the reviewer plans to be gentle with it. A special tip for making smoothies is to freeze the fruits and vegetables beforehand. The motor made a funny noise after a year, and the warranty had expired. Overall quality has decreased. 

Try experimenting on your own!

import json
with open('l4-summarizing.ipynb') as f:
    print(f.read())
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "87857393-6369-4b66-87c9-5f3253edf28e",
   "metadata": {},
   "source": [
    "# Summarizing\n",
    "In this lesson, you will summarize text with a focus on specific topics.\n",
    "\n",
    "## Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "8ac673e1",
   "metadata": {
    "height": 132
   },
   "outputs": [],
   "source": [
    "import openai\n",
    "import os\n",
    "\n",
    "from dotenv import load_dotenv, find_dotenv\n",
    "_ = load_dotenv(find_dotenv()) # read local .env file\n",
    "\n",
    "openai.api_key  = os.getenv('OPENAI_API_KEY')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "66de8ca6",
   "metadata": {
    "height": 166
   },
   "outputs": [],
   "source": [
    "def get_completion(prompt, model=\"gpt-3.5-turbo\"): # Andrew mentioned that the prompt/ completion paradigm is preferable for this class\n",
    "    messages = [{\"role\": \"user\", \"content\": prompt}]\n",
    "    response = openai.ChatCompletion.create(\n",
    "        model=model,\n",
    "        messages=messages,\n",
    "        temperature=0, # this is the degree of randomness of the model's output\n",
    "    )\n",
    "    return response.choices[0].message[\"content\"]\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "387b0686-bea6-41a2-b879-88721dc0ec10",
   "metadata": {},
   "source": [
    "## Text to summarize"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "0ce2cf3c",
   "metadata": {
    "height": 183
   },
   "outputs": [],
   "source": [
    "prod_review = \"\"\"\n",
    "Got this panda plush toy for my daughter's birthday, \\\n",
    "who loves it and takes it everywhere. It's soft and \\ \n",
    "super cute, and its face has a friendly look. It's \\ \n",
    "a bit small for what I paid though. I think there \\ \n",
    "might be other options that are bigger for the \\ \n",
    "same price. It arrived a day earlier than expected, \\ \n",
    "so I got to play with it myself before I gave it \\ \n",
    "to her.\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5d95eba0-7744-491a-a30a-8ee687303b7a",
   "metadata": {},
   "source": [
    "## Summarize with a word/sentence/character limit"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "0c3023c6",
   "metadata": {
    "height": 234
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Soft and cute panda plush toy loved by daughter, but a bit small for the price. Arrived early.\n"
     ]
    }
   ],
   "source": [
    "prompt = f\"\"\"\n",
    "Your task is to generate a short summary of a product \\\n",
    "review from an ecommerce site. \n",
    "\n",
    "Summarize the review below, delimited by triple \n",
    "backticks, in at most 30 words. \n",
    "\n",
    "Review: ```{prod_review}```\n",
    "\"\"\"\n",
    "\n",
    "response = get_completion(prompt)\n",
    "print(response)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "90832908-3b3a-459b-b595-bbe15c2a72fa",
   "metadata": {},
   "source": [
    "## Summarize with a focus on shipping and delivery"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "d850bdd2",
   "metadata": {
    "height": 268
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The panda plush toy arrived a day earlier than expected, but the customer felt it was a bit small for the price paid.\n"
     ]
    }
   ],
   "source": [
    "prompt = f\"\"\"\n",
    "Your task is to generate a short summary of a product \\\n",
    "review from an ecommerce site to give feedback to the \\\n",
    "Shipping deparmtment. \n",
    "\n",
    "Summarize the review below, delimited by triple \n",
    "backticks, in at most 30 words, and focusing on any aspects \\\n",
    "that mention shipping and delivery of the product. \n",
    "\n",
    "Review: ```{prod_review}```\n",
    "\"\"\"\n",
    "\n",
    "response = get_completion(prompt)\n",
    "print(response)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "01204385-1d27-420c-80ee-bd4b524550f6",
   "metadata": {},
   "source": [
    "## Summarize with a focus on price and value"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "6d865432",
   "metadata": {
    "height": 285
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The panda plush toy is soft, cute, and loved by the recipient, but the price may be too high for its size.\n"
     ]
    }
   ],
   "source": [
    "prompt = f\"\"\"\n",
    "Your task is to generate a short summary of a product \\\n",
    "review from an ecommerce site to give feedback to the \\\n",
    "pricing deparmtment, responsible for determining the \\\n",
    "price of the product.  \n",
    "\n",
    "Summarize the review below, delimited by triple \n",
    "backticks, in at most 30 words, and focusing on any aspects \\\n",
    "that are relevant to the price and perceived value. \n",
    "\n",
    "Review: ```{prod_review}```\n",
    "\"\"\"\n",
    "\n",
    "response = get_completion(prompt)\n",
    "print(response)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "21a561c4-d9a0-48a8-86c4-725746fb08df",
   "metadata": {},
   "source": [
    "#### Comment\n",
    "- Summaries include topics that are not related to the topic of focus."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9aff99cd-dc09-467c-bd09-897ffe06a232",
   "metadata": {},
   "source": [
    "## Try \"extract\" instead of \"summarize\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "190943b0",
   "metadata": {
    "height": 251
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The product arrived a day earlier than expected.\n"
     ]
    }
   ],
   "source": [
    "prompt = f\"\"\"\n",
    "Your task is to extract relevant information from \\ \n",
    "a product review from an ecommerce site to give \\\n",
    "feedback to the Shipping department. \n",
    "\n",
    "From the review below, delimited by triple quotes \\\n",
    "extract the information relevant to shipping and \\ \n",
    "delivery. Limit to 30 words. \n",
    "\n",
    "Review: ```{prod_review}```\n",
    "\"\"\"\n",
    "\n",
    "response = get_completion(prompt)\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f513da2e-f89c-4c91-8456-b79c630e70c9",
   "metadata": {},
   "source": [
    "## Summarize multiple product reviews"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "027822c2",
   "metadata": {
    "height": 1271
   },
   "outputs": [],
   "source": [
    "\n",
    "review_1 = prod_review \n",
    "\n",
    "# review for a standing lamp\n",
    "review_2 = \"\"\"\n",
    "Needed a nice lamp for my bedroom, and this one \\\n",
    "had additional storage and not too high of a price \\\n",
    "point. Got it fast - arrived in 2 days. The string \\\n",
    "to the lamp broke during the transit and the company \\\n",
    "happily sent over a new one. Came within a few days \\\n",
    "as well. It was easy to put together. Then I had a \\\n",
    "missing part, so I contacted their support and they \\\n",
    "very quickly got me the missing piece! Seems to me \\\n",
    "to be a great company that cares about their customers \\\n",
    "and products. \n",
    "\"\"\"\n",
    "\n",
    "# review for an electric toothbrush\n",
    "review_3 = \"\"\"\n",
    "My dental hygienist recommended an electric toothbrush, \\\n",
    "which is why I got this. The battery life seems to be \\\n",
    "pretty impressive so far. After initial charging and \\\n",
    "leaving the charger plugged in for the first week to \\\n",
    "condition the battery, I've unplugged the charger and \\\n",
    "been using it for twice daily brushing for the last \\\n",
    "3 weeks all on the same charge. But the toothbrush head \\\n",
    "is too small. I’ve seen baby toothbrushes bigger than \\\n",
    "this one. I wish the head was bigger with different \\\n",
    "length bristles to get between teeth better because \\\n",
    "this one doesn’t.  Overall if you can get this one \\\n",
    "around the $50 mark, it's a good deal. The manufactuer's \\\n",
    "replacements heads are pretty expensive, but you can \\\n",
    "get generic ones that're more reasonably priced. This \\\n",
    "toothbrush makes me feel like I've been to the dentist \\\n",
    "every day. My teeth feel sparkly clean! \n",
    "\"\"\"\n",
    "\n",
    "# review for a blender\n",
    "review_4 = \"\"\"\n",
    "So, they still had the 17 piece system on seasonal \\\n",
    "sale for around $49 in the month of November, about \\\n",
    "half off, but for some reason (call it price gouging) \\\n",
    "around the second week of December the prices all went \\\n",
    "up to about anywhere from between $70-$89 for the same \\\n",
    "system. And the 11 piece system went up around $10 or \\\n",
    "so in price also from the earlier sale price of $29. \\\n",
    "So it looks okay, but if you look at the base, the part \\\n",
    "where the blade locks into place doesn’t look as good \\\n",
    "as in previous editions from a few years ago, but I \\\n",
    "plan to be very gentle with it (example, I crush \\\n",
    "very hard items like beans, ice, rice, etc. in the \\ \n",
    "blender first then pulverize them in the serving size \\\n",
    "I want in the blender then switch to the whipping \\\n",
    "blade for a finer flour, and use the cross cutting blade \\\n",
    "first when making smoothies, then use the flat blade \\\n",
    "if I need them finer/less pulpy). Special tip when making \\\n",
    "smoothies, finely cut and freeze the fruits and \\\n",
    "vegetables (if using spinach-lightly stew soften the \\ \n",
    "spinach then freeze until ready for use-and if making \\\n",
    "sorbet, use a small to medium sized food processor) \\ \n",
    "that you plan to use that way you can avoid adding so \\\n",
    "much ice if at all-when making your smoothie. \\\n",
    "After about a year, the motor was making a funny noise. \\\n",
    "I called customer service but the warranty expired \\\n",
    "already, so I had to buy another one. FYI: The overall \\\n",
    "quality has gone done in these types of products, so \\\n",
    "they are kind of counting on brand recognition and \\\n",
    "consumer loyalty to maintain sales. Got it in about \\\n",
    "two days.\n",
    "\"\"\"\n",
    "\n",
    "reviews = [review_1, review_2, review_3, review_4]\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b7c39cc8",
   "metadata": {
    "height": 251
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0 Soft and cute panda plush toy loved by daughter, but a bit small for the price. Arrived early. \n",
      "\n",
      "1 Affordable lamp with storage, fast shipping, and excellent customer service. Easy to assemble and missing parts were quickly replaced. \n",
      "\n",
      "2 Good battery life, small toothbrush head, but effective cleaning. Good deal if bought around $50. \n",
      "\n"
     ]
    }
   ],
   "source": [
    "for i in range(len(reviews)):\n",
    "    prompt = f\"\"\"\n",
    "    Your task is to generate a short summary of a product \\ \n",
    "    review from an ecommerce site. \n",
    "\n",
    "    Summarize the review below, delimited by triple \\\n",
    "    backticks in at most 20 words. \n",
    "\n",
    "    Review: ```{reviews[i]}```\n",
    "    \"\"\"\n",
    "\n",
    "    response = get_completion(prompt)\n",
    "    print(i, response, \"\\n\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e0c9f921-8672-4124-bad6-8bee65078ccb",
   "metadata": {},
   "source": [
    "## Try experimenting on your own!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d05d8a20-86f2-4613-835e-41c49a504b5b",
   "metadata": {
    "height": 30
   },
   "outputs": [],
   "source": [
    "import json\n",
    "with open('l2-guidelines.ipynb') as f:\n",
    "    print(f.read())"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}