Blog Summarizer App Github

The motivation behind creating a Blog Summarizer App is to showcase a common real-world implementation of Large Language Models (LLM) & Generative AI. Powered by Python and the OpenAI API, this app empowers content creators and bloggers to provide their readers with concise and coherent summaries, ensuring a smoother understanding of the main points.

See example summaries here.

Key Features


Blog summarizer is customizable app that let’s users integrate generated summaries from their blog/article/content.

Automatic Summarization: Say goodbye to manual summarization. This app automates the process, saving you time and effort.

Customization: Tailor the summarization process to your specific needs. Adjust token limits and select the OpenAI model that suits your requirements.

Markdown Integration: Seamlessly process Markdown files and add summaries to specified insertion markers. Your blog articles will benefit from organized and compelling summaries.

Prerequisites


Before using the Blog Summarizer App, ensure that you have the following prerequisites:

   – Python 3.x installed on your system.

   – An OpenAI API key for making API requests.

Installation


Please follow below stepst to get started:

  1. Clone the repository to your local machine:
# Use the following commands
git clone https://github.com/codebygarrysingh/blog-summarizer-app.git
cd blog-summarizer-app
  1. Install the required Python packages using pip.

  2. Configure your OpenAI API key in the config.py file:


Configurable Parameters


You can customize the behavior of the Blog Summarizer App by modifying the constants and configurations in the config.py file. For example, you can change the OpenAI model, token limits, and insertion markers.

	
# OpenAI API Key (Replace with your actual API key)
OPENAI_API_KEY = "YOUR_API_KEY_HERE"

# Directory containing Markdown blog articles
BLOG_DIR = "YOUR_BLOG_DIR_PATH_HERE"

# File extension for blog articles, for example .markdown
BLOG_FILE_EXT = "YOUR_BLOG_FILE_EXT_HERE"

# Offset for starting to read content from Markdown files
BLOG_OFFSET = 11

# Name of the OpenAI model to use for summarization
MODEL_NAME = "text-davinci-002"

# Maximum number of tokens for the response
MAX_RESPONSE_TOKENS = 150

# Insertion marker for the blog files
INSERTION_MARKER = "<!-- Insert Summary Here -->"




Importing Necessary Libraries and Modules

import os
import openai
import tiktoken
from bs4 import BeautifulSoup
import config

The code begins by importing several essential libraries and modules. Here’s what each of them does:

– os: Allows interaction with the operating system, necessary for file operations.

– openai: Provides access to OpenAI’s GPT-3 model for text generation.

– tiktoken: Helps count the number of tokens in a text string, useful for managing the model’s limitations.

– BeautifulSoup: Used for parsing and cleaning HTML content.

– config: Imports configuration parameters from an external file.


Constants and Configuration

OPENAI_API_KEY = config.OPENAI_API_KEY
BLOG_DIR = config.BLOG_DIR
BLOG_FILE_EXT = config.BLOG_FILE_EXT
BLOG_OFFSET = config.BLOG_OFFSET
MODEL_NAME = config.MODEL_NAME
MAX_RESPONSE_TOKENS = config.MAX_RESPONSE_TOKENS
INSERTION_MARKER = config.INSERTION_MARKER

These constants and configurations are set up to store critical parameters like API keys, file paths, model names, and token limits.


Initialize the OpenAI Client

openai.api_key = OPENAI_API_KEY

The OpenAI API key is set to enable communication with OpenAI’s GPT-3 model.


Token Counting Function

def num_tokens_in_content(content: str, model_name: str) -> int:
    # ...

This function counts the number of tokens in a text string using the tiktoken library. It takes the content and the model name as input and returns the token count.


Content Summarization Function

def generate_blog_summary(content: str, model_name: str, max_tokens: int) -> str:
    # ...

This function generates a summary of the content using the OpenAI model. It takes the content, model name, and the maximum number of tokens for the generated summary as input. It returns the generated summary.


Add Summary to Blog Function

def add_summary_to_blogs():
    # ...

This function is responsible for processing and updating blog files with generated summaries. It iterates through files in the specified directory (BLOG_DIR). It processes only files with a specific file extension (BLOG_FILE_EXT). It reads the blog content, extracts the relevant text, generates a summary, and updates the blog file with the summary.


Main Execution

if __name__ == "__main__":
    add_summary_to_blogs()

The code checks if the script is being executed directly (not imported as a module) and then calls the add_summary_to_blogs() function to summarize the blog articles.


Usage


The app will process the blog files, generate summaries, and update the content using the specified insertion markers. The summarized content will replace the marker in each file. Please follow steps below:

  1. Place your blog articles in the designated directory
  2. Configure the app using the constants in the config.py file. You can adjust token limits and other settings.
  3. Run the app using the following command:
    python main.py
    


Customization


You can customize the behavior of the Blog Summarizer App by modifying the constants and configurations in the config.py file. For example, you can change the OpenAI model, token limits, and insertion markers.


License


This project is licensed under the MIT License - see the LICENSE file for details.


Acknowledgements


Special thanks to OpenAI for their powerful language models. Inspired by the need to simplify the process of summarizing blog articles.

Back to Showcase