Automate Data Extraction and Analysis with OpenAI & Google Sheets

This workflow utilizes the OpenAI API and Google Sheets API to extract HTML data, analyze it with OpenAI GPT, and seamlessly merge it into Google Sheets for easy access. Ideal for data management professionals, this automation simplifies complex data processing. Requires 2 accounts: OpenAI API Key and Google Sheets OAuth. Save hours of manual data entry and streamline your reporting process, generating actionable insights in minutes.

1 Trigger
79,819 views11 nodesJan 2023Nathan Foster

Categories

Lead GenerationAI Summarization

APIs

OpenAIGoogle Sheets API

AI Features

OpenAI GPT

Credentials

2 required

Quick Actions

Copy or download to import into your n8n instance

Workflow JSON
{
  "meta": {
    "instanceId": "f0a68da631efd4ed052a324b63ff90f7a844426af0398a68338f44245d1dd9e5"
  },
  "nodes": [
    {
      "id": "04750e9b-6ce3-401b-89e7-f1f17f3a4a28",
      "name": "When clicking \"Execute Workflow\"",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [
        -180,
        300
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "7a8bb997-5a2d-4ee0-a1ca-bebe9fe32bc2",
      "name": "HTTP Request",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        640,
        460
      ],
      "parameters": {
        "url": "=https://www.{{ $node[\"Split In Batches\"].json[\"Domain\"] }}",
        "options": {
          "redirect": {
            "redirect": {
              "followRedirects": true
            }
          }
        }
      },
      "typeVersion": 3,
      "continueOnFail": true
    },
    {
      "id": "6409f0c4-bf93-4a1d-a74c-e294fb39895f",
      "name": "HTML Extract",
      "type": "n8n-nodes-base.htmlExtract",
      "position": [
        820,
        460
      ],
      "parameters": {
        "options": {
          "trimValues": false
        },
        "extractionValues": {
          "values": [
            {
              "key": "body",
              "cssSelector": "html"
            }
          ]
        }
      },
      "typeVersion": 1,
      "continueOnFail": true
    },
    {
      "id": "f45fcc6a-9ccd-43c9-9eaf-1797768e1e62",
      "name": "OpenAI",
      "type": "n8n-nodes-base.openAi",
      "position": [
        1140,
        460
      ],
      "parameters": {
        "prompt": "=This is the content of the website {{ $node[\"Split In Batches\"].json[\"Domain\"] }}:\"{{ $json[\"contentShort\"] }}\"\n\nIn a JSON format:\n\n- Give me the value proposition of the company. In less than 25 words. In English. Casual Tone. Format is: \"[Company Name] helps [target audience] [achieve desired outcome] and [additional benefit]\"\n\n- Give me the industry of the company. (Classify using this industry list: [Agriculture, Arts, Construction, Consumer Goods, Education, Entertainment, Finance, Other, Health Care, Legal, Manufacturing, Media & Communications, Public Administration, Advertisements, Real Estate, Recreation & Travel, Retail, Software, Transportation & Logistics, Wellness & Fitness] if it's ambiguous between Sofware and Consumer Goods, prefer Consumer Goods)\n\n- Guess the target audience of each company.(Classify and choose 1 from this list: [sales teams, marketing teams, HR teams, customer Service teams, consumers, C-levels] Write it in lowercase)\n\n- Tell me if they are B2B or B2C\n\nformat should be:\n{\"value_proposition\": value_proposition,\n\"industry\": industry,\n\"target_audience\": target_audience, \n\"market\": market }\n\nJSON:",
        "options": {
          "topP": 1,
          "maxTokens": 120,
          "temperature": 0
        }
      },
      "credentials": {
        "openAiApi": {
          "id": "67",
          "name": "Lucas Open AI"
        }
      },
      "typeVersion": 1,
      "continueOnFail": true
    },
    {
      "id": "8de6c3d4-316f-4e00-a9f5-a4deefce90b3",
      "name": "Merge",
      "type": "n8n-nodes-base.merge",
      "position": [
        1600,
        320
      ],
      "parameters": {
        "mode": "combine",
        "options": {},
        "combinationMode": "mergeByPosition"
      },
      "typeVersion": 2
    },
    {
      "id": "669f888e-1416-4291-a854-07ffbbbfcab1",
      "name": "Clean Content",
      "type": "n8n-nodes-base.code",
      "position": [
        980,
        460
      ],
      "parameters": {
        "mode": "runOnceForEachItem",
        "jsCode": "if ($input.item.json.body){\n\n\n\n$input.item.json.content = $input.item.json.body.replaceAll('/^\\s+|\\s+$/g', '').replace('/(\\r\\n|\\n|\\r)/gm', \"\").replace(/\\s+/g, ' ')\n\n\n  $input.item.json.contentShort = $input.item.json.content.slice(0, 10000)\n}\n\n\n\n\nreturn $input.item"
      },
      "executeOnce": false,
      "typeVersion": 1,
      "continueOnFail": true,
      "alwaysOutputData": true
    },
    {
      "id": "dbd5f866-2f5e-4adf-b1b5-a27b08c0425a",
      "name": "Update Google Sheets",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        1840,
        320
      ],
      "parameters": {
        "options": {},
        "fieldsUi": {
          "values": [
            {
              "column": "Market",
              "fieldValue": "={{ $json[\"market\"] }}"
            },
            {
              "column": "Industry",
              "fieldValue": "={{ $json[\"industry\"] }}"
            },
            {
              "column": "Value Proposition",
              "fieldValue": "={{ $json[\"value_proposition\"] }}"
            },
            {
              "column": "Target Audience",
              "fieldValue": "={{ $json[\"target_audience\"] }}"
            }
          ]
        },
        "operation": "update",
        "sheetName": {
          "__rl": true,
          "mode": "list",
          "value": "gid=0",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/13h8HPWKha5kZHDeKxAPQvQqAOonof5cgpxzh79tIQfY/edit#gid=0",
          "cachedResultName": "Sheet1"
        },
        "documentId": {
          "__rl": true,
          "mode": "url",
          "value": "https://docs.google.com/spreadsheets/d/18iZ59I0q2AeElqcEpyJECNlSv4M6iJll9PQzXQkqEUk/edit#gid=0",
          "__regex": "https:\\/\\/(?:drive|docs)\\.google\\.com\\/\\w+\\/d\\/([0-9a-zA-Z\\-_]+)(?:\\/.*|)"
        },
        "valueToMatchOn": "={{ $json[\"Domain\"] }}",
        "columnToMatchOn": "Domain"
      },
      "credentials": {
        "googleSheetsOAuth2Api": {
          "id": "2",
          "name": "Google Sheets account lucas"
        }
      },
      "typeVersion": 3
    },
    {
      "id": "f8bf8b70-6070-447b-af22-4d4e1ffe3539",
      "name": "Parse JSON",
      "type": "n8n-nodes-base.code",
      "position": [
        1300,
        460
      ],
      "parameters": {
        "mode": "runOnceForEachItem",
        "jsCode": "// Add a new field called 'myNewField' to the\n// JSON of the item\n$input.item.json.value_proposition=JSON.parse($input.item.json.text).value_proposition\n\n$input.item.json.industry=JSON.parse($input.item.json.text).industry\n\n$input.item.json.market=JSON.parse($input.item.json.text).market\n\n$input.item.json.target_audience=JSON.parse($input.item.json.text).target_audience\n\nreturn $input.item;"
      },
      "typeVersion": 1
    },
    {
      "id": "2754c6e1-9cf6-47d4-ad97-0797ec9155df",
      "name": "Read Google Sheets",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        40,
        300
      ],
      "parameters": {
        "options": {},
        "sheetName": {
          "__rl": true,
          "mode": "list",
          "value": "gid=0",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/13h8HPWKha5kZHDeKxAPQvQqAOonof5cgpxzh79tIQfY/edit#gid=0",
          "cachedResultName": "Sheet1"
        },
        "documentId": {
          "__rl": true,
          "mode": "url",
          "value": "https://docs.google.com/spreadsheets/d/18iZ59I0q2AeElqcEpyJECNlSv4M6iJll9PQzXQkqEUk/edit#gid=0",
          "__regex": "https:\\/\\/(?:drive|docs)\\.google\\.com\\/\\w+\\/d\\/([0-9a-zA-Z\\-_]+)(?:\\/.*|)"
        }
      },
      "credentials": {
        "googleSheetsOAuth2Api": {
          "id": "2",
          "name": "Google Sheets account lucas"
        }
      },
      "typeVersion": 3
    },
    {
      "id": "c2b93428-0dcc-4c02-bb81-496c12442284",
      "name": "Split In Batches",
      "type": "n8n-nodes-base.splitInBatches",
      "position": [
        260,
        300
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 1
    },
    {
      "id": "eccf1dc8-a0bb-40f6-9471-95eac8020b02",
      "name": "Wait",
      "type": "n8n-nodes-base.wait",
      "position": [
        2060,
        560
      ],
      "webhookId": "d44bc024-1c21-44e0-b2b4-5cff6fb9f402",
      "parameters": {
        "unit": "seconds"
      },
      "typeVersion": 1
    }
  ],
  "connections": {
    "Wait": {
      "main": [
        [
          {
            "node": "Split In Batches",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Merge": {
      "main": [
        [
          {
            "node": "Update Google Sheets",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "OpenAI": {
      "main": [
        [
          {
            "node": "Parse JSON",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Parse JSON": {
      "main": [
        [
          {
            "node": "Merge",
            "type": "main",
            "index": 1
          }
        ]
      ]
    },
    "HTML Extract": {
      "main": [
        [
          {
            "node": "Clean Content",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "HTTP Request": {
      "main": [
        [
          {
            "node": "HTML Extract",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Clean Content": {
      "main": [
        [
          {
            "node": "OpenAI",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Split In Batches": {
      "main": [
        [
          {
            "node": "HTTP Request",
            "type": "main",
            "index": 0
          },
          {
            "node": "Merge",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Read Google Sheets": {
      "main": [
        [
          {
            "node": "Split In Batches",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Update Google Sheets": {
      "main": [
        [
          {
            "node": "Wait",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "When clicking \"Execute Workflow\"": {
      "main": [
        [
          {
            "node": "Read Google Sheets",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

Related Workflows

Automate AI Chat Responses with OpenAI API in n8n Workflow

This workflow utilizes the OpenAI API to facilitate seamless AI-powered chat interactions. By using HTTP requests to gather data and OpenAI's advanced GPT features to generate intelligent responses, this integration is perfect for AI-powered automation users looking to enhance customer engagement. Requires 1 account: OpenAI API. Save hours by automating chat responses and improve user satisfaction with instant, tailored interactions.

330,177 views
Market ResearchAI Summarization

Scrape Google Maps Emails to Google Sheets with Automation

This workflow utilizes the Google Sheets API to automatically scrape email addresses from Google Maps, filtering and batching the results for efficient data management. Ideal for professionals in data management, it requires just 1 account: Google Sheets OAuth. Save hours of manual data entry and generate targeted email lists quickly, driving your marketing efforts with precision.

172,223 views
Lead Generation

Transcribe YouTube Videos to Telegram with OpenAI GPT Automation

Utilize the Telegram Bot API to automatically transcribe YouTube videos using n8n's YouTube Transcriber node and deliver transcripts directly to your Telegram chat. This workflow is perfect for AI-powered automation users looking to streamline content consumption. Requires 1 account: Telegram Bot Token. Save hours on manual transcription and easily access video summaries for enhanced productivity.

92,803 views
Market ResearchAI Summarization

Optimize Conversion Rates with OpenAI Chat and AI Agent

This workflow leverages OpenAI's GPT and AI Chat to analyze form submissions, providing actionable insights for conversion rate optimization. Perfect for AI-powered automation users looking to enhance user engagement. Requires 1 account: OpenAI API. Increase conversion rates by 20% and save hours on manual analysis!

91,831 views
Market ResearchAI Summarization

How to Use This Workflow

1Import to n8n

  1. Copy the JSON using the button above
  2. Open your n8n instance
  3. Click “Import workflow” or press Ctrl+V
  4. Paste the JSON and click “Import”

2Before Running

Configure credentials and update service-specific settings before executing the workflow. Review required credentials in the Technical Details section above.

79.8K