How to create a Meta Tag Validator by scraping the website in NEXT JS

thumbnail

Hello friends, today in this post I will show how I created a meta tags validator by scraping the website in NEXT JS. By doing this project you will learn about Next JS API, regular expressions, web scraping, etc. So I will scrape my website to extract all meta tags and validate whether my website has all the necessary tags. So let’s get started.

For a demo, you can find this product on our website

Meta Tags Validator | CodeWithMarish

Step1 —Project Setup

Open your vscode, click on terminal > new terminal or plus Ctrl + J and type the below command

> npx create-next-app meta-tags-validator

For fetching website data I will axios package, so I will install using

> npx install axios

Step2 — Coding

First, we will create an API for fetching a website, under api folder you will find hello.js and rename it to metatagvalidator.js, inside that file we will use axios to fetch website data using an API call and will return the response or error whenever the metatagvalidator API is called.

import axios from "axios";

async function fetchAPI(url) {
  try {
    const { data } = await axios.get(url);
    return { data };
  } catch (err) {
    return { err };
  }
}

export default async function handler(req, res) {
  const { url } = req.body;
  const { data, err } = await fetchAPI(url);

  res.status(200).json({ response: data, err: err });
}

Now we will create a response, error, and URL state variable using useState. We will call this API using axios once the component is mounted.

import axios from "axios";
import { useState, useEffect } from "react";
export default function Home() {
  const [response, setResponse] = useState();
  const [error, setError] = useState();
  const [url, setURL] = useState("https://codewithmarish.com");

  const fetchData = async (url) => {
    const {
      data: { response, err },
    } = await axios.post("/api/metacardvalidator", {
      url: url,
    });

    setResponse(response);
    setError(err);
  }; 
  useEffect(() => {
    fetchData(url);
  }, [url]);

  return (
    <div>
</div>
  );
}

After this, we will extract all meta tags in a list and also store them in a key-value pair. For extracting we will use regex ‘<meta.*?(|</meta)>’ this will extract all meta tags. After extracting the list we will get property name using regex ‘((?<=name=”)|(?<=property=”)).*?(?=”)’

?<=lookbehind matches all characters followed by name=“ or property=“ and ?=” lookahead will match all characters that appears before ”. For eg:(?<=Y)X — matches X if Y is before it and X(?=Y) — matches X only if followed by Y

For content or value we have two cases either content will be inside property ‘content=’ or inside the meta tag <meta>content…</meta> so we have two regex here ‘(?<=content=”).*?(?=”) and ‘<meta*?>(.*?)</meta>’. Finally we will set this tags to metaTagsList and metaTags.

Now we will call this function when the response variable is updated while fetching data so we have useEffect with dependency has response.

import axios from "axios";
import { useState, useEffect } from "react";

export default function Home() {
  const [metaTags, setMetaTags] = useState();
  const [metaTagsList, setMetagsList] = useState([]);
  ...

  const getData = (data) => {
    const regexp = new RegExp("<meta.*?(|</meta)>", "g");
    let metaTagsContent = {};
    let metaTagsList = [];
    if (data) {
      metaTagsList = data.match(regexp);
      metaTagsList.map((tag) => {
        let nameRegexp = new RegExp(
          '((?<=name=")|(?<=property=")).*?(?=")',
          "g"
        );
        let contentRegexp = new RegExp('(?<=content=").*?(?=")', "g");
        let contentRegexp1 = new RegExp("<meta*?>(.*?)</meta>", "g");
        let name = tag.match(nameRegexp);
        let content = tag.match(contentRegexp);
        content = content || tag.match(contentRegexp1);
        if (name && content) {
          metaTagsContent = {
            ...metaTagsContent,
            [`${name[0]}`]: `${content[0]}`,
          };
        }
      });
    }
    return { metaTagsList, metaTagsContent };
  };
  ...

  useEffect(() => {
    const { metaTagsList, metaTagsContent } = getData(response);
    setMetaTags(metaTagsContent);
    setMetagsList(metaTagsList);
  }, [response]);
  return (
    <div></div>
  );
}

We will install tailwindcss for simplifying CSS.

npm install -D tailwindcss postcss autoprefixer
npx tailwindcss init -p

Update tailwind.config.js as below

/** @type {import('tailwindcss').Config} */
module.exports = {
  content: [
    "./pages/**/*.{js,ts,jsx,tsx}",
    "./components/**/*.{js,ts,jsx,tsx}",
  ],
  mode: "jit",
  theme: {
    extend: {},
  },
  plugins: [],
};

Please refer tailwindcss documentation for installation and other guides.

Tailwind CSS Docs

We will show a p tag stating the error message in case of any errors. We will show a simple card containing a meta title, description, URL, and image. I have used tailwindcss classnames for CSS styles you can get details of each class in their documentation.

...

export default function Home() {
  ...
  return (
    <div className="container max-w-2xl mx-auto p-4">
      <h1 className="text-2xl md:text-3xl mb-6 font-medium">
        Meta Tags Validator
      </h1>
      <p className="text-red-400 font-bold">
        Challenge for you: Url Input form should be there here to be dynamic
      </p>
      {error ? (
        <p className="text-red-300">Please enter a valid url or try again</p>
      ) : response && metaTags && metaTagsList ? (
        <div className="flex flex-col space-y-6">
          <a
            href={url}
            target="_blank"
            rel="noreferrer"
            className="border flex flex-col-reverse sm:flex-row justify-between"
          >
            <div className="flex flex-col p-4 text-center sm:text-left">
              <p className="text-lg font-bold">
                {metaTags["title"] || metaTags["og:title"]}
              </p>
              <p className="mt-2 max-w-[65ch]">
                {metaTags["description"] || metaTags["og:description"]}
              </p>
              <p className="mt-3">{new URL(url).host}</p>
            </div>
            <img
              src={metaTags["og:image"]}
              className="max-h-36 sm:max-h-32 object-contain self-center"
            />
          </a>

        </div>
      ) : (
        <div></div>
      )}
    </div>
  );
}

We will create a new component inside the components folder MetaTagsValidator.jsx. Inside this component, we will have object containing list tags which we need to validate. For this component, we will metaTagsList and metaTags from the index.js. Here we will show a total number of meta tags found, and check whether the list contains all the tags which we used to validate, based on that we will show a check or cross symbol.

import React from "react";

const Data = ({ label, check }) => {
  let classNames = `w-7 flex-shrink-0 h-7 ${
    check ? "text-green-500" : "text-red-500"
  } rounded-full flex items-end justify-center bg-slate-100`;
  return (
    <div className="flex space-x-3">
      {check ? (
        <div className={classNames}>&#10004;</div>
      ) : (
        <div className={classNames}>&#10060;</div>
      )}
      <p>{label}</p>
    </div>
  );
};

const MetaTagsValidator = ({ metaTagsList, metaTags }) => {
  let tagsList = {
    "Basic Meta Tags": ["title", "description"],
    "Open Graph Tags": [
      "og:title",
      "og:description",
      "og:image",
      "og:site_name",
      "og:url",
    ],
    "Twitter Tags": [
      "twitter:title",
      "twitter:description",
      "twitter:image",
      "twitter:url",
      "twitter:card",
    ],
  };

  return (
    <div className="flex flex-col space-y-4">
      <Data
        label={`Total ${metaTagsList.length} Meta Tags found.`}
        check={metaTagsList.length > 0}
      />
      {Object.keys(tagsList).map((tag) => {
        return (
          <div key={tag} className="flex flex-col space-y-4">
            <Data
              label={tag}
              check={tagsList[tag].every((v) => (metaTags[v] ? true : false))}
            />
            <div className="flex flex-col space-y-3 ml-10 mt-4">
              {tagsList[tag].map((t) => {
                let c = metaTags[t] ? true : false;
                return (
                  <Data
                    key={`meta-${t}`}
                    check={c}
                    label={`meta ${t} ${c ? "found" : "not found"}`}
                  />
                );
              })}
            </div>
          </div>
        );
      })}
    </div>
  );
};

export default MetaTagsValidator;

Now we will use this component in our index.js after showing the card and pass metaTagsList and metaTags as its props.

Use `npm run dev` in your command to run your project and you will see the output as below.

Meta tag validator Demo

The Challenge for you will be to create an input form that will accept URLand based on that it will validate the meta tags.

You can find this code on my Github repository.

Meta Tags Validator | Github

Thanks for reading this post, if you found this post helpful please share maximum, Thanks for reading 😊 Stay tuned.

If you are facing any issues please contact us from our contact section.

Contact Us | CodeWithMarish

Also please don’t forget to subscribe to our youtube channel codewithmarish for all web development-related challenges.

Code With Marish | Youtube

Posted with ❤️ from somewhere on the Earth


Check out our Free Products

Please visit other posts