categorization api

How to categorize websites using Categorization API

Website categorization can be used in many ways. Maybe you are building an online directory, analyzing a large list of websites for behavior analysis, looking to filter your network traffic or categorizing ads on your platform.

In each of these scenarios, you need a reliable URL categorization partner who offers a quick and painless solution for your needs. Our service allows you to perform URL categorization in four simple steps with a REST API which you can use with your platform of choice.

Step 1 – Generate API access keys

Access keys allow you to connect to our services without sharing your username and password or other sensitive information in your application. Since they are unique, you can rest assured that no one except you will be able to use resources from your account.

To generate API keys:

  1. You need to sign-in to your account dashboard (https://dashboard.webshrinker.com).
  2. Navigate to the “API Access Keys” area and select “Create API Key”
  3. Add a description to your key and make sure that the “Categories” service is selected

Step 2 – Send the category lookup request

In this post we will use HTTP Basic Authentication to make requests to the API. It’s also possible to generate “pre-signed URLs” which grant access to the API for specific requests without using basic authentication. To check out this method, please refer to our documentation.

Since we are using the Python library “requests” to send HTTP requests to the server, we’ll use the basic authentication built into the “requests” library which will add the necessary HTTP header for us.

Make sure you have your access key and secret key and substitute them into the sample code. Your next step is to send the request to our server.

To do that, we simply use get function from the library passing the encoded URL and your API keys.

import requests

from base64 import urlsafe_b64encode

target_website = ""

key = ""

secret_key = ""

api_url = "https://api.webshrinker.com/categories/v2/%s" % urlsafe_b64encode(target_website)

response = requests.get(api_url, auth=(key, secret_key))

Step 3 – Parsing the response

Our categorization API provides the response in standard JSON format and to get that we use the json() function to parse the response and pass it to your application.

status_code = response.status_code

data = response.json()

Here is the example of the response:

{

"data": [

{

"categories": [

"informationtech",

"business"

],

"url": "webshrinker.com"

}

]

}

Step 4 – Handling status codes

In this example, we are going to use HTTP status code returned to know if the categorization was completed or if it’s being evaluated.

if status_code == 200:

# Do something with the JSON response

categories = data['data'][0]['categories']

print "'%s' belongs to the following categories: %s" % (target_website, ",".join(categories))

elif status_code == 202:

# The request is being categorized right now in real time, check again for an updated answer

print "The categories for '%s' are being determined, check again in a few seconds" % target_website

else:

# The different status codes are covered in the documentation (http://docs.webshrinker.com/v2/website-category-api.html#category-lookup)

print "An error occurred: HTTP %d" % status_code

if 'error' in data:

print data['error']['message']

In this if/else statement, we covered three possible scenarios:

  • Status code 200 is returned if the URL that has been submitted exists in our database.
  • Status code 202 is being returned if the URL does not exist in our database and our algorithm is doing real-time categorization. This typically takes a few seconds to a minute to complete.
  • Finally, the else statement is triggered if some error occurs. For more details about the complete list of status codes, please refer to our documentation.

Interested in trying it yourself?

Create my free account

Most Popular
New Webshrinker Categories: Hate, Government, and Trackers
March 24, 2021
By
Peter Lowe

We curate our sets of categories very carefully, and only update them after thorough consideration. Here are the newest Webshrinker categories.

read more
This is some text inside of a div block.

Explore More Content

Ready to brush up on something new? We've got even more for you to discover.

Secure Your Organization Without Slowing Down

Content filtering for end-user protection. Block security threats and inappropriate content with DNSFilter.