2 min read

How to get root domain from URL using Python

Admin : Jun 28, 2019 10:28:10 AM

As SEO professionals, a lot of the work we do involves analysing and manipulating URL data. While URL data is great, when it comes to digesting that data or presenting findings to stakeholders or internal teams, that level of granularity the URLs offer can sometimes be a downside. A common way to get around this is to group by root domain and turn lots of smaller pieces of analysis into one bigger picture.

If you’re less familiar with python, check out a our other blog post, how to extract domains from urls in excel.

Get domain from URLs using python?

We’ll be using the Python 3 tld project to make our scripts much easier to manage. Find out more information and how to install tld here. If you’re using IDLE with macOS, check out my other post which gives a brief overview on how to install modules in IDLE.

Extracting a single domain using print

from tld import get_tld
 
url = 'https://www.honchosearch.com/blog/seo/14-elements-every-successful-outreach-campaign-need/' #URL to strip. Change this URL to whatever you want.
 
res = get_tld(url, as_object=True) #Get the root as an object
 
print (res.fld) #res.fld to extract the domain

Extracting multiple root domains within a list using print

from tld import get_tld
 
urls = [\'https://www.example.com/hello_world\', \'https://www.example.co.uk/hello_uk\']
#list of urls
 
for url in urls: #for loop to create iterations
    res = get_tld(url,as_object=True)
    print(res.fld)

Extracting multiple root domains from a CSV using print

from tld import get_tld
 
urls_file = "urls_file.csv"
#URLs should be in column A without a heading, in a CSV file named "urls_file.csv"
 
urls = [line.rstrip('\n') for line in open(urls_file)]
 
for url in urls:
    res = get_tld(url,as_object=True)
    print(res.fld)

This script also works with .txt files.

Extracting multiple root domains from a CSV to a CSV

from tld import get_tld
 
urls_file = "urls_file.csv"
#URLs should be in column A without a heading, in a CSV file named "urls_file.csv"
 
urls = [line.rstrip('\n') for line in open(urls_file)]
the_file = open("domains.csv", "w")
#Create a CSV file within the same file directory and name "domains.csv"
 
the_file.write("root domain, urls \n")
for url in urls:
    the_list = []
    result = get_tld(url,as_object=True).fld
    try:
        root_domain = the_list.append(result)
    except:
        the_list.append("NO ROOT")
    url = the_list.append(url)
    the_list.append("\n")
    the_file.write(",".join(the_list))
the_file.close

What is a root domain?

Outside of the context of DNS, the root domain usually refers to the overarching structure of a domain. So, for example, https://www.honchosearch.com which then contains all folders (/services/seo) or subdomains etc.

What is the tld Python library?

The tld Python library created by Artur Barseghyan allows you to easily extract the top level domain (TLD) from a given URL. What makes this library so handy is that it includes other useful functions such as being able to extract sub-domains, extract root domains and even check the validity of a tld.

Want to find out more? National Coding Week is coming up on 16th September. Keep an eye on our blog as we'll be sharing useful tips daily.

Explore Our Services

DIGITAL PR

Earn authoritative links and drive brand awareness with Digital PR

PAID SEARCH

Deliver instant traffic and revenue through Paid Search and Shopping

SOCIAL ADS

Reach new audiences and retarget existing ones on social channels

CONTENT

Attract and engage website visitors with a well executed content strategy

2 min read

What is Google Search Generative Experience? (SGE)

Apr 18, 2024 10:45:53 AM

What is Google SGE? Think of Google SGE as your helpful buddy on the search results page. Instead of making you click on different websites, it pulls...

SEO Search/Social Updates

5 min read

Harnessing High Search Volume Keywords for Maximum Impact

Apr 9, 2024 4:12:33 AM

Discover the power of high search volume keywords and how to effectively use them to boost your online presence and drive maximum impact.

SEO

2 min read

Honcho partner with Eflorist to support Digital PR campaigns across Europe

Apr 1, 2024 9:37:07 AM

We're delighted to officially announce our partnership with Eflorist, one of the world’s leading flower delivery brands with over 54,000 local flower...

Digital PR SEO

How to extract domains from URLs in Excel

Admin : Jul 1, 2019 5:00:29 AM

In the day to day of an SEO, it’s inevitable you’ll need to manipulate URLs to get what you need. Working with a large number of backlinks is a...

1 min read

Google Adwords Basic's Blog: Display URL’s

Honcho : Feb 25, 2011 11:01:37 AM

Welcome to the latest edition of the Google Adwords Basics Blog. In this blog I will express the importance of using targeted display URL’s. Display...

Jade Halstead : Apr 8, 2022 11:44:09 AM

Here’s what you need to know about Google removing its URL parameters tool… 👇🏻

SEO

How to get root domain from URL using Python

Get domain from URLs using python?

What is a root domain?

What is the tld Python library?

Explore Our Services

DIGITAL PR

PAID SEARCH

SOCIAL ADS

CONTENT

What is Google Search Generative Experience? (SGE)

Harnessing High Search Volume Keywords for Maximum Impact

Honcho partner with Eflorist to support Digital PR campaigns across Europe

How to extract domains from URLs in Excel

Google Adwords Basic's Blog: Display URL’s

Google removes URL parameter tool – here’s what you need to know

ABOUT US

OUR SERVICES

Tel: (+44) 01438 870220

How to get root domain from URL using Python

Get domain from URLs using python?

What is a root domain?

What is the tld Python library?

Explore Our Services

DIGITAL PR

PAID SEARCH

SOCIAL ADS

CONTENT

What is Google Search Generative Experience? (SGE)

Harnessing High Search Volume Keywords for Maximum Impact

Honcho partner with Eflorist to support Digital PR campaigns across Europe

How to extract domains from URLs in Excel

Google Adwords Basic's Blog: Display URL’s

Google removes URL parameter tool – here’s what you need to know

FOLLOW US ON SOCIAL