In the day to day of an SEO, it’s inevitable you’ll need to manipulate URLs to get what you need. Working with a large number of backlinks is a common occurrence and you may need to extract domains from the URLs to make data more digestible. Luckily this is very easy to do in Excel, read ahead to find out how!
In our worked example, we have pulled backlink data from Ahrefs for a popular website. As there are thousands of individual links, we want to quickly simplify the data and break down to domain level.
Now here’s the bit you care about…
I’m going to share with you two fast and easy methods to extract domains from URLs in Excel. Though you should definitely use method 2… (skip to method 2)
1) Insert two new columns at the front of your data, so columns A and B are empty.
2) Copy and paste your backlinks into column A (steps 1 & 2 means you don’t have to tweak the formula), then add a heading ‘Domains’ in B1.
3) Enter the formula below into cell B2
=IF(ISERROR(FIND("//www.",A2)), MID(A2,FIND(":",A2,4)+3,FIND("/",A2,9)-FIND(":",A2,4)-3), MID(A2,FIND(":",A2,4)+7,FIND("/",A2,9)-FIND(":",A2,4)-7))
4) Copy and paste this down / Fill handle (this formula will extract the domain from URLs for all scenarios e.g. ‘http’, https’ or ‘www.’)
5) Copy and paste values to remove formula
And there you have it! You will now have a clean column of domains.
(my favourite method)
Using a wildcard in conjunction with Find & Replace is a gamechanger! With just a couple of quick replaces, your URLs will be tidied, without needing to remember or dig out a dirty long formula like the above.
1) Add a second column next to your backlink data and copy and paste the first column into second so you’ve got two identical columns.
2) Highlight the second column, then press CTRL + F (Windows) or CMD + F (Mac), then click ‘Replace…’
Here’s the trial and error part, you’ll want to first remove the protocol at the front of the URL, and then the path after the domain. The following steps will clear up every possible URL scenario in 3 simple steps.
3) Here’s the trial and error part, you’ll want to first remove the protocol at the front of the URL, and then the path after the domain. The following steps will clear up every possible URL scenario in 3 simple steps:
Done! You’ve successfully extracted domains from your list of URLs. Simple right?
Should you want to get really crafty, check out our article on how to extract domains using python for a clear and easy guide.
Want to learn more? Keep an eye on our blog as we’ll be sharing more helpful Excel tips or follow us on Twitter.