Combine Keyword Lists with Python

Combine Keyword Lists with Python

Combine Keyword Lists with Python

Keyword research is a common task for an SEO manager. It is often necessary to combine keywords from several sources in different ways. If the number of keywords is under 100 entries, it is easy to manage the task in a text or spreadsheet editor of your choice. When we are talking about thousands of entries, it might be useful to plug in some programming skills in the process.

Python programming language is well suited for text manipulation tasks. In this blog post I will show how it may make the life of an SEO manager easier. I assume that the reader already has some Python programming skills. If you are a complete newbie - check some 'getting started with Python' tutorial first. There are plenty of them available online.

In this post I will cover the following typical keyword list manipulation tasks:

  • Append one list of keywords to another;
  • Generate permutations of each keyword phrase in a list;
  • Append keywords from one list to another element-wise;
  • Combine each keyword from one list with each keyword from another list (dot product).

Read Keywords from Files

First, let us get the list of files that are containing your keywords. The function get_file_paths takes two inputs: the path to a directory containing the files with keyword lists (e.g. /home/keyword_lists) and the extension of the files (e.g. txt). The function return a generator over all the file names matching the criteria.


from pathlib import Path

def get_file_paths(directory: str, file_extension: str) -> Iterable[str]:
p = Path(directory)
return p.glob(f'*.{file_extension}')

Then we need a function that would read a file into the Python's list data structure.

If you are unsure how to do is use this snippet. Provide the path to a file with a keyword list, run the function, its return value is a list of strings converted to lowercase:


from typing import List

def file_reader(filepath: str) -> List[str]:
with open(filepath, 'r') as file_in:
return [line.strip().lower() for line in file_in]

Putting it together:


keyword_lists = [file_reader(x) for x in get_file_paths('/home/keyword_lists', 'txt')]

keyword_lists now contains a list of lists of all your keyword files. It is now very easy to manipulate them with Python.

Append one list of keywords to another

In order to just merge the lists together and get a single file containing all the keywords from multiple files you can use a chain() function from itertools package and write_list_to_file function provided below. As the output you will get a out.txt file listing all the keywords written from a new line each. The list is now ready for the import to an SEO tool of your choice for further processing.


from itertools import chain

def write_list_to_file(filepath:str, in_list: List[str]) -> None:
with open(filepath, 'w') as file_out:
for line in in_list:
file_out.write(f'{line}\n')

write_list_to_file('out.txt', chain(*keyword_lists))

Generate Permutations of Each Keyword Phrase in a List

Permutation over a keyword phrase is a list of keyword phrases that you get after changing the position of each keyword in a phrase. For example, if you have a seed keyword phrase 'online seo software' then all the permutations will be: 'online seo software', 'online software seo', 'seo online software', 'seo software online', 'software online seo', 'software seo online'.

Let us code a simple function that get a list of keyword phrases as an input and outputs an iterable of all their permutations.


    from itertools import chain, permutations

def generate_permutations(input_list: List[str]) -> Iterable[str]:
splitted_list = (x.split() for x in input_list)
permutated = map(permutations, splitted_list)
back_to_strings = map(lambda i: (' '.join(j) for j in i), permutated)
return chain(*back_to_strings)

Append Keywords from One List to Another Element-wise

Imagine that you have two or more columns and you want to merge elements from each row together. We are going to write a function that will help you with this. The function will take a list of lists as and input, and output a list over merged input. Note that the length of lists must be the same, otherwise the program will merge the elements according to the length of the shortest of the input lists.


    
    def merge_list_row_wise(input_list: List[List[str]]) -> Iterable[str]:
return map(lambda x: ' '.join(x), zip(*input_list))

Dot Product of Two Lists

This is a typical problem when one is working with keywords for an e-commerce website. You often have a list of phrases that apply to a generic product and a list of brand names that combine well with such phrases. For example, 'tv review', 'TV buying guide' can be combined with a brand name 'Sony' to get a list of more specific keywords like 'Sony TV review', 'Sony TV buying guide'. We are actually performing a so called dot product multiplication where each element of one list is combined in some way with each element of another list. The function below takes two lists as arguments and outputs a single list containing the dot product of the input lists.


    
    def dot_product_strings(a: Iterable[str], b: Iterable[str]) -> Iterable[str]:
for token_a in a:
for token_b in b:
yield token_a + ' ' + token_b

I hope these small snippets will help you speed up your keyword research efforts.