curtmerrill.com

Extracting e-mails from a CSV

Published

A while ago I needed to extract e-mail addresses from a CSV file, but the addresses weren’t isolated in a specific field. I wrote a short Python script to get triggered by Automator. Now I can drop a CSV file on to the Automator “app” and it will create a new column in the file with just an e-mail address.

The script is crude, but works well enough for me. It only finds the first address in a row. The regular expression is simple, too, so it will likely miss a few or grab false positives. It also overwrites the source file, which would not be good for a more serious workflow, but it is fine for my current purpose:

import sys
import re
import csv

EMAIL = re.compile('[^-/][a-zA-Z0-9]\S+@\S+\.[a-zA-Z0-9]+')

for f in sys.argv[1:]:
    emails = []
    with open(f) as csv_file:
      reader = csv.reader(csv_file, delimiter=',')
      for row in reader:
        text = ', '.join(row)
        search_result = re.search(EMAIL, text)
        if search_result:
          row.append(search_result.group(0).strip())
        else:
          row.append(' ')
        emails.append(row)
    with open(f, 'w') as csv_file:
      writer = csv.writer(csv_file, delimiter=',')
      for e in emails:
        writer.writerow(e)
Python code