Extracting e-mails from a CSV
A while ago I needed to extract e-mail addresses from a CSV file, but the addresses weren’t isolated in a specific field. I wrote a short Python script to get triggered by Automator. Now I can drop a CSV file on to the Automator “app” and it will create a new column in the file with just an e-mail address.
The script is crude, but works well enough for me. It only finds the first address in a row. The regular expression is simple, too, so it will likely miss a few or grab false positives. It also overwrites the source file, which would not be good for a more serious workflow, but it is fine for my current purpose:
import sys
import re
import csv
EMAIL = re.compile('[^-/][a-zA-Z0-9]\S+@\S+\.[a-zA-Z0-9]+')
for f in sys.argv[1:]:
emails = []
with open(f) as csv_file:
reader = csv.reader(csv_file, delimiter=',')
for row in reader:
text = ', '.join(row)
search_result = re.search(EMAIL, text)
if search_result:
row.append(search_result.group(0).strip())
else:
row.append(' ')
emails.append(row)
with open(f, 'w') as csv_file:
writer = csv.writer(csv_file, delimiter=',')
for e in emails:
writer.writerow(e)