In this post I’ll provide some code for parsing an .eml
file and extracting images. I was able to perfrom the parsing with the help of a great blog post I found here. Turning the blocks of ASCII letters back into JPEGs and PNGs took some work.
Data
You can get sample data by going to your email and finding out how to view the original text. In GMail, you can expand the menu next to the reply button, which looks like a swoopy left arrow, and selecting Show Original or something. This should open a new browser tab with a bunch of ASCII text. Copy this and save it was some.eml
or something.
Code
Go to the link in the introduction, scroll down and save parsemail.py
in whatever directory you saved some.eml
. Open a Python interpreter or IPython notebook from this directory and start with the following
import email import io from PIL import Image from parsemail import get_mail_contents
The email
module converts the .EML
file to a string. The get_mail_contents()
function from the parsemail
module does what you’d expect. The io
module converts the payload to the correct encoding. The Image
module is used to convert the byte data to an actual image.
Note: PIL, the Python Imaging Library has been discontinued, so you should pip-install the pillow
module. Importing should look the same though, just do import PIL
as usual.
em = "some.eml" msg = email.message_from_string( open( em, "r" ).read() ) attachments = get_mail_contents( msg ) for c in contents: if c.filename != None: atype, afmt = c.type.split('/') if atype == 'image': fh = io.BytesIO( c.payload ) im = Image.open( fh ) im.save( c.filename )
This will save the images with their original names and extensions in your working directory, like magic.