In this post I’ll provide some code for parsing an
.eml file and extracting images. I was able to perfrom the parsing with the help of a great blog post I found here. Turning the blocks of ASCII letters back into JPEGs and PNGs took some work.
You can get sample data by going to your email and finding out how to view the original text. In GMail, you can expand the menu next to the reply button, which looks like a swoopy left arrow, and selecting Show Original or something. This should open a new browser tab with a bunch of ASCII text. Copy this and save it was
some.eml or something.
Go to the link in the introduction, scroll down and save
parsemail.py in whatever directory you saved
some.eml. Open a Python interpreter or IPython notebook from this directory and start with the following
import email import io from PIL import Image from parsemail import get_mail_contents
.EML file to a string. The
get_mail_contents() function from the
parsemail module does what you’d expect. The
io module converts the payload to the correct encoding. The
Image module is used to convert the byte data to an actual image.
Note: PIL, the Python Imaging Library has been discontinued, so you should pip-install the
pillow module. Importing should look the same though, just do
import PIL as usual.
em = "some.eml" msg = email.message_from_string( open( em, "r" ).read() ) attachments = get_mail_contents( msg ) for c in contents: if c.filename != None: atype, afmt = c.type.split('/') if atype == 'image': fh = io.BytesIO( c.payload ) im = Image.open( fh ) im.save( c.filename )
This will save the images with their original names and extensions in your working directory, like magic.