
This is the second part of the article series ‘Playing With Python And Gmail’. If you didn’t read the first part I would recomend you to read it.
This time we will see how to fetch mails from Gmail using Python.
Reading Mails
The IMAP4.fetch method fetch (parts of) messages. message_parts should be a string of message part names enclosed within parentheses, eg: “(UID BODY[TEXT])”. Returned data are tuples of message part envelope and data.
Here is a minimal example (without error checking) that opens a mailbox and retrieves and prints all messages:
import imaplib M = imaplib.IMAP4('imap.gmail.com', 993) M.login('myname@gmail.com', 'pa$$word') M.select() typ, data = M.search(None, 'ALL') for num in data[0].split(): typ, data = M.fetch(num, '(RFC822)') print 'Message %s\n%s\n' % (num, data[0][1]) M.close() M.logout()
The email package provides a standard parser that understands most email document structures, including MIME documents. You can pass the parser a string or a file object, and the parser will return to you the root Message instance of the object structure. For simple, non-MIME messages the payload of this root object will likely be a string containing the text of the message. For MIME messages, the root object will return True from its is_multipart() method, and the subparts can be accessed via the get_payload() and walk() methods.
Extract Mail Headers
Here is method to retrieve from, to and subject from from an email message:
from email.parser import HeaderParser resp, data = M.FETCH(1, '(RFC822)') msg = HeaderParser().parsestr(data[0][1]) print msg['From'] print msg['To'] print msg['Subject'] M.LOGOUT()
Output will be something like.
Gmail Team My Name Gmail is different. Here's what you need to know.
Identifying the content type
The Content-Type header indicates the Internet media type of the message content, consisting of a type and subtype, for example text/plain is the default value for “Content-Type:”
Gmail uses alternative content, such as a message sent in both plain text and another format such as HTML (multipart/alternative with the same content in text/plain and text/html forms).
import email resp, data = M.FETCH(1, '(RFC822)') mail = email.message_from_string(data[0][1]) for part in mail.walk(): print 'Content-Type:',part.get_content_type() print 'Main Content:',part.get_content_maintype() print 'Sub Content:',part.get_content_subtype()
Out put will be
Content-Type: multipart/alternative Main Content: multipart Sub Content: alternative Content-Type: text/plain Main Content: text Sub Content: plain Content-Type: text/html Main Content: text Sub Content: html
Extract Message Body.
Using the walk() method we can iterate through Message parts. The get_payload() method will return the current payload, which will be a list of Message objects when is_multipart() is True, or a string when is_multipart() is False.
import email resp, data = M.FETCH(1, '(RFC822)') mail = email.message_from_string(data[0][1]) for part in mail.walk(): # multipart are just containers, so we skip them if part.get_content_maintype() == 'multipart': continue # we are interested only in the simple text messages if part.get_content_subtype() != 'plain': continue payload = part.get_payload() print payload M.LOGOUT()
Extracting Attachmets
The below code will extract and save attached images to disk.
import re name_pat = re.compile('name=\".*\"') for part in mail.walk(): if part.get_content_maintype() != 'image': continue file_type = part.get_content_type().split('/')[1] if not file_type: file_type = 'jpg' filename = part.get_filename() if not filename: filename = name_pat.findall(part.get('Content-Type'))[0][6:-1] counter = 1 if not filename: filename = 'img-%03d%s' % (counter, file_type) counter += 1 payload = part.get_payload(decode=True) if not os.path.isfile(filename) : # finally write the stuff fp = open(filename, 'wb') fp.write(part.get_payload(decode=True)) fp.close()
That’s it. In the next part I will explain searching and moving your mails using Python. Dont forget to subscribe ![]()
Ever wanted to know that someone is really offline or has just gone invisible in GTalk? Here is a small trick. The bellow peace of python code get the list of invisible users from your GTalk buddy list. It uses XMPP module for python. You can install this module in Ubuntu/Debian via apt. It also requires python dns module.