Playing With Python And Gmail – Part 2

August 19, 2010 at 2:55 pm | PYTHON | 2 comments

python
This is the second part of the article series ‘Playing With Python And Gmail’. If you didn’t read the first part I would recomend you to read it.

This time we will see how to fetch mails from Gmail using Python.

Reading Mails

The IMAP4.fetch method fetch (parts of) messages. message_parts should be a string of message part names enclosed within parentheses, eg: “(UID BODY[TEXT])”. Returned data are tuples of message part envelope and data.

Here is a minimal example (without error checking) that opens a mailbox and retrieves and prints all messages:

import imaplib
 
M = imaplib.IMAP4('imap.gmail.com', 993)
M.login('myname@gmail.com', 'pa$$word')
M.select()
typ, data = M.search(None, 'ALL')
for num in data[0].split():
    typ, data = M.fetch(num, '(RFC822)')
    print 'Message %s\n%s\n' % (num, data[0][1])
M.close()
M.logout()

The email package provides a standard parser that understands most email document structures, including MIME documents. You can pass the parser a string or a file object, and the parser will return to you the root Message instance of the object structure. For simple, non-MIME messages the payload of this root object will likely be a string containing the text of the message. For MIME messages, the root object will return True from its is_multipart() method, and the subparts can be accessed via the get_payload() and walk() methods.

Extract Mail Headers
Here is method to retrieve from, to and subject from from an email message:

from email.parser import HeaderParser
 
resp, data = M.FETCH(1, '(RFC822)')
msg = HeaderParser().parsestr(data[0][1])
 
print msg['From']
print msg['To']
print msg['Subject']
 
M.LOGOUT()

Output will be something like.

Gmail Team
My Name
Gmail is different. Here's what you need to know.

Identifying the content type
The Content-Type header indicates the Internet media type of the message content, consisting of a type and subtype, for example text/plain is the default value for “Content-Type:”
Gmail uses alternative content, such as a message sent in both plain text and another format such as HTML (multipart/alternative with the same content in text/plain and text/html forms).

import email
 
resp, data = M.FETCH(1, '(RFC822)')
mail = email.message_from_string(data[0][1])
 
for part in mail.walk():
  print 'Content-Type:',part.get_content_type()
  print 'Main Content:',part.get_content_maintype()
  print 'Sub Content:',part.get_content_subtype()

Out put will be

Content-Type: multipart/alternative
Main Content: multipart
Sub Content: alternative
Content-Type: text/plain
Main Content: text
Sub Content: plain
Content-Type: text/html
Main Content: text
Sub Content: html

Extract Message Body.
Using the walk() method we can iterate through Message parts. The get_payload() method will return the current payload, which will be a list of Message objects when is_multipart() is True, or a string when is_multipart() is False.

import email
 
resp, data = M.FETCH(1, '(RFC822)')
mail = email.message_from_string(data[0][1])
 
for part in mail.walk():
  # multipart are just containers, so we skip them
  if part.get_content_maintype() == 'multipart':
      continue
 
  # we are interested only in the simple text messages
  if part.get_content_subtype() != 'plain':
    continue
 
  payload = part.get_payload()
  print payload
 
M.LOGOUT()

Extracting Attachmets
The below code will extract and save attached images to disk.

import re
 
name_pat = re.compile('name=\".*\"')
 
for part in mail.walk():
  if part.get_content_maintype() != 'image':
    continue
 
  file_type = part.get_content_type().split('/')[1]
  if not file_type:
    file_type = 'jpg'
 
  filename = part.get_filename()
  if not filename:
    filename = name_pat.findall(part.get('Content-Type'))[0][6:-1]
 
  counter = 1
  if not filename:
    filename = 'img-%03d%s' % (counter, file_type)
    counter += 1
 
  payload = part.get_payload(decode=True)
 
  if not os.path.isfile(filename) :
      # finally write the stuff
      fp = open(filename, 'wb')
      fp.write(part.get_payload(decode=True))
      fp.close()

That’s it. In the next part I will explain searching and moving your mails using Python. Dont forget to subscribe :-)

Playing With Python And Gmail – Part 1

July 28, 2010 at 4:35 pm | PYTHON | 6 comments

python In addition to its web interface Google also provides access via IMAP. The python imaplib module defines three classes, IMAP4, IMAP4_SSL and IMAP4_stream, which encapsulate a connection to an IMAP4 server and implement a large subset of the IMAP4rev1 client protocol as defined in RFC 2060.

The IMAP4 class implements the actual IMAP4 protocol. The connection is created and protocol version (IMAP4 or IMAP4rev1) is determined when the instance is initialized.

Getting started with Python Imaplib
To start with, we will create a simple python program to login to Gmail via IMAP.

import imaplib
 
IMAP_SERVER='imap.gmail.com'
IMAP_PORT=993
 
M = imaplib.IMAP4_SSL(IMAP_SERVER, IMAP_PORT)
rc, resp = M.login('username@gmail.com', 'pa$$word')
print rc, resp
 
M.logout()

IMAP4.IMAP4_SSL is a subclass derived from IMAP4 that connects over an SSL encrypted socket (to use this class you need a socket module that was compiled with SSL support). If host is not specified, ” (the local host) is used. If port is omitted, the standard IMAP4-over-SSL port (993) is used. keyfile and certfile are also optional – they can contain a PEM formatted private key and certificate chain file for the SSL connection.

If authentication is successful the output will be:

OK ['username@gmail.com authenticated (Success)']

As part of our exercise we will be writing may usefull functions. It is good to create a python class file to put our functions so that at the end of our exercise we will have a cool python gmail library. Lets create a pygmail.py

import imaplib
 
class pygmail:
  def __init__(self):
    self.IMAP_SERVER='imap.gmail.com'
    self.IMAP_PORT=993
    self.M = None
    self.response = None
 
  def login(self, username, password):
    self.M = imaplib.IMAP4_SSL(self.IMAP_SERVER, self.IMAP_PORT)
    rc, self.response = self.M.login(username, password)
    return rc
 
  def logout(self):
    self.M.logout()
g = pygmail()
g.login('username@gmail.com', 'pa$$word')
print g.response

Listing Mailboxes
The IMAP4.list() function list mailbox names in directory matching pattern. The directory defaults to the top-level mail folder, and pattern defaults to match anything. Returned data contains a list of LIST responses.

Add the below function to our pygmial.py

def get_mailboxes(self):
  rc, self.response = self.M.list()
  for item in self.response:
    self.mailboxes.append(item.split()[-1])
  return rc

Use:

g.get_mailboxes()
for item in g.mailboxes:
  print item

This will output your Gmail mailboxes

"INBOX"
"Sent"
"Trash"
"/"
"[Gmail]/All
"[Gmail]/Drafts"
"[Gmail]/Sent
"[Gmail]/Spam"
"[Gmail]/Starred"
"[Gmail]/Trash"
"freebsd-net"
"fsug-tvm"
"openflow"

Creating, Renaming, Deleting Mailboxes
IMAP4.create, IMAP4.rename, IMAP4.delete functions will create, rename, delete the mailboxes respectively.
Lets add three more functions to out lib.

def rename_mailbox(self, oldmailbox, newmailbox):
  rc, self.response = self.M.rename(oldmailbox, newmailbox)
  return rc
 
def create_mailbox(self, mailbox):
  rc, self.response = self.M.create(mailbox)
  return rc
 
def delete_mailbox(self, mailbox):
  rc, self.response = self.M.delete(mailbox)
  return rc

Get Mail Count
The IMAP4.select function select a mailbox. Returned data is the count of messages in mailbox (EXISTS response). The default mailbox is 'INBOX'. If the readonly flag is set, modifications to the mailbox are not allowed.

Add the get_mail_count function to out Python class.

  def get_mail_count(self, folder='Inbox'):
    rc, count = self.M.select(folder)
    return count[0]

Output:

14581

You can specify your mailbox also.

g.get_mail_count('mailbox')

Get Unread Mail Count
The IMAP4.status() function request named status conditions for mailbox. The standard defines these status conditions:

MESSAGES - The number of messages in the mailbox.
RECENT - The number of messages with the Recent flag set.
UIDNEXT - The next unique identifier value of the mailbox.
UIDVALIDITY - The unique identifier validity value of the mailbox.
UNSEEN - The number of messages which do not have the Seen flag set.

Using the UNSEEN condition will return total unread messages in Inbox.

Lets define a function to get unread mail count in our pygmail.py.

  def get_unread_count(self, folder='Inbox'):
    rc, message = self.M.status(folder, "(UNSEEN)")
    unreadCount = re.search("UNSEEN (\d+)", message[0]).group(1)
    return unreadCount

Output:

2

OK, enough for a start. In the next parts of this article I will explain sending, searching, retrieving mails from Gmail via Python. So dont forget to subscribe :-)

I am pushing our small Python Gmail library to github hopefully useful to someone.
PyGmail: http://github.com/vinod85/pygmail

Catch Invisible Friends On GTalk The Python Way

July 16, 2010 at 7:51 pm | PROGRAMMING, PYTHON | 3 comments

Google-TalkEver wanted to know that someone is really offline or has just gone invisible in GTalk? Here is a small trick. The bellow peace of python code get the list of invisible users from your GTalk buddy list. It uses XMPP module for python. You can install this module in Ubuntu/Debian via apt. It also requires python dns module.

$ sudo aptitude install python-xmpp python-dnspython

Now here is our script. Open your favorite text editor and save the code as ‘gchat.py’. Dont forget to fill your gtalk username and password in the script.

import xmpp
 
# Google Talk constants
FROM_GMAIL_ID = "username@gmail.com"
GMAIL_PASS = "secret"
GTALK_SERVER = "gmail.com"
 
jid=xmpp.protocol.JID(FROM_GMAIL_ID)
C=xmpp.Client(jid.getDomain(),debug=[])
 
if not C.connect((GTALK_SERVER,5222)):
    raise IOError('Can not connect to server.')
if not C.auth(jid.getNode(),GMAIL_PASS):
    raise IOError('Can not auth with server.')
 
C.sendInitPresence(requestRoster=1)
 
def myPresenceHandler(con, event):
   if event.getType() == 'unavailable':
     print event.getFrom().getStripped()
 
C.RegisterHandler('presence', myPresenceHandler)
while C.Process(1):
  pass

Now simply run:

$ python gchat.py

So , Next time do not let anyone fool you , rather catch him Invisibly .

Next Page »