Stale stats

Hi ,
database :Oracle 11g(11.2.0.2)
OS:solaris
one of my prod database is having huge number of stale statistics.am planning to gather stats using below one.
EXEC DBMS_STATS.gather_schema_stats('COMMODITY', estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE, cascade => TRUE);
here my doubt is ..is it any performance degradation if I gather stats for stale stats? or any performance benefits? can any one please explain or suggest any docs/MOS docs etc...links if any...
why statistics are not collected while using ESTIMATE_PERCENT =>15? when db is has stale stats...why we go for DBMS_STATS.AUTO_SAMPLE_SIZE any idea?
thanks,
Mike.

Unless you have explicitly disabled it, an 11.2 database will be gathering statistics automatically using sensible defaults. To check if the job is enabled,
select client_name,status from dba_autotask_client;
If it is indeed disabled, then your questions becomes relevant. But in general, unless your users have a problem, do nothing. If they do have a problem, take a look at the problem to determine if stale statistics is the cause. Simply gathering statistics for no paricular reason may not be a good idea.
John Watson
Oracle Certified Master DBA

Similar Messages

Error while collecting stale stats dbms_stats

SQL> exec dbms_stats.gather_table_stats('CData3','OM_HEAD',cascade=>'TR
UE',options=>'GATHER STALE');
BEGIN dbms_stats.gather_table_stats('CData3','OM_HEAD',cascade=>'TRUE',
options=>'GATHER STALE'); END;
ERROR at line 1:
ORA-06550: line 1, column 7:
PLS-00306: wrong number or types of arguments in call to 'GATHER_TABLE_STATS'
ORA-06550: line 1, column 7:
PL/SQL: Statement ignoredoracle version 9.2
pls suggest me for the same...
Edited by: user00726 on Feb 3, 2009 4:28 AM

there is no such variable options for gather_table_stats:
PROCEDURE GATHER_TABLE_STATS
Argument Name Type In/Out Default?
OWNNAME VARCHAR2 IN
TABNAME VARCHAR2 IN
PARTNAME VARCHAR2 IN DEFAULT
ESTIMATE_PERCENT NUMBER IN DEFAULT
BLOCK_SAMPLE BOOLEAN IN DEFAULT
METHOD_OPT VARCHAR2 IN DEFAULT
DEGREE NUMBER IN DEFAULT
GRANULARITY VARCHAR2 IN DEFAULT
CASCADE BOOLEAN IN DEFAULT
STATTAB VARCHAR2 IN DEFAULT
STATID VARCHAR2 IN DEFAULT
STATOWN VARCHAR2 IN DEFAULT
NO_INVALIDATE BOOLEAN IN DEFAULT
STATTYPE VARCHAR2 IN DEFAULT
FORCE BOOLEAN IN DEFAULT
So you can't use such variable.
Use instead gather_schema_stats with such option, it will automaticaly gather stats for all tables with stale_stats in that schema.
Or You can look into table:
dba_tab_modifications
And from there find which tables are having stale stats and gather with gather_table_stats.
Edited by: Laura Gaigala on Feb 3, 2009 2:53 PM
Edited by: Laura Gaigala on Feb 3, 2009 2:54 PM

How are execution plan created with tables of stale stats

Hello
I would like to ask the group
1. How oracle handels the execution plan with table joins where some tables have stale stats
2. How would oracle handel execution plan where the table has histogram but the stats are stale.
Database version 11.1.0.7.0
Thanks
Arun

ALTER SESSION SET EVENTS='10053 trace name context forever, level 1';
by doing above before executing the SQL, you can see what & how CBO arrives at the actual execution plan

How To Handle Stale Stats Scenario.

hi ,
I am using Release 10.2.0.1.0 of Oracle. I am having a scenario in which i am getting poor execution plans due to stale stats , and how should i tackle the scenario. below is the part of my main query which deviates the execution path due to wrong cardinality estimation.
      My column c1 of table tab1 holds javatimestamp values i.e. its NUMBER datatype which points to a date and time component only. And we gather stats each weekend on this table tab1.
      below is my query:
      select /*+gather_plan_statistics*/* from tab1
      where c1 BETWEEN 1346300090668 AND 1346325539486    ;
Plan hash value: 3167980259
| Id | Operation                   | Name                    | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads |
|   1 | TABLE ACCESS BY INDEX ROWID| tab1                    |      1 |   1    |    167K|00:01:13.72 |     158K| 12390 |
|* 2 |   INDEX RANGE SCAN          | IDX_N1                  |      1 |   1    |    167K|00:00:13.27 |   13880 |   1736 |
     Above shows a big gap in actual and estimated cardinality estimation, and its due to the fact that the HIGH_VALUE (1346203206173 points to 8/29/2012 1:20:06 AM) in DBA_TAB_COLUMN for     column C1 is well below the STARTRANGE(1346300090668 points to 8/30/2012 4:14:51 AM) and ENDRANGE(1346325539486 points to 8/30/2012 11:18:59 AM) of the BETWEEN clause.
     So even gathering stats daily on the table wont help me as because, in morning again it will require updated maxvalue for the column C1 for estimating proper, So how to handle this situation? Dont want to go with 'hint' , want to make the stats proper so that optimizer will automatically pick the right path.Edited by: 930254 on Aug 30, 2012 4:41 AM

930254 wrote:
hi ,
I am using Release 10.2.0.1.0 of Oracle. I am having a scenario in which i am getting poor execution plans due to stale stats , and how should i tackle the scenario. below is the part of my main query which deviates the execution path due to wrong cardinality estimation.
      My column c1 of table tab1 holds javatimestamp values i.e. its NUMBER datatype which points to a date and time component only. And we gather stats each weekend on this table tab1.
      below is my query:
      select /*+gather_plan_statistics*/* from tab1
      where c1 BETWEEN 1346300090668 AND 1346325539486    ;
Plan hash value: 3167980259
| Id | Operation                   | Name                    | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads |
|   1 | TABLE ACCESS BY INDEX ROWID| tab1                    |      1 |   1    |    167K|00:01:13.72 |     158K| 12390 |
|* 2 |   INDEX RANGE SCAN          | IDX_N1                  |      1 |   1    |    167K|00:00:13.27 |   13880 |   1736 |
     Above shows a big gap in actual and estimated cardinality estimation, and its due to the fact that the HIGH_VALUE (1346203206173 points to 8/29/2012 1:20:06 AM) in DBA_TAB_COLUMN for     column C1 is well below the STARTRANGE(1346300090668 points to 8/30/2012 4:14:51 AM) and ENDRANGE(1346325539486 points to 8/30/2012 11:18:59 AM) of the BETWEEN clause.
     So even gathering stats daily on the table wont help me as because, in morning again it will require updated maxvalue for the column C1 for estimating proper, So how to handle this situation? Dont want to go with 'hint' , want to make the stats proper so that optimizer will automatically pick the right path.Edited by: 930254 on Aug 30, 2012 4:41 AMUm, refresh the stats on a regular basis?
Oracle 10 and later has a default job to do that. Runs at 2200 daily.
If you need an 'on demand' refresh, that's easy enough to set up.

Any way to Stale state state to active

Hi
I am using BPM suite 11.1.1.6.I am using the Human Task functionality and I have few questions related to redeployment of composite application.
1)We are planning to migrate the weblogic server to new hardware. How I can migrate the active workflow requests to new Weblogic server ?
2)Is there any way to activate the stale workflow request to active if I accidentally undeployed the composite application ?
Thanks
Suneesh

Thank you Dmitri!
Actually, I was looking to do this pro grammatically - but thanks for pointing to the relevant class!
CacheFactory.getClusterConfig().writeXml(new PrintWriter(System.out, true), true);
Edited by: Andrey Belomutskiy on Jul 15, 2010 11:28 AM

Hibernate Stale State Exception

Hi all
I'm facing a problem while updating the object in hibernate
THe object1 has a mapping for another object2 as a set
I'm trying to to get the Object 1 and fetch the Object 2 from Object 1
updating the contents of Object2 , setting it back it Object1 and clling the update for Object1, I get an exception as "hibernate.stalestate Exception."
How can this problem be solved or if i'm doing anything wrong , how can that be corrected.
ANy help would be highly appreciated.
//gets the main object
oObject1= (Object1) Helpers.fetchObject1(ID);
//retrieves the object2 from object1
Object2 oObj2 = new Object2
Iterator it = oObject1.getObject2().iterator();
oData = (Object2) it.next();
String sEncGreeting = oData.getGreeting();
Set oSet = new HashSet();
oSet.add(oData);
oObject1.setObject2(oSet);
return oObject1;

These warnings mean you should have a logger such as log4j installed. Since its just a warning, you can ignore it:
log4j:WARN No appenders could be found for logger (org.hibernate.cfg.Environment).
log4j:WARN Please initialize the log4j system properly.
This means your sample.hbm.xml file has a problem Either your syntax in invalid in it, its not well formed (each
opening tag needs a closing tag). Compare to example *.hbm.xml files to see what the problem is.
Exception occur:Could not parse mapping document from resource event/Sample.hbm.xml
org.hibernate.InvalidMappingException: Could not parse mapping document from resource event/Sample.hbm.xml
This generally means the tags are suppose to occur in a certain order which you violated:
match "(meta*,subselect?,cache?,synchronize*,comment?,tuplizer*,(id|composite-id),discriminator?,natural-id?,(version|timestamp)?,(property|many-to-one|one-to-one|component|dynamic-component|properties|any|map|set|list|bag|idbag|array|primitive-array)*,((join*,subclass*)|joined-subclass*|union-subclass*),loader?,sql-insert?,sql-update?,sql-delete?,filter*,resultset*,(query|sql-query)*)".
I can't help you further except to suggest you try locating a completely different example from a different source and try to create a project out of it than try to debug this example. Its also better to read a book on hibernate (such as 'Hibernate in Action' since there is a lot of concepts to learn that a simple tutorial may not provide.

Gather Schema Statistics - GATHER AUTO option failing to gather stats

Hi ,
We recently upgraded to 10g DB and 11.5.10 version of Oracle EBS. I want to employ GATHER AUTO option while running Gather Schema Statistics.
To test the working, I created a test table with 1 million rows. Then, stats were gathered for this table alone by using Gather Table Stats. Now, I deleted ~12% of rows & issued commit. The table all_tab_statistics shows that the table has stale statistics (stale stats column = YES). After that I ran Gather Schema Stats for that particular schema. But the request did not pick the test table to be gathered.
What is the criterion on which Oracle chooses which all tables to be gather statistics for under Gather Auto option? I am aware of the 10% change in data, but how is this 10% calculated? Is it only based on (insert + update + delete)?
Also, what is the difference between Gather Auto and Gather Stale ?
Any help is appreciated.
Thanks,
Jithin

Randalf,
FYI.. this is what happens inside the concurrent progarm call.. there are a few additional parameters for output/ error msgs:
procedure GATHER_SCHEMA_STATS(errbuf out varchar2,
retcode out varchar2,
schemaname in varchar2,
estimate_percent in number,
degree in number ,
internal_flag in varchar2,
request_id in number,
hmode in varchar2 default 'LASTRUN',
options in varchar2 default 'GATHER',
modpercent in number default 10,
invalidate in varchar2 default 'Y'
is
exist_insufficient exception;
bad_input exception;
pragma exception_init(exist_insufficient,-20000);
pragma exception_init(bad_input,-20001);
l_message varchar2(1000);
Error_counter number := 0;
Errors Error_Out;
-- num_request_id number(15);
conc_request_id number(15);
degree_parallel number(2);
begin
-- Set the package body variable.
stathist := hmode;
-- check first if degree is null
if degree is null then
degree_parallel:=def_degree;
else
degree_parallel := degree;
end if;
l_message := 'In GATHER_SCHEMA_STATS , schema_name= '|| schemaname
|| ' percent= '|| to_char(estimate_percent) || ' degree = '
|| to_char(degree_parallel) || ' internal_flag= '|| internal_flag ;
FND_FILE.put_line(FND_FILE.log,l_message);
BEGIN
FND_STATS.GATHER_SCHEMA_STATS(schemaname, estimate_percent,
degree_parallel, internal_flag, Errors, request_id,stathist,
options,modpercent,invalidate);
exception
when exist_insufficient then
errbuf := sqlerrm ;
retcode := '2';
l_message := errbuf;
FND_FILE.put_line(FND_FILE.log,l_message);
raise;
when bad_input then
errbuf := sqlerrm ;
retcode := '2';
l_message := errbuf;
FND_FILE.put_line(FND_FILE.log,l_message);
raise;
when others then
errbuf := sqlerrm ;
retcode := '2';
l_message := errbuf;
FND_FILE.put_line(FND_FILE.log,l_message);
raise;
END;
FOR i in 0..MAX_ERRORS_PRINTED LOOP
exit when Errors(i) is null;
Error_counter:=i+1;
FND_FILE.put_line(FND_FILE.log,'Error #'||Error_counter||
': '||Errors(i));
-- added to send back status to concurrent program manager bug 2625022
errbuf := sqlerrm ;
retcode := '2';
END LOOP;
end;

Gather stats on every table in a schema

Hi,
i have an CRM application running on 10g R2 db. it has 5000 tbls on which less than 10% of tables are dynamic. gather stats job runs every day at 2am successfully.
i was monitoring the statistics(dba_tables, dba_tab_modifications, dba_tab_statistics), noticed that only 28 tables r been update with latest stats every day for CRM schema and most of these tables are same. during query tunning i found that some tables has stale stats, but it does't figure in column stale of dba_tab_statistics, but it shows no of rows inserted, updated in tab_modifications.
my question is there any draw back in gathering stats for all the tables every day irrespective of data is loaded with 10% or not and but not for tables with no rows..

thanks for the quick response, it was helpful.
due to application vendor recommendations, for some tables stats were disabled and optimizer parameter were changed which causes dynamic sample not using dynamic stats gather for some queries as they use the tables with no stats. as per documentation it would be calculating the stats on fly when the query the tables which stats has not been updated.
as of now i am not gathering stats manually for this schema, as auto is scheduled. and will verify if indeed on 10% of data is loaded it updates the stats or not then i may manually gather stats for only those tables.

What are the database resources when collecting stats using dbms_stats

Hello,
We have tables that contain stale stats and would want to collect stats using dbms_stats with estiamte of 30%. What are the database resources that would be consummed when dbms_stats is used on a table? Also, would the table be locked during dbms_stats? Thank you.

1) I'm not sure what resources you're talking about. Obviously, gathering statistics requires I/O since you've got to read lots of data from the table. It requires CPU particularly if you are gathering histograms. It requires RAM to the extent that you'll be doing sorts in PGA and to the extent that you'll be putting blocks in the buffer cache (and thus causing other blocks to age out), etc. Depending on whether you immediately invalidate query plans, you may force other sessions to start doing a lot more hard parsing as well.
2) You cannot do DDL on a table while you are gathering statistics, but you can do DML. You would generally not want to gather statistics while an application is active.
Justin

Partition Gurus - Another help on Stale Statistics

Hi All,
A simple question for all gurus here.
MY DBA was saying I dont need to ANALYZE TABLE COMPUTE STATISTICS on all tables every night. He said the CBO will collect STALE STATISTICS every night. I Only need to ANALYZE INDEX COMPUTE STATISTICS every night.
IS this true? what exactly is stale stats?
Thanks and Regards,
Saff

- What version of Oracle are we talking about?
- Assuming 10g or later, is the default GATHER_STATS_JOB running?
- If so, have any of the defaults been changed?
- The ANALYZE command has been deprecated for quite some time as a means of gathering statistics for the optimizer. Unless you're on 8.1.5 or something, you should always be using DBMS_STATS rather than ANALYZE to gather optimizer statistics.
Assuming you are on 10g or later and that the default GATHER_STATS_JOB is running, there would normally be no need to manually gather statistics on tables or indexes.
Justin

Why is a workflow "stale"?

I'm writing a workflow to send an email each time an user is created and I'm calling this workflow from a POJO. Everything looks good, no errors, in debug mode in my POJO I see that the workflow starts, there are no messages in my inbox. This is what my log says:
day.cq.workflow.impl.CQWorkflowSession Workflow instance started with model: /etc/workflow/models/send-user-email/jcr:content/model and ID: /etc/workflow/instances/2013-04-04/model_17333941154731 for payload: /home/users/group01/[email protected]
My model only includes a workflow process step to send the email. Here comes the problem... it's not sending the email, I'm debuging the workflow and is not passing through the step that I wrote. In the workflow console the workflow the instance says "STALE". What causes the workflow going to stale state?
Thanks!

Hi Rudy,
It would be hard to tell from the info you have given, but I am assuming you chose the "Send Emnail" Process step and have it correctly configured [0]. You could enable DEBUG level logging for the Workflow packages [1] and see if there is any extra information output from their, also maybe the Mail Service packages. I would also make sure you have configured the Mail service correctly [2] and that you have rights to send the email using the 'from' through the mail server you have configured.
[0] http://dev.day.com/docs/en/cq/current/workflows/wf-ref.html
[1] http://dev.day.com/docs/en/cq/current/deploying/configure_logging.html#Loggers and Writers for Individual Services
[2] http://dev.day.com/docs/en/cq/current/administering/notification.html#Configuring%20the%20 Mail%20Service
Thanks -- David

Stale BPEL Process instances

Is there a configuration setting in BPEL that would allow existing BPEL instances to complete normally (instead of being made STALE) when the associated BPEL Process is updated and redeployed?

Unfortunately not, the way around this is to use versioning. If you increment the version number no process will go into a stale state.
cheers
James

SCHEMA STATS SLOW SOME TIME

Hi All,
I generate stats on weekly basis and it completes in 6-8 hours but some time it takes 26-40 hours to complete the stats.
There is no speacial load on the database when it takes 26-40 hours.
===SQL STATEMENT====
exec dbms_stats.gather_schema_stats(ownname => 'schemaname', estimate_percent => 20, cascade => true, options => 'GATHER');
=====================
Any advices.
Regards,
Umair

Umair,
I would suggest you few options:
Why do you collect entire schema stats every week? Collect stats on objects which have heavy dml. Enable table monitoring to your schema, its default in 10g. You will have no performance impact by enabling table monitoring because it keeps the information in a separate area of memory that doesn't impact on the performance.
Once you enable table monitoring to entire schema, use gather_stale option with dbms_stats package to collect stale stats. For more, search in the metalink.
If the system is multi CPU, then, make use of DEGREE option with dbms_stats package.
Generally 10% estimation is acceptable. You are collecting 10% on very big tables, all small tables will collect 100% stats, dispite of your estimate percentage.
select table_name,num_rows,sample_size,(sample_size/num_rows*100) "%" from user_tables where num_rows > 0 and sample_size > 0
/

Accessing TCP connection state information

i am trying to find a method of access to the current state information (and other data would be nice) of TCP sessions. i noticed that in /usr/include/netinet/tcp_var.h it would appear as though there is support for this type of thing, however i cannot seem to locate any supporting documentation. thanks for any help,
/kris

i think perhaps we are miscommunicating. i would like to be able to access this information (yes, programmatically) and hopefully obtain some primitive read locking so i do not end up with stale state information as soon as i have obtained it. i apologize if you thought i would post a question such as how to operate netstat to this discussion group. thanks again,
/kris

Python openbox pipe menu

I somewhat hijacked a different thread and my question is more suited here.
I'm using a python script to check gmail in a pipe menu. At first it was creating problems because it would create a cache but would then not load until the file was removed. To fix this, I removed the last line (which created the cache) and it all works. However, I would prefer to have it work like it was intended.
The script:
#!/usr/bin/python
# Authors: [email protected] [email protected]
# License: GPL 2.0
# Usage:
# Put an entry in your ~/.config/openbox/menu.xml like this:
# <menu id="gmail" label="gmail" execute="~/.config/openbox/scripts/gmail-openbox.py" />
# And inside <menu id="root-menu" label="openbox">, add this somewhere (wherever you want it on your menu)
# <menu id="gmail" />
import os
import sys
import logging
name = "111111"
pw = "000000"
browser = "firefox3"
filename = "/tmp/.gmail.cache"
login = "\'https://mail.google.com/mail\'"
# Allow us to run using installed `libgmail` or the one in parent directory.
try:
import libgmail
except ImportError:
# Urghhh...
sys.path.insert(1,
os.path.realpath(os.path.join(os.path.dirname(__file__),
os.path.pardir)))
import libgmail
if __name__ == "__main__":
import sys
from getpass import getpass
if not os.path.isfile(filename):
ga = libgmail.GmailAccount(name, pw)
try:
ga.login()
except libgmail.GmailLoginFailure:
print "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
print "<openbox_pipe_menu>"
print " <item label=\"login failed.\">"
print " <action name=\"Execute\"><execute>" + browser + " " + login + "</execute></action>"
print " </item>"
print "</openbox_pipe_menu>"
raise SystemExit
else:
ga = libgmail.GmailAccount(
state = libgmail.GmailSessionState(filename = filename))
msgtotals = ga.getUnreadMsgCount()
print "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
print "<openbox_pipe_menu>"
print "<separator label=\"Gmail\"/>"
if msgtotals == 0:
print " <item label=\"no new messages.\">"
elif msgtotals == 1:
print " <item label=\"1 new message.\">"
else:
print " <item label=\"" + str(msgtotals) + " new messages.\">"
print " <action name=\"Execute\"><execute>" + browser + " " + login + "</execute></action>"
print " </item>"
print "</openbox_pipe_menu>"
state = libgmail.GmailSessionState(account = ga).save(filename)
The line I removed:
state = libgmail.GmailSessionState(account = ga).save(filename)
The error I'd get if the cache existed:
Traceback (most recent call last):
File "/home/shawn/.config/openbox/scripts/gmail.py", line 56, in <module>
msgtotals = ga.getUnreadMsgCount()
File "/home/shawn/.config/openbox/scripts/libgmail.py", line 547, in getUnreadMsgCount
q = "is:" + U_AS_SUBSET_UNREAD)
File "/home/shawn/.config/openbox/scripts/libgmail.py", line 428, in _parseSearchResult
return self._parsePage(_buildURL(**params))
File "/home/shawn/.config/openbox/scripts/libgmail.py", line 401, in _parsePage
items = _parsePage(self._retrievePage(urlOrRequest))
File "/home/shawn/.config/openbox/scripts/libgmail.py", line 369, in _retrievePage
if self.opener is None:
AttributeError: GmailAccount instance has no attribute 'opener'
EDIT - you might need the libgmail.py
#!/usr/bin/env python
# libgmail -- Gmail access via Python
## To get the version number of the available libgmail version.
## Reminder: add date before next release. This attribute is also
## used in the setup script.
Version = '0.1.8' # (Nov 2007)
# Original author: [email protected]
# Maintainers: Waseem ([email protected]) and Stas Z ([email protected])
# License: GPL 2.0
# NOTE:
# You should ensure you are permitted to use this script before using it
# to access Google's Gmail servers.
# Gmail Implementation Notes
# ==========================
# * Folders contain message threads, not individual messages. At present I
# do not know any way to list all messages without processing thread list.
LG_DEBUG=0
from lgconstants import *
import os,pprint
import re
import urllib
import urllib2
import mimetypes
import types
from cPickle import load, dump
from email.MIMEBase import MIMEBase
from email.MIMEText import MIMEText
from email.MIMEMultipart import MIMEMultipart
GMAIL_URL_LOGIN = "https://www.google.com/accounts/ServiceLoginBoxAuth"
GMAIL_URL_GMAIL = "https://mail.google.com/mail/?ui=1&"
# Set to any value to use proxy.
PROXY_URL = None # e.g. libgmail.PROXY_URL = 'myproxy.org:3128'
# TODO: Get these on the fly?
STANDARD_FOLDERS = [U_INBOX_SEARCH, U_STARRED_SEARCH,
U_ALL_SEARCH, U_DRAFTS_SEARCH,
U_SENT_SEARCH, U_SPAM_SEARCH]
# Constants with names not from the Gmail Javascript:
# TODO: Move to `lgconstants.py`?
U_SAVEDRAFT_VIEW = "sd"
D_DRAFTINFO = "di"
# NOTE: All other DI_* field offsets seem to match the MI_* field offsets
DI_BODY = 19
versionWarned = False # If the Javascript version is different have we
# warned about it?
RE_SPLIT_PAGE_CONTENT = re.compile("D\((.*?)\);", re.DOTALL)
class GmailError(Exception):
Exception thrown upon gmail-specific failures, in particular a
failure to log in and a failure to parse responses.
pass
class GmailSendError(Exception):
Exception to throw if we're unable to send a message
pass
def _parsePage(pageContent):
Parse the supplied HTML page and extract useful information from
the embedded Javascript.
lines = pageContent.splitlines()
data = '\n'.join([x for x in lines if x and x[0] in ['D', ')', ',', ']']])
#data = data.replace(',,',',').replace(',,',',')
data = re.sub(',{2,}', ',', data)
result = []
try:
exec data in {'__builtins__': None}, {'D': lambda x: result.append(x)}
except SyntaxError,info:
print info
raise GmailError, 'Failed to parse data returned from gmail.'
items = result
itemsDict = {}
namesFoundTwice = []
for item in items:
name = item[0]
try:
parsedValue = item[1:]
except Exception:
parsedValue = ['']
if itemsDict.has_key(name):
# This handles the case where a name key is used more than
# once (e.g. mail items, mail body etc) and automatically
# places the values into list.
# TODO: Check this actually works properly, it's early... :-)
if len(parsedValue) and type(parsedValue[0]) is types.ListType:
for item in parsedValue:
itemsDict[name].append(item)
else:
itemsDict[name].append(parsedValue)
else:
if len(parsedValue) and type(parsedValue[0]) is types.ListType:
itemsDict[name] = []
for item in parsedValue:
itemsDict[name].append(item)
else:
itemsDict[name] = [parsedValue]
return itemsDict
def _splitBunches(infoItems):# Is this still needed ?? Stas
Utility to help make it easy to iterate over each item separately,
even if they were bunched on the page.
result= []
# TODO: Decide if this is the best approach.
for group in infoItems:
if type(group) == tuple:
result.extend(group)
else:
result.append(group)
return result
class SmartRedirectHandler(urllib2.HTTPRedirectHandler):
def __init__(self, cookiejar):
self.cookiejar = cookiejar
def http_error_302(self, req, fp, code, msg, headers):
# The location redirect doesn't seem to change
# the hostname header appropriately, so we do
# by hand. (Is this a bug in urllib2?)
new_host = re.match(r'http[s]*://(.*?\.google\.com)',
headers.getheader('Location'))
if new_host:
req.add_header("Host", new_host.groups()[0])
result = urllib2.HTTPRedirectHandler.http_error_302(
self, req, fp, code, msg, headers)
return result
class CookieJar:
A rough cookie handler, intended to only refer to one domain.
Does no expiry or anything like that.
(The only reason this is here is so I don't have to require
the `ClientCookie` package.)
def __init__(self):
self._cookies = {}
def extractCookies(self, headers, nameFilter = None):
# TODO: Do this all more nicely?
for cookie in headers.getheaders('Set-Cookie'):
name, value = (cookie.split("=", 1) + [""])[:2]
if LG_DEBUG: print "Extracted cookie `%s`" % (name)
if not nameFilter or name in nameFilter:
self._cookies[name] = value.split(";")[0]
if LG_DEBUG: print "Stored cookie `%s` value `%s`" % (name, self._cookies[name])
if self._cookies[name] == "EXPIRED":
if LG_DEBUG:
print "We got an expired cookie: %s:%s, deleting." % (name, self._cookies[name])
del self._cookies[name]
def addCookie(self, name, value):
self._cookies[name] = value
def setCookies(self, request):
request.add_header('Cookie',
";".join(["%s=%s" % (k,v)
for k,v in self._cookies.items()]))
def _buildURL(**kwargs):
return "%s%s" % (URL_GMAIL, urllib.urlencode(kwargs))
def _paramsToMime(params, filenames, files):
mimeMsg = MIMEMultipart("form-data")
for name, value in params.iteritems():
mimeItem = MIMEText(value)
mimeItem.add_header("Content-Disposition", "form-data", name=name)
# TODO: Handle this better...?
for hdr in ['Content-Type','MIME-Version','Content-Transfer-Encoding']:
del mimeItem[hdr]
mimeMsg.attach(mimeItem)
if filenames or files:
filenames = filenames or []
files = files or []
for idx, item in enumerate(filenames + files):
# TODO: This is messy, tidy it...
if isinstance(item, str):
# We assume it's a file path...
filename = item
contentType = mimetypes.guess_type(filename)[0]
payload = open(filename, "rb").read()
else:
# We assume it's an `email.Message.Message` instance...
# TODO: Make more use of the pre-encoded information?
filename = item.get_filename()
contentType = item.get_content_type()
payload = item.get_payload(decode=True)
if not contentType:
contentType = "application/octet-stream"
mimeItem = MIMEBase(*contentType.split("/"))
mimeItem.add_header("Content-Disposition", "form-data",
name="file%s" % idx, filename=filename)
# TODO: Encode the payload?
mimeItem.set_payload(payload)
# TODO: Handle this better...?
for hdr in ['MIME-Version','Content-Transfer-Encoding']:
del mimeItem[hdr]
mimeMsg.attach(mimeItem)
del mimeMsg['MIME-Version']
return mimeMsg
class GmailLoginFailure(Exception):
Raised whenever the login process fails--could be wrong username/password,
or Gmail service error, for example.
Extract the error message like this:
try:
foobar
except GmailLoginFailure,e:
mesg = e.message# or
print e# uses the __str__
def __init__(self,message):
self.message = message
def __str__(self):
return repr(self.message)
class GmailAccount:
def __init__(self, name = "", pw = "", state = None, domain = None):
global URL_LOGIN, URL_GMAIL
self.domain = domain
if self.domain:
URL_LOGIN = "https://www.google.com/a/" + self.domain + "/LoginAction"
URL_GMAIL = "http://mail.google.com/a/" + self.domain + "/?"
else:
URL_LOGIN = GMAIL_URL_LOGIN
URL_GMAIL = GMAIL_URL_GMAIL
if name and pw:
self.name = name
self._pw = pw
self._cookieJar = CookieJar()
if PROXY_URL is not None:
import gmail_transport
self.opener = urllib2.build_opener(gmail_transport.ConnectHTTPHandler(proxy = PROXY_URL),
gmail_transport.ConnectHTTPSHandler(proxy = PROXY_URL),
SmartRedirectHandler(self._cookieJar))
else:
self.opener = urllib2.build_opener(
urllib2.HTTPHandler(debuglevel=0),
urllib2.HTTPSHandler(debuglevel=0),
SmartRedirectHandler(self._cookieJar))
elif state:
# TODO: Check for stale state cookies?
self.name, self._cookieJar = state.state
else:
raise ValueError("GmailAccount must be instantiated with " \
"either GmailSessionState object or name " \
"and password.")
self._cachedQuotaInfo = None
self._cachedLabelNames = None
def login(self):
# TODO: Throw exception if we were instantiated with state?
if self.domain:
data = urllib.urlencode({'continue': URL_GMAIL,
'at' : 'null',
'service' : 'mail',
'userName': self.name,
'password': self._pw,
else:
data = urllib.urlencode({'continue': URL_GMAIL,
'Email': self.name,
'Passwd': self._pw,
headers = {'Host': 'www.google.com',
'User-Agent': 'Mozilla/5.0 (Compatible; libgmail-python)'}
req = urllib2.Request(URL_LOGIN, data=data, headers=headers)
pageData = self._retrievePage(req)
if not self.domain:
# The GV cookie no longer comes in this page for
# "Apps", so this bottom portion is unnecessary for it.
# This requests the page that provides the required "GV" cookie.
RE_PAGE_REDIRECT = 'CheckCookie\?continue=([^"\']+)'
# TODO: Catch more failure exceptions here...?
try:
link = re.search(RE_PAGE_REDIRECT, pageData).group(1)
redirectURL = urllib2.unquote(link)
redirectURL = redirectURL.replace('\\x26', '&')
except AttributeError:
raise GmailLoginFailure("Login failed. (Wrong username/password?)")
# We aren't concerned with the actual content of this page,
# just the cookie that is returned with it.
pageData = self._retrievePage(redirectURL)
def _retrievePage(self, urlOrRequest):
if self.opener is None:
raise "Cannot find urlopener"
if not isinstance(urlOrRequest, urllib2.Request):
req = urllib2.Request(urlOrRequest)
else:
req = urlOrRequest
self._cookieJar.setCookies(req)
req.add_header('User-Agent',
'Mozilla/5.0 (Compatible; libgmail-python)')
try:
resp = self.opener.open(req)
except urllib2.HTTPError,info:
print info
return None
pageData = resp.read()
# Extract cookies here
self._cookieJar.extractCookies(resp.headers)
# TODO: Enable logging of page data for debugging purposes?
return pageData
def _parsePage(self, urlOrRequest):
Retrieve & then parse the requested page content.
items = _parsePage(self._retrievePage(urlOrRequest))
# Automatically cache some things like quota usage.
# TODO: Cache more?
# TODO: Expire cached values?
# TODO: Do this better.
try:
self._cachedQuotaInfo = items[D_QUOTA]
except KeyError:
pass
#pprint.pprint(items)
try:
self._cachedLabelNames = [category[CT_NAME] for category in items[D_CATEGORIES][0]]
except KeyError:
pass
return items
def _parseSearchResult(self, searchType, start = 0, **kwargs):
params = {U_SEARCH: searchType,
U_START: start,
U_VIEW: U_THREADLIST_VIEW,
params.update(kwargs)
return self._parsePage(_buildURL(**params))
def _parseThreadSearch(self, searchType, allPages = False, **kwargs):
Only works for thread-based results at present. # TODO: Change this?
start = 0
tot = 0
threadsInfo = []
# Option to get *all* threads if multiple pages are used.
while (start == 0) or (allPages and
len(threadsInfo) < threadListSummary[TS_TOTAL]):
items = self._parseSearchResult(searchType, start, **kwargs)
#TODO: Handle single & zero result case better? Does this work?
try:
threads = items[D_THREAD]
except KeyError:
break
else:
for th in threads:
if not type(th[0]) is types.ListType:
th = [th]
threadsInfo.append(th)
# TODO: Check if the total or per-page values have changed?
threadListSummary = items[D_THREADLIST_SUMMARY][0]
threadsPerPage = threadListSummary[TS_NUM]
start += threadsPerPage
# TODO: Record whether or not we retrieved all pages..?
return GmailSearchResult(self, (searchType, kwargs), threadsInfo)
def _retrieveJavascript(self, version = ""):
Note: `version` seems to be ignored.
return self._retrievePage(_buildURL(view = U_PAGE_VIEW,
name = "js",
ver = version))
def getMessagesByFolder(self, folderName, allPages = False):
Folders contain conversation/message threads.
`folderName` -- As set in Gmail interface.
Returns a `GmailSearchResult` instance.
*** TODO: Change all "getMessagesByX" to "getThreadsByX"? ***
return self._parseThreadSearch(folderName, allPages = allPages)
def getMessagesByQuery(self, query, allPages = False):
Returns a `GmailSearchResult` instance.
return self._parseThreadSearch(U_QUERY_SEARCH, q = query,
allPages = allPages)
def getQuotaInfo(self, refresh = False):
Return MB used, Total MB and percentage used.
# TODO: Change this to a property.
if not self._cachedQuotaInfo or refresh:
# TODO: Handle this better...
self.getMessagesByFolder(U_INBOX_SEARCH)
return self._cachedQuotaInfo[0][:3]
def getLabelNames(self, refresh = False):
# TODO: Change this to a property?
if not self._cachedLabelNames or refresh:
# TODO: Handle this better...
self.getMessagesByFolder(U_INBOX_SEARCH)
return self._cachedLabelNames
def getMessagesByLabel(self, label, allPages = False):
return self._parseThreadSearch(U_CATEGORY_SEARCH,
cat=label, allPages = allPages)
def getRawMessage(self, msgId):
# U_ORIGINAL_MESSAGE_VIEW seems the only one that returns a page.
# All the other U_* results in a 404 exception. Stas
PageView = U_ORIGINAL_MESSAGE_VIEW
return self._retrievePage(
_buildURL(view=PageView, th=msgId))
def getUnreadMessages(self):
return self._parseThreadSearch(U_QUERY_SEARCH,
q = "is:" + U_AS_SUBSET_UNREAD)
def getUnreadMsgCount(self):
items = self._parseSearchResult(U_QUERY_SEARCH,
q = "is:" + U_AS_SUBSET_UNREAD)
try:
result = items[D_THREADLIST_SUMMARY][0][TS_TOTAL_MSGS]
except KeyError:
result = 0
return result
def _getActionToken(self):
try:
at = self._cookieJar._cookies[ACTION_TOKEN_COOKIE]
except KeyError:
self.getLabelNames(True)
at = self._cookieJar._cookies[ACTION_TOKEN_COOKIE]
return at
def sendMessage(self, msg, asDraft = False, _extraParams = None):
`msg` -- `GmailComposedMessage` instance.
`_extraParams` -- Dictionary containing additional parameters
to put into POST message. (Not officially
for external use, more to make feature
additional a little easier to play with.)
Note: Now returns `GmailMessageStub` instance with populated
`id` (and `_account`) fields on success or None on failure.
# TODO: Handle drafts separately?
params = {U_VIEW: [U_SENDMAIL_VIEW, U_SAVEDRAFT_VIEW][asDraft],
U_REFERENCED_MSG: "",
U_THREAD: "",
U_DRAFT_MSG: "",
U_COMPOSEID: "1",
U_ACTION_TOKEN: self._getActionToken(),
U_COMPOSE_TO: msg.to,
U_COMPOSE_CC: msg.cc,
U_COMPOSE_BCC: msg.bcc,
"subject": msg.subject,
"msgbody": msg.body,
if _extraParams:
params.update(_extraParams)
# Amongst other things, I used the following post to work out this:
# <http://groups.google.com/groups?
# selm=mailman.1047080233.20095.python-list%40python.org>
mimeMessage = _paramsToMime(params, msg.filenames, msg.files)
#### TODO: Ughh, tidy all this up & do it better...
## This horrible mess is here for two main reasons:
## 1. The `Content-Type` header (which also contains the boundary
## marker) needs to be extracted from the MIME message so
## we can send it as the request `Content-Type` header instead.
## 2. It seems the form submission needs to use "\r\n" for new
## lines instead of the "\n" returned by `as_string()`.
## I tried changing the value of `NL` used by the `Generator` class
## but it didn't work so I'm doing it this way until I figure
## out how to do it properly. Of course, first try, if the payloads
## contained "\n" sequences they got replaced too, which corrupted
## the attachments. I could probably encode the submission,
## which would probably be nicer, but in the meantime I'm kludging
## this workaround that replaces all non-text payloads with a
## marker, changes all "\n" to "\r\n" and finally replaces the
## markers with the original payloads.
## Yeah, I know, it's horrible, but hey it works doesn't it? If you've
## got a problem with it, fix it yourself & give me the patch!
origPayloads = {}
FMT_MARKER = "&&&&&&%s&&&&&&"
for i, m in enumerate(mimeMessage.get_payload()):
if not isinstance(m, MIMEText): #Do we care if we change text ones?
origPayloads[i] = m.get_payload()
m.set_payload(FMT_MARKER % i)
mimeMessage.epilogue = ""
msgStr = mimeMessage.as_string()
contentTypeHeader, data = msgStr.split("\n\n", 1)
contentTypeHeader = contentTypeHeader.split(":", 1)
data = data.replace("\n", "\r\n")
for k,v in origPayloads.iteritems():
data = data.replace(FMT_MARKER % k, v)
req = urllib2.Request(_buildURL(), data = data)
req.add_header(*contentTypeHeader)
items = self._parsePage(req)
# TODO: Check composeid?
# Sometimes we get the success message
# but the id is 0 and no message is sent
result = None
resultInfo = items[D_SENDMAIL_RESULT][0]
if resultInfo[SM_SUCCESS]:
result = GmailMessageStub(id = resultInfo[SM_NEWTHREADID],
_account = self)
else:
raise GmailSendError, resultInfo[SM_MSG]
return result
def trashMessage(self, msg):
# TODO: Decide if we should make this a method of `GmailMessage`.
# TODO: Should we check we have been given a `GmailMessage` instance?
params = {
U_ACTION: U_DELETEMESSAGE_ACTION,
U_ACTION_MESSAGE: msg.id,
U_ACTION_TOKEN: self._getActionToken(),
items = self._parsePage(_buildURL(**params))
# TODO: Mark as trashed on success?
return (items[D_ACTION_RESULT][0][AR_SUCCESS] == 1)
def _doThreadAction(self, actionId, thread):
# TODO: Decide if we should make this a method of `GmailThread`.
# TODO: Should we check we have been given a `GmailThread` instance?
params = {
U_SEARCH: U_ALL_SEARCH, #TODO:Check this search value always works.
U_VIEW: U_UPDATE_VIEW,
U_ACTION: actionId,
U_ACTION_THREAD: thread.id,
U_ACTION_TOKEN: self._getActionToken(),
items = self._parsePage(_buildURL(**params))
return (items[D_ACTION_RESULT][0][AR_SUCCESS] == 1)
def trashThread(self, thread):
# TODO: Decide if we should make this a method of `GmailThread`.
# TODO: Should we check we have been given a `GmailThread` instance?
result = self._doThreadAction(U_MARKTRASH_ACTION, thread)
# TODO: Mark as trashed on success?
return result
def _createUpdateRequest(self, actionId): #extraData):
Helper method to create a Request instance for an update (view)
action.
Returns populated `Request` instance.
params = {
U_VIEW: U_UPDATE_VIEW,
data = {
U_ACTION: actionId,
U_ACTION_TOKEN: self._getActionToken(),
#data.update(extraData)
req = urllib2.Request(_buildURL(**params),
data = urllib.urlencode(data))
return req
# TODO: Extract additional common code from handling of labels?
def createLabel(self, labelName):
req = self._createUpdateRequest(U_CREATECATEGORY_ACTION + labelName)
# Note: Label name cache is updated by this call as well. (Handy!)
items = self._parsePage(req)
print items
return (items[D_ACTION_RESULT][0][AR_SUCCESS] == 1)
def deleteLabel(self, labelName):
# TODO: Check labelName exits?
req = self._createUpdateRequest(U_DELETECATEGORY_ACTION + labelName)
# Note: Label name cache is updated by this call as well. (Handy!)
items = self._parsePage(req)
return (items[D_ACTION_RESULT][0][AR_SUCCESS] == 1)
def renameLabel(self, oldLabelName, newLabelName):
# TODO: Check oldLabelName exits?
req = self._createUpdateRequest("%s%s^%s" % (U_RENAMECATEGORY_ACTION,
oldLabelName, newLabelName))
# Note: Label name cache is updated by this call as well. (Handy!)
items = self._parsePage(req)
return (items[D_ACTION_RESULT][0][AR_SUCCESS] == 1)
def storeFile(self, filename, label = None):
# TODO: Handle files larger than single attachment size.
# TODO: Allow file data objects to be supplied?
FILE_STORE_VERSION = "FSV_01"
FILE_STORE_SUBJECT_TEMPLATE = "%s %s" % (FILE_STORE_VERSION, "%s")
subject = FILE_STORE_SUBJECT_TEMPLATE % os.path.basename(filename)
msg = GmailComposedMessage(to="", subject=subject, body="",
filenames=[filename])
draftMsg = self.sendMessage(msg, asDraft = True)
if draftMsg and label:
draftMsg.addLabel(label)
return draftMsg
## CONTACTS SUPPORT
def getContacts(self):
Returns a GmailContactList object
that has all the contacts in it as
GmailContacts
contactList = []
# pnl = a is necessary to get *all* contacts
myUrl = _buildURL(view='cl',search='contacts', pnl='a')
myData = self._parsePage(myUrl)
# This comes back with a dictionary
# with entry 'cl'
addresses = myData['cl']
for entry in addresses:
if len(entry) >= 6 and entry[0]=='ce':
newGmailContact = GmailContact(entry[1], entry[2], entry[4], entry[5])
#### new code used to get all the notes
#### not used yet due to lockdown problems
##rawnotes = self._getSpecInfo(entry[1])
##print rawnotes
##newGmailContact = GmailContact(entry[1], entry[2], entry[4],rawnotes)
contactList.append(newGmailContact)
return GmailContactList(contactList)
def addContact(self, myContact, *extra_args):
Attempts to add a GmailContact to the gmail
address book. Returns true if successful,
false otherwise
Please note that after version 0.1.3.3,
addContact takes one argument of type
GmailContact, the contact to add.
The old signature of:
addContact(name, email, notes='') is still
supported, but deprecated.
if len(extra_args) > 0:
# The user has passed in extra arguments
# He/she is probably trying to invoke addContact
# using the old, deprecated signature of:
# addContact(self, name, email, notes='')
# Build a GmailContact object and use that instead
(name, email) = (myContact, extra_args[0])
if len(extra_args) > 1:
notes = extra_args[1]
else:
notes = ''
myContact = GmailContact(-1, name, email, notes)
# TODO: In the ideal world, we'd extract these specific
# constants into a nice constants file
# This mostly comes from the Johnvey Gmail API,
# but also from the gmail.py cited earlier
myURL = _buildURL(view='up')
myDataList = [ ('act','ec'),
('at', self._cookieJar._cookies['GMAIL_AT']), # Cookie data?
('ct_nm', myContact.getName()),
('ct_em', myContact.getEmail()),
('ct_id', -1 )
notes = myContact.getNotes()
if notes != '':
myDataList.append( ('ctf_n', notes) )
validinfokeys = [
'i', # IM
'p', # Phone
'd', # Company
'a', # ADR
'e', # Email
'm', # Mobile
'b', # Pager
'f', # Fax
't', # Title
'o', # Other
moreInfo = myContact.getMoreInfo()
ctsn_num = -1
if moreInfo != {}:
for ctsf,ctsf_data in moreInfo.items():
ctsn_num += 1
# data section header, WORK, HOME,...
sectionenum ='ctsn_%02d' % ctsn_num
myDataList.append( ( sectionenum, ctsf ))
ctsf_num = -1
if isinstance(ctsf_data[0],str):
ctsf_num += 1
# data section
subsectionenum = 'ctsf_%02d_%02d_%s' % (ctsn_num, ctsf_num, ctsf_data[0]) # ie. ctsf_00_01_p
myDataList.append( (subsectionenum, ctsf_data[1]) )
else:
for info in ctsf_data:
if validinfokeys.count(info[0]) > 0:
ctsf_num += 1
# data section
subsectionenum = 'ctsf_%02d_%02d_%s' % (ctsn_num, ctsf_num, info[0]) # ie. ctsf_00_01_p
myDataList.append( (subsectionenum, info[1]) )
myData = urllib.urlencode(myDataList)
request = urllib2.Request(myURL,
data = myData)
pageData = self._retrievePage(request)
if pageData.find("The contact was successfully added") == -1:
print pageData
if pageData.find("already has the email address") > 0:
raise Exception("Someone with same email already exists in Gmail.")
elif pageData.find("https://www.google.com/accounts/ServiceLogin"):
raise Exception("Login has expired.")
return False
else:
return True
def _removeContactById(self, id):
Attempts to remove the contact that occupies
id "id" from the gmail address book.
Returns True if successful,
False otherwise.
This is a little dangerous since you don't really
know who you're deleting. Really,
this should return the name or something of the
person we just killed.
Don't call this method.
You should be using removeContact instead.
myURL = _buildURL(search='contacts', ct_id = id, c=id, act='dc', at=self._cookieJar._cookies['GMAIL_AT'], view='up')
pageData = self._retrievePage(myURL)
if pageData.find("The contact has been deleted") == -1:
return False
else:
return True
def removeContact(self, gmailContact):
Attempts to remove the GmailContact passed in
Returns True if successful, False otherwise.
# Let's re-fetch the contact list to make
# sure we're really deleting the guy
# we think we're deleting
newContactList = self.getContacts()
newVersionOfPersonToDelete = newContactList.getContactById(gmailContact.getId())
# Ok, now we need to ensure that gmailContact
# is the same as newVersionOfPersonToDelete
# and then we can go ahead and delete him/her
if (gmailContact == newVersionOfPersonToDelete):
return self._removeContactById(gmailContact.getId())
else:
# We have a cache coherency problem -- someone
# else now occupies this ID slot.
# TODO: Perhaps signal this in some nice way
# to the end user?
print "Unable to delete."
print "Has someone else been modifying the contacts list while we have?"
print "Old version of person:",gmailContact
print "New version of person:",newVersionOfPersonToDelete
return False
## Don't remove this. contact stas
## def _getSpecInfo(self,id):
## Return all the notes data.
## This is currently not used due to the fact that it requests pages in
## a dos attack manner.
## myURL =_buildURL(search='contacts',ct_id=id,c=id,\
## at=self._cookieJar._cookies['GMAIL_AT'],view='ct')
## pageData = self._retrievePage(myURL)
## myData = self._parsePage(myURL)
## #print "\nmyData form _getSpecInfo\n",myData
## rawnotes = myData['cov'][7]
## return rawnotes
class GmailContact:
Class for storing a Gmail Contacts list entry
def __init__(self, name, email, *extra_args):
Returns a new GmailContact object
(you can then call addContact on this to commit
it to the Gmail addressbook, for example)
Consider calling setNotes() and setMoreInfo()
to add extended information to this contact
# Support populating other fields if we're trying
# to invoke this the old way, with the old constructor
# whose signature was __init__(self, id, name, email, notes='')
id = -1
notes = ''
if len(extra_args) > 0:
(id, name) = (name, email)
email = extra_args[0]
if len(extra_args) > 1:
notes = extra_args[1]
else:
notes = ''
self.id = id
self.name = name
self.email = email
self.notes = notes
self.moreInfo = {}
def __str__(self):
return "%s %s %s %s" % (self.id, self.name, self.email, self.notes)
def __eq__(self, other):
if not isinstance(other, GmailContact):
return False
return (self.getId() == other.getId()) and \
(self.getName() == other.getName()) and \
(self.getEmail() == other.getEmail()) and \
(self.getNotes() == other.getNotes())
def getId(self):
return self.id
def getName(self):
return self.name
def getEmail(self):
return self.email
def getNotes(self):
return self.notes
def setNotes(self, notes):
Sets the notes field for this GmailContact
Note that this does NOT change the note
field on Gmail's end; only adding or removing
contacts modifies them
self.notes = notes
def getMoreInfo(self):
return self.moreInfo
def setMoreInfo(self, moreInfo):
moreInfo format
Use special key values::
'i' = IM
'p' = Phone
'd' = Company
'a' = ADR
'e' = Email
'm' = Mobile
'b' = Pager
'f' = Fax
't' = Title
'o' = Other
Simple example::
moreInfo = {'Home': ( ('a','852 W Barry'),
('p', '1-773-244-1980'),
('i', 'aim:brianray34') ) }
Complex example::
moreInfo = {
'Personal': (('e', 'Home Email'),
('f', 'Home Fax')),
'Work': (('d', 'Sample Company'),
('t', 'Job Title'),
('o', 'Department: Department1'),
('o', 'Department: Department2'),
('p', 'Work Phone'),
('m', 'Mobile Phone'),
('f', 'Work Fax'),
('b', 'Pager')) }
self.moreInfo = moreInfo
def getVCard(self):
"""Returns a vCard 3.0 for this
contact, as a string"""
# The \r is is to comply with the RFC2425 section 5.8.1
vcard = "BEGIN:VCARD\r\n"
vcard += "VERSION:3.0\r\n"
## Deal with multiline notes
##vcard += "NOTE:%s\n" % self.getNotes().replace("\n","\\n")
vcard += "NOTE:%s\r\n" % self.getNotes()
# Fake-out N by splitting up whatever we get out of getName
# This might not always do 'the right thing'
# but it's a *reasonable* compromise
fullname = self.getName().split()
fullname.reverse()
vcard += "N:%s" % ';'.join(fullname) + "\r\n"
vcard += "FN:%s\r\n" % self.getName()
vcard += "EMAIL;TYPE=INTERNET:%s\r\n" % self.getEmail()
vcard += "END:VCARD\r\n\r\n"
# Final newline in case we want to put more than one in a file
return vcard
class GmailContactList:
Class for storing an entire Gmail contacts list
and retrieving contacts by Id, Email address, and name
def __init__(self, contactList):
self.contactList = contactList
def __str__(self):
return '\n'.join([str(item) for item in self.contactList])
def getCount(self):
Returns number of contacts
return len(self.contactList)
def getAllContacts(self):
Returns an array of all the
GmailContacts
return self.contactList
def getContactByName(self, name):
Gets the first contact in the
address book whose name is 'name'.
Returns False if no contact
could be found
nameList = self.getContactListByName(name)
if len(nameList) > 0:
return nameList[0]
else:
return False
def getContactByEmail(self, email):
Gets the first contact in the
address book whose name is 'email'.
As of this writing, Gmail insists
upon a unique email; i.e. two contacts
cannot share an email address.
Returns False if no contact
could be found
emailList = self.getContactListByEmail(email)
if len(emailList) > 0:
return emailList[0]
else:
return False
def getContactById(self, myId):
Gets the first contact in the
address book whose id is 'myId'.
REMEMBER: ID IS A STRING
Returns False if no contact
could be found
idList = self.getContactListById(myId)
if len(idList) > 0:
return idList[0]
else:
return False
def getContactListByName(self, name):
This function returns a LIST
of GmailContacts whose name is
'name'.
Returns an empty list if no contacts
were found
nameList = []
for entry in self.contactList:
if entry.getName() == name:
nameList.append(entry)
return nameList
def getContactListByEmail(self, email):
This function returns a LIST
of GmailContacts whose email is
'email'. As of this writing, two contacts
cannot share an email address, so this
should only return just one item.
But it doesn't hurt to be prepared?
Returns an empty list if no contacts
were found
emailList = []
for entry in self.contactList:
if entry.getEmail() == email:
emailList.append(entry)
return emailList
def getContactListById(self, myId):
This function returns a LIST
of GmailContacts whose id is
'myId'. We expect there only to
be one, but just in case!
Remember: ID IS A STRING
Returns an empty list if no contacts
were found
idList = []
for entry in self.contactList:
if entry.getId() == myId:
idList.append(entry)
return idList
def search(self, searchTerm):
This function returns a LIST
of GmailContacts whose name or
email address matches the 'searchTerm'.
Returns an empty list if no matches
were found.
searchResults = []
for entry in self.contactList:
p = re.compile(searchTerm, re.IGNORECASE)
if p.search(entry.getName()) or p.search(entry.getEmail()):
searchResults.append(entry)
return searchResults
class GmailSearchResult:
def __init__(self, account, search, threadsInfo):
`threadsInfo` -- As returned from Gmail but unbunched.
#print "\nthreadsInfo\n",threadsInfo
try:
if not type(threadsInfo[0]) is types.ListType:
threadsInfo = [threadsInfo]
except IndexError:
print "No messages found"
self._account = account
self.search = search # TODO: Turn into object + format nicely.
self._threads = []
for thread in threadsInfo:
self._threads.append(GmailThread(self, thread[0]))
def __iter__(self):
return iter(self._threads)
def __len__(self):
return len(self._threads)
def __getitem__(self,key):
return self._threads.__getitem__(key)
class GmailSessionState:
def __init__(self, account = None, filename = ""):
if account:
self.state = (account.name, account._cookieJar)
elif filename:
self.state = load(open(filename, "rb"))
else:
raise ValueError("GmailSessionState must be instantiated with " \
"either GmailAccount object or filename.")
def save(self, filename):
dump(self.state, open(filename, "wb"), -1)
class _LabelHandlerMixin(object):
Note: Because a message id can be used as a thread id this works for
messages as well as threads.
def __init__(self):
self._labels = None
def _makeLabelList(self, labelList):
self._labels = labelList
def addLabel(self, labelName):
# Note: It appears this also automatically creates new labels.
result = self._account._doThreadAction(U_ADDCATEGORY_ACTION+labelName,
self)
if not self._labels:
self._makeLabelList([])
# TODO: Caching this seems a little dangerous; suppress duplicates maybe?
self._labels.append(labelName)
return result
def removeLabel(self, labelName):
# TODO: Check label is already attached?
# Note: An error is not generated if the label is not already attached.
result = \
self._account._doThreadAction(U_REMOVECATEGORY_ACTION+labelName,
self)
removeLabel = True
try:
self._labels.remove(labelName)
except:
removeLabel = False
pass
# If we don't check both, we might end up in some weird inconsistent state
return result and removeLabel
def getLabels(self):
return self._labels
class GmailThread(_LabelHandlerMixin):
Note: As far as I can tell, the "canonical" thread id is always the same
as the id of the last message in the thread. But it appears that
the id of any message in the thread can be used to retrieve
the thread information.
def __init__(self, parent, threadsInfo):
_LabelHandlerMixin.__init__(self)
# TODO Handle this better?
self._parent = parent
self._account = self._parent._account
self.id = threadsInfo[T_THREADID] # TODO: Change when canonical updated?
self.subject = threadsInfo[T_SUBJECT_HTML]
self.snippet = threadsInfo[T_SNIPPET_HTML]
#self.extraSummary = threadInfo[T_EXTRA_SNIPPET] #TODO: What is this?
# TODO: Store other info?
# Extract number of messages in thread/conversation.
self._authors = threadsInfo[T_AUTHORS_HTML]
self.info = threadsInfo
try:
# TODO: Find out if this information can be found another way...
# (Without another page request.)
self._length = int(re.search("\((\d+?)\)\Z",
self._authors).group(1))
except AttributeError,info:
# If there's no message count then the thread only has one message.
self._length = 1
# TODO: Store information known about the last message (e.g. id)?
self._messages = []
# Populate labels
self._makeLabelList(threadsInfo[T_CATEGORIES])
def __getattr__(self, name):
Dynamically dispatch some interesting thread properties.
attrs = { 'unread': T_UNREAD,
'star': T_STAR,
'date': T_DATE_HTML,
'authors': T_AUTHORS_HTML,
'flags': T_FLAGS,
'subject': T_SUBJECT_HTML,
'snippet': T_SNIPPET_HTML,
'categories': T_CATEGORIES,
'attach': T_ATTACH_HTML,
'matching_msgid': T_MATCHING_MSGID,
'extra_snippet': T_EXTRA_SNIPPET }
if name in attrs:
return self.info[ attrs[name] ];
raise AttributeError("no attribute %s" % name)
def __len__(self):
return self._length
def __iter__(self):
if not self._messages:
self._messages = self._getMessages(self)
return iter(self._messages)
def __getitem__(self, key):
if not self._messages:
self._messages = self._getMessages(self)
try:
result = self._messages.__getitem__(key)
except IndexError:
result = []
return result
def _getMessages(self, thread):
# TODO: Do this better.
# TODO: Specify the query folder using our specific search?
items = self._account._parseSearchResult(U_QUERY_SEARCH,
view = U_CONVERSATION_VIEW,
th = thread.id,
q = "in:anywhere")
result = []
# TODO: Handle this better?
# Note: This handles both draft & non-draft messages in a thread...
for key, isDraft in [(D_MSGINFO, False), (D_DRAFTINFO, True)]:
try:
msgsInfo = items[key]
except KeyError:
# No messages of this type (e.g. draft or non-draft)
continue
else:
# TODO: Handle special case of only 1 message in thread better?
if type(msgsInfo[0]) != types.ListType:
msgsInfo = [msgsInfo]
for msg in msgsInfo:
result += [GmailMessage(thread, msg, isDraft = isDraft)]
return result
class GmailMessageStub(_LabelHandlerMixin):
Intended to be used where not all message information is known/required.
NOTE: This may go away.
# TODO: Provide way to convert this to a full `GmailMessage` instance
# or allow `GmailMessage` to be created without all info?
def __init__(self, id = None, _account = None):
_LabelHandlerMixin.__init__(self)
self.id = id
self._account = _account
class GmailMessage(object):
def __init__(self, parent, msgData, isDraft = False):
Note: `msgData` can be from either D_MSGINFO or D_DRAFTINFO.
# TODO: Automatically detect if it's a draft or not?
# TODO Handle this better?
self._parent = parent
self._account = self._parent._account
self.author = msgData[MI_AUTHORFIRSTNAME]
self.id = msgData[MI_MSGID]
self.number = msgData[MI_NUM]
self.subject = msgData[MI_SUBJECT]
self.to = msgData[MI_TO]
self.cc = msgData[MI_CC]
self.bcc = msgData[MI_BCC]
self.sender = msgData[MI_AUTHOREMAIL]
self.attachments = [GmailAttachment(self, attachmentInfo)
for attachmentInfo in msgData[MI_ATTACHINFO]]
# TODO: Populate additional fields & cache...(?)
# TODO: Handle body differently if it's from a draft?
self.isDraft = isDraft
self._source = None
def _getSource(self):
if not self._source:
# TODO: Do this more nicely...?
# TODO: Strip initial white space & fix up last line ending
# to make it legal as per RFC?
self._source = self._account.getRawMessage(self.id)
return self._source
source = property(_getSource, doc = "")
class GmailAttachment:
def __init__(self, parent, attachmentInfo):
# TODO Handle this better?
self._parent = parent
self._account = self._parent._account
self.id = attachmentInfo[A_ID]
self.filename = attachmentInfo[A_FILENAME]
self.mimetype = attachmentInfo[A_MIMETYPE]
self.filesize = attachmentInfo[A_FILESIZE]
self._content = None
def _getContent(self):
if not self._content:
# TODO: Do this a more nicely...?
self._content = self._account._retrievePage(
_buildURL(view=U_ATTACHMENT_VIEW, disp="attd",
attid=self.id, th=self._parent._parent.id))
return self._content
content = property(_getContent, doc = "")
def _getFullId(self):
Returns the "full path"/"full id" of the attachment. (Used
to refer to the file when forwarding.)
The id is of the form: "<thread_id>_<msg_id>_<attachment_id>"
return "%s_%s_%s" % (self._parent._parent.id,
self._parent.id,
self.id)
_fullId = property(_getFullId, doc = "")
class GmailComposedMessage:
def __init__(self, to, subject, body, cc = None, bcc = None,
filenames = None, files = None):
`filenames` - list of the file paths of the files to attach.
`files` - list of objects implementing sub-set of
`email.Message.Message` interface (`get_filename`,
`get_content_type`, `get_payload`). This is to
allow use of payloads from Message instances.
TODO: Change this to be simpler class we define ourselves?
self.to = to
self.subject = subject
self.body = body
self.cc = cc
self.bcc = bcc
self.filenames = filenames
self.files = files
if __name__ == "__main__":
import sys
from getpass import getpass
try:
name = sys.argv[1]
except IndexError:
name = raw_input("Gmail account name: ")
pw = getpass("Password: ")
domain = raw_input("Domain? [leave blank for Gmail]: ")
ga = GmailAccount(name, pw, domain=domain)
print "\nPlease wait, logging in..."
try:
ga.login()
except GmailLoginFailure,e:
print "\nLogin failed. (%s)" % e.message
else:
print "Login successful.\n"
# TODO: Use properties instead?
quotaInfo = ga.getQuotaInfo()
quotaMbUsed = quotaInfo[QU_SPACEUSED]
quotaMbTotal = quotaInfo[QU_QUOTA]
quotaPercent = quotaInfo[QU_PERCENT]
print "%s of %s used. (%s)\n" % (quotaMbUsed, quotaMbTotal, quotaPercent)
searches = STANDARD_FOLDERS + ga.getLabelNames()
name = None
while 1:
try:
print "Select folder or label to list: (Ctrl-C to exit)"
for optionId, optionName in enumerate(searches):
print " %d. %s" % (optionId, optionName)
while not name:
try:
name = searches[int(raw_input("Choice: "))]
except ValueError,info:
print info
name = None
if name in STANDARD_FOLDERS:
result = ga.getMessagesByFolder(name, True)
else:
result = ga.getMessagesByLabel(name, True)
if not len(result):
print "No threads found in `%s`." % name
break
name = None
tot = len(result)
i = 0
for thread in result:
print "%s messages in thread" % len(thread)
print thread.id, len(thread), thread.subject
for msg in thread:
print "\n ", msg.id, msg.number, msg.author,msg.subject
# Just as an example of other usefull things
#print " ", msg.cc, msg.bcc,msg.sender
i += 1
print
print "number of threads:",tot
print "number of messages:",i
except KeyboardInterrupt:
break
print "\n\nDone."
Last edited by Reasons (2008-03-20 01:18:27)

Thought it might help to give lines 369-relevant of the libgmail so it's easier to read
def _retrievePage(self, urlOrRequest):
if self.opener is None:
raise "Cannot find urlopener"
if not isinstance(urlOrRequest, urllib2.Request):
req = urllib2.Request(urlOrRequest)
else:
req = urlOrRequest
self._cookieJar.setCookies(req)
req.add_header('User-Agent',
'Mozilla/5.0 (Compatible; libgmail-python)')
try:
resp = self.opener.open(req)
except urllib2.HTTPError,info:
print info
return None
pageData = resp.read()
# Extract cookies here
self._cookieJar.extractCookies(resp.headers)
# TODO: Enable logging of page data for debugging purposes?
return pageData
def _parsePage(self, urlOrRequest):
Retrieve & then parse the requested page content.
items = _parsePage(self._retrievePage(urlOrRequest))
# Automatically cache some things like quota usage.
# TODO: Cache more?
# TODO: Expire cached values?
# TODO: Do this better.
try:
self._cachedQuotaInfo = items[D_QUOTA]
except KeyError:
pass
#pprint.pprint(items)
try:
self._cachedLabelNames = [category[CT_NAME] for category in items[D_CATEGORIES][0]]
except KeyError:
pass
return items
def _parseSearchResult(self, searchType, start = 0, **kwargs):
params = {U_SEARCH: searchType,
U_START: start,
U_VIEW: U_THREADLIST_VIEW,
params.update(kwargs)
return self._parsePage(_buildURL(**params))

Stale stats

Similar Messages

Maybe you are looking for