PGA
PGA copied to clipboard
This is a library for making batch request to Google Analytics Core Reporting v3 API and extracting data from Google Analytics property into Python 3 data structures.
Python Google Analytics Library
(Core Reporting API v3 support)
This is a library for making batch request to Google Analytics Core Reporting v3 API and extracting data from Google Analytics property into Python 3 data structures.
The package uses
-
OAuth 2.0 (protocol) client or server access to Google Analytics API (oauth2client==3.0.0) - for connection to Google Analytics
-
Core Reporting v3 API Google Analytics - for extracting data
-
Metadata API Google Analytics - integrated dimensions or metrics reference lookup
-
Management API Google Analytics - to get View, Property and Account tree.
Dependency:
-
Pandas > 0.13.0 - for transformation data into pandas DataFrame object
-
Numpy > 1.0.0 - for slice numpy array chunk
-
google-api-python-client > 1.5.0 - self explanatory
Best practices usage:
- Interactive shell Jupyter for analyzing data
Installation
- Via pip: use the following command: # sudo pip install pga
Latest version of Pandas, Numpy and oauth2client will be automatically installed as a dependency.
Authentication
First of all you will need to get google client_secret json file from Google API Console
You may choose the following types of Client ID :
-
for Service account client
-
for Web application
PGA.init

PGA.init(key_file_location=None,type_of_connection=None,facet_chunk=10,count_day_slice=1)
Constructor and set parameters for instance basic functionality.
| Parameters: | key_file_location : string Set path for secret json file type_of_connection : string Available methods are Client’, ‘Server’ If use service account, then choose ‘Server’, if use web applicatio use ‘Client.’ facet_chunk : int, optional Set a number of chunk,which execute all parallels request. More detail about this technology. Important things - Google Universal Analytics make execute only 10 parallel request in one second, if you want more - contact with a Google form to increase this limit. count_day_slice : int, optional Set a number of days,which need to slice [start-date, end-date] in your request. For example: (input) {‘count_day_slice’:2, 'start_date' : '2016-12-01','end_date' : '2016-12-05'} (output) [{ 'start_date' : '2016-12-01','end_date' : '2016-12-02'}, { 'start_date' : '2016-12-03','end_date' : '2016-12-04'}, { 'start_date' : '2016-12-05','end_date' : '2016-12-05'}] |
| Returns: | self : self return self with current behavior. |
After apply constructor will be create the instance, and redirect the client to a browser for authentication with Google.
Request add
Simply add request in an already instantiated object pga

Request.add_settings_request
Request**.add_settings_request(****settings_products)
| Parameters: | **settings_products : kwargs Specify json request formats Core V3, list of query parameters - https://developers.google.com/analytics/devguides/reporting/core/v3/reference?hl=ru#q_summary |
| Returns: | self : self return self with current behavior. |
You can update any already used query parameters later with the following method, and make new request. ![image alt text]
Execute DataFram****e
Execute all settings for get DataFrame

PGA.get_dataframe
PGA.get_dataframe(groupby=True)
| Parameters: | groupby : boolean Available methods are ‘True’, ‘False’ if choose True then DataFrame groupby all date by all dimensions, dates, and start-index. Also all columns apply appropriate type based on Google Analytics MetaData API. if choose False then DataFrame doesn’t groupby data. It made for use some other library which can fast aggregate and groupby data, because in some cases data is too large and this process is very low. You may pay attention in to this project - http://dask.pydata.org/en/latest/ |
| Returns: | data : pandas.DataFrame object |
Get settings pga
All settings
Print all current settings pga:
PGA.get_all_settings
PGA.get_all_settings()
| Returns: | all settings : pandas.DataFrame object |
All products
Print all current product settings pga
PGA.get_all_products
PGA.get_all_products()
| Returns: | all settings : pandas.DataFrame object |
Additional extra apps
ExtraAppsMetaCdm
Lookup through metadata of Google Analytics dimensions and metrics:

ExtraAppsMetaCdm.get_list_cdcm
ExtraAppsMetaCdm.get_list_cdcm(clarify=None)
| Parameters: | clarify : string Specifying the attribute on which the selection will be dimensions and metris |
| Returns: | Table of information : pandas.DataFrame object |
ExtraAppsManagementAPI
Get the list of Google Universal Analytics (Account ID, Property id, View id) objects, you have an access to.

PGA.get_all_profile
PGA.get_all_profile()
| Returns: | Table of information with dimensions or metrics: pandas.DataFrame object |