Posts

Showing posts from 2017

Using Java API to access Google Search Console Data

Image
This blog talk about ingesting the Google Search Console data using the Java API. There are various kinds of reports that are available in Google Search Console (GSC). Ex: Search Analytics, Sitemaps, Sites, URL crawl errors/metrics  etc. The API endpoints are defined in : https://developers.google.com/apis-explorer/?hl=en_US#p/webmasters/v3/ Before you begin setting up your Java env, make sure you have access to the Google Search Console dashboard and also has an admin access to create the service key for API access. Steps to create service account for API access : 1. Go to API Manager :  https://console.developers.google.com/apis/credentials 2. Select your project from the dropdown list. 3. Click on " Create Credentials " and then select " Service Account Key " 4. Provide a service account name and select " P12 " as the key type. 5. Note the email id for this service account and Save the p12 file once you have created the new...

Setting Up Eclipse to run Spark using Scala

Image
All the steps given below is written based on the Mac version of Eclipse. As of this blog, following are the versions that I am using : Platform :   MacOS OS          : 10.12.5 (MacOS Sierra) Eclipse   :   Mars.2 Release (4.5.2) Eclipse Java EE IDE for Web Developers. Disclaimer : I am not talking about any best practice here. This blog is just to get your Eclipse ready to work with Spark with Scala. 1. Install Eclipse 2. Install Scala IDE.     - Go to Help -> Eclipse Marketplace     - Search for "Scala IDE" and install the "Scala IDE <version>"     - After successful installation, you should see it being listed under 'Installed' tab. 3. Change the perspective to "Scala" 4. Create a new Scala Project and provide a name     File -> New ->  Scala Project 5. Create a new Scala Object under the above project and provide a name.   Right click the package...

Accessing Hbase table via Hive.

Create the HBase Table : hbase(main):> CREATE 'employee', {NAME => 'e', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false',  KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => '189341712', COMPRESSION => 'SNAPPY',  MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '1'} Load the values in HBase Table : hbase(main):> put 'employee', 'employee_123', 'e:n' , 'my_name' hbase(main):> put 'employee', 'employee_123', 'e:id' , '2345687' hbase(main):> put 'employee', 'employee_123', 'e:l' , 'san_franscisco' hbase(main):> put 'employee', 'employee_123', 'e:r' , '25' hbase(main):> put 'employee', 'employee_123', 'e:cd' , ...