Combining Graph Traversal with Powerful Graph Analytics

Oracle Big Data Spatial and Graph has, in the property graph feature, two important components: data access layer and in-memory analyst. This first component, data access layer, allows one to store, manage, index, query, and traverse property graph data in a horizontally scalable database (Apache HBase or Oracle NoSQL Database). And the second component, in-memory analyst, offers a rich set of out-of-the-box graph analytics and graph operations. These two components together provide a solid framework for users to build graph based applications.

In this blog, I am going to demonstrate how graph traversal, an important function supported by the data access layer, can be used together with graph analytics.

Setup

If you haven't already, download Oracle Big Data Lite Virtual Machine v4.4.0 (or newer) from the following page.
http://www.oracle.com/technetwork/database/bigdata-appliance/oracle-bigdatalite-2104726.html

Retrieve the latest property graph Hands-on-Lab/Demo scripts

- Login to Big Data Lite 4.4.0 VM

- Click Refersh Samples icon on the desktop, follow the instructions and download the latest property graph HoL/Demo scripts. (Kudos to Marty Gubar and Nigel Bayliss who designed this very cool script that can automatically fetch latest content from Github!)

- Open the following page using the Firefox browser
file:///home/oracle/src/hol/property_graph_hol_2015_Nov/property_graph_hol_2015_Nov.html

Load example property graph data

- Follow steps described in 2.3 to 2.4.2 (if you are using Oracle NoSQL Database), or steps in 4.10 to 4.11 (for Apache HBase) to load an example property graph.

Traverse the graph with Blueprints APIs and Gremlin syntax

In the built-in groovy shell, one can easily navigate the graph using either Blueprints Java APIs and/or Gremlin Syntax. A few examples as follows:

// find a start vertex using Blueprints Java API
opg-nosql> v=opg.getVertex(1l);
==>Vertex ID 1 {country:str:United States, name:str:Barack Obama, occupation:str:44th president of United States of America, political party:str:Democratic, religion:str:Christianity, role:str:political authority}

opg-nosql> din=com.tinkerpop.blueprints.Direction.IN; dout= com.tinkerpop.blueprints.Direction.OUT;
==>OUT

// get in edges (using Java API)
opg-nosql> v.getEdges(din);
==>Edge ID 1078 from Vertex ID 2 {country:str:United States, music genre:str:pop soul , name:str:Beyonce, role:str:singer actress} =[collaborates]=> Vertex ID 1 {country:str:United States, name:str:Barack Obama, occupation:str:44th president of United States of America, political party:str:Democratic, religion:str:Christianity, role:str:political authority} edgeKV[{weight:flo:1.0}]
...

// get out edges (using Gremlin Syntax)
opg-nosql> v.outE
==>Edge ID 1000 from Vertex ID 1 {country:str:United States, name:str:Barack Obama, occupation:str:44th president of United States of America, political party:str:Democratic, religion:str:Christianity, role:str:political authority} =[collaborates]=> Vertex ID 2 {country:str:United States, music genre:str:pop soul , name:str:Beyonce, role:str:singer actress} edgeKV[{weight:flo:1.0}]
...

// follow "collaborates" edges and add a filter on religion
opg-nosql> v.outE('collaborates').inV.filter{it.religion != 'Christianity'}
==>Vertex ID 3 {country:str:United States, name:str:Charlie Rose, role:str:talk show host journalist, show:str:Charlie Rose}
...

Use PipeFunction to combine Gremlin traversal and In Memory analysis.

The following scripts will create a session and in memory analyst, compute page rank value for the vertices, and start a simple Gremlin traversal from vertex (with ID 1) and limit visited vertices to those with page rank value above a threshold.

// Create in-memory analytics session and analyst
session=Pgx.createSession("session_ID_1");
analyst=session.createAnalyst();

// Read the graph from database into memory
pgxGraph = session.readGraphWithProperties(opg.getConfig());

// Execute Page Rank
rank=analyst.pagerank(pgxGraph, 0.00000001, 0.85, 5000);

import com.tinkerpop.gremlin.java.*;
import com.tinkerpop.pipes.*;

opg-nosql> pipe = new GremlinPipeline(opg.getVertex(1).out("collaborates").filter(new PipeFunction<Vertex, Boolean>() { public Boolean compute(Vertex v) { if (rank.get(v.getId()) > 0.01) return true ; return false; } }));

// Traversal results shown below
==>Vertex ID 2 {country:str:United States, music genre:str:pop soul , name:str:Beyonce, role:str:singer actress}
...

The important part of the above traversal is that it includes a Tinkerpop PipeFunction implementation which, upon receiving a vertex from the traversal, checks the analytical result (from a parallel in-memory page rank computation) for that vertex, and uses that information to guide the traversal.

Acknowledgement: thanks Jay Banerjee for his input on this blog post.

Combining Graph Traversal with Powerful Graph Analytics

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List