Getting started with Neo4j and Gephi Tool

Dhwanipanjwani
5 min readOct 28, 2021

Neo4j is an open-source, NoSQL, native graph database that provides an ACID-compliant transactional backend for your applications.The source code, written in Java and Scala, is available for free on GitHub or as a user-friendly desktop application download.

What is a Graph Database?

Very simply, a graph database is a database designed to treat the relationships between data as equally important to the data itself. It is intended to hold data without constricting it to a pre-defined model. Instead, the data is stored like we first draw it out showing how each individual entity connects with or is related to others.

Three main primitives in Neo4j:

1.Nodes — It is like a table of Relational Database where we store the data.

2. Relationships — It is a connection between Data mapped between two nodes.

3. Properties — It is nothing but tags that can be attached to both Nodes and Relationships. It is having the data. ex. Node Person can have properties like Name, Age.

Let’s start the demo,

  1. Download neo4j Desktop, and install it
  2. After the installation,

At first, I will run a hello world query which will create the 2 nodes called Neo4j and Hello world and 1 relation called says.

CREATE (database:Database {name:"Neo4j"})-[r:SAYS]->(message:Message {name:"Hello World!"})
RETURN database, message, r

You can see that the 2 nodes is created and one relation called says is created using the query.

The relationship created just by a simple query

In the below image you can see the table view of nodes and relations.

Table view of nodes & relations

Below I have created a simple Neo4j project using Movies dataset provided in Neo4j and performed various queries to visualize data. the various queries performed and their output are as follows:

In this database, There are 9 person nodes and 8 movies nodes and total 18 relationships between nodes. use below command to find total nodes.

MATCH (n) RETURN count(n)

//find labels in database
CALL db.labels()

// Find types of relationship between tables
CALL db.relationshipTypes()

By using this query we can know that how the person is connected to the movie,who is producer of movie, which role person acted in the movie.

// query for the movies released in 1990s
MATCH (nineties:Movie) WHERE nineties.released >= 1990 AND nineties.released < 2000 RETURN nineties.title

Gephi Tool:

Gephi is an open-source software for visualizing and analysing large networks graphs. Gephi uses a 3D render engine to display graphs in real-time and speed up the exploration. You can use it to explore, analyse, spatialise, filter, cluterize, manipulate and export all types of graphs.

Gephi is a tool for data analysts and scientists keen to explore and understand graphs. The goal is to help data analysts to make hypothesis, intuitively discover patterns, isolate structure singularities or faults during data sourcing.

  1. Open the Gephi Tool and click on New Project. Then choose File->Open and load the dataset of your choice as shown below. On loading the dataset it would show the number of nodes and edges present in the dataset as well as the type of the graph. Here, I have chosen the karate.gml dataset.

2. After clicking OK all the nodes and edges are displayed when initially data is loaded.

3. Now we can represent the data in various layouts. In the left pane choose the Layout option and choose the layout of your choice and click on Run. In the below image I have chosen the Fruchterman Reingold layout which displays the data in the following form.

4. We can show nodes in a different color, sizes based on degree, in-degree and out-degree. For that go to the left panel on the top side, Nodes->Ranking, select color for Degree/ In-degree/ Out-degree. Where red color nodes have a lower degree compared to white and the dark grey node has the highest degree rankings.

5. For displaying in various sizes in the left pane in the Appearance section select the Size option and then mention the minimum and maximum size of nodes you want to display. I have given the Min size to be 10 and Max size to be 30. In the below image nodes having higher degrees are larger in size compared to nodes having less degree i.e nodes in Dark grey have a high value of degree compared to nodes in white and red color.

6. Next, we generate a Degree Distribution graph for Degree, In-Degree, and Out-Degree and also get the Average Degree value for all the nodes. To generate the graph simply in the right pane choose the Statistics tab and there run Average Degree in the Network Overview section.

7. A report will be generated as well the column for the Degree, In-degree, and Out-degree will be added to the dataset table.

To see the Data Table in the top Menu Bar select Window->Data Table and you would be able to see your table as in the above image where after running the Average Degree function columns for Degree, In-Degree, and Out-Degree are added for each node present.

Conclusion:

We have performed an example of the graphical analysis of data using Neo4j and Gephi Tool.

Thank you!!!

--

--