Node Classification using Graph Convolutional Neural Network

Node Classification on Cora Dataset in PyTorch using GCN

Renu Khandelwal
5 min readJun 29, 2022



Graph Basics and Application of Graph

Graph Representational Learning

Graph Neural Networks: A Deep Neural Network for Graphs

Dataset: This article uses Cora Dataset, consisting of 2708 scientific publications classified into one of seven different classes. The citation network consists of 5429 links.

Objective: Node classification using GCN to accurately predict the subject of a paper given its words and citation network using PyTorch geometric

Graph Convolutional Neural Network(GCN) model is a framework of spectral graph convolutions applying a generalization of convolutions to non-Euclidean data.

GCNs are similar to convolutions applied to images as they generalize the graph data's convolution operations. The filter parameters are shared over all locations in the graph. GCN is based on graph convolutions built by stacking multiple convolutional layers, and a point-wise non-linearity function follows each layer.


In this example, you will classify the scientific papers in a citation graph where labels are only available for a small subset of nodes, and GCN must predict the correct label for the node.

The key idea of GCN is to generate node embeddings based on local network neighborhoods. Nodes aggregate information from their neighbors using neural networks. As a result, every node defines a computation graph based on its neighborhood by averaging neighbor messages and applying a neural network, as shown below.


Exploring the Dataset

Open the tar files inside the .tgz files

import urllib.request
import tarfile
coraTarFile = ''
tarfiles = urllib.request.urlopen(coraTarFile)
zip_file =, mode="r|gz")
for tarinfo in zip_file



Renu Khandelwal

A Technology Enthusiast who constantly seeks out new challenges by exploring cutting-edge technologies to make the world a better place!