Data Detangle

How Knowledge Graphs Bring Order to the HRA's Data Diversity

Artboard Created with Sketch.
It takes a lot of data to construct the Human Reference Atlas (HRA).
But not all data is created equal!
HRA data comes from many different sources which may use different technologies and follow different protocols.
The data itself comes in many different formats, some of which may require a particular code to read.
Some of it is old data, and some of it is new.
It may have been mixed with other data or repurposed.
Some data might be open research data, available to all.
While other data might have restrictions that limit access, use, and distribution.
For the Human Reference Atlas (HRA), we need the ability to easily find the data we want, utilize it for our purposes, and share it as widely as possible.
Artboard Created with Sketch. Pure data Pure data Structured data
Of course, that data needs to be structured in a way that it can be readable by machines.
Ideally, though, that data structure would also be understandable to humans.
It would not only show what data exists in the HRA but also how pieces of that data relate to each other.
Artboard Created with Sketch. Tuft cell Large Intestine Digestive System Trachea Tuft cell Respiratory System part_of located_in part_of located_in Breathing System Eating System Large Intestine Digestive System Trachea Tuft cell Respiratory System part_of located_in part_of located_in
By labeling our data and connecting our labeled nodes with relational links,
we put our data into context and create a framework for moving from data
to knowledge
to insight.
Artboard Created with Sketch.
The type of data structure we are moving towards here is known as a "knowledge graph," and they are a lot more common than you think.
Google was the first to introduce the term back in 2012.
But now major companies like Facebook, Amazon, and Netflix—they all utilize knowledge graphs to represent relationships between people, products, and concepts.
Artboard Created with Sketch. Mom & dad Home Ice cream My puppy Food My city Ice cream Book- stores Fluffy pillows Me Things I like * my puppy * Ice cream * Fluffy pillows * Mom and dad * Home * Food * Book stores * My City
A knowledge graph gathers all the things that are important to a particular group or organization.
These things can be people, places, entities, concepts, databases, documents—really just about anything.
Each of those data entities is assigned a node. Then, it organizes all those things into a network of interrelations.
Artboard Created with Sketch. Research metadata Biological data Digital objects
In the case of the Human Reference Atlas, we are interested in things like biological data, research metadata, and data about the digital objects within the HRA.
Artboard Created with Sketch. Subject Object Predicate Organ A has relation to Organ B Renal cortex is a part of Kidney
Using the Resource Description Framework (RDF), each of these are expressed as a subject, predicate, and an object.
The predicate expresses the relationship between the entities.
This grouping is called a triple, and the relation between an anatomical structure and its parent organ might look like this.
Artboard Created with Sketch. Left Female Kidney Left Female Kidney Left Female Kidney Left Female Kidney Left Female Kidney Left Female Kidney Left Female Kidney
Let's see how this might look for a particular digital object created for the Human Reference Atlas.
Here's a 3D reference organ for the left female kidney.
And here's how it appears in the knowledge graph.
The subject entries in the left column all point to the same thing: the HRA's 3D reference organ of the left female kidney.
The object column lists all the other data in the HRA that the reference organ is connected to.
And the predicate column indicates the nature of that relationship.
A closer look at these predicates reveals relationships such as the creation date,
A closer look at these predicates reveals relationships such as the creation date, version number,
A closer look at these predicates reveals relationships such as the creation date, version number, the raw data the 3D kidney was derived from, and many more.
What we see here is actually a network of nodes and edges, with our kidney reference organ as the central node with all its related data connected to it by labeled edges.
Of course, this is only one network. There are over 500 digital objects currently in the HRA, each with its own network. And each network is connected to all the others.
Utilizing a knowledge graph not only helps us structure the massive amount of different types of data that power the HRA.
It will also allow us to link up with other information networks to create a wide and radically open web of knowledge about the human body.
2024 CNS at Indiana University

Funded By:

Medical Disclaimer: This resource is intended for research purposes only. It should not be used for emergencies or medical or professional advice.