Introduction to RDF and Linked Data
The Resource Description Framework (RDF) is the foundational data model of the Semantic Web — Tim Berners-Lee's vision of a web where data is machine-readable and interconnected across the internet. While traditional databases store data in tables and JSON stores it in trees, RDF stores data as a graph of interconnected triples: subject-predicate-object relationships that link resources together.
Think of RDF as the data model behind knowledge graphs used by Google (Knowledge Panel), Wikipedia (Wikidata), scientific databases (UniProt, PubChem), government open data portals, and library cataloging systems. When you search Google and see a side panel with structured information about a person, place, or thing — that's linked data powered by RDF principles.
In this guide, we'll cover RDF fundamentals, the Turtle serialization format, and practical techniques for converting tabular data (CSV, Excel) into linked data triples. Whether you're building a knowledge graph, publishing open data, or integrating with semantic web services, understanding RDF conversion is an essential skill.
Understanding RDF Triples
Every piece of information in RDF is expressed as a triple — three components that form a statement:
Subject → Predicate → Object
"Alice" → "knows" → "Bob"
"Alice" → "age" → 30
"Alice" → "works_at" → "Acme Corp"
Each triple is a single atomic fact. Multiple triples about the same subject build up a complete description:
:Alice :name "Alice Smith" .
:Alice :email "[email protected]" .
:Alice :age 30 .
:Alice :role "Engineer" .
:Alice :works_at :AcmeCorp .
:Alice :knows :Bob .
:Bob :name "Bob Johnson" .
:Bob :works_at :AcmeCorp .
Notice how :Alice and :Bob are connected through the :knows predicate, and both link to :AcmeCorp — this is the "linked" in linked data. Data naturally forms a graph that can be queried, merged, and extended without the rigid constraints of relational tables.
The Turtle Serialization Format
RDF is a data model, not a file format. Several serialization formats exist (RDF/XML, JSON-LD, N-Triples, N-Quads), but Turtle (Terse RDF Triple Language) is the most human-readable and widely used for authoring RDF data.
Basic Turtle Syntax
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix ex: <http://example.org/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
ex:alice
a foaf:Person ;
foaf:name "Alice Smith" ;
foaf:mbox <mailto:[email protected]> ;
ex:age "30"^^xsd:integer ;
foaf:knows ex:bob .
ex:bob
a foaf:Person ;
foaf:name "Bob Johnson" ;
foaf:mbox <mailto:[email protected]> .
Key Turtle Syntax Elements
- @prefix — Declares namespace shortcuts (like imports)
- Semicolons (;) — Separate multiple predicates for the same subject
- Periods (.) — End a statement block for a subject
- Commas (,) — Separate multiple objects for the same predicate
- a — Shorthand for
rdf:type - "value"^^xsd:type — Typed literals (integers, dates, etc.)
- <URI> — Full URI references
- prefix:localName — Abbreviated URI using a prefix
Converting Tabular Data to RDF
The Conceptual Mapping
Converting a table to RDF follows a consistent pattern:
| Table Concept | RDF Concept | Example |
|---|---|---|
| Table | Class (rdf:type) | ex:Employee |
| Row | Resource (Subject) | ex:employee_1 |
| Column header | Property (Predicate) | ex:name |
| Cell value | Literal or Resource (Object) | "Alice" |
| Primary key | URI identifier | ex:employee_1 |
| Foreign key | Resource link | ex:department_5 |
Step-by-Step CSV to RDF Conversion
Let's convert a simple CSV file to Turtle format:
Input CSV:
id,name,email,department,salary
1,Alice Smith,[email protected],Engineering,85000
2,Bob Johnson,[email protected],Design,72000
3,Carol White,[email protected],Engineering,91000
Output Turtle:
@prefix ex: <http://example.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
ex:employee_1
a ex:Employee ;
foaf:name "Alice Smith" ;
foaf:mbox "[email protected]" ;
ex:department "Engineering" ;
ex:salary "85000"^^xsd:integer .
ex:employee_2
a ex:Employee ;
foaf:name "Bob Johnson" ;
foaf:mbox "[email protected]" ;
ex:department "Design" ;
ex:salary "72000"^^xsd:integer .
ex:employee_3
a ex:Employee ;
foaf:name "Carol White" ;
foaf:mbox "[email protected]" ;
ex:department "Engineering" ;
ex:salary "91000"^^xsd:integer .
Python Script for CSV to RDF Conversion
import csv
from rdflib import Graph, Namespace, Literal, URIRef
from rdflib.namespace import RDF, RDFS, XSD, FOAF
def csv_to_rdf(csv_file, base_uri='http://example.org/',
class_name='Record', id_column=None):
"""Convert CSV to RDF Turtle format using rdflib."""
g = Graph()
ex = Namespace(base_uri)
g.bind('ex', ex)
g.bind('foaf', FOAF)
with open(csv_file, 'r') as f:
reader = csv.DictReader(f)
for i, row in enumerate(reader):
# Generate subject URI
if id_column and id_column in row:
subject_id = row[id_column].replace(' ', '_').lower()
subject = ex[f'{class_name.lower()}_{subject_id}']
else:
subject = ex[f'{class_name.lower()}_{i}']
# Add type triple
g.add((subject, RDF.type, ex[class_name]))
# Add property triples
for header, value in row.items():
if not value or header == id_column:
continue
# Create predicate URI
predicate = ex[header.replace(' ', '_').lower()]
# Infer literal type
obj = infer_literal(value)
g.add((subject, predicate, obj))
return g.serialize(format='turtle')
def infer_literal(value):
"""Infer XSD datatype for a value."""
# Try integer
try:
return Literal(int(value), datatype=XSD.integer)
except ValueError:
pass
# Try float
try:
return Literal(float(value), datatype=XSD.decimal)
except ValueError:
pass
# Try boolean
if value.lower() in ('true', 'false'):
return Literal(value.lower() == 'true', datatype=XSD.boolean)
# Default to string
return Literal(value)
# Usage
turtle_output = csv_to_rdf(
'employees.csv',
base_uri='http://company.org/',
class_name='Employee',
id_column='id'
)
print(turtle_output)
Using ConvertMatrix for Quick RDF Generation
For quick conversions without writing code, ConvertMatrix converts data from any format to RDF Turtle syntax directly in your browser:
- CSV to RDF — Convert CSV data to Turtle triples
- JSON to RDF — Transform JSON objects into RDF resources
- Excel to RDF — Generate RDF from spreadsheet data
- XML to RDF — Map XML elements to RDF triples
The converter automatically generates proper prefixes, creates URI identifiers for each row, and maps column headers to predicates. All processing happens in your browser for complete privacy.
Linked Data Principles
Tim Berners-Lee defined four principles for publishing linked data, known as the Linked Data Principles:
- Use URIs as names for things — Every entity gets a unique URI identifier
- Use HTTP URIs — So people can look up those names (dereferenceable URIs)
- When someone looks up a URI, provide useful information — Using standards like RDF and SPARQL
- Include links to other URIs — So people can discover more things
Linking to External Datasets
The real power of linked data emerges when you connect your data to existing datasets:
@prefix ex: <http://example.org/> .
@prefix dbr: <http://dbpedia.org/resource/> .
@prefix schema: <http://schema.org/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
ex:NewYorkOffice
a schema:Place ;
schema:name "New York Office" ;
schema:address "123 Broadway, New York, NY" ;
owl:sameAs dbr:New_York_City .
ex:alice
schema:workLocation ex:NewYorkOffice ;
schema:nationality dbr:United_States .
By linking to DBpedia (the RDF version of Wikipedia), your data becomes part of the global knowledge graph and can be enriched with information from billions of existing triples.
Querying RDF with SPARQL
Once your data is in RDF, you can query it with SPARQL — the SQL of the semantic web:
# Find all employees in Engineering
SELECT ?name ?email
WHERE {
?person a ex:Employee ;
foaf:name ?name ;
foaf:mbox ?email ;
ex:department "Engineering" .
}
# Find employees earning more than 80000
SELECT ?name ?salary
WHERE {
?person foaf:name ?name ;
ex:salary ?salary .
FILTER (?salary > 80000)
}
ORDER BY DESC(?salary)
# Count employees per department
SELECT ?dept (COUNT(?person) AS ?count)
WHERE {
?person a ex:Employee ;
ex:department ?dept .
}
GROUP BY ?dept
Vocabulary Reuse
Rather than inventing your own predicates, reuse established vocabularies:
| Vocabulary | Prefix | Use Case |
|---|---|---|
| Schema.org | schema: | General-purpose (people, places, events) |
| FOAF | foaf: | People and social networks |
| Dublin Core | dc:/dcterms: | Documents and metadata |
| SKOS | skos: | Taxonomies and classifications |
| DCAT | dcat: | Data catalogs |
| PROV | prov: | Provenance and lineage |
| GeoSPARQL | geo: | Geospatial data |
| VCARD | vcard: | Contact information |
Validation with SHACL
SHACL (Shapes Constraint Language) validates RDF data against a schema, similar to JSON Schema for JSON:
@prefix sh: <http://www.w3.org/ns/shacl#> .
ex:EmployeeShape
a sh:NodeShape ;
sh:targetClass ex:Employee ;
sh:property [
sh:path foaf:name ;
sh:datatype xsd:string ;
sh:minCount 1 ;
sh:maxCount 1 ;
] ;
sh:property [
sh:path ex:salary ;
sh:datatype xsd:integer ;
sh:minInclusive 0 ;
] .
Conclusion
Converting tabular data to RDF is a gateway to the semantic web and linked data ecosystem. Whether you're publishing government open data, building a corporate knowledge graph, enriching your data with external sources, or integrating with semantic web services, understanding the table-to-RDF conversion process is essential. Start with simple conversions using ConvertMatrix, graduate to Python's rdflib for programmatic transformations, and leverage SPARQL for powerful graph queries that would be impossibly complex in traditional SQL. The semantic web is no longer a future vision — it's a practical technology used by organizations worldwide to unlock the full potential of their data.
Try Our Free Conversion Tools
Put what you've learned into practice with our browser-based converters: