Dataset Quality Vocabulary (daQ)

Namespace:

http://purl.org/eis/vocab/daq#

Authors:

Contributors:

Serialisations:


Abstract

The Dataset Quality Vocabulary (daQ) is a lightweight, extensible core vocabulary for attaching the result of quality benchmarking of a linked open dataset (usually an expensive process) to that dataset. daQ is designed to be extended by custom quality metrics. Use cases include filtering and ranking datasets by quality.

Ontology Statistics

Classes (5):

Metric , Dimension , Category , Observation , QualityGraph

Properties (10):

hasMetric , hasDimension , isEstimate , requires , expectedDataType , value , metric , hasObservation , computedOn , computedBy

Instances (2):

daq:dsd , sdmx-dimension:timePeriod



Term Description

In this section, the classes and properties that provide a basis for the Dataset Quality Vocabulary (daQ) are described

Classes

The smallest unit of measuring a quality dimension is a metric. A metric belongs to exactly one dimension. Each metric has one or more observations (daq:hasObservation), which records data quality assessment value following a computation. Metrics are provided as subclasses of this abstract class, which is not intended for direct usage.

Described by:

daq:hasObservation , daq:expectedDataType , daq:requires

In Range of:

daq:metric , daq:hasMetric

Equivalent Class of:

dqv:Metric
Each dimension is part of a larger group called category (See daq:Category). Each dimension has a number of metrics which are associated to it. A dimension is linked with a category using the daq:hasDimension property. Dimensions are provided as subclasses of this abstract class, which is not intended for direct usage.

Described by:

daq:hasMetric

In Range of:

daq:hasDimension

Equivalent Class of:

dqv:Dimension
The highest level of quality metric is a category. A category groups a number of dimensions relevant to each other which aims at measuring the quality of a dataset from different aspects. Categories are provided as subclasses of this abstract class, which is not intended for direct usage.

Described by:

daq:hasDimension

Equivalent Class of:

dqv:Category
A quality observation represents the statistical and provenance information of the attached metric's assessment activity.

Described by:

sdmx-dimension:timePeriod , daq:computedOn , daq:isEstimate , daq:metric , daq:value

In Range of:

daq:hasObservation

Subclass of:

prov:Entity , qb:Observation

Equivalent Class of:

dqv:QualityMeasurement
Defines a quality graph which will contain all metadata about quality metrics on the dataset.

Subclass of:

qb:DataSet , rdfg:Graph

Equivalent Class of:

dqv:QualityMeasurementDataset

Properties

A dimension is an abstract concept which groups an number of more concrete metrics to measure quality of a dataset. This is an abstract property and should not be used directly. Specific sub-properties should be inherited for different metrics.

Domain of:

daq:Dimension

Range of:

daq:Metric
The category concept classifies dimensions related to the measurement of quality for a specific criteria. This is an abstract property and should not be used directly. Specific sub-properties should be inherited for different dimensions.

Domain of:

daq:Category

Range of:

daq:Dimension
This property flags true if an assessed observation of a metric gives an estimate result instead of a more accurate one.

Domain of:

daq:Observation

Range of:

xsd:boolean

Has Cardinality:

1
A metric might require a number of external resources (e.g. a gold standard) in order to be able to measure the quality. In order to cater for the most generic requirement, this property links a metric to the required resource (e.g. a URI to the gold standard dataset used).

Domain of:

daq:Metric

Range of:

rdfs:Resource
Each metric should have an expect data type for it's observed value (e.g. xsd:boolean, xsd:double etc...)

Domain of:

daq:Metric

Range of:

xsd:anySimpleType

Has Cardinality:

1

Equivalent Property of:

dqv:expectedDataType
Each metric will have a value computed. In order to deal with the different return type of the metric computation, this property links a metric with a value object (e.g. boolean, double, Literal).

Domain of:

daq:Observation

Has Cardinality:

1

Equivalent Property of:

dqv:value
Represents the metric being observed.

Domain of:

daq:Observation

Range of:

daq:Metric

Has Minimum Cardinality:

1

Inverse of:

daq:hasObservation

Equivalent Property of:

dqv:isMeasurementOf
Computed metrics can have 1 or more quality observations, where each computed resource has one observation.

Domain of:

daq:Metric

Range of:

daq:Observation

Has Minimum Cardinality:

1

Inverse of:

daq:metric
Quality metrics can be (in principle) calculated on various forms of data (such as datasets, graphs, set of triples etc...). This vocabulary allow the owner/user of such RDF data to calculate metrics on multiple (and different) resources.

Domain of:

daq:Observation

Range of:

rdfs:Resource

Has Cardinality:

1

Equivalent Property of:

dqv:computedOn
**Deprecated Property**. The computedBy property defines the Agent that computed a metric on a dataset.

Domain of:

qb:Observation