Fundamentals
Set theme to dark (⇧+D)

Data Standards

See definitions for Open Ecosystems terms.

​​ ALCOA

ALCOA is a set of principles to ensure data quality. It is an implementation of GDocP.

Read more

​​ ALCOA+

ALCOA+ has been adopted by various industries as a framework for ensuring that data security and integrity is observed and maintained. It is in use by various big bodies such as the FDA and WHO. It is all about data quality.

Read more

​​ ATC

The Anatomical Therapeutic Chemical Classification System (ATC) Code Set is the most widely used Drug classification system, which assigns Drugs a unique ATC Code.

Read more

​​ Base32

Base32 is a binary-to-text Encoding, similar to Base64. But instead of 6-bit digits it uses 5-bit digits.

Read more

​​ Base64

Base64 is a binary-to-text Encoding that represents binary data in sequences of 24 bits that can be represented by 6-bit Base64 digits.

Read more

​​ BSON

BSON is a binary data interchange format with the same capabilities as JSON.

Read more

​​ BSR

​​ C-CDA

​​ C4 Model

The C4 Model is a Lean graphical notation technique for modelling the Architecture of Software systems. It relies on other Modelling Techniques like UML and ERD.

Read more

​​ ClinVar

​​ COSMIC

​​ CPT

Read more

​​ CSF

CSF is a framework that is used to structure and organize elements of User Interfaces. It is design to help developers and designers to break down different parts of a User Interface into discrete units called “components” that can be managed, implemented, and reused more easily.

Read more

​​ CSV

Comma Separated Values (CSV) is a file format for tabular data.

Read more

​​ Cypher

Cypher is a Query Language for Labeled Property Graphs. It is aimed to be easily readable by both humans and machines. It’s also designed to look familiar to people that know SQL.

Read more

​​ dbVar

​​ DMC

A Data Matrix Code (DMC) is a 2D Barcode. It can be distinguished from other barcodes by two black lines.

Read more

​​ DNS

Dynamic Name Service, a network service that can resolve Domain Names to current IP Addresses.

Read more

​​ DTD

Read more

​​ EDI

Electronic Data Interchange (EDI) is the electronic interchange of business information using a standardized format; a process which allows one company to send information to another company electronically rather than with paper. Business entities conducting business electronically are called trading partners.

Read more

​​ ERD

An Entity Relationship Diagram (ERD) describes interrelated things. It is composed of Entity types and specifies the Relationships that can exist between them.

Read more

​​ FAIR

In 2016, the ‘FAIR Guiding Principles for scientific data management and stewardship’ were published in Scientific Data. The authors intended to provide guidelines to improve the Findability, Accessibility, Interoperability, and Reuse of digital assets. It is an implementation of GDocP.

Read more

​​ FHIR

FHIR is an Industry Standard for describing Health Care related Artifacts.

Read more

​​ FIPS

FIPS contains Security Functional Requirements and Non-Functional Requirements that we MUST adhere to.

Read more

​​ FTP

​​ FTPS

File Transfer Protocol, Secure (FTPS) = FTP + SSL. Or in other words: FTPS adds Security to File Transfer capabilities.

Read more

​​ GCP

​​ GDocP

Good documentation practices. To achieve robust decisions, the supporting data set needs to be reliable and complete. GDocP should be followed in order to ensure all records, both paper and electronic, allow the full reconstruction and traceability of GxP activities.

Read more

​​ GDP

​​ GDPR

The General Data Protection Regulation (GDPR) is a regulation in EU law on data protection and privacy in the European Union and the European Economic Area.

Read more

​​ GitFlow

GitFlow is a Branching Model that does not depend on creating forks of central repositories. Instead, it creates separate timelines by agreeing on specific branch names.

Read more

​​ GitHub Flow

GitHub Flow is a Branching Model defined by GitHub.

Read more

​​ GLBA

The GLBA requires financial institutions – companies that offer consumers financial products or services like loans, financial or investment advice, or insurance – to explain their information-sharing practices to their customers and to safeguard sensitive data.

Read more

​​ GMP

​​ GraphQL

​​ Gremlin

Gremlin is a query language for Graph Databases.

Read more

​​ gRPC

gRPC uses a binary format to encode data, which is much faster, cheaper, and compact than many other Message Protocols.

Read more

​​ GxP

GxP is an acronym for the group of good practice guides governing the preclinical, clinical, manufacturing, testing, storage, distribution and post-market activities for regulated pharmaceuticals, biologicals and Medical Devices, such as good laboratory practices, good clinical practices, good manufacturing practices, good pharmacovigilance practices and good distribution practices.

​​ GZIP

GZIP has relatively high Compression, but it is both slow to Compress and Decompress. As such it does not do well with high Throughput.

​​ HCL

HashiCorp Configuration Language (HCL) is a DSL created by HashiCorp, which is used in Terraform for example.

​​ HCPS

​​ HIE

​​ HITECH

​​ HiTrust

Read more

​​ HL7v2

​​ HL7v3

​​ HTML

HyperText Markup Language. HTML is the code that is used to structure a web page and its content.

Read more

​​ HTTP

HTTP is a Transport Protocol for Web Applications.

Read more

​​ HTTPS

Hypertext Transfer Protocol Secure (HTTPS) is an extension of the Hypertext Transfer Protocol (HTTP). It is used for secure communication over a computer network, and is widely used on the Internet

Read more

​​ ICD-10

Read more

​​ ICD-9

Read more

​​ IPv4

Internet Protocol v4 (IPv4) is a full specification of how communication should work across networks.

Read more

​​ IPv6

Internet Protocol v6 (IPv6) is a full specification of how communication should work across networks.

Read more

​​ IRI

The Internationalized Resource Identifier (IRI) is an internet protocol standard that builds on the URI. It greatly expands the set of permitted characters. IRIs may contain most characters from the Universal Character Set.

Read more

​​ JMX

Java Management Extension (JMX) is built in into Java and can be enabled for Java applications. If enabled, it provides an API that can be used to get Metrics that are specific to the Java application. JMX can also be used to manage the application.

​​ JSON

JSON is a way of describing any kind of artifact in a human-readable format, that is also easily parsed by JavaScript. Because of the convenience of using this format, many programming languages now have implementations to parse and render objects in JSON.

Read more

​​ JSON-LD

​​ JWT

JSON Web Token (JWT) is a proposed Internet standard for creating data with optional Signature and/or optional Encryption whose payload holds JSON that asserts some number of claims. The tokens are Signed either using a private secret or a Public Key/Private Key.

Read more

​​ LOINC

Read more

​​ LZO

LZO is a Lossless Compression method that is optimized for Decompression.

Read more

​​ Markdown

Markdown is a very lightweight plain text markup language for writing documentation. It is easy to read, easy to write, and easy to render to other markup languages like HTML.

Read more

​​ MLLP

MLLP is a protocol used to transfer HL7v3 messages via TCP/IP.

Read more

​​ Mono Repository

​​ N-Triples

​​ NTP

Network Time Protocol (NTP) is a Message Protocol for clock synchronization between computer systems over packet-switched, variable Latency data Networks.

Read more

​​ OAUTH

​​ OpenAPI

​​ OpenID

​​ ORC

​​ ORI

​​ OWL

The Web Ontology Language (OWL) is a semantic web language design to represent rich and complex knowledge about things, a group of things, and relations between things. The current standard is OWS2. It was published in 2009. It is based on RDF.

Read more

​​ Parquet

Parquet is an open source binary column-oriented data format that is very efficient.

Read more

​​ Part 11

Title 21 CFR Part 11 is the part of Title 21 of the Code of Federal Regulations that establishes the United States Food and Drug Administration (FDA) regulations on electronic records and electronic signatures (ERES).

Read more

​​ PCI-DSS

PCI-DSS is a set of regulations to protect Credit Card details.

Read more

​​ PDF

PDF is a Portable Document Format, which is created by Adobe.

Read more

​​ PO

PO-files is a file format that aims to provide a solution to Multi Language Support.

​​ Protobuf

Protocol Buffers, or Protobuf, is a free and Open Source Cross-Platform data format used to serialize structured data. It is useful in developing programs to communicate with each other over a Network or for storing data. Protobuf is more compact and Performant than REST.

Read more

​​ PROV

Provenance is information about entities, activities, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness. The PROV Family of Documents defines a model, corresponding serializations and other supporting definitions to enable the inter-operable interchange of provenance information in heterogeneous environments such as the Web. This document provides an overview of this family of documents.

Read more

​​ prov-dm

Provenance is information about entities, activities, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness. PROV-DM is the conceptual data model that forms a basis for the W3C provenance (PROV) family of specifications. PROV-DM distinguishes core structures, forming the essence of provenance information, from extended structures catering for more specific uses of provenance. PROV-DM is organized in six components, respectively dealing with: (1) entities and activities, and the time at which they were created, used, or ended; (2) derivations of entities from entities; (3) agents bearing responsibility for entities that were generated and activities that happened; (4) a notion of bundle, a mechanism to support provenance of provenance; (5) properties to link entities that refer to the same thing; and, (6) collections forming a logical structure for its members.

Read more

​​ PROV-O

The PROV Ontology (PROV-O) expresses the PROV Data Model PROV-DM using the OWL2 Web Ontology Language (OWL2) OWL2-OVERVIEW. It provides a set of classes, properties, and restrictions that can be used to represent and interchange provenance information generated in different systems and under different contexts. It can also be specialized to create new classes and properties to model provenance information for different applications and domains. The PROV Document Overview describes the overall state of PROV, and should be read before other PROV documents.

Read more

​​ QR Code

A Quick Response Code, better known as a QR-Code, is a 2D Barcode.

Read more

​​ RDF

Resource Description Framework (RDF) is a specification for expressing information about resources in a Data Graph.

Read more

​​ RDF/XML

​​ RDFa

​​ REST

A REST API is (an interface definition of) a Web Application that can return data and execute actions on data.

Read more

​​ SAML

Security Assertion Markup Language 2.0 (SAML 2.0) is a version of the SAML standard for exchanging authentication and authorization identities between security domains. It is based on XML.

Read more

​​ Section 508

Handicapped people may have Readers that help them understand websites. But in order for those to work well the website should meet certain standards. These standards are called Section 508.

Read more

​​ Semantic Versioning

Semantic Versioning is an approach to address Dependency Hell by making incompatible dependencies more predictable.

​​ SFTP

Secure Shell File Transfer Protocol (SFTP) = SSH + FTP. Or in other words: SFTP adds File Transfer capabilities to something that is already Secure.

Read more

​​ SGML

​​ SMTP

Simple Mail Transfer Protocol (SMTP) is used for sending Email.

​​ Snappy

Snappy is usually the best Lossless Compression method for Parquet and AVRO files. It is the fastest algorithm for Compression and a fast one for Decompression. It supports high data Throughput and allows Partitioning of data.

Read more

​​ SNOMED

Read more

​​ SOAP

Simple Object Access Protocol (SOAP). It allows for describing services, similar to Swagger, but then in XML. Whether or not it’s really “simple” is debatable.

Read more

​​ SOC2

SOC2 is an Auditing procedure that ensures your service providers securely manage your data to protect the interests of your organization and the privacy of its clients.

Read more

​​ SPARQL

​​ SQL

SQL is a Query Language that is used for most Relational Databases. It assumes that data is structured in Tables that have Rows and Columns.

Read more

​​ SSH

Secure SHell (SSH) is a protocol for securely operating components over an unsecured network.

Read more

​​ SSL

Secure Sockets Layer (SSL) was a widely used cryptographic protocol for providing data security for Internet communications. SSL was superseded by TLS; however, most people still refer to Internet cryptographic protocols as SSL.

Read more

​​ SSO

SSO is an Authentication scheme that allows a User to login with a single ID to any of several related, yet independent, software systems.

Read more

​​ Swagger

Swagger is the tooling that can be used with OpenAPI specificiations.

​​ Synchronous

Read more

​​ TCP

Where IPv4 and IPv6 specify how packets are sent across a Network, TCP adds the concepts of “Connections” and TCP IP Ports (among others).

Read more

​​ TLS

Transport Layer Security (TLS) is a security protocol that replaces SSL for data privacy and Internet communication security. TLS encrypts communications between web applications and servers such as between a visitor’s browser loading a website.

Read more

​​ TOML

TOML is a file format for configuration files, intended to be unambiguously mappable to a Dictionary (a list of Key-Value Pairs).

Read more

​​ TriG

​​ Turtle

​​ UCUM

Read more

​​ UML

UML is a general-purpose modeling language for Software development that it is intended to provide a standard way to visualize the design of a system.

Read more

​​ UMLS

Read more

​​ URI

A Uniform Resource Identifier (URI) is a unique sequence of characters that identifies a logical or physical resource used by web technologies. URIs may be used to identify anything, including real-world objects, such as people and places, concepts, or information resources such as web pages and books.

​​ URL

A Uniform Resource Locator (URL), colloquially termed a web address, is reference to a web resource that specifies its location on a computer network, and a mechanism for retrieving it.

Read more

​​ URN

A URN is a URI that uses the urn scheme to identify a logical or physical resource used by web technology, but it does not provide information to locate the object.

Read more

​​ USCDI

USCDI as an acronym of United States Core Data for Interoperability.

Read more

​​ Websocket

WebSocket is a Network Protocol that provides full-duplex communication channels over a single TCP/IP connection.

​​ WSDL

Web Service Description Language (WSDL) is a language to describe SOAP Web Services.

Read more

​​ XML

The Extensible Markup Language (XML) is a subset of SGML that is completely described in this document. Its goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML. XML has been designed for ease of implementation and for interoperability with both SGML and HTML.

Read more

​​ XSD

XML Schema Definition (XSD)

Read more

​​ XSL

An eXtensible Stylesheet Language (XSL) file describes how to mediate an XML document from one schema to another (or basically into any text-based format).

Read more

​​ YAML

YAML addresses the same domain as JSON. One can describe the same objects as one could with JSON. YAML is easier to read by humans, though, and YAML supports comments, which JSON does not.

Read more