Data Standards
See definitions for Open Ecosystems terms.
ALCOA
ALCOA is a set of principles to ensure data quality. It is an implementation of GDocP.
Read more ALCOA+
ALCOA+
has been adopted by various industries as a framework for ensuring that data security and integrity is observed and maintained. It is in use by various big bodies such as the FDA and WHO. It is all about data quality.
ATC
The Anatomical Therapeutic Chemical Classification System (ATC) Code Set is the most widely used Drug classification system, which assigns Drugs a unique ATC Code.
Read more Base32
Base32
is a binary-to-text Encoding, similar to Base64. But instead of 6-bit digits it uses 5-bit digits.
Base64
Base64 is a binary-to-text Encoding that represents binary data in sequences of 24 bits that can be represented by 6-bit Base64 digits.
Read more BSON
BSON
is a binary data interchange format with the same capabilities as JSON.
BSR
C-CDA
C4 Model
The C4 Model
is a Lean graphical notation technique for modelling the Architecture of Software systems. It relies on other Modelling Techniques like UML and ERD.
ClinVar
COSMIC
CPT
Read more CSF
CSF
is a framework that is used to structure and organize elements of User Interfaces. It is design to help developers and designers to break down different parts of a User Interface into discrete units called “components” that can be managed, implemented, and reused more easily.
CSV
Comma Separated Values (CSV) is a file format for tabular data.
Read more Cypher
Cypher
is a Query Language for Labeled Property Graphs. It is aimed to be easily readable by both humans and machines. It’s also designed to look familiar to people that know SQL.
dbVar
DMC
A Data Matrix Code (DMC) is a 2D Barcode. It can be distinguished from other barcodes by two black lines.
Read more DNS
Dynamic Name Service
, a network service that can resolve Domain Names to current IP Addresses.
DTD
Read more EDI
Electronic Data Interchange (EDI) is the electronic interchange of business information using a standardized format; a process which allows one company to send information to another company electronically rather than with paper. Business entities conducting business electronically are called trading partners.
Read more ERD
An Entity Relationship Diagram (ERD) describes interrelated things. It is composed of Entity types and specifies the Relationships that can exist between them.
Read more FAIR
In 2016, the ‘FAIR Guiding Principles for scientific data management and stewardship’ were published in Scientific Data. The authors intended to provide guidelines to improve the Findability, Accessibility, Interoperability, and Reuse of digital assets. It is an implementation of GDocP.
Read more FHIR
FHIR
is an Industry Standard for describing Health Care related Artifacts.
FIPS
FIPS contains Security Functional Requirements and Non-Functional Requirements that we MUST adhere to.
Read more FTP
FTPS
File Transfer Protocol, Secure (FTPS) = FTP + SSL. Or in other words: FTPS
adds Security to File Transfer capabilities.
GCP
GDocP
Good documentation practices. To achieve robust decisions, the supporting data set needs to be reliable and complete. GDocP
should be followed in order to ensure all records, both paper and electronic, allow the full reconstruction and traceability of GxP activities.
GDP
GDPR
The General Data Protection Regulation (GDPR)
is a regulation in EU law on data protection and privacy in the European Union and the European Economic Area.
GitFlow
GitFlow
is a Branching Model that does not depend on creating forks of central repositories. Instead, it creates separate timelines by agreeing on specific branch names
.
GitHub Flow
GitHub Flow
is a Branching Model defined by GitHub
.
GLBA
The GLBA
requires financial institutions – companies that offer consumers financial products or services like loans, financial or investment advice, or insurance – to explain their information-sharing practices to their customers and to safeguard sensitive data.
GMP
GraphQL
Gremlin
Gremlin
is a query language for Graph Databases.
gRPC
gRPC uses a binary format to encode data, which is much faster, cheaper, and compact than many other Message Protocols.
Read more GxP
GxP
is an acronym for the group of good practice guides governing the preclinical, clinical, manufacturing, testing, storage, distribution and post-market activities for regulated pharmaceuticals, biologicals and Medical Devices, such as good laboratory practices, good clinical practices, good manufacturing practices, good pharmacovigilance practices and good distribution practices.
GZIP
GZIP
has relatively high Compression, but it is both slow to Compress and Decompress. As such it does not do well with high Throughput.
HCL
HashiCorp Configuration Language (HCL) is a DSL created by HashiCorp, which is used in Terraform for example.
HCPS
HIE
HITECH
HiTrust
Read more HL7v2
HL7v3
HTML
HyperText Markup Language
. HTML
is the code that is used to structure a web page and its content.
HTTP
HTTP
is a Transport Protocol for Web Applications.
HTTPS
Hypertext Transfer Protocol Secure (HTTPS) is an extension of the Hypertext Transfer Protocol (HTTP). It is used for secure communication over a computer network, and is widely used on the Internet
Read more ICD-10
Read more ICD-9
Read more IPv4
Internet Protocol v4 (IPv4) is a full specification of how communication should work across networks.
Read more IPv6
Internet Protocol v6 (IPv6) is a full specification of how communication should work across networks.
Read more IRI
The Internationalized Resource Identifier (IRI) is an internet protocol standard that builds on the URI. It greatly expands the set of permitted characters. IRIs may contain most characters from the Universal Character Set.
Read more JMX
Java Management Extension (JMX) is built in into Java and can be enabled for Java applications. If enabled, it provides an API that can be used to get Metrics that are specific to the Java application. JMX can also be used to manage the application.
JSON
JSON is a way of describing any kind of artifact in a human-readable format, that is also easily parsed by JavaScript. Because of the convenience of using this format, many programming languages now have implementations to parse and render objects in JSON
.
JSON-LD
JWT
JSON Web Token (JWT) is a proposed Internet standard for creating data with optional Signature and/or optional Encryption whose payload holds JSON that asserts some number of claims. The tokens are Signed either using a private secret or a Public Key/Private Key.
Read more LOINC
Read more LZO
LZO is a Lossless Compression method that is optimized for Decompression.
Read more Markdown
Markdown
is a very lightweight plain text markup language for writing documentation. It is easy to read, easy to write, and easy to render to other markup languages like HTML.
MLLP
MLLP
is a protocol used to transfer HL7v3 messages via TCP/IP.
Mono Repository
N-Triples
NTP
Network Time Protocol (NTP) is a Message Protocol for clock synchronization between computer systems over packet-switched, variable Latency data Networks.
Read more OAUTH
OpenAPI
OpenID
ORC
ORI
OWL
The Web Ontology Language (OWL) is a semantic web language design to represent rich and complex knowledge about things, a group of things, and relations between things. The current standard is OWS2. It was published in 2009. It is based on RDF.
Read more Parquet
Parquet
is an open source binary column-oriented data format that is very efficient.
Part 11
Title 21 CFR Part 11 is the part of Title 21
of the Code of Federal Regulations
that establishes the United States Food and Drug Administration (FDA) regulations on electronic records and electronic signatures (ERES).
PCI-DSS
PCI-DSS is a set of regulations to protect Credit Card details.
Read morePDF
is a Portable Document Format
, which is created by Adobe.
PO
PO-files is a file format that aims to provide a solution to Multi Language Support.
Protobuf
Protocol Buffers
, or Protobuf, is a free and Open Source Cross-Platform data format used to serialize structured data. It is useful in developing programs to communicate with each other over a Network or for storing data. Protobuf is more compact and Performant than REST.
PROV
Provenance is information about entities, activities, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness. The PROV Family of Documents defines a model, corresponding serializations and other supporting definitions to enable the inter-operable interchange of provenance information in heterogeneous environments such as the Web. This document provides an overview of this family of documents.
Read more prov-dm
Provenance is information about entities, activities, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness. PROV-DM is the conceptual data model that forms a basis for the W3C provenance (PROV) family of specifications. PROV-DM distinguishes core structures, forming the essence of provenance information, from extended structures catering for more specific uses of provenance. PROV-DM is organized in six components, respectively dealing with: (1) entities and activities, and the time at which they were created, used, or ended; (2) derivations of entities from entities; (3) agents bearing responsibility for entities that were generated and activities that happened; (4) a notion of bundle, a mechanism to support provenance of provenance; (5) properties to link entities that refer to the same thing; and, (6) collections forming a logical structure for its members.
Read more PROV-O
The PROV Ontology (PROV-O) expresses the PROV Data Model PROV-DM using the OWL2 Web Ontology Language (OWL2) OWL2-OVERVIEW. It provides a set of classes, properties, and restrictions that can be used to represent and interchange provenance information generated in different systems and under different contexts. It can also be specialized to create new classes and properties to model provenance information for different applications and domains. The PROV Document Overview describes the overall state of PROV, and should be read before other PROV documents.
Read more QR Code
A Quick Response Code, better known as a QR-Code, is a 2D Barcode.
Read more RDF
Resource Description Framework (RDF) is a specification for expressing information about resources in a Data Graph.
Read more RDF/XML
RDFa
REST
A REST API is (an interface definition of) a Web Application that can return data and execute actions on data.
Read more SAML
Security Assertion Markup Language 2.0
(SAML 2.0) is a version of the SAML
standard for exchanging authentication and authorization identities between security domains. It is based on XML.
Section 508
Handicapped people may have Readers that help them understand websites. But in order for those to work well the website should meet certain standards. These standards are called Section 508
.
Semantic Versioning
Semantic Versioning
is an approach to address Dependency Hell by making incompatible dependencies more predictable.
SFTP
Secure Shell File Transfer Protocol (SFTP)
= SSH + FTP. Or in other words: SFTP
adds File Transfer capabilities to something that is already Secure.
SGML
SMTP
Simple Mail Transfer Protocol (SMTP) is used for sending Email.
Snappy
Snappy
is usually the best Lossless Compression method for Parquet and AVRO files. It is the fastest algorithm for Compression and a fast one for Decompression. It supports high data Throughput and allows Partitioning of data.
SNOMED
Read more SOAP
Simple Object Access Protocol (SOAP). It allows for describing services, similar to Swagger, but then in XML. Whether or not it’s really “simple” is debatable.
Read more SOC2
SOC2
is an Auditing procedure that ensures your service providers securely manage your data to protect the interests of your organization and the privacy of its clients.
SPARQL
SQL
SQL
is a Query Language that is used for most Relational Databases. It assumes that data is structured in Tables that have Rows and Columns.
SSH
Secure SHell (SSH) is a protocol for securely operating components over an unsecured network.
Read more SSL
Secure Sockets Layer (SSL) was a widely used cryptographic protocol for providing data security for Internet communications. SSL was superseded by TLS; however, most people still refer to Internet cryptographic protocols as SSL.
Read more SSO
SSO
is an Authentication scheme that allows a User to login with a single ID to any of several related, yet independent, software systems.
Swagger
Swagger
is the tooling that can be used with OpenAPI specificiations.
Synchronous
Read more TCP
Where IPv4 and IPv6 specify how packets are sent across a Network, TCP adds the concepts of “Connections” and TCP IP Ports (among others).
Read more TLS
Transport Layer Security (TLS) is a security protocol that replaces SSL for data privacy and Internet communication security. TLS encrypts communications between web applications and servers such as between a visitor’s browser loading a website.
Read more TOML
TOML
is a file format for configuration files, intended to be unambiguously mappable to a Dictionary (a list of Key-Value Pairs).
TriG
Turtle
UCUM
Read more UML
UML is a general-purpose modeling language for Software development that it is intended to provide a standard way to visualize the design of a system.
Read more UMLS
Read more URI
A Uniform Resource Identifier (URI) is a unique sequence of characters that identifies a logical or physical resource used by web technologies. URIs may be used to identify anything, including real-world objects, such as people and places, concepts, or information resources such as web pages and books.
URL
A Uniform Resource Locator (URL), colloquially termed a web address, is reference to a web resource that specifies its location on a computer network, and a mechanism for retrieving it.
Read more URN
A URN is a URI that uses the urn
scheme to identify a logical or physical resource used by web technology, but it does not provide information to locate the object.
USCDI
USCDI
as an acronym of United States Core Data for Interoperability.
Websocket
WebSocket is a Network Protocol that provides full-duplex communication channels over a single TCP/IP connection.
WSDL
Web Service Description Language (WSDL) is a language to describe SOAP Web Services.
Read more XML
The Extensible Markup Language (XML) is a subset of SGML that is completely described in this document. Its goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML. XML has been designed for ease of implementation and for interoperability with both SGML and HTML.
Read more XSD
XML Schema Definition (XSD)
Read more XSL
An eXtensible Stylesheet Language (XSL) file describes how to mediate an XML document from one schema to another (or basically into any text-based format).
Read more YAML
YAML
addresses the same domain as JSON. One can describe the same objects as one could with JSON. YAML
is easier to read by humans, though, and YAML
supports comments, which JSON does not.