Acciones

Metadata

De iMMAP-Colombia Wiki

Un metadato esta definido como datos sobre datos. Aunque esta definición es facil de recordar, no es muy preciso. La fortaleza de esta definición es en reconocer que el metadato también es un dato. Como tal, el metadato puede ser almacenado y manejado entre una base de datos, frecuentemente nombrado un registro o repositorio. Sin embargo, es imposible identificar al metadato solo por mirarlo. No sabemos cuando un metadato es un metadato o solamente datos.<ref>METADATA STANDARDS AND METADATA REGISTRIES: AN OVERVIEW</ref>

Los metadatos son un concepto que aplica principalmente a datos archivados electrónicamente y utilizados para describir:

  1. una definición
  2. una estructura

y, # la administración de archivos de datos con todos sus contenidos en un contexto para facilitar el uso de los datos capturados para uso en el futuro.

Las paginas web frecuentemente incluyen a metadatos en el formato de "metatags". Metatags con descripciones y palabras claves se utiliza en muchos casos para describir el contenido de una página Web. La mayoría de buscadores utilizan a estos datos cuando agregan a paginas a su índice de búsqueda.

Definición de metadatos

La definición de metadatos ofrece información sobre varios elementos, tales como:

  • medio de creación,
  • propósito de los datos,
  • tiempo y fecha de creación,
  • creador o autor del dato,
  • ubicación de una red en donde se creó los datos,
  • que estándar se utiliza


Metadata definition provides information about the distinct items, such as:

  • means of creation,
  • purpose of the data,
  • time and date of creation,
  • creator or author of data,
  • placement on a network (electronic form) where the data was created,
  • what standards used
  • etc.

For example: The purpose of a digital image created may include metadata that describes how large the picture is, the color depth, the image resolution, when the image was created, and other data. A text document's metadata may contain information about how long the document is, who the author is, when the document was written, and a short summary of the document.

In various form metadata has been used in so far as a means of cataloging information archived. An example of an earlier form of metadata is the Dewey Decimal System employed by libraries to index books. In this system, the data found on small 3x5 inch (A7) sized cards with the name of the book, its author, subject matter, a brief synopsis and typically an abbreviated alpha- numeric system indicating the location of the book on particular shelves. Such data helps classify, aggregate and identify the book(s) in question to find quickly. Another form of older metadata collection is the use by US Census Bureau in what is known as the “Long Form." The Long Form asks questions that are used to create demographic data to create patters and to find patterns of distribution. <ref>Plantilla:Cite web

 </ref> The term was coined in 1968 by Philip Bagley, one of the pioneers of computerized document retrieval.<ref>Plantilla:Citation</ref><ref>"The notion of "metadata" introduced by Bagley". Plantilla:Citation</ref> Since then the fields of information management, information science, information technology, librarianship and GIS have widely adopted the term. In these fields the word metadata is defined as “data about data”.<ref  name=NISO>Plantilla:Cite web
</ref> While this is the generally accepted definition, various disciplines have adopted their own more specific explanation and uses of the term. 

For the purposes of this article, an "object" refers to any of the following:

  • a physical item such as a book, CD, DVD, map, chair, table, flower pot, etc
  • an electronic file such as a digital image, digital photo, document, program file, database table etc

Photographic Metadata Definition: Information written into a digital photo file that will identify who owns it, copyright & contact information, what camera created the file, along with exposure information and descriptive information such as keywords about the photo, making the file searchable on the computer and/or the Internet. Some metadata is written by the camera and some is input by the photographer and/or software after downloading to a computer.

Photographic Metadata Standards are governed by organizations that that develop the following standards. They include, but are not limited to:

  • IPTC Information Interchange Model IIM (International Press Telecommunications Council),
  • IPTC Core Schema for XMP,
  • XMP - Extensible Metadata Platform (an Adobe standard)
  • Exif - Exchangeable image file format, Maintained by CIPA (Camera & Imaging Products Association) and published by JEITA (Japan Electronics and Information Technology Industries Association)
  • Dublin Core (Dublin Core Metadata Initiative -DCMI)
  • PLUS (Picture Licensing Universal System)

Creation of Metadata

Metadata can be created either by automated information processing or by manual work. Elementary metadata captured by computers can include information about when a file was created, who created it, when it was last updated, file size and file extension.

Metadata Structures

Metadata is typically structured according to a standardised concept using a well defined metadata scheme, including: metadata standards and metadata models. Tools such as controlled vocabularies, taxonomies, thesauri, data dictionaries and metadata registries can be used to apply further standardisation to the metadata.

Metadata Syntax

Metadata syntax refers to the rules created to structure the fields or elements of metadata.<ref>Plantilla:Cite web </ref> A single metadata scheme may be expressed in a number of different markup or programming languages, each of which requires a different syntax. For example, Dublin Core may be expressed in plain text, HTML, XML and RDF.<ref> Plantilla:Cite web </ref>

Metadata Types

As the metadata application is manifold covering a large variety of fields of application there are nothing but specialised and well accepted models to specify types of metadata. Bretheron & Singley (1994) distinguish between two distinct classes: structural/control metadata and guide metadata.<ref>Plantilla:Cite conference </ref> Structural metadata is used to describe the structure of computer systems such as tables, columns and indexes. Guide metadata is used to help humans find specific items and is usually expressed as a set of keywords in a natural language. According to Ralph Kimball metadata can be divided into 2 similar categories - Technical metadata and Business metadata. Technical metadata correspond to internal metadata, business metadata to external metadata. Kimball adds a third category named Process metadata. On the other hand, NISO distinguishes between three types of metadata: descriptive, structural and administrative. <ref name=NISO/> Descriptive metadata is the information used to search and locate an object such as title, author, subjects, keywords, publisher; structural metadata gives a description of how the components of the object are organised; and administrative metadata refers to the technical information including file type. Two sub-types of administrative metadata are rights management metadata and preservation metadata.

Hierarchical, linear and planar schemata

Metadata schemas can be hierarchical in nature where relationships exist between metadata elements and elements are nested so that parent-child relationships exist between the elements. An example of a hierarchical metadata schema is the IEEE LOM schema where metadata elements may belong to a parent metadata element. Metadata schemas can also be one dimensional, or linear, where each element is completely discrete from other elements and classified according to one dimension only . An example of a linear metadata schema is Dublin Core schema which is one dimensional. Metadata schemas are often two dimensional, or planar, where each element is completely discrete from other elements but classified according to two orthogonal dimensions. <ref>Plantilla:Cite web </ref>

Metadata Hypermapping

In all cases where the metadata schemata exceed the planar depiction, some type of hypermapping is required to enable display and view of metadata according to chosen aspect and to serve special views. Hypermapping frequently applies to layering of geographical and geological information overlays<ref>[www.isprs.org/proceedings/XXXII/part4/www.ifp.uni.../kuebler51.pdf THE DESIGN AND DEVELOPMENT OF A GEOLOGIC HYPERMAP PROTOTYPE]</ref>.

Granularity

Granularity is a term that applies to data as well as to metadata. The degree to which metadata is structured is referred to as its granularity. Metadata with a high granularity allows for deeper structured information and enables greater levels of technical manipulation however, a lower level of granularity means that metadata can be created for considerably lower costs but will not provide as detailed information. The major impact of granularity is not only on creation and capture, but moreover on maintenance. As soon as the metadata structures get outdated, the access to the referred data will get outdated. Hence granularity shall take into account the effort to create as well as the effort to maintain.

Metadata Standards

International standards apply to metadata. Much work is being accomplished in the national and international standards communities, especially ANSI (American National Standards Institute) and ISO (International Organization for Standardization) to reach consensus on standardizing metadata and registries.

The core standard is ISO/IEC 11179-1:2004 <ref>ISO/IEC 11179-1:2004 Information technology - Metadata registries (MDR) - Part 1: Framework</ref> and subsequent standards (see ISO/IEC_11179). All yet published registrations according to this standard cover just the definition of metadata and do not serve the structuring of metadata storage or retrieval neither any administrative standardisation.

Metadata Usage

Statistics and Census Services

Standardisation work has had a large impact on efforts to build metadata systems in the statistical community. Several metadata standards are described, and their importance to statistical agencies is discussed. Applications of the standards at the Census Bureau, Environmental Protection Agency, Bureau of Labor Statistics, Statistics Canada, and many others are described. Emphasis is on the impact a metadata registry can have in a statistical agency.

Library and Information Science

Digital libraries widely employ metadata in Library management system. Metadata is used as a means of cataloguing resources such as books, periodicals, papers, CDs, and DVDs. This data is stored in an integrated library management system, ILMS, using the MARC metadata standard. The purpose is the straight querying for quick access to the repository of titles on the queried subject.

Libraries are also using the ILMS to store information about electronic resources including electronic journals, e-books and websites.

Standardisation for library operation is a key topic in international standardisation (ISO) since decades. Standards for metadata in digital libraries include Dublin Core, METS, MODS, DDI, ISO standard Digital Object Identifier (DOI), ISO standard Uniform Resource Name (URN), PREMIS schema, and OAI-PMH. Leading libraries in the world give hints on their metadata standards strategies <ref>Library of Congress Washington DC on metadata</ref>, <ref>[www.d-nb.de/standardisierung/.../metadaten.htm Deutsche Nationalbibliothek Frankfurt on metadata]</ref>.

Metadata and the Law

United States

Problems involving metadata in litigation in the United States are becoming widespread.Plantilla:When Courts have looked at various questions involving metadata, including the discoverability of metadata by parties. Although the Federal Rules of Civil Procedure have only specified rules about electronic documents, subsequent case law has elaborated on the requirement of parties to reveal metadata.<ref>Plantilla:Cite journal</ref> In October 2009, the Arizona Supreme Court has ruled that metadata records are public record.<ref>Plantilla:Cite news </ref>

Document Metadata is particularly important in legal environments where litigation can request this sensitive information (metadata) which can include many elements of private detrimental data. This data has been linked to multiple lawsuits that have got corporations into legal complications.

Using metadata removal tools can mitigate the risks associated with metadata. These clean documents before they are sent outside of the firm. This process partially protects law firms from potentially unsafe leaking of sensitive data through Electronic Discovery. Removal of metadata alone is only one aspect of redaction, a technique for which it's infamously necessary to perform thoroughly and completely.

Metadata in Healthcare

Australian researches in medicine started a lot of metadata definition for applications in health care. That approach offers the first recognised attempt to adhere to international standards in medical sciences instead of defining a proprietary standard under the WHO umbrella first.

The medical community yet did not approve the need to follow metadata standards despite respective research<ref>[ceur-ws.org/Vol-559/Paper1.pdf TIM: A Semantic Web Application for the Specification of Metadata Items in Clinical Research]</ref>

Metadata and Data Warehousing

Data warehouse (DW) is a repository of an organization's electronically stored data. Data warehouses are designed to manage and store the data whereas the Business Intelligence (BI) focuses on the usage of data to facilitate reporting and analysis.<ref>Inmon, W.H. Tech Topic: What is a Data Warehouse? Prism Solutions. Volume 1. 1995. (http://en.wikipedia.org/wiki/Data_warehouse)</ref>

The purpose of a data warehouse is to house standardized, structured, consistent, integrated, correct, cleansed and timely data, extracted from various operational systems in an organization. The extracted data is integrated in the data warehouse environment in order to provide an enterprise wide perspective, one version of the truth. Data is structured in a way to specifically address the reporting and analytic requirements.

An essential component of a data warehouse/business intelligence system is the metadata and tools to manage and retrieve metadata. Ralph Kimball<ref>Ralph Kimball,The Data Warehouse Lifecycle Toolkit, Second Edition. New York, Wiley, 2008, ISBN 978-0-470-14977-5, page 10, 115-117,131-132, 140, 154-155</ref> describes metadata as the DNA of the data warehouse as metadata defines the elements of the data warehouse and how they work together.

Metadata on the Internet

The HTML format used to define web pages allows for the inclusion of a variety of types of metadata, from basic descriptive text, dates and keywords to further advanced metadata schemes such as the Dublin Core, e-GMS, and AGLS<ref>National Archives of Australia, AGLS Metadata Standard, accessed 07 January 2010, [1]</ref> standards. Pages can also be geotagged with coordinates. Metadata may be included in the page's header or in a separate file. Microformats allow metadata to be added to on-page data in a way that users don't see, but computers can readily access.

Interestingly, many search engines are cautious about using metadata in their ranking algorithms due to exploitation of metadata and the practice of search engine optimization, SEO, to improve rankings, see Meta element article for further discussion.

Geospatial Metadata

Metadata that describe geographic objects (such as datasets, maps, features, or simply documents with a geospatial component) have a history going back to at least 1994 (refer MIT Library page on FGDC Metadata). This class of metadata is described more fully on the Geospatial metadata page.

Metadata Administration and Management

Metadata Storage

Plantilla:Noref Metadata can be stored either internally, in the same file as the data, or externally, in a separate file. Metadata that is embedded with content is called embedded metadata. A data repository typically stores the metadata detached from the data. Both ways have advantages and disadvantages:

  • Internal storage allows transferring metadata together with the data it describes; thus, metadata is always at hand and can be manipulated easily. This method creates high redundancy and does not allow holding metadata together.
  • External storage allows bundling metadata, for example in a database, for more efficient searching. There is no redundancy and metadata can be transferred simultaneously when using streaming. However, as most formats use URIs for that purpose, the method of how the metadata is linked to its data should be treated with care. What if a resource does not have a URI (resources on a local hard disk or web pages that are created on-the-fly using a content management system)? What if metadata can only be evaluated if there is a connection to the Web, especially when using RDF? How to realize that a resource is replaced by another with the same name but different content?

Moreover, there is the question of data format: storing metadata in a human-readable format such as XML can be useful because users can understand and edit it without specialized tools. On the other hand, these formats are not optimized for storage capacity; it may be useful to store metadata in a binary, non-human-readable format instead to speed up transfer and save memory.

Database Management

Each relational database system has its own mechanisms for storing metadata. Examples of relational-database metadata include:

  • Tables of all tables in a database, their names, sizes and number of rows in each table.
  • Tables of columns in each database, what tables they are used in, and the type of data stored in each column.

In database terminology, this set of metadata is referred to as the catalog. The SQL standard specifies a uniform means to access the catalog, called the INFORMATION_SCHEMA, but not all databases implement it, even if they implement other aspects of the SQL standard. For an example of database-specific metadata access methods, see Oracle metadata. Programmatic access to metadata is possible using APIs such as JDBC, or SchemaCrawler.<ref name=schemacrawler>Plantilla:Cite web</ref>

See also

Plantilla:Col-begin Plantilla:Col-break

Plantilla:Col-break

Plantilla:Col-end

References

<references group=""></references>

<references/>

External links

Plantilla:Wiktionarypar

Plantilla:Software Engineering

ar:بيانات وصفية az:Metaverilənlər ca:Metadades cs:Metadata da:Metadata de:Metadaten et:Metaandmed el:Μεταδεδομένα es:Metadato eo:Meta-dateno fa:فراداده fr:Métadonnée ko:메타데이터 hr:Metapodaci id:Metadata it:Metadato he:Metadata kn:ಮೆಟಾಡೇಟಾ lv:Metadati hu:Metaadat ml:മെറ്റാഡാറ്റ ms:Metadata nl:Metadata ja:メタデータ no:Metadata pl:Metadane pt:Metadados ro:Metadată ru:Метаданные simple:Metadata sk:Metadáta sr:Metapodaci fi:Metatieto sv:Metadata ta:மேனிலைத் தரவு th:เมทาดาตา uk:Метадані ur:Metadata vi:Siêu dữ liệu zh:元数据