[x] Close ad

UNSTRUCTURED DATA

It has been suggested that this article or section be merged with Unstructured_information. (Discuss)


Unstructured data refers to masses of (usually) computerized information which do not have a data structure which is easily readable by a machine. Examples of unstructured data may include audio, video and unstructured text such as the body of an email or word processor document.

Contents

Examples

Merrill Lynch estimates that more than 85 percent of all business information exists as unstructured data - commonly appearing in e-mails, memos, notes from call centers and support operations, news, user groups, chats, reports, letters, surveys, white papers, marketing material, research, presentations and Web pages.[1]

Data with some form of structure may also be referred to as unstructured data if the structure is not helpful for the desired processing task. For example, an HTML webpage is highly structured, but this structure is often oriented towards formatting, rather than performing more complex tasks with the content of the page.


Dealing with unstructured data

Data mining and text analytics techniques are different methods used to find patterns in, or otherwise interpret, this information. UIMA provides a common framework for processing this information to extract meaning and create structured data about the information.

Notes

  1.   The problem with unstructured data, DMReview, February 2003.

See also

External links