Monday, 2 June 2014

Challenges on Data


  • .What challenges do processing such huge amounts of data hold for Microsoft customers?
    Kromer: Businesses that have big data requirements like search engines (think Google, Yahoo, Bing) or large social networking sites have a need to process super-large (aka “big”) data sets very, very quickly. In these cases, it may be beneficial to utilize a distributed NoSQL approach with tools like Hadoop and MapReduce, where the database schema is minimized with classic SQL constructs like ACID [atomicity, consistency, isolation, durability] and referential integrity put aside in favor of speed and easy data access. Microsoft is supporting our customers with big data requirements with these connectors. There are also some very exciting projects coming out of Microsoft Research and [Windows] Azure around distributed processing and big data. There is a white paper that [Microsoft watcher] Andrew Brust published for Microsoft, talking about using existing capabilities in Windows Azure, such as Azure Table Storage, for storing schema-lite structured data in key [or] value pairs for easy and quick access.

Database

A database is a collection of information that is organized so that it can easily be accessed, managed, and updated. In one view, databases can be classified according to types of content: bibliographic, full-text, numeric, and images.