John Graunt introduces statistical data analysis with the bubonic plague. The London haberdasher published the first collection of public health records when he recorded death rates and their variations during the bubonic plague in England.
Herman Hollerith
Herman Hollerith invents the punch card tabulating machine, marking the beginning of data processing. The tabulating device Hollerith developed was used to process data from the 1890 U.S. Census. Later, in 1911, he founded the Computing-Tabulating-Recording Company, which would eventually become IBM.
Tesla predictions
Nikola Tesla predicts humans will one day have access to large swaths of data via an instrument that can be carried “in [one’s] vest pocket.” Tesla managed to predict our modern affinity for smartphones and other handheld devices based on his understanding of how wireless technology would change particles: “When wireless is perfectly applied, the whole earth will be converted into a huge brain, which in fact it is, all things being particles of a real and rhythmic whole. We shall be able to communicate with one another instantly, irrespective of distance.”
Fritz Pfleumer
Fritz Pfleumer, a German-Austrian engineer, invests a method of storing information magnetically on tape. The principles he develops are still in use today, with the vast majority of digital data being stored magnetically on computer hard disks.
Franklin D. Roosevelt’s
Franklin D. Roosevelt’s administration in the US created the first major data project to track the contribution of nearly 3 million employers and 26 million Americans, after the Social Security Act became law. The massive bookkeeping project to develop punch card reading machines was given to IBM.
Richard Millar Devens coins the term “business intelligence.” In his “Cyclopædia of Commercial and Business Anecdotes,” Devens described how a banker used information from his environment to turn a profit. This is thought to be the first study of a business putting data analysis to use for commercial purposes. As we understand it today, business intelligence is the process of analyzing data, and then using it to deliver actionable information.
Punch card tabulating machine,
↑
Richard Millar Devens
Dawn of Electronic Computing
1940-1945
1943
The U.K. created a theoretical computer and one of the first data processing machines to decipher Nazi codes during WWII. The Colossus, as it was called, performed Boolean and counting operations to analyze large volumes of data.
A Wesleyan University Librarian, Fremont Rider predicts that the libraries in US are doubling in size every 16 years. Considering the same growth rate, Yale library is expected to have 200,000,000 volumes by end of 2040. The books in the library will require cataloguing staff of more than 6000 people and is expected to occupy approximately 6000 miles of shelves.
200,000,000 vol.s by end of 2040.
1945
ENIAC
ENIAC, the first electronic general purpose computer, was completed.
A programmer at IBM and pioneer of artificial intelligence, coined the term machine learning (ML).
Claude Shannon
The “Father of Information”- Claude Shannon did research on various big storage capacity items like photographic data and punch cards. The largest item on Claude Shannon’s list of items was the Library of Congress that measured 100 trillion bits of data.
IBM 305 & 650
RAMAC
IBM announced the 305 and 650 RAMAC (Random Access Memory Accounting) “data processing machines”, incorporating the first-ever disk storage product.
The IBM System/360
The IBM System/360 family of mainframe computer systems was launched.
The first
data center
The U.S. plans to build the first data center buildings to store millions of tax returns and fingerprints on magnetic tape.
The first
wide area network
Advanced Research Projects Agency Network (ARPANET), the first wide area network that included distributed control and TCI/IP protocols, was created. This formed the foundation of today’s internet.
ARCnet
introduced
ARCnet introduced the first LAN at Chase Manhattan Bank, connecting 255 computers.
↑
IBM 305 RAMAC
The Internet age
the dawn of big data
1981-1999
Read more ↓
The PC era began
IBM released its first commercially available relational database, DB2
Being
of Python
Implementation of the Python programming language began. Early use of term big data in magazine article by fiction author Erik Larson – commenting on advertisers` use of data to target customers. Tim Berners-Lee and Robert Cailliau found the World Wide Web and develop HTML, URLs and HTTP while working for CERN. The internet age with widespread and easy access to data begins.
According to RJT Morris and B J Truskowski in their 2003 book The Evolution of Storage Systems digital storage becomes more cost-effective than paper storage.
Cost
Effective
Carlo Strozzi develops NoSQL, an open source relational database that provides a way to store and retrieve data modeled differently from the traditional tabular methods found in relational databases.
John R. Masey, Chief Scientist at SGI, presents at a USENIX meeting a paper titled Big Data… and the Next Wave of Infrastress.
Kevin
Ashton
The term Internet of Things (IoT) was used for the very first time by Kevin Ashton in a business presentation at P & G.
Based on data from 1999, the first edition of the influential book, How Much Information, by Hal R. Varian and Peter Lyman (published in 2000), attempts to quantify the amount of digital information available in the world to date.
1997
The term “BIG DATA” was used for the first time- A paper on Visualization published by David Ellsworth and Michael Cox of NASA’s Ames Research Centre mentioned about the challenges in working with large unstructured data sets with the existing computing systems. “We call this as the problem of BIG DATA” and hence the term “BIG DATA” was coined in this setting.
↑
Carlo
Strozzi
Big Data in the 21st century
2001-2020
Doug
Laney
Doug Laney of analyst firm Gartner coins the 3Vs (volume, variety and velocity), defining the dimensions and properties of big data. The Vs encapsulate the true definition of big data and usher in a new period where big data can be viewed as a dominant feature of the 21st century. Additional Vs — such as veracity, value and variability — have since been added to the list.
The dimensions of big data i.e. the 3V’s were defined by a Gartner Analyst Doug Laney in a research paper titled –“3D Data Management: Controlling Data Volume, Velocity, and Variety”
AWS as a free
Amazon Web Services (AWS) launched as a free service.
2005
This year sees birth of Hadoop – an open source Big Dara framework now developed by Apache – is developed. Same year the user generated web known as Web 2.0 is created.
The term Big Data might have been launched by O’Reilly Media in 2005 but the usage of big data and the necessity to analyse it has been identified since quite some time before.
Cloud
Computing
Amazon Web Services (AWS) starts offering web-based computing infrastructure services, now known as cloud computing. Currently, AWS dominates the cloud services industry with roughly one-third of the global market share.
First
iPhone
Apple launched the first iPhone, creating the mobile internet as we know it today.
North Carolina State University established the Institute of Advanced Analytics which offered the first Master’s degree in Analytics.
Increase in Data
Consumption
9.57 trillion gigabytes of data has been processed by the world’s CPU’s.
According to a survey by Global Information Industry Centre, in 2008 Americans consumed approximately 1.3 trillion hours of information i.e. an average of 12 hours of information per day. The survey estimates that a normal person on an average consumes 34 gigabytes of information and 100,500 words in a single day totalling the consumption to 10,845 trillion words and 3.6 zettabytes.
Google processed 20 petabytes of data in a single day.
The world’s CPUs process over 9.57 zettabytes (or 9.57 trillion gigabytes) of data, about equal to 12 gigabytes per person. Global production of new information hits an estimated 14.7 exabytes.
With continuous explosion of data George Glider and Bret Swanson predicted that by end of 2015 the US IP traffic will reach 1 zettabytes and the US internet usage will be 50x times more than it was in 2006
Business
Intelligenec
Gartner reports business intelligence as the top priority for Chief Information Officers. As companies face a period of economic volatility and uncertainty due to the Great Recession, squeezing value out of data becomes paramount.
A McKinsey report estimated that, on an average-a US company with 1000 employees stores more than 200 TB of data.
Eric
Schmidt
Eric Schmidt, executive chairman of Google, tells a conference that as much data is now being created every two days, as was created from the beginning of human civilization to the year 2003.
Open Computer
Project
Facebook launched the Open Compute Project to share specifications for energy efficient data centers.
A McKinsey report on Big Data highlighted the shortage of analytics talent in US by 2018. US alone will face a shortage of 1.5 million analysts and managers, 140,000 and 190,000 skilled professionals with deep analytical skills.
Also, Facebook launches the Open Compute Project to share specifications for energy- efficient data centers. The initiative’s goal is to deliver a 38% increase in energy efficiency at a 24% lower cost.
Spending
On Big Data
According to Gartner, 72% of the organizations plan to increase their spending on big data analytics but 60% actually stated that they lacked personnel with required deep analytical skills.
Digital
for business
For the first time, more people are using mobile dev data, than office or home computers. 88% of business executives surveyed by GE working with Accenture report that big data analytics is a top priority to access digital for their business.
According to IDC, by end of 2020 business transactions on the web i.e. B2B and B2C transactions will exceed 450 billion every day.
By end of 2020, 1/3rd of the data produced in the digital universe will live in or pass through the cloud.
For the first time, more mobile devices access the internet than desktop computers in the U.S. The rest of the world follows suit two years later, in 2016.
1,000,000,000, Gb
of big data
Google and Microsoft lead massive build outs of data centers.
Google is the largest big data company in the world that stores 10 billion gigabytes of data and processes approximately 3.5 billion requests every day.
Amazon is the company with the most number of servers-the 1,000,000,000 gigabytes of big data produced by Amazon from its 152 million customers is stored on more than 1,400,000 servers in various data centres.
90% of the
Date 2 years
Ninety percent of the world’s data was created in the last two years alone, and IBM reports that 2.5 quintillion bytes of data is created every day (that’s 18 zeroes).
Edge
computing
Allied Market Research reports the big data and business analytics market hit $193.14 billion in 2019, and estimates it will grow to $420.98 billion by 2027 at a compound annual growth rate of 10.9%.
Edge computing will revise the role of the cloud in key sectors of the economy. Edge computing, which refers to computing done near the source of data collection rather than in the cloud or a centralized data center, represents the next frontier for big data.
Data
Speed
↑
Read more ↓
Data center speeds to exceed 1,000G.
The future of
big data:
Read more ↓
2025
The role of data
for businesses:
Data centers will be increasingly on-device.
Due to the explosion of connected devices, our increasing reliance on the cloud and the coming edge computing revolution, among other factors, big data has a lot of growing left to do.
Technologies such as machine learning, AI and IoT analytics, for example, continue to push the envelope by vastly improving our ability to process, analyze and act upon data.
Expect significant advancements in big data and analytics to happen at a faster clip. The next few years could very well make what we’ve seen over the last 20 years look like child’s play.
Fascinating
Big Data Stats
Data volumes have skyrocketed. More data was generated in the last two years than in the entire human history before that. Since 2012, big data has created 8 million jobs in the US alone and 6 million more worldwide. Big data needs as much computing power as you can throw at it. That’s why engineers aspire to reach the processing capability of the human brain for their CPUs in the next decade! Big data holds the key to an amazing future. It reveals patterns and connections that significantly improve our lives. Secure self-driving cars, more effective medical treatments, even reliable weather forecasts that will allow farmers to get better yields! The driving force behind big data is the “data-fication” of information. For example, in the past, you would just go for a walk. Today you know it was 10,435 steps long and you burned 450 calories because of it. IT services earned the biggest share of the BDA revenues in 2019. The estimated profit is $77.5 billion! Right behind it are hardware purchases ($23.7 billion), and business services ($20.7 billion). Big data statistics show that software-wise, BDA revenues will go as high as $67.2 billion this year.